Högskolan i Skövde

his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Deriving pathway maps from automated text analysis using a grammar-based approach
University of Skövde, School of Humanities and Informatics.ORCID iD: 0000-0001-6254-4335
University of Skövde, School of Humanities and Informatics.ORCID iD: 0000-0001-6233-8996
University of Skövde, School of Humanities and Informatics.
2006 (English)In: Journal of Bioinformatics and Computational Biology, ISSN 0219-7200, E-ISSN 1757-6334, Vol. 4, no 2, p. 483-501Article in journal (Refereed) Published
Abstract [en]

We demonstrate how automated text analysis can be used to support the large-scale analysis of metabolic and regulatory pathways by deriving pathway maps from textual descriptions found in the scientific literature. The main assumption is that correct syntactic analysis combined with domain-specific heuristics provides a good basis for relation extraction. Our method uses an algorithm that searches through the syntactic trees produced by a parser based on a Referent Grammar formalism, identifies relations mentioned in the sentence, and classifies them with respect to their semantic class and epistemic status (facts, counterfactuals, hypotheses). The semantic categories used in the classification are based on the relation set used in KEGG (Kyoto Encyclopedia of Genes and Genomes), so that pathway maps using KEGG notation can be automatically generated. We present the current version of the relation extraction algorithm and an evaluation based on a corpus of abstracts obtained from PubMed. The results indicate that the method is able to combine a reasonable coverage with high accuracy. We found that 61% of all sentences were parsed, and 97% of the parse trees were judged to be correct. The extraction algorithm was tested on a sample of 300 parse trees and was found to produce correct extractions in 90.5% of the cases.

Place, publisher, year, edition, pages
World Scientific, 2006. Vol. 4, no 2, p. 483-501
Identifiers
URN: urn:nbn:se:his:diva-1858DOI: 10.1142/S0219720006002041Scopus ID: 2-s2.0-33745684308OAI: oai:DiVA.org:his-1858DiVA, id: diva2:32134
Available from: 2007-09-12 Created: 2007-09-12 Last updated: 2020-10-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Olsson, BjörnGawronska, BarbaraErlendsson, Björn

Search in DiVA

By author/editor
Olsson, BjörnGawronska, BarbaraErlendsson, Björn
By organisation
School of Humanities and Informatics
In the same journal
Journal of Bioinformatics and Computational Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 894 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • apa-cv
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf