his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Term ambiguity and variation in biomedical nomenclature and literature - problems for information extraction
University of Skövde, School of Humanities and Informatics.
2009 (English)In: Proceedings of the 4th Language and Technology Conference / [ed] Zygmunt Vetulani, Fundacja Uniwersytetu im A. Mickiewicza , 2009, 492-496 p.Conference paper, (Refereed)
Abstract [en]

Named entity recognition in life sciences is reported to achieve up to 0.9 F-score when tested on test corpora. The results which are obtained for casually chosen texts are usually not as good. We believe that the task may still be underestimated, together with the basic tasks of tokenization. We present here the problems which we have encountered in our attempt to identify gene names and chemical substance names in research articles. The two problems which information extraction has to cope with are language variation and ambiguity. Both are present not only in unstructured texts but also in the nomenclature of life sciences. We also note the discrepancies between the nomenclature registered in terminologies and the actual use of terms in texts. These problems are intimately entangled with text segmentation problems.

Place, publisher, year, edition, pages
Fundacja Uniwersytetu im A. Mickiewicza , 2009. 492-496 p.
National Category
Computer and Information Science
Research subject
Technology
Identifiers
URN: urn:nbn:se:his:diva-3545ISBN: 978-83-7177-746-2 (print)OAI: oai:DiVA.org:his-3545DiVA: diva2:284578
Conference
4th Language & Technology Conference, November, 6-8, 2009, Poznań, Poland
Available from: 2010-01-07 Created: 2010-01-07 Last updated: 2017-06-12Bibliographically approved

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Dura, Elzbieta
By organisation
School of Humanities and Informatics
Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar

Total: 36 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf