his.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Novelty extraction from special and parallel corpora
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre.
University of Skövde, School of Humanities and Informatics. University of Skövde, The Informatics Research Centre.
2009 (English)In: Human Language Technology. Challenges of the Information Society: Third Language and Technology Conference, LTC 2007, Poznan, Poland, October 5-7, 2007, Revised Selected Papers / [ed] Zygmunt Vetulani, Hans Uszkoreit, Springer Berlin/Heidelberg, 2009, p. 291-302Chapter in book (Refereed)
Abstract [en]

How can corpora assist translators in ways in which resources like translation memories or term databases cannot? Our tests on English, Polish and Swedish parts of the JRC-Acquis Multilingual Parallel show that corpora can provide support for term standardization and variation, and, most importantly, for tracing novel expressions. A corpus tool with an explicit dictionary representation is particularly suitable for the last task. Culler is a tool which allows one to select expressions with words absent from its dictionary. Even if the extracted material may be stained with some noise, it has an undeniable value for translators and lexicographers. The quality of extraction depends in a rather obvious way on the dictionary and text processing but also on the query.

Place, publisher, year, edition, pages
Springer Berlin/Heidelberg, 2009. p. 291-302
Series
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), ISSN 0302-9743, E-ISSN 1611-3349 ; 5603 LNAI
Keywords [en]
corpus, novelty, terminology, term extraction, translation, dictionary
National Category
Computer Sciences
Research subject
Technology
Identifiers
URN: urn:nbn:se:his:diva-2222DOI: 10.1007/978-3-642-04235-5_25ISI: 000270337100025Scopus ID: 2-s2.0-70349339535ISBN: 978-3-642-04234-8 (print)ISBN: 978-3-642-04235-5 (electronic)OAI: oai:DiVA.org:his-2222DiVA, id: diva2:37484
Conference
Third Language and Technology Conference, LTC 2007, Poznan, Poland, October 5-7, 2007
Note

Part of the Lecture Notes in Computer Science book series (LNCS, volume 5603). Also part of the Lecture Notes in Artificial Intelligence book sub series (LNAI, volume 5603). Originalpaper 2007 i Proceedings of 3rd Language & Technology Conference 2007 (s. 305-309), ISBN 978-83-7177-407-2. http://ltc.amu.edu.pl/a2007/content.en.html

Available from: 2008-10-06 Created: 2008-10-06 Last updated: 2019-03-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Dura, ElżbietaGawronska, Barbara

Search in DiVA

By author/editor
Dura, ElżbietaGawronska, Barbara
By organisation
School of Humanities and InformaticsThe Informatics Research Centre
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 259 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf