Show simple item record

dc.contributor.authorCholakov, Kostadin
dc.contributor.authorBiemann, Chris
dc.contributor.authorEckle-Kohler, Judith
dc.contributor.authorGurevych, Iryna
dc.descriptionThis article describes a lexical substitution dataset for German. The whole dataset contains 2,040 sentences from the German Wikipedia,with one target word in each sentence. There are 51 target nouns, 51 adjectives, and 51 verbs randomly selected from 3 frequency groups based on the lemma frequency list of the German WaCKy corpus. 200 sentences have been annotated by 4 professional annotators and the remaining sentences by 1 professional annotator and 5 additional annotators who have been recruited via crowdsourcing. The resulting dataset can be used to evaluate not only lexical substitution systems, but also different sense inventories and word sense disambiguation systems.en_US
dc.rightsCC BY-SA 3.0
dc.subject.classification409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierungen_US
dc.titleLexical Substitution Dataset for German.en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

CC BY-SA 3.0
Except where otherwise noted, this item's license is described as CC BY-SA 3.0