Zur Kurzanzeige

dc.contributor.authorBiemann, Chris
dc.date.accessioned2021-05-17T09:24:24Z
dc.date.available2021-05-17T09:24:24Z
dc.date.issued2010-02-01
dc.identifier.urihttps://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2768
dc.descriptionTurk Bootstrap Word Sense Inventory (TWSI) 2.0. This lexical resource, created by a crowdsourcing process using Amazon Mechanical Turk (http://www.mturk.com), encompasses a sense inventory for lexical substitution for 1,012 highly frequent English common nouns. Along with each sense, a large number of sense-annotated occurrences in context are given, as well as a weighted list of substitutions. Sense distinctions are not motivated by lexicographic considerations, but driven by substitutability: two usages belong to the same sense if their substitutions overlap considerably. After laying out the need for such a resource, the data is characterized in terms of organization and quantity. Then, we briefly describe how this data was used to create a system for lexical substitutions. Training a supervised lexical substitution system on a smaller version of the resource resulted in well over 90% acceptability for lexical substitutions provided by the system. Thus, this resource can be used to set up reliable, enabling technologies for semantic natural language processing (NLP), some of which we discuss briefly.en_US
dc.language.isoenen_US
dc.relationIsReferencedBy;URL;https://www.aclweb.org/anthology/L12-1101/
dc.rightsCreative Commons Attribution Share-Alike 4.0
dc.rights.urihttps://creativecommons.org/licenses/by-sa/4.0/
dc.subjectlexical substitutionen_US
dc.subjectnlp
dc.subjectmturk
dc.subjectsemantic word sense
dc.subject.classification409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierungen_US
dc.subject.ddc004
dc.titleTurk Bootstrap Word Sense Inventory (TWSI) 2.0en_US
dc.typeDataseten_US
dc.description.version2en_US
tud.unitTUDa


Dateien zu dieser Ressource

Thumbnail
Thumbnail

Der Datensatz erscheint in:

Zur Kurzanzeige

Creative Commons Attribution Share-Alike 4.0
Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Creative Commons Attribution Share-Alike 4.0