TUdatalib Upgrade

Am 2. Juni erfolgte ein TUdatalib Upgrade auf eine neue Softwareversion. Dieses Upgrade bringt wichtige Neuerungen mit sich. Eine Übersicht finden Sie in der Dokumentation
On June 2nd, TUdatalib was upgraded to a new software version. This upgrade introduced major changes to the system. Please see our documentation for an overview.

 

Annotation Curricula to Implicitly Train Non-Expert Annotators

dc.contributor.author Lee, Ji-Ung
dc.contributor.author Klie, Jan-Christoph
dc.contributor.author Gurevych, Iryna
dc.date.accessioned 2021-06-04T17:24:16Z
dc.date.available 2021-06-04T17:24:16Z
dc.date.created 2021
dc.date.issued 2021-06-04
dc.description Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain. This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations; especially in citizen science or crowd sourcing scenarios where domain expertise is not required and only annotation guidelines are provided. To alleviate these issues, we propose annotation curricula, a novel approach to implicitly train annotators. We gradually introduce annotators into the task by ordering instances that are annotated according to a learning curriculum. To do so, we first formalize annotation curricula for sentence- and paragraph-level annotation tasks, define an ordering strategy, and identify well-performing heuristics and interactively trained models on three existing English datasets. We then conduct a user study with 40 voluntary participants who are asked to identify the most fitting misconception for English tweets about the Covid-19 pandemic. Our results show that using a simple heuristic to order instances can already significantly reduce the total annotation time while preserving a high annotation quality. Annotation curricula thus can provide a novel way to improve data collection. To facilitate future research, we further share our code and data consisting of 2,400 annotations. en_US
dc.identifier.uri https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2783
dc.language.iso en en_US
dc.relation.isreferencedby https://arxiv.org/abs/2106.02382
dc.rights.licenseCC-BY-4.0 (https://creativecommons.org/licenses/by/4.0)
dc.subject NLP en_US
dc.subject Annotation Curriculum en_US
dc.subject Interactive Learning en_US
dc.subject Semantic Similarity en_US
dc.subject.classification 4.43-06
dc.subject.ddc 004
dc.title Annotation Curricula to Implicitly Train Non-Expert Annotators en_US
dc.type Dataset en_US
dc.type Text en_US
dcterms.accessRights openAccess
person.identifier.orcid #PLACEHOLDER_PARENT_METADATA_VALUE#
person.identifier.orcid 0000-0003-0181-6450
person.identifier.orcid 0000-0003-2187-7621
tuda.history.classification Version=2020-2024;409-06 Informationssysteme, Prozess- und Wissensmanagement
tuda.project EU/EFRE | 20005482 | TexPrax - Gurevych
tuda.project DFG | GU798/21-1 | Infrastruktur für in
tuda.unit TUDa

Files

Original bundle

Now showing 1 - 1 of 1
NameDescriptionSizeFormat
data.zip48.71 KBZIP-Archivdateien Download