Am Montag, 7.4.2025 wird TUdatalib wegen geplanten Wartungsarbeiten am Speichersystem von 9:00 bis voraussichtlich 9:30 nur eingeschränkt nutzbar sein (kein Datenupload und Download) | Due to scheduled maintenance on the storage system, using TUdatalib will be limited on Monday, April 7 2025 from 9:00 to approx. 9:30 (no data upload or download)

Zur Kurzanzeige

dc.contributor.authorLee, Ji-Ung
dc.contributor.authorPfetsch, Marc
dc.contributor.authorGurevych, Iryna
dc.date.accessioned2024-04-08T09:54:18Z
dc.date.available2024-04-08T09:54:18Z
dc.date.issued2024-04
dc.identifier.urihttps://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4205
dc.descriptionThis work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap. In contrast to previous works that only consider varying the gap size or gap placement to achieve locally optimal solutions, we propose a mixed-integer programming (MIP) approach. This allows us to consider gap size and placement simultaneously, achieving globally optimal solutions and to directly integrate state-of-the-art models for gap difficulty prediction into the optimization problem. A user study with 40 participants across four C-Tests generation strategies (including GPT-4) shows that our approach (*MIP*) significantly outperforms two of the baseline strategies (based on gap placement and GPT-4); and performs on-par with the third (based on gap size). Our analysis shows that GPT-4 still struggles to fulfill explicit constraints during generation and that *MIP* produces C-Tests that correlate best with the perceived difficulty. We publish our code, model, and collected data consisting of 32 English C-Tests with 20 gaps each (3,200 in total) under an open source license.de_DE
dc.language.isoende_DE
dc.rightsCreative Commons Attribution 4.0
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectC-Testde_DE
dc.subjectNLPde_DE
dc.subjectLanguage Learningde_DE
dc.subjectConstrained Optimizationde_DE
dc.subjectMachine Learningde_DE
dc.subject.classification4.43-04 Künstliche Intelligenz und Maschinelle Lernverfahrende_DE
dc.subject.classification4.43-05 Bild- und Sprachverarbeitung, Computergraphik und Visualisierung, Human Computer Interaction, Ubiquitous und Wearable Computing
dc.subject.ddc004
dc.titleConstrained C-Test Generation via Mixed-Integer Programming (Supplementary Material)de_DE
dc.typeDatasetde_DE
dc.typeTextde_DE
dc.typeSoftwarede_DE
tud.unitTUDa
tud.history.classificationVersion=2016-2020;409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung


Dateien zu dieser Ressource

No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]
No Thumbnail [100%x60]

Der Datensatz erscheint in:

Zur Kurzanzeige

Creative Commons Attribution 4.0
Solange nicht anders angezeigt, wird die Lizenz wie folgt beschrieben: Creative Commons Attribution 4.0