RWSE Wikipedia Revision Dataset
datacite.relation.isSupplementTo | https://www.aclweb.org/anthology/E12-1054/ | |
dc.contributor.author | Zesch, Torsten | |
dc.date.accessioned | 2020-07-25T09:25:04Z | |
dc.date.available | 2020-07-25T09:25:04Z | |
dc.date.created | 2012 | |
dc.date.issued | 2020-07-25 | |
dc.description | Real-word spelling error datasets mined from the Wikipedia revision history. Each instance consists of the original sentence with an error and the sentence where the error has been corrected. An instance also contains the id of the Wikipedia article as well as of the revision, so the instance can be traced back to the original Wikipedia article. | en_US |
dc.identifier.uri | https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/2451 | |
dc.language.iso | other | en_US |
dc.rights.license | CC-BY-4.0 (https://creativecommons.org/licenses/by/4.0) | |
dc.subject.classification | 4.43-04 | |
dc.subject.classification | 4.43-05 | |
dc.subject.ddc | 004 | |
dc.title | RWSE Wikipedia Revision Dataset | en_US |
dc.type | Dataset | en_US |
dcterms.accessRights | openAccess | |
person.identifier.orcid | #PLACEHOLDER_PARENT_METADATA_VALUE# | |
tuda.history.classification | Version=2016-2020;409-05 Interaktive und intelligente Systeme, Bild- und Sprachverarbeitung, Computergraphik und Visualisierung | |
tuda.project | Volkswagen | I/82806 | e-NLP - Stiftungsmit | en_US |
Files
Original bundle
1 - 8 of 8
Name | Description | Size | Format | |
---|---|---|---|---|
en_artificial_noun.txt | 158.23 KB | Plain Text | ||
en_artificial_token.txt | 151.72 KB | Plain Text | ||
en_natural_test.txt | 71.16 KB | Plain Text | ||
en_natural_train.txt | 15.8 KB | Plain Text | ||
de_artificial_noun.txt | 170.49 KB | Plain Text | ||
de_artificial_token.txt | 162.85 KB | Plain Text | ||
de_natural_test.txt | 30.29 KB | Plain Text | ||
de_natural_train.txt | 16.9 KB | Plain Text |