Der Login über E-Mail und Passwort wird in Kürze abgeschaltet. Für Externe steht ab sofort der Login über ORCID zur Verfügung.
The login via e-mail and password will be retired in the near future. External uses can login via ORCID from now on.
 

DBS Corpus

datacite.relation.isDescribedBy https://aclweb.org/anthology/C16-1099
datacite.relation.isVersionOf https://github.com/AIPHES/DBS
datacite.relation.references https://www.bildungsserver.de/
dc.contributor.author Benikova, Darina
dc.contributor.author Mieskes, Margot
dc.contributor.author Meyer, Christian M.
dc.contributor.author Gurevych, Iryna
dc.date.accessioned 2019-02-18T09:28:24Z
dc.date.available 2019-02-18T09:28:24Z
dc.date.created 2016-11-28
dc.date.issued 2019-02-18
dc.description The DBS corpus contains 93 multi-document summaries for 293 German documents about 30 education-related topics. We sampled the topics from the Deutscher Bildungsserver (DBS) webpage and crawled the documents linked there. The documents are highly heterogeneous in terms of text type, genre, and style. The multi-document summaries are the result of a seven step annotation process yielding coherent extracts – a novel type of summary that is based on phrases extracted from the original documents that have been ordered and minimally redacted to form a well-readable, coherent text. The data of all intermediate steps is part of the repository to allow for extensive system evaluation. If you use the corpus in academic works, please cite our COLING paper. en_US
dc.description.version 1.0
dc.identifier.uri https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/1915
dc.language.iso de en_US
dc.rights Creative Commons Attribution Share-Alike 4.0
dc.rights.licenseother
dc.rights.uri https://creativecommons.org/licenses/by-sa/4.0/
dc.subject Multi-document Summarization en_US
dc.subject Heterogeneous Sources en_US
dc.subject Information Aggregation en_US
dc.subject Natural Language Processing en_US
dc.subject AIPHES en_US
dc.subject.ddc 000 Informatik, Informationswissenschaft, allgemeine Werke en_US
dc.subject.ddc 430 Germanische Sprachen; Deutsch en_US
dc.title DBS Corpus en_US
dc.type Dataset en_US
dc.type Text en_US
dc.type Workflow en_US
tuda.tubiblio 97945

Files

Original bundle

Now showing 1 - 2 of 2
NameDescriptionSizeFormat
dbs-corpus-v1-public.zip612.66 KBZIP-Archivdateien Download
restricted.txtrestricted license2 BPlain Text Download

Collections

Version History

Now showing 1 - 2 of 2
VersionDateSummary
2019-02-18 13:02:50
20 additional topics.
1*
2019-02-18 10:28:24
* Selected version