PeerQA: A Scientific Question Answering Dataset from Peer Reviews

dc.contributor.author Baumgärtner, Tim
dc.contributor.author Briscoe, Ted
dc.contributor.author Gurevych, Iryna
dc.date.accessioned 2025-02-17T13:24:15Z
dc.date.available 2025-02-17T13:24:15Z
dc.date.created 2025
dc.date.issued 2025-02-17
dc.description We present PeerQA, a real-world, scientific, document-level Question Answering (QA) dataset. PeerQA questions have been sourced from peer reviews, which contain questions that reviewers raised while thoroughly examining the scientific article. Answers have been annotated by the original authors of each paper. The dataset contains 579 QA pairs from 208 academic articles, with a majority from ML and NLP, as well as a subset of other scientific communities like Geoscience and Public Health. PeerQA supports three critical tasks for developing practical QA systems: Evidence retrieval, unanswerable question classification, and answer generation. We provide a detailed analysis of the collected dataset and conduct experiments establishing baseline systems for all three tasks. Our experiments and analyses reveal the need for decontextualization in document-level retrieval, where we find that even simple decontextualization approaches consistently improve retrieval performance across architectures. On answer generation, PeerQA serves as a challenging benchmark for long-context modeling, as the papers have an average size of 12k tokens. de_DE
dc.description.version 1.0 de_DE
dc.identifier.uri https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4467
dc.language.iso en de_DE
dc.rights CC-BY-NC-SA-4.0
dc.rights.licenseother
dc.rights.uri https://creativecommons.org/licenses/by-nc-sa/4.0/deed.en
dc.subject Question Answering, Retrieval, Answerability, Scientific de_DE
dc.subject.classification 4.43-04
dc.subject.ddc 004
dc.title PeerQA: A Scientific Question Answering Dataset from Peer Reviews de_DE
dc.type Dataset de_DE
dcterms.accessRights openAccess
person.identifier.orcid #PLACEHOLDER_PARENT_METADATA_VALUE#
person.identifier.orcid #PLACEHOLDER_PARENT_METADATA_VALUE#
person.identifier.orcid #PLACEHOLDER_PARENT_METADATA_VALUE#
tuda.project DFG | GU798/18-3 | QAScilnf: Automatisc
tuda.unit TUDa

Files

Original bundle

Now showing 1 - 1 of 1
NameDescriptionSizeFormat
peerqa-data-v1.0.zip2.38 MBZIP-Archivdateien Download