SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

datacite.relation.isPartOf https://arxiv.org/abs/2601.12910
dc.contributor.author Baumgärtner, Tim
dc.contributor.author Gurevych, Iryna
dc.date.accessioned 2026-03-26T13:54:57Z
dc.date.created 2026-01
dc.date.issued 2026-03-26
dc.description We present SciCoQA, a dataset for detecting discrepancies between scientific publications and their codebases to ensure faithful implementations. We construct SciCoQA from GitHub issues and reproducibility papers, and to scale our dataset, we propose a synthetic data generation method for constructing paper-code discrepancies. We analyze the paper-code discrepancies in detail and propose discrepancy types and categories to better understand the occurring mismatches. In total, our dataset consists of 635 paper-code discrepancies (92 real, 543 synthetic), covering the AI domain from real-world data and extending to Physics, Quantitative Biology, and other computational sciences through synthetic data. Our evaluation of 22 LLMs demonstrates the difficulty of SciCoQA, particularly for instances involving omitted paper details, long-context inputs, and data outside the models' pre-training corpus. The best performing models in our evaluation, Gemini 3.1 Pro and GPT-5 Mini, detect only 46.7% of real-world paper-code discrepancies.
dc.description.version v1.1
dc.identifier.uri https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4994.2
dc.language.iso en
dc.rights.licenseCC-BY-4.0 (https://creativecommons.org/licenses/by/4.0)
dc.subject AI4Science
dc.subject Peer Review
dc.subject Paper-Code Alignment
dc.subject.classification 4.43-04
dc.subject.ddc 004
dc.title SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
dc.type Text
dcterms.accessRights openAccess
person.identifier.orcid 0000-0001-6903-5509
person.identifier.orcid 0000-0003-2187-7621
tuda.agreements true
tuda.unit TUDa

Files

Original bundle

Now showing 1 - 2 of 2
NameDescriptionSizeFormat
scicoqa-v1.0.zip1.87 MBZIP-Archivdateien Download
scicoqa-v1.1.zip2.02 MBZIP-Archivdateien Download

Collections

Version History

Now showing 1 - 2 of 2
VersionDateSummary
2*
2026-03-26 05:26:41
Additional Data: 11 real-world discrepancies, 13 synthetic discrepancies, and the pooled annotations of model predictions
2026-01-19 08:58:11
* Selected version