File access restricted

CORE-T: COherent REtrieval of Tables for Text-to-SQL

Files

core-t-data-v2.0.zip (2.85 GB)

Date

2026-01-16

Type

Dataset
Text

Authors

Soliman, Hassan

Description

We provide preprocessed text-to-SQL benchmarks for BIRD, SPIDER, MMQA, and BEAVER. For BIRD, SPIDER, and MMQA, we preprocess the datasets to follow our open-book setting by merging tables from multiple DBs, or question-specific schemas for MMQA, into a single retrieval corpus per benchmark. BEAVER is already released in an open-book format. For consistency with the other benchmarks, we convert BEAVER's original MySQL database files into SQLite database files. We provide the preprocessed retrieval data together with the corresponding SQL databases.

Keywords

Information Retrieval, Text-to-SQL, Multi-table Selection

Identifier

https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/4993

DFG Classification

4.43-04 Künstliche Intelligenz und Maschinelle Lernverfahren

License

Except where otherwise noted, this license is described as CC BY-SA 4.0

Full item page

CORE-T: COherent REtrieval of Tables for Text-to-SQL

Files

Date

Type

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Identifier

Endorsement

DFG Classification

Project(s)

Faculty

Collections

License