CORE-T: COherent REtrieval of Tables for Text-to-SQL

Loading...
Thumbnail Image

Date

2026-01-16

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

We provide preprocessed text-to-SQL benchmarks for BIRD, SPIDER, MMQA, and BEAVER. For BIRD, SPIDER, and MMQA, we preprocess the datasets to follow our open-book setting by merging tables from multiple DBs, or question-specific schemas for MMQA, into a single retrieval corpus per benchmark. BEAVER is already released in an open-book format. For consistency with the other benchmarks, we convert BEAVER's original MySQL database files into SQLite database files. We provide the preprocessed retrieval data together with the corresponding SQL databases.

Citation

Endorsement

Project(s)

Faculty

Collections

License

Except where otherwise noted, this license is described as CC BY-SA 4.0