General Table Question Answering via Answer-Formula Joint Generation

Zhongyuan Wang, Richong Zhang, Zhijie Nie, Hangyu Mao

公開日: 2025/3/16

Abstract

Advanced table question answering (TableQA) methods prompt large language models (LLMs) to generate answer text, SQL query, Python code, or custom operation, which impressively improve the complex reasoning problems in the TableQA task. However, these methods lack the versatility to cope with specific question types or table structures. In contrast, the Spreadsheet Formula, the widely used and well-defined operation language for tabular data, has not been thoroughly explored to solve TableQA. In this paper, we first attempt to use the Formula as the executable representation for solving complex reasoning on tables with different structures. Specifically, we construct \texttt{FromulaQA}, a large Formula-annotated TableQA dataset from existing datasets. In addition, we propose \texttt{TabAF}, a general table answering framework to solve multiple types of tasks over multiple types of tables simultaneously, which decodes answers and Formulas with a single LLM backbone. Extensive experiments demonstrate the versatility and generalization of \texttt{TabAF}. Under the same model size, \texttt{TabAF} achieves new state-of-the-art performance on the WikiTableQuestion, HiTab, and TabFact.