Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP

Fangzheng Tian, Debasis Ganguly, Craig Macdonald

Published: 2025/10/2

Abstract

Leveraging query variants (QVs), i.e., queries with potentially similar information needs to the target query, has been shown to improve the effectiveness of query performance prediction (QPP) approaches. Existing QV-based QPP methods generate QVs facilitated by either query expansion or non-contextual embeddings, which may introduce topical drifts and hallucinations. In this paper, we propose a method that retrieves QVs from a training set (e.g., MS MARCO) for a given target query of QPP. To achieve a high recall in retrieving queries with the most similar information needs as the target query from a training set, we extend the directly retrieved QVs (1-hop QVs) by a second retrieval using their denoted relevant documents (which yields 2-hop QVs). Our experiments, conducted on TREC DL'19 and DL'20, show that the QPP methods with QVs retrieved by our method outperform the best-performing existing generated-QV-based QPP approaches by as much as around 20\%, on neural ranking models like MonoT5.

Revisiting Query Variants: The Advantage of Retrieval Over Generation of Query Variants for Effective QPP | SummarXiv | SummarXiv