Event Tokenization and Next-Token Prediction for Anomaly Detection at the Large Hadron Collider
Ambre Visive, Polina Moskvitina, Clara Nellist, Roberto Ruiz de Austri, Sascha Caron
公開日: 2025/9/30
Abstract
We propose a novel use of Large Language Models (LLMs) as unsupervised anomaly detectors in particle physics. Using lightweight LLM-like networks with encoder-based architectures trained to reconstruct background events via masked-token prediction, our method identifies anomalies through deviations in reconstruction performance, without prior knowledge of signal characteristics. Applied to searches for simultaneous four-top-quark production, this token-based approach shows competitive performance against established unsupervised methods and effectively captures subtle discrepancies in collider data, suggesting a promising direction for model-independent searches for new physics.