BitHydra: Towards Bit-flip Inference Cost Attack against Large Language Models

Xiaobei Yan, Yiming Li, Hao Wang, Han Qiu, Tianwei Zhang

Published: 2025/5/22

Abstract

Large language models (LLMs) are widely deployed, but their growing compute demands expose them to inference cost attacks that maximize output length. We reveal that prior attacks are fundamentally self-targeting because they rely on crafted inputs, so the added cost accrues to the attacker's own queries and scales poorly in practice. In this work, we introduce the first bit-flip inference cost attack that directly modifies model weights to induce persistent overhead for all users of a compromised LLM. Such attacks are stealthy yet realistic in practice: for instance, in shared MLaaS environments, co-located tenants can exploit hardware-level faults (e.g., Rowhammer) to flip memory bits storing model parameters. We instantiate this attack paradigm with BitHydra, which (1) minimizes a loss that suppresses the end-of-sequence token (i.e., EOS) and (2) employs an efficient yet effective critical-bit search focused on the EOS embedding vector, sharply reducing the search space while preserving benign-looking outputs. We evaluate across 11 LLMs (1.5B-14B) under int8 and float16, demonstrating that our method efficiently achieves scalable cost inflation with only a few bit flips, while remaining effective even against potential defenses.

Read Full Paper (arXiv.org)