Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data

Tianyi Chen, Pengxiao Lin, Zhiwei Wang, Zhi-Qin John Xu

Published: 2025/9/22

Abstract

State Space Models (SSMs) have emerged as promising alternatives to attention mechanisms, with the Mamba architecture demonstrating impressive performance and linear complexity for processing long sequences. However, the fundamental differences between Mamba and Transformer architectures remain incompletely understood. In this work, we use carefully designed synthetic tasks to reveal Mamba's inherent limitations. Through experiments, we identify that Mamba's nonlinear convolution introduces an asymmetry bias that significantly impairs its ability to recognize symmetrical patterns and relationships. Using composite function and inverse sequence matching tasks, we demonstrate that Mamba strongly favors compositional solutions over symmetrical ones and struggles with tasks requiring the matching of reversed sequences. We show these limitations stem not from the SSM module itself but from the nonlinear convolution preceding it, which fuses token information asymmetrically. These insights provide a new understanding of Mamba's constraints and suggest concrete architectural improvements for future sequence models.

Achilles' Heel of Mamba: Essential difficulties of the Mamba architecture demonstrated by synthetic data | SummarXiv | SummarXiv