EDmamba: Rethinking Efficient Event Denoising with Spatiotemporal Decoupled SSMs

Ciyu Ruan, Zihang Gong, Ruishan Guo, Jingao Xu, Xinlei Chen

Published: 2025/5/8

Abstract

Event cameras provide micro-second latency and broad dynamic range, yet their raw streams are marred by spatial artifacts (e.g., hot pixels) and temporally inconsistent background activity. Existing methods jointly process the entire 4D event volume (x, y, p, t), forcing heavy spatio-temporal attention that inflates parameters, FLOPs, and latency. We introduce EDmamba, a compact event-denoising framework that embraces the key insight that spatial and temporal noise arise from different physical mechanisms and can therefore be suppressed independently. A polarity- and geometry-aware encoder first extracts coarse cues, which are then routed to two lightweight state-space branches: a Spatial-SSM that learns location-conditioned filters to silence persistent artifacts, and a Temporal-SSM that models causal signal dynamics to eliminate bursty background events. This decoupled design distills the network to only 88.9K parameters and 2.27GFLOPs, enabling real-time throughput of 100K events in 68ms on a single GPU, 36x faster than recent Transformer baselines. Despite its economy, EDmamba establishes new state-of-the-art accuracy on four public benchmarks, outscoring the strongest prior model by 2.1 percentage points.

EDmamba: Rethinking Efficient Event Denoising with Spatiotemporal Decoupled SSMs | SummarXiv | SummarXiv