Efficient Quantization-Aware Neural Receivers: Beyond Post-Training Quantization

SaiKrishna Saketh Yellapragada, Esa Ollila, Mario Costa

公開日: 2025/9/17

Abstract

As wireless communication systems advance toward Sixth Generation (6G) Radio Access Networks (RAN), Deep Learning (DL)-based neural receivers are emerging as transformative solutions for Physical Layer (PHY) processing, delivering superior Block Error Rate (BLER) performance compared to traditional model-based approaches. Practical deployment on resource-constrained hardware, however, requires efficient quantization to reduce latency, energy, and memory without sacrificing reliability. We extend Post-Training Quantization (PTQ) baselines with Quantization-Aware Training (QAT), which incorporates low-precision simulation during training for robustness at ultra-low bitwidths. Our study applies QAT/PTQ to a neural receiver architecture and evaluates across 3GPP Clustered Delay Line (CDL)-B/D channels in LoS and NLoS environments at user velocities up to 40 m/s. Results show that 4-bit and 8-bit QAT models achieve BLERs similar to that of FP32 models at 10% target BLER. QAT models are also shown to outperform PTQ models by up to 3 dB, and yield 8x compression. These results demonstrate that QAT is a key enabler of low-complexity and latency-constrained inference at the PHY layer, facilitating real-time processing in 6G edge devices

Efficient Quantization-Aware Neural Receivers: Beyond Post-Training Quantization | SummarXiv | SummarXiv