Room Impulse Response Prediction with Neural Networks: From Energy Decay Curves to Perceptual Validation
Imran Muhammad, Gerald Schuller
Published: 2025/9/29
Abstract
Prediction of room impulse responses (RIRs) is essential for room acoustics, spatial audio, and immersive applications, yet conventional simulations and measurements remain computationally expensive and time-consuming. This work proposes a neural network framework that predicts energy decay curves (EDCs) from room dimensions, material absorption coefficients, and source-receiver positions, and reconstructs corresponding RIRs via reverse-differentiation. A large training dataset was generated using room acoustic simulations with realistic geometries, frequency-dependent absorption, and diverse source-receiver configurations. Objective evaluation employed root mean squared error (RMSE) and a custom loss for EDCs, as well as correlation, mean squared error (MSE), spectral similarity for reconstructed RIRs. Perceptual validation through a MUSHRA listening test confirmed no significant perceptual differences between predicted and reference RIRs. The results demonstrate that the proposed framework provides accurate and perceptually reliable RIR predictions, offering a scalable solution for practical acoustic modeling and audio rendering applications.