Apr 7, 2026arXiv:2604.05545

Multimodal Deep Learning Method for Real-Time Spatial Room Impulse Response Computing

Zhiyu Li, Xinwen Yue, Shenghui Zhao, Jing Wang

AI Summary

This paper introduces a multimodal deep learning model for real-time generation of spatial room impulse responses (SRIRs) for VR auralization. The model leverages scene information and low-order reflections (LoR) computed via geometrical acoustics as inputs, enabling efficient SRIR generation. Experiments on a newly constructed, diverse dataset demonstrate the model's superior performance in reconstructing scene-specific auditory perception.

Key Contribution

Achieve real-time VR auralization by cleverly combining deep learning with geometrical acoustics to generate spatial room impulse responses.

Abstract

We propose a multimodal deep learning model for VR auralization that generates spatial room impulse responses (SRIRs) in real time to reconstruct scene-specific auditory perception. Employing SRIRs as the output reduces computational complexity and facilitates integration with personalized head-related transfer functions. The model takes two modalities as input: scene information and waveforms, where the waveform corresponds to the low-order reflections (LoR). LoR can be efficiently computed using geometrical acoustics (GA) but remains difficult for deep learning models to predict accurately. Scene geometry, acoustic properties, source coordinates, and listener coordinates are first used to compute LoR in real time via GA, and both LoR and these features are subsequently provided as inputs to the model. A new dataset was constructed, consisting of multiple scenes and their corresponding SRIRs. The dataset exhibits greater diversity. Experimental results demonstrate the superior performance of the proposed model.

Multimodal Models Speech & Audio

Citation Metrics

Citations0

Influential citations0

References25

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Multimodal Deep Learning Method for Real-Time Spatial Room Impulse Response Computing

Related Papers