A Multi-Grid Implicit Neural Representation for Multi-View Videos

Qingyue Ling, Zhengxue Cheng, Donghui Feng, Shen Wang, Chen Zhu, Guo Lu, Heming Sun, Jiro Katto, Li Song

Published: 2025/9/20

Abstract

Multi-view videos are becoming widely used in different fields, but their high resolution and multi-camera shooting raise significant challenges for storage and transmission. In this paper, we propose MV-MGINR, a multi-grid implicit neural representation for multi-view videos. It combines a time-indexed grid, a view-indexed grid and an integrated time and view grid. The first two grids capture common representative contents across each view and time axis respectively, and the latter one captures local details under specific view and time. Then, a synthesis net is used to upsample the multi-grid latents and generate reconstructed frames. Additionally, a motion-aware loss is introduced to enhance the reconstruction quality of moving regions. The proposed framework effectively integrates the common and local features of multi-view videos, ultimately achieving high-quality reconstruction. Compared with MPEG immersive video test model TMIV, MV-MGINR achieves bitrate savings of 72.3% while maintaining the same PSNR.

A Multi-Grid Implicit Neural Representation for Multi-View Videos | SummarXiv | SummarXiv