Quantitative convergence of trained single layer neural networks to Gaussian processes

Eloy Mosig, Andrea Agazzi, Dario Trevisan

公開日: 2025/9/29

Abstract

In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly during training. We provide explicit upper bounds on the quadratic Wasserstein distance between the network output and its Gaussian approximation at any training time $t \ge 0$, demonstrating polynomial decay with network width. Our results quantify how architectural parameters, such as width and input dimension, influence convergence, and how training dynamics affect the approximation error.

全文を読む (arXiv.org)