Stein's unbiased risk estimate and Hyvärinen's score matching

Sulagna Ghosh, Nikolaos Ignatiadis, Frederic Koehler, Amber Lee

公開日: 2025/2/27

Abstract

Given a collection of observed signals corrupted with Gaussian noise, how can we learn to optimally denoise them? This fundamental problem arises in both empirical Bayes and generative modeling. In empirical Bayes, the predominant approach is via nonparametric maximum likelihood estimation (NPMLE), while in generative modeling, score matching (SM) methods have proven very successful. In our setting, Hyv\"arinen's implicit SM is equivalent to another classical idea from statistics -- Stein's Unbiased Risk Estimate (SURE). Revisiting SURE minimization, we establish, for the first time, that SURE achieves nearly parametric rates of convergence of the regret in the classical empirical Bayes setting with homoscedastic noise. We also prove that SURE-training can achieve fast rates of convergence to the oracle denoiser in a commonly studied misspecified model. In contrast, the NPMLE may not even converge to the oracle denoiser under misspecification of the class of signal distributions. We show how to practically implement our method in settings involving heteroscedasticity and side-information, such as in an application to the estimation of economic mobility in the Opportunity Atlas. Our empirical results demonstrate the superior performance of SURE-training over NPMLE under misspecification. Collectively, our findings advance SURE/SM as a strong alternative to the NPMLE for empirical Bayes problems in both theory and practice.

Stein's unbiased risk estimate and Hyvärinen's score matching | SummarXiv | SummarXiv