Linear Convergence of Gradient Descent for Quadratically Regularized Optimal Transport

Alberto González-Sanz, Marcel Nutz, Andrés Riveros Valdevenito

Published: 2025/9/10

Abstract

In optimal transport, quadratic regularization is an alternative to entropic regularization when sparse couplings or small regularization parameters are desired. Here quadratic regularization means that transport couplings are penalized by the squared $L^2$ norm, or equivalently the $\chi^2$ divergence. While a number of computational approaches have been shown to work in practice, quadratic regularization is analytically less tractable than entropic, and we are not aware of a previous theoretical convergence rate analysis. We focus on the gradient descent algorithm for the dual transport problem in continuous and semi-discrete settings. This problem is convex but not strongly convex; its solutions are the potential functions that approximate the Kantorovich potentials of unregularized optimal transport. The gradient descent steps are straightforward to implement, and stable for small regularization parameter -- in contrast to Sinkhorn's algorithm in the entropic setting. Our main result is that gradient descent converges linearly; that is, the $L^2$ distance between the iterates and the limiting potentials decreases exponentially fast. Our analysis centers on the linearization of the gradient descent operator at the optimum and uses functional-analytic arguments to bound its spectrum. These techniques seem to be novel in this area and are substantially different from the approaches familiar in entropic optimal transport.

Read Full Paper (arXiv.org)