Perturbed Iterate SGD for Lipschitz Continuous Loss Functions with Numerical Error and Adaptive Step Sizes

Michael R. Metel

Published: 2022/11/9

Abstract

Motivated by neural network training in finite-precision arithmetic environments, this work studies the convergence of perturbed iterate SGD using adaptive step sizes in an environment with numerical error. Considering a general stochastic Lipschitz continuous loss function, an asymptotic convergence result to a Clarke stationary point is proven as well as the non-asymptotic convergence to an approximate stationary point in expectation. It is assumed that only an approximation of the loss function's stochastic gradient can be computed, in addition to error in computing the SGD step itself.

Read Full Paper (arXiv.org)