Minimal-Dissipation Learning for Energy-Based Models

Jeff Hnybida, Simon Verret

Published: 2025/10/3

Abstract

We show that the bias of the approximate maximum-likelihood estimation (MLE) objective of a persistent chain energy-based model (EBM) is precisely equal to the thermodynamic excess work of an overdamped Langevin dynamical system. We then answer the question of whether such a model can be trained with minimal excess work, that is, energy dissipation, in a finite amount of time. We find that a Gaussian energy function with constant variance can be trained with minimal excess work by controlling only the learning rate. This proves that it is possible to train a persistent chain EBM in a finite amount of time with minimal dissipation and also provides a lower bound on the energy required for the computation. We refer to such a learning process that minimizes the excess work as minimal-dissipation learning. We then provide a generalization of the optimal learning rate schedule to general potentials and find that it induces a natural gradient flow on the MLE objective, a well-known second-order optimization method.

Read Full Paper (arXiv.org)