A Memory Efficient Adjoint Method to Enable Billion Parameter Optimization on a Single GPU in Dynamic Problems

Leon Herrmann, Tim Bürchner, László Kudela, Stefan Kollmannsberger

Published: 2025/9/19

Abstract

Dynamic optimization is currently limited by sensitivity computations that require information from full forward and adjoint wave fields. Since the forward and adjoint solutions are computed in opposing time directions, the forward solution must be stored. This requires a substantial amount of memory for large-scale problems even when using check pointing or data compression techniques. As a result, the problem size is memory bound rather than bound by wall clock time, when working with modern GPU-based implementations that have limited memory capacity. To overcome this limitation, we introduce a new approach for approximate sensitivity computation based on the adjoint method (for self-adjoint problems) that relies on the principle of superposition. The approximation allows an iterative computation of the sensitivity, reducing the memory burden to that of the solution at a small number of time steps, i.e., to the number of degrees of freedom. This enables sensitivity computations for problems with billions of degrees of freedom on current GPUs, such as the A100 from NVIDIA (from 2020). We demonstrate the approach on full waveform inversion and transient acoustic topology optimization problems, relying on a highly efficient finite difference forward solver implemented in CUDA. Phenomena such as damping cannot be considered, as the approximation technique is limited to self-adjoint problems.

Read Full Paper (arXiv.org)