Optimal Stopping for Sequential Bayesian Experimental Design
Chen Cheng, Xun Huan
公開日: 2025/9/26
Abstract
In sequential Bayesian experimental design, the number of experiments is usually fixed in advance. In practice, however, campaigns may terminate early, raising the fundamental question: when should one stop? Threshold-based rules are simple to implement but inherently myopic, as they trigger termination based on a fixed criterion while ignoring the expected future information gain that additional experiments might provide. We develop a principled Bayesian framework for optimal stopping in sequential experimental design, formulated as a Markov decision process where stopping and design policies are jointly optimized. We prove that the optimal rule is to stop precisely when the immediate terminal reward outweighs the expected continuation value. To learn such policies, we introduce a policy gradient method, but show that na\"ive joint optimization suffers from circular dependencies that destabilize training. We resolve this with a curriculum learning strategy that gradually transitions from forced continuation to adaptive stopping. Numerical studies on a linear-Gaussian benchmark and a contaminant source detection problem demonstrate that curriculum learning achieves stable convergence and outperforms vanilla methods, particularly in settings with strong sequential dependencies.