Multi-period Asset-liability Management with Reinforcement Learning in a Regime-Switching Market

Zhongqin Gao, Ping Chen, Xun Li, Yan Lv, Wenhao Zhang

Published: 2025/9/3

Abstract

This paper explores the mean-variance portfolio selection problem in a multi-period financial market characterized by regime-switching dynamics and uncontrollable liabilities. To address the uncertainty in the decision-making process within the financial market, we incorporate reinforcement learning (RL) techniques. Specifically, the study examines an exploratory mean-variance (EMV) framework where investors aim to minimize risk while maximizing returns under incomplete market information, influenced by shifting economic regimes. The market model includes risk-free and risky assets, with liability dynamics driven by a Markov regime-switching process. To align with real-world scenarios where financial decisions are made over discrete time periods, we adopt a multi-period dynamic model. We present an optimal portfolio strategy derived using RL techniques that adapt to these market conditions. The proposed solution addresses the inherent time inconsistency in classical mean-variance models by integrating a pre-committed strategy formulation. Furthermore, we incorporate partial market observability, employing stochastic filtering techniques to estimate unobservable market states. Numerical simulations and empirical tests on real financial data demonstrate that our method achieves superior returns, lower risk, and faster convergence compared to traditional models. These findings highlight the robustness and adaptability of our RL-based solution in dynamic and complex financial environments.

Read Full Paper (arXiv.org)