A model free approach for continuous-time optimal tracking control with unknown user-define cost and constrained control input via advantage function

Duc Cuong Nguyen, Quang Huy Dao, Phuong Nam Dao

Published: 2025/9/20

Abstract

This paper presents a pioneering approach to solving the linear quadratic regulation (LQR) and linear quadratic tracking (LQT) problems with constrained inputs using a novel off-policy continuous-time Q-learning framework. The proposed methodology leverages a novel concept of the Advantage function for linear continuous systems, enabling solutions to be obtained without the need for prior knowledge of the reward matrix weights, state resetting, or assuming the existence of a predefined admissible controller. This framework includes multiple algorithms (Algs) tailored to address these control problems under model-free conditions, without requiring any knowledge about system dynamics. Two distinct implementation methods are explored: the first processes state and input data over a fixed time interval, making it well-suited for LQR problems, while the second method operates over multiple intervals, offering a practical solution for tracking problems with constrained inputs. The convergence of the proposed algorithms is verified theoretically. Finally, the simulation results of the F-16 aircraft system are presented for the two problems to validate the effectiveness of the proposed method.

Read Full Paper (arXiv.org)