Measuring General Associations in Time Series: An Adaptation and Empirical Evaluation of the CODEC Coefficient in Determining Autoregressive Dynamics
Juan Pablo Montaño, Mario E. Arrieta-Prieto
Published: 2025/9/7
Abstract
Identifying the number of lags to include in an autoregressive model remains an open research problem due to the computational burden of treating it as a hyperparameter, especially in complex models. This study explores model-agnostic association measures, including Pearson, Spearman, and an adaptation of the recently proposed conditional dependence coefficient (CODEC), for guiding lag selection in time series. We adapt and implement the CODEC-based Feature Ordering by Conditional Independence (CODEC-FOCI) algorithm and evaluate its performance through extensive simulations across linear, nonlinear, stationary, nonstationary, seasonal, and heteroskedastic processes. Results show that CODEC outperforms classical correlation-based measures in nonlinear and nonstationary settings, especially for large sample sizes. In contrast, Pearson performs better in purely linear models. Applications to benchmark datasets confirm that the CODEC approach identifies lag structures consistent with those reported in the literature. These findings highlight CODEC's potential as a practical, model-free tool for exploratory lag identification in time series analysis.