Deep learning for interval-censored failure time data from case-cohort studies
Yeyu Xiao, Yonghong Long
Published: 2025/9/26
Abstract
Interval-censored data are common in fields such as epidemiology and demography. When the failure event of interest is relatively rare and the collection of covariates is costly, researchers often adopt the case-cohort design to reduce study costs. However, existing studies typically rely on the assumption of linearity in modeling covariates, which may not capture the complex and nonlinear relationships present in real data. To address this limitation, we consider a class of transformation models with unspecified covariate-dependent functions. We propose a sieve maximum weighted likelihood approach for interval-censored data arising from the case-cohort design, which combines deep neural networks with Bernstein polynomials. The method employs a deep neural network to flexibly represent the covariate-dependent function and uses Bernstein polynomials to approximate the cumulative baseline hazard function. We establish the consistency and convergence rate of the proposed estimator and show that the resulting nonparametric deep neural network estimator attains the minimax optimal rate of convergence (up to a polylogarithmic factor). Simulation studies suggest that the proposed method performs well in practice. Finally, we apply the method to a real dataset and use the SHAP (Shapley Additive Explanations) approach to attribute the neural network predictions of the covariate-dependent function to covariates. The results indicate that our method is both accurate and interpretable.