Efficient Log-Rank Updates for Random Survival Forests

Erik Sverdrup, James Yang, Michael LeBlanc

公開日: 2025/10/4

Abstract

Random survival forests are widely used for estimating covariate-conditional survival functions under right-censoring. Their standard log-rank splitting criterion is typically recomputed at each candidate split. This O(M) cost per split, with M the number of distinct event times in a node, creates a bottleneck for large cohort datasets with long follow-up. We revisit approximations proposed by LeBlanc and Crowley (1995) and develop simple constant-time updates for the log-rank criterion. The method is implemented in grf and substantially reduces training time on large datasets while preserving predictive performance.

Efficient Log-Rank Updates for Random Survival Forests | SummarXiv | SummarXiv