Semi-supervised inference for treatment heterogeneity

Yilizhati Anniwaer, Yuqian Zhang

公開日: 2025/9/5

Abstract

In causal inference, measuring treatment heterogeneity is crucial as it provides scientific insights into how treatments influence outcomes and guides personalized decision-making. In this work, we study semi-supervised settings where a labeled dataset is accompanied by a large unlabeled dataset, and develop semi-supervised estimators for two measures of treatment heterogeneity: the total treatment heterogeneity (TTH) and the explained treatment heterogeneity (ETH) of a simplified working model. We propose semi-supervised estimators for both quantities and demonstrate their improved robustness and efficiency compared with supervised methods. For ETH estimation, we show that direct semi-supervised approaches may result in efficiency loss relative to supervised counterparts. To address this, we introduce a re-weighting strategy that assigns data-dependent weights to labeled and unlabeled samples to optimize efficiency. The proposed approach guarantees an asymptotic variance no larger than that of the supervised method, ensuring its safe use. We evaluate the performance of the proposed estimators through simulation studies and a real-data application based on an AIDS clinical trial.

Semi-supervised inference for treatment heterogeneity | SummarXiv | SummarXiv