Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
Xin Wang, Wanying Ge, Junichi Yamagishi
Published: 2025/9/12
Abstract
When being delivered in applications or services on the cloud, static speech deepfake detectors that are not updated will become vulnerable to newly created speech deepfake attacks. From the perspective of machine learning operations (MLOps), this paper tries to answer whether we can monitor new and unseen speech deepfake data that drifts away from a seen reference data set. We further ask, if drift is detected, whether we can fine-tune the detector using similarly drifted data, reduce the drift, and improve the detection performance. On a toy dataset and the large-scale MLAAD dataset, we show that the drift caused by new text-to-speech (TTS) attacks can be monitored using distances between the distributions of the new data and reference data. Furthermore, we demonstrate that fine-tuning the detector using data generated by the new TTS deepfakes can reduce the drift and the detection error rates.