A Benchmark Dataset for Satellite-Based Estimation and Detection of Rain

Simon Pfreundschuh, Malarvizhi Arulraj, Ali Behrangi, Linda Bogerd, Alan James Peixoto Calheiros, Daniele Casella, Neda Dolatabadi, Clement Guilloteau, Jie Gong, Christian D. Kummerow, Pierre Kirstetter, Gyuwon Lee, Maximilian Maahn, Lisa Milani, Giulia Panegrossi, Rayana Palharini, Veljko Petković, Soorok Ryu, Paolo Sanò, Jackson Tan

Published: 2025/9/10

Abstract

Accurately tracking the global distribution and evolution of precipitation is essential for both research and operational meteorology. Satellite observations remain the only means of achieving consistent, global-scale precipitation monitoring. While machine learning has long been applied to satellite-based precipitation retrieval, the absence of a standardized benchmark dataset has hindered fair comparisons between methods and limited progress in algorithm development. To address this gap, the International Precipitation Working Group has developed SatRain, the first AI-ready benchmark dataset for satellite-based detection and estimation of rain, snow, graupel, and hail. SatRain includes multi-sensor satellite observations representative of the major platforms currently used in precipitation remote sensing, paired with high-quality reference estimates from ground-based radars corrected using rain gauge measurements. It offers a standardized evaluation protocol to enable robust and reproducible comparisons across machine learning approaches. In addition to supporting algorithm evaluation, the diversity of sensors and inclusion of time-resolved geostationary observations make SatRain a valuable foundation for developing next-generation AI models to deliver more accurate, detailed, and globally consistent precipitation estimates.

Read Full Paper (arXiv.org)