iDDS: Intelligent Distributed Dispatch and Scheduling for Workflow Orchestration

Wen Guan, Tadashi Maeno, Aleksandr Alekseev, Fernando Harald Barreiro Megino, Kaushik De, Edward Karavakis, Alexei Klimentov, Tatiana Korchuganova, FaHui Lin, Paul Nilsson, Torre Wenaus, Zhaoyu Yang, Xin Zhao

Published: 2025/10/3

Abstract

The intelligent Distributed Dispatch and Scheduling (iDDS) service is a versatile workflow orchestration system designed for large-scale, distributed scientific computing. iDDS extends traditional workload and data management by integrating data-aware execution, conditional logic, and programmable workflows, enabling automation of complex and dynamic processing pipelines. Originally developed for the ATLAS experiment at the Large Hadron Collider, iDDS has evolved into an experiment-agnostic platform that supports both template-driven workflows and a Function-as-a-Task model for Python-based orchestration. This paper presents the architecture and core components of iDDS, highlighting its scalability, modular message-driven design, and integration with systems such as PanDA and Rucio. We demonstrate its versatility through real-world use cases: fine-grained tape resource optimization for ATLAS, orchestration of large Directed Acyclic Graph (DAG) workflows for the Rubin Observatory, distributed hyperparameter optimization for machine learning applications, active learning for physics analyses, and AI-assisted detector design at the Electron-Ion Collider. By unifying workload scheduling, data movement, and adaptive decision-making, iDDS reduces operational overhead and enables reproducible, high-throughput workflows across heterogeneous infrastructures. We conclude with current challenges and future directions, including interactive, cloud-native, and serverless workflow support.