Cloud abstractions for AI workloads

Marco Canini, Theophilus A. Benson, Ricardo Bianchini, Íñigo Goiri, Dejan Kostić, Peter Pietzuch, Simon Peter

Published: 2025/1/16

Abstract

AI workloads, often hosted in multi-tenant cloud environments, require vast computational resources but suffer inefficiencies due to limited tenant-provider coordination. Tenants lack infrastructure insights, while providers lack workload details to optimize tasks like partitioning, scheduling, and fault tolerance. We propose HarmonAIze to redefine cloud abstractions, enabling cooperative optimization for improved performance, efficiency, resiliency, and sustainability. We outline key opportunities and challenges this vision faces.

Cloud abstractions for AI workloads | SummarXiv | SummarXiv