Tracing and Metrics Design Patterns for Monitoring Cloud-native Applications
Carlos Albuquerque, Filipe F. Correia
Published: 2025/10/3
Abstract
Observability helps ensure the reliability and maintainability of cloud-native applications. As software architectures become increasingly distributed and subject to change, it becomes a greater challenge to diagnose system issues effectively, often having to deal with fragmented observability and more difficult root cause analysis. This paper builds upon our previous work and introduces three design patterns that address key challenges in monitoring cloud-native applications. Distributed Tracing improves visibility into request flows across services, aiding in latency analysis and root cause detection, Application Metrics provides a structured approach to instrumenting applications with meaningful performance indicators, enabling real-time monitoring and anomaly detection, and Infrastructure Metrics focuses on monitoring the environment in which the system is operated, helping teams assess resource utilization, scalability, and operational health. These patterns are derived from industry practices and observability frameworks and aim to offer guidance for software practitioners.