Title
AWS re:Invent 2022 - Observability in the real world: Improve systems w/iterative approach (PRT263)
Summary
- Junaid Ahmed from Datadog discusses the importance of Application Performance Monitoring (APM) and observability in modern cloud environments.
- The talk addresses the challenges faced by developers and SREs in monitoring complex systems and emphasizes the need for deep observability to improve engineering culture and business growth.
- Junaid introduces a four-step approach to improving observability: service instrumentation, metrics and traces, correlation, and discovery.
- He explains three low-effort service instrumentation techniques: eBPF, Kubernetes admission controllers, and OS runtime primitives.
- The talk includes demos showcasing how to instrument services using these techniques and how to gain insights from the collected data.
- Ariel Allen from Starbucks shares her experience with APM and observability, focusing on defining SLOs, service mapping, and the importance of partnership with Datadog.
Insights
- Deep Observability: The talk highlights the significance of deep observability in identifying and resolving issues quickly, which is crucial for maintaining a robust engineering culture and ensuring frequent and confident deployments.
- Iterative Approach: The iterative approach to observability, starting with basic service instrumentation and gradually moving towards advanced metrics, tracing, and AI-powered insights, is emphasized as a way to continuously improve system monitoring and performance.
- Low-Effort Instrumentation: The introduction of low-effort instrumentation techniques like eBPF, Kubernetes admission controllers, and OS runtime primitives suggests a trend towards minimizing the impact on existing application code while enhancing observability capabilities.
- Real-World Application: Ariel Allen's discussion on Starbucks' use of Datadog's APM tools provides a real-world example of how observability practices are applied in large organizations and the challenges they face, such as service mapping across multiple accounts and tracing end-to-end transactions.
- Vendor-Client Collaboration: The talk underscores the importance of collaboration between observability vendors like Datadog and their clients to continuously improve features and address specific needs, as demonstrated by Starbucks' feedback and requests for enhancements.