Title
AWS re:Invent 2022 - How Discovery increased operational efficiency with AWS observability (COP201)
Summary
- Warner Brothers Discovery (WBD) faced challenges with operational intelligence due to a large and diverse technical footprint.
- Observability is crucial for WBD, and they define it as the unification of all operationally relevant data.
- WBD started with fragmented observability solutions and lacked standardization, which led to inefficiencies.
- They introduced an Operational Metadata Specification (OMD) to standardize operational data.
- WBD partnered with AWS to create a singular observability platform using AWS services and open-source solutions.
- They implemented a templatized approach for logging and metrics, using Fluentd, Kinesis, OpenSearch, Grafana Agent, Amazon Managed Service for Prometheus (AMP), and Grafana.
- WBD achieved substantial progress in operational efficiency, scaling from 3 to 12 terabytes of log data ingestion per day and reducing query times.
- Future goals include converting logs to metrics, Kubernetes-based cost attribution, OpenTelemetry-backed distributed tracing, and open search cross-cluster search optimizations.
Insights
- WBD's approach to observability emphasizes the importance of standardization and governance of operational data.
- The use of AWS services and open-source tools allowed WBD to streamline their observability processes and reduce the number of tools and interfaces engineers need to interact with.
- The adoption of a templatized solution for logging and metrics demonstrates the benefits of a scalable and repeatable approach to observability.
- WBD's future goals suggest a trend towards real-time data processing and a focus on cost management and efficiency in observability practices.
- The session highlights the importance of managed services in reducing the operational burden and allowing companies to focus on their core business rather than managing infrastructure.