How Discovery Increased Operational Efficiency with Aws Observability Cop201

Title

AWS re:Invent 2022 - How Discovery increased operational efficiency with AWS observability (COP201)

Summary

  • Warner Brothers Discovery (WBD) faced challenges with operational intelligence due to a large and diverse technical footprint.
  • Observability is crucial for WBD, and they define it as the unification of all operationally relevant data.
  • WBD started with fragmented observability solutions and lacked standardization, which led to inefficiencies.
  • They introduced an Operational Metadata Specification (OMD) to standardize operational data.
  • WBD partnered with AWS to create a singular observability platform using AWS services and open-source solutions.
  • They implemented a templatized approach for logging and metrics, using Fluentd, Kinesis, OpenSearch, Grafana Agent, Amazon Managed Service for Prometheus (AMP), and Grafana.
  • WBD achieved substantial progress in operational efficiency, scaling from 3 to 12 terabytes of log data ingestion per day and reducing query times.
  • Future goals include converting logs to metrics, Kubernetes-based cost attribution, OpenTelemetry-backed distributed tracing, and open search cross-cluster search optimizations.

Insights

  • WBD's approach to observability emphasizes the importance of standardization and governance of operational data.
  • The use of AWS services and open-source tools allowed WBD to streamline their observability processes and reduce the number of tools and interfaces engineers need to interact with.
  • The adoption of a templatized solution for logging and metrics demonstrates the benefits of a scalable and repeatable approach to observability.
  • WBD's future goals suggest a trend towards real-time data processing and a focus on cost management and efficiency in observability practices.
  • The session highlights the importance of managed services in reducing the operational burden and allowing companies to focus on their core business rather than managing infrastructure.