An Introduction to Open Source Monitoring with Aws Opn201

Title

AWS re:Invent 2022 - An introduction to open-source monitoring with AWS (OPN201)

Summary

  • Speakers: Imayakumar Jagannathan (Principal Solution Architect, AWS) and Vikram Venkatraman (Principal Solutions Architect, AWS).
  • Topics Covered:
    • Evolution from monolithic to microservices architecture and the complexity it brings to observability.
    • Importance of observability in understanding system states through external signals.
    • The role of metrics, logs, traces, and profiles in observability.
    • Open-source tools for observability and their relevance.
    • Introduction to OpenTelemetry as a vendor-neutral way to collect observability data.
    • AWS's contribution to OpenTelemetry and the AWS Distro for OpenTelemetry (ADOT).
    • Discussion on Prometheus for metrics collection and Amazon Managed Service for Prometheus.
    • Overview of other open-source tools like Jaeger for tracing, Pixie for profiling, and OpenSearch for logs.
    • The challenge of day-two operations and the introduction of AWS managed services for open-source tools.
    • Visualization of observability data using Grafana and Amazon Managed Grafana.
    • Patterns of observability adoption by customers.
    • Dimensions to consider when choosing observability tools based on workload, capacity, ease of use, and hybrid support.
    • AWS Observability Accelerator as a tool to implement observability on Amazon EKS.
    • AWS Observability Workshop for hands-on experience with AWS monitoring services.

Insights

  • Observability Challenges:

    • Transitioning from monolithic to microservices architectures has increased the complexity of observability.
    • The ephemeral nature of containers and serverless architectures necessitates robust observability tools to troubleshoot issues effectively.
  • OpenTelemetry:

    • OpenTelemetry provides a unified way to collect metrics, logs, and traces, making it easier to switch between vendors and avoid vendor lock-in.
    • AWS's active contribution to OpenTelemetry and the provision of ADOT ensures that customers have a secure and supported way to implement OpenTelemetry.
  • Prometheus and Managed Services:

    • Prometheus is a popular choice for metrics collection, especially in Kubernetes environments.
    • Amazon Managed Service for Prometheus addresses the scalability and management challenges of running Prometheus at scale.
  • Other Open-Source Tools:

    • Jaeger is highlighted for tracing, Pixie for profiling using eBPF, and OpenSearch for logs.
    • These tools face day-two operations challenges, which AWS managed services aim to alleviate.
  • Visualization and Alerting:

    • Grafana is emphasized for its ability to transform telemetry data into actionable dashboards and alerts.
    • Amazon Managed Grafana provides a managed solution for customers who prefer not to handle the operational overhead.
  • Customer Adoption Patterns:

    • Some customers prefer auto-instrumentation and out-of-the-box solutions for observability to focus on their core business logic.
    • Others make strategic decisions based on the platform they use, such as Amazon EKS, and choose tools that integrate well with that platform.
  • Choosing the Right Tools:

    • AWS suggests considering dimensions such as workload, capacity, ease of use, and hybrid support when selecting observability tools.
    • AWS is exploring the development of a recommendation engine to help customers choose the most suitable observability services.
  • AWS Observability Accelerator:

    • This open-source tool helps customers quickly implement observability on Amazon EKS by deploying ADOT, provisioning Amazon Managed Service for Prometheus, and setting up Amazon Managed Grafana with curated dashboards.