Running High Throughput Real Time Ad Platforms in the Cloud Adm302

Title

AWS re:Invent 2022 - Running high-throughput, real-time ad platforms in the cloud (ADM302)

Summary

  • The session focused on the experience of building high-throughput, real-time ad platforms capable of processing over 1 trillion transactions per day.
  • Victor Gershkovich from AppsFlyer and Akhil Andapalli from AWS shared insights on transitioning from EC2 to EKS, optimizing with Graviton instances, and empowering developers with self-serve infrastructure.
  • AppsFlyer's journey involved moving from a classic EC2 approach to an EKS infrastructure, optimizing with Graviton instances for cost and performance, and providing developers with tools for easy infrastructure management.
  • The presenters discussed the importance of stateless architecture, using spot instances, and leveraging Graviton processors for better cost efficiency.
  • They shared best practices for reducing costs and improving performance, including minimizing boot time, using attribute-based instance selection, and monitoring spot placement scores.
  • The session also covered the benefits of using Graviton processors, including significant cost savings and reduced energy consumption.
  • Tips for optimizing code and infrastructure were provided, such as using fast JSON libraries, optimizing web server scalability, and employing non-blocking tasks to reduce latency.
  • The session concluded with the announcement of an open-source real-time bidding application and an invitation for feedback.

Insights

  • Transitioning to a stateless architecture and separating processors from data producers can significantly improve scalability and reduce costs.
  • Utilizing spot instances and Graviton processors can lead to substantial cost savings, with Graviton offering up to 40% better price performance over similar x86 instances.
  • Optimizing code by replacing standard libraries with more efficient alternatives can drastically reduce latency and increase throughput.
  • Implementing non-blocking tasks and optimizing data streams can further reduce latency and improve overall system performance.
  • The use of open-source tools and community contributions, such as the TerraCrust API for Terraform, can simplify infrastructure management for developers.
  • The session highlighted the importance of continuous optimization and experimentation to achieve the best performance and cost efficiency in cloud-based ad platforms.
  • The announcement of an open-source real-time bidding application indicates AWS's commitment to sharing knowledge and tools with the wider community, encouraging collaboration and innovation.