Serverless Data Streaming Amazon Kinesis Data Streams and Aws Lambda Com308

Title

AWS re:Invent 2023 - Serverless data streaming: Amazon Kinesis Data Streams and AWS Lambda (COM308)

Summary

  • The talk focused on the integration of Amazon Kinesis Data Streams and AWS Lambda, emphasizing the importance of understanding failures in complex systems.
  • The speaker, Anahit, is a lead cloud software engineer and an AWS Data Hero, sharing personal experiences and insights.
  • Anahit discussed the "storage-first" pattern using Kinesis for data capture and Lambda for processing, highlighting the simplicity and popularity of this architecture.
  • A real-life story was shared where data loss occurred in a seemingly well-functioning system, leading to an investigation into Kinesis and Lambda's behavior.
  • Kinesis Data Streams is a managed, scalable service for streaming data, with features like data retention and replay, but it requires shard management for scalability.
  • Two capacity modes for Kinesis were discussed: provisioned capacity mode (manual scaling) and on-demand capacity mode (auto-scaling with limitations).
  • Writing data to Kinesis can be done individually or in batches, with the latter being more efficient but requiring careful handling of partial failures and retries.
  • Event Source Mapping is a key component when using Lambda as a stream consumer, handling batch processing, tracking, and error handling.
  • The speaker emphasized the importance of proper error handling, retry strategies, and monitoring to avoid data loss and unnecessary costs.
  • Event Source Mapping features like batch bisecting, partial success handling, and failure destinations were discussed as ways to improve reliability.
  • The talk concluded with a call to embrace failures as learning opportunities and to be prepared for them in distributed systems.

Insights

  • Understanding the underlying mechanisms of AWS services is crucial for building reliable and efficient systems, especially when dealing with distributed systems where failures are common.
  • Kinesis Data Streams' on-demand capacity mode, while offering auto-scaling, can be significantly more expensive than a well-provisioned provisioned capacity mode, highlighting the need for cost-benefit analysis when choosing between the two.
  • Proper error handling and retry strategies are essential when working with batch operations in distributed systems to prevent data loss and excessive costs due to retries.
  • AWS SDK defaults for retries and timeouts can be dangerous and should be configured appropriately for the specific use case to ensure real-time processing and cost efficiency.
  • Event Source Mapping's advanced features can greatly enhance Lambda's ability to process Kinesis data streams by providing more granular control over error handling and record processing.
  • The speaker's emphasis on embracing failures and learning from them underscores the importance of a resilient mindset when working with cloud services and distributed architectures.