Title
AWS re:Invent 2023 - Dive deep on Amazon S3 (STG314)
Summary
- Amy Therrien and Seth Markle, both with over a decade of experience with Amazon S3, presented a deep dive into S3's history, threat modeling, and how S3 has evolved to handle unexpected scale and threats.
- S3 has been operational for over 17 years, learning and improving its systems for resilience and proactive threat mitigation.
- The team uses Amazon's doc writing culture for threat modeling, creating documents to reflect on potential threats and their mitigations.
- They discussed the importance of multipart uploads, range gets, and multivalue DNS for leveraging S3's scale and handling greater than expected customer usage.
- The common runtime is introduced as a mitigation for software complexity and potential bugs, including best practices for multipart uploads, range gets, and DNS usage.
- Mount Point for Amazon S3, a FUSE file connector, was discussed, including its new features like caching, container support, and the S3 fast storage class, S3 Express.
- The indexing system of S3 was explained, highlighting its ability to handle over 350 trillion objects and 100 million requests per second, and how customers can use prefix strategies for scaling.
- Seth Markle discussed S3's durability, including the 11 nines durability design, end-to-end checksumming, erasure coding, and zone replication.
- The new S3 Express One Zone storage class was introduced, offering high performance with the trade-off of being localized in a single availability zone.
- The talk concluded with a discussion on the importance of building for exceptional cases and the organizational focus on robust and correct software development.
Insights
- S3's evolution from a reactive to a proactive service highlights the importance of continuous learning and improvement in cloud services.
- The use of threat modeling and documentation is a key practice within the S3 team, ensuring that potential issues are identified and mitigated before they impact customers.
- The multipart upload feature is crucial for optimizing data transfer speeds and reliability, especially for large objects.
- The common runtime library encapsulates best practices and mitigations for software complexity, which is a valuable tool for developers to ensure high performance and reliability.
- The introduction of Mount Point for Amazon S3 and its new features demonstrates AWS's commitment to providing versatile and high-performance storage solutions.
- The explanation of S3's indexing system and prefix strategies provides valuable insights for customers on how to effectively scale their applications using S3.
- The discussion on S3's durability, including the 11 nines design and zone replication, underscores the robustness of AWS's storage infrastructure.
- The new S3 Express One Zone storage class offers a solution for high-performance applications with the caveat of reduced availability zone resilience, which customers need to consider when choosing storage classes.
- The emphasis on building for exceptional cases and the organizational focus on robust software development practices are key takeaways for any organization looking to build resilient cloud services.