Title

AWS re:Invent 2023 - Achieve high performance consistency using Amazon EBS (STG331)

Summary

Mark Olson, a Senior Principal Software Engineer, and Vienna Chen, a Principal Product Manager, discuss improving storage performance with Amazon EBS.
They emphasize the importance of storage performance as part of the overall system performance.
Queueing theory and Little's Law are introduced to understand system capacity and concurrency.
Real-world examples, such as airport traffic, are used to illustrate queuing and system capacity.
The presenters delve into the technical aspects of I/O operations, file systems, and block sizes.
They discuss the Nitro system and its role in delivering high-performance, secure I/O operations.
SRD (Scalable Reliable Datagram) is introduced as a network protocol designed for AWS data centers to deliver lower latency and higher throughput.
EBS volume types are explained, with a focus on GP3 and IO2 Block Express volumes.
The importance of choosing the right EC2 instance based on dedicated EBS bandwidth is highlighted.
CloudWatch metrics are recommended for monitoring EBS performance at the volume, instance, and application levels.
Benchmarking is suggested to understand performance needs, with tools like iostat and Block Trace recommended for Linux users.
Considerations for running SQL and NoSQL databases on EBS are discussed, including high availability, backup strategies, and cost implications.

Insights

Understanding and applying queueing theory, such as Little's Law, can help optimize storage performance by managing concurrency and contention.
The Nitro system's role in EBS performance is critical, as it provides a secure and efficient way to handle I/O operations.
SRD protocol is a significant innovation for AWS, offering improved performance and reliability over traditional TCP in AWS's controlled network environment.
The transition of all IO2 volumes to deliver the same performance as IO2 Block Express volumes is a notable enhancement, providing sub-millisecond latency consistently.
The choice of EBS volume type and EC2 instance size can significantly impact the performance and cost-effectiveness of AWS workloads.
CloudWatch metrics are essential for monitoring and understanding EBS performance, allowing for informed decisions on volume and instance adjustments.
Benchmarking is crucial for assessing the performance of AWS services against specific workloads, and tools like iostat and Block Trace can provide valuable insights into I/O patterns.
For database applications, the choice between different storage and instance configurations can affect performance, availability, and cost, with options like SQL Server failover Cluster Instances offering new possibilities for high availability without the need for expensive enterprise licenses.
EBS's ability to support larger I/Os written in an all-or-nothing fashion can help database engines disable double write protection, improving transaction performance and reducing write latency.
Running NoSQL databases on EBS volumes can offer a balance between performance, durability, and cost, with the added benefit of easy backup and recovery through EBS snapshots.

Accelerating the Migration of Large Scale Sap Systems to Aws Ent210 Achieving Amazon S3 Data Lake Resilience at Lexisnexis Stg101