Title
AWS re:Invent 2023 - Advanced data modeling with Amazon DynamoDB (DAT410)
Summary
- DynamoDB Basics: Key terms include Table, Item, Primary Key, and Attributes. DynamoDB tables are separate resources, and each item must have a unique primary key. DynamoDB is schemaless except for the primary key. There are two types of primary keys: simple and composite.
- Primary Key Design: Critical for success with DynamoDB. Data is partitioned across storage nodes, and the partition key determines the partition an item belongs to. This design ensures consistent performance at scale.
- Multi-Tenant Architecture: DynamoDB uses a shared infrastructure across a region, which is unique compared to other databases. This architecture handles enormous scale and operational complexity.
- Unique Aspects of DynamoDB: Every request is an index hit, and it uses eventually consistent reads by default. DynamoDB's pricing is operation-based, charging for read and write capacity units.
- Advanced Data Modeling Patterns:
- Search Flight Options: Use domain knowledge to create efficient search algorithms for flights, considering direct and connecting flights. Consider downloading static data for compute efficiency.
- Booking Flights: Use condition expressions to maintain constraints and structure items to allow direct operations. Use DynamoDB transactions for atomic operations and step functions for long-running transactions.
- Complex Filtering: Overfetch and filter client-side for small datasets, use reduced projections in secondary indexes for larger items, and consider external systems like OpenSearch for very large datasets or full-text search.
Insights
- DynamoDB's Design Philosophy: The design of DynamoDB emphasizes direct interaction with the database's infrastructure, forcing developers to think about data access patterns and primary key design upfront. This approach can lead to more efficient and scalable applications.
- Operational Simplicity: DynamoDB's multi-tenant architecture and operation-based pricing model simplify operational management for developers. It abstracts away the complexities of scaling, partitioning, and infrastructure maintenance.
- Secondary Indexes and Projections: The use of secondary indexes and projections can significantly optimize read operations and costs. Careful consideration of which attributes to include in secondary indexes can lead to more efficient data retrieval patterns.
- Integration with Other AWS Services: DynamoDB's integration with other AWS services like Lambda, Step Functions, and EventBridge allows for flexible and powerful data processing workflows. This integration enables developers to build complex applications that react to data changes in real-time.
- Handling Complex Filtering: For complex filtering requirements, DynamoDB may not always be the best tool. Developers should evaluate the use of external systems and ensure that they keep the operational burden to a minimum by offloading only the necessary operations.