Title
AWS re:Invent 2023 - What’s new in AWS Lake Formation (ANT303)
Summary
- Presenters: Leon Stichter (Product Management for Lake Formation and Glue Data Catalog) and Pravajan Narayanaswamy (Software Delivery Manager), with a guest appearance by Brad Alford from Duke Energy.
- Key Points:
- AWS Lake Formation and Glue Data Catalog have seen significant feature updates.
- Four major categories of updates: Discover and Secure, Connect and Share, Scale and Optimize, Audit and Monitor.
- Integration with AWS IAM Identity Center for identity-based data access.
- Cross-region capabilities for data sharing and access.
- New features for AWS Glue Crawlers, including bring-your-own-driver version, partition indicers, and integration with Lake Formation.
- Support for OpenTable formats like Apache Hudi, Apache Iceberg, and Linux Delta Format.
- Hybrid access mode and fine-grained permissions on nested data types.
- Lake Formation tag-based access control and tag delegation for decentralized architectures.
- General availability of AWS Glue Data Catalog statistics for cost-based optimizers.
- Automated compaction for Apache Iceberg Tables.
- New integrations with partners like Colibra, Dremio, and Prevacera.
- Brad Alford from Duke Energy shared their data journey using AWS Lake Formation and Glue Data Catalog, highlighting their data mesh approach and future plans for automation and integration with new Lake Formation features.
Insights
- Integration with IAM Identity Center: This integration allows for seamless identity-based access control, improving security and auditability by tying actions directly to individual user identities rather than just IAM roles.
- Cross-region and Cross-account Capabilities: These features facilitate a more flexible and distributed data architecture, allowing organizations to manage data governance across different geographical locations and business units.
- OpenTable Formats Support: The support for modern data formats like Apache Hudi, Iceberg, and Delta Lake indicates AWS's commitment to providing customers with the ability to handle transactional workloads and schema evolution within their data lakes.
- Tag-based Access Control and Delegation: This feature simplifies permission management at scale and supports decentralized data governance models, which are becoming increasingly important in complex, multi-team environments.
- Automated Compaction for Iceberg Tables: This feature addresses the small file problem in data lakes, optimizing query performance without manual intervention.
- Partner Ecosystem Expansion: The announcement of new integrations with partners like Colibra, Dremio, and Prevacera shows AWS's focus on interoperability and providing customers with a choice of tools that best fit their needs.
- Duke Energy's Use Case: The detailed account of Duke Energy's use of AWS Lake Formation and Glue Data Catalog provides a real-world example of how enterprises are leveraging AWS services for data governance and management, and their plans to further automate and integrate with new features suggest confidence in the AWS ecosystem for future-proofing their data strategy.