Solving Large Scale Data Access Challenges with Amazon S3 Stg328

Title

AWS re:Invent 2022 - Solving large-scale data access challenges with Amazon S3 (STG328)

Summary

  • Presenter: Becky Weiss, Senior Principal Engineer at AWS, specializing in security in the cloud.
  • Focus: Addressing large-scale data access challenges in Amazon S3 with various strategies and use cases.
  • Key Points:
    • IAM basics and policy authorizations for S3 data access.
    • Strategies for scaling access control beyond IAM basics.
    • Use of S3 access points for managing numerous discrete access patterns.
    • IAM session broker pattern for dynamic access control with business logic.
    • AWS Lake Formation for structured data access and permissions at the schema level.
  • Strategies Discussed:
    • IAM role-based access control (RBAC) and tagging strategies.
    • S3 access points for granular access control with up to 10,000 discrete policies.
    • IAM session broker for dynamic and business logic-driven access patterns.
    • Lake Formation for structured data access with SQL-like permissions.

Insights

  • IAM Basics: Understanding IAM is crucial for AWS security and permissions management. It's the foundation for controlling access to AWS services, including S3.
  • Scale Dimensions: Different strategies are required based on the scale dimensions of the use case, such as the number of prefixes, IAM roles, or dynamic access patterns.
  • S3 Access Points: They offer a significant increase in policy space, allowing for more granular control over access to S3 data. This is suitable for use cases with a large number of discrete, static access patterns.
  • IAM Session Broker Pattern: This pattern is ideal for dynamic access control scenarios where direct proxying to S3 is not scalable. It allows for issuing temporary, finely scoped IAM credentials based on business logic.
  • Lake Formation: It's a specialized service for managing access to structured data in S3, allowing permissions to be set at the column and row level, which is not possible with IAM alone.
  • Service Integration: The integration between Athena, Lake Formation, and the Glue Data Catalog demonstrates AWS's ability to provide a seamless experience for querying and managing access to structured data in S3.
  • Security and Scalability: The session highlighted the importance of balancing security with scalability. AWS provides various tools and patterns to achieve this balance, depending on the complexity and scale of the data access requirements.