Title: AWS re:Inforce 2024 - Keeping people away from data: A generative AI use case (GAI326)
Insights:
- Generative AI and Data Security: The session focuses on the intersection of generative AI and data security, emphasizing the importance of managing data securely when using generative AI applications.
- Partnership Announcement: OpenAI and Apple have announced a partnership to integrate generative AI into Apple devices, raising questions about data security and management.
- Customer Concerns: Common questions from customers include whether there is a need to store data for generative AI applications and what controls are necessary if data storage is required.
- General Architecture: The typical architecture for generative AI applications involves Amazon SageMaker, Amazon OpenSearch, and Amazon S3, with data moving through various AWS services.
- Data Storage Best Practices: If data storage is not necessary, it is recommended to delete the data to minimize exposure. If storage is required, implementing robust controls is essential.
- Identity-Based Controls: Fine-grained access controls using AWS IAM roles, tags, and groups are crucial for managing who can access specific data within S3 buckets.
- Data Protection Controls: AWS services like KMS for encryption, Amazon Certificate Manager for data in transit, and Amazon Comprehend for PII data redaction are recommended for securing data.
- Defense in Depth: Combining identity-based controls with data protection measures is advised to create a comprehensive security strategy.
- Auditability and Traceability: For regulated industries, demonstrating auditability and traceability is critical. AWS CloudTrail with trusted identity propagation helps track user actions across services.
- Trusted Identity Propagation: This feature allows sharing user identity context across AWS services, enhancing the ability to trace and audit user actions.
Quotes:
- "Basically, that's a way of democratizing generative AI. But what it's made me start thinking is data security, right?"
- "As much as possible, you want to keep people away from data."
- "If the data is not needed, get away of it. Get it away."
- "We recommend defense in depth here, which means it is very, very fine for you to build both data and data protection, both identity and data protection controls."
- "You want to ensure you have a human in that loop when you're walking through, you know, manipulating your data."
- "With trusted identity propagation, we are going to provide access to AWS resources based on the user's identity context."
- "Data not needed, keep it away from people by deleting them. Those that are needed, you want to start designing controls."
- "You want to use CloudTrail with the new future of trusted identity propagation to be able to audit and trace users' actions when they are interacting with your data."