Title: AWS re:Inforce 2024 - Securing generative AI: Privacy and compliance considerations (GAI222)
Insights:
- Introduction to Generative AI Privacy and Compliance: The session focuses on privacy and compliance considerations for generative AI applications, emphasizing the importance of understanding data usage, storage, and regulatory requirements.
- Scoping Matrix Overview: The Scoping Matrix categorizes generative AI applications into five scopes:
- Scope 1: Consumer apps (e.g., ChatGPT) that are free and publicly available.
- Scope 2: Enterprise applications with standard SLAs and terms (e.g., CodeWhisperer, Salesforce).
- Scope 3: Custom applications using pre-trained models (e.g., Bedrock).
- Scope 4: Fine-tuned models with added specific data (e.g., SageMaker Jumpstart).
- Scope 5: Self-trained models using proprietary data, requiring significant resources.
- Data Access and Usage: Critical questions include who has access to the data, how it is used, and where it is stored. Data residency regulations are crucial, especially for paid services.
- Regulatory and Compliance Considerations: Organizations must ensure compliance with data storage, processing, and access control regulations, particularly when building custom applications.
- Data Source and Pipeline Management: Understanding the source and legality of data used for fine-tuning models is essential. Organizations must ensure data pipelines are compliant and secure.
- Self-Trained Models: Building self-trained models involves significant responsibility, including data sourcing, validation, and ensuring compliance with regulatory requirements.
- Legal and Compliance Support: AWS can assist with compliance journeys but cannot provide legal compliance guarantees. Organizations should involve legal counsel in their AI projects.
- Emerging Regulations: Key themes in emerging AI regulations include data privacy, transparency, automated decision-making, regulatory classification, profiling, and safety.
- Data Privacy: Avoid recording unnecessary data, especially protected information. Legal counsel can help determine the necessity of data for AI training.
- Transparency and Explainability: Regulators expect clear explanations of data sources and model training processes. Disclosure of AI interactions is necessary.
- Automated Decision-Making: AI systems making decisions with legal impacts must include human intervention and provide rights of appeal.
- Regulatory Classification: Certain AI workloads, like unfiltered mass surveillance, are banned in the EU. High-risk processing requires careful legal review.
- Profiling: Using protected characteristics for decision-making is problematic. Organizations must ensure tight data controls and compliance.
- Safety: AI systems impacting life or property require rigorous testing and independent verification. Executive orders and emerging standards emphasize safety.
Quotes:
- "How do we help customers with their use cases, defining what they are, and focusing those conversations when it comes to privacy and compliance and data considerations?"
- "Who's got access to your data? How are they using your prompts and responses? Are they using them to train the models? Are they using them to improve in a service?"
- "You really want to understand where your data is coming from. Can you use it? Are you allowed to use it? The output's in certain contexts?"
- "We often get this question from customers around, do you think my workload is compliant? And the answer to that is actually very simple. We kind of can't help you with that."
- "Don't record unnecessary data. So what we mean by that is in situations where you may be recording protected information about people such as their religion, their ethnicity, trade union membership."
- "A regulator will expect you, particularly when you're doing what's called a high-risk workload, to have a really good way of explaining where your data came from and how you trained your model."
- "Expecting people to have a right of appeal because when you have an AI system making decisions that impact people's lives there's got to be some sort of right of appeal for that person."
- "In the EU we have banned workloads. Example, suppose you are doing unfiltered scanning of mass CCTV surveillance or facial recognition and you're not law enforcement. Well guess what, in the EU you don't do that."
- "Using those characteristics is going to be problematic in the EU, unless you've got very tight controls around your data pipelines and can demonstrate to regulators you've got security on your data."
- "It's very expensive to unwind it at that point. Get them involved at some point in your brilliant new start-up idea, or if you're a very mature enterprise, you'll probably have a lot of resources in the space to help you out."