Title

AWS re:Invent 2023 - Data patterns for generative AI applications (DAT338)

Summary

Siva Raghupati and Vlad Vlaschano presented on data patterns for generative AI (Gen AI) applications on AWS.
The focus was on how data is a differentiator in Gen AI applications, with an emphasis on structured and unstructured data.
Three main patterns for feeding data into Gen AI systems were discussed: Contextual Engineering with Retrieval Augmented Generation (RAG), fine-tuning foundation models with cleansed and labeled data, and building custom models with curated data.
The importance of data lakes, data warehouses, ETL pipelines, stream processing, data cataloging, data quality, and data governance was highlighted.
The session covered the architecture of a classic Gen AI application, including the use of AWS services like DynamoDB, DocumentDB, Amazon MemoryDB, Amazon Kendra, Amazon OpenSearch, and Amazon SageMaker.
The presenters discussed the importance of vector embeddings and vector data stores, comparing Amazon RDS PostgreSQL with PGVector and Amazon OpenSearch.
A customer use case from CS Disco was shared, demonstrating the application of these concepts in a real-world scenario.
The session concluded with insights on evolving data strategies to accommodate Gen AI, including considerations for security, compliance, and unified data views.

Data is the key differentiator in Gen AI applications, and how it is structured, stored, and utilized can significantly impact the effectiveness of the application.
AWS provides a comprehensive set of services that can be used to build and manage the data infrastructure required for Gen AI applications.
Contextual Engineering with RAG is the easiest pattern to start with for incorporating data into Gen AI applications, as it does not require deep machine learning expertise.
Fine-tuning and building custom models are more complex and resource-intensive but can lead to more specialized and effective Gen AI applications.
The choice of vector data store (e.g., Amazon RDS PostgreSQL with PGVector vs. Amazon OpenSearch) should be based on familiarity, ease of implementation, scalability, performance, flexibility, and cost.
Data governance and security become more complex with Gen AI applications, as they involve new types of interactions with data and require expanded controls.
The session highlighted the importance of attending to the backend processes that support Gen AI, such as data ingestion, storage, processing, and governance, which are critical for the front-end application to function effectively.
AWS encourages the use of its services for building Gen AI applications but also supports the integration of third-party solutions where appropriate.