Title
AWS re:Invent 2023 - LLM inference and fine-tuning on secure enterprise data (AIM208)
Summary
- Speaker: Miles Adkins, a data cloud principal for AI/ML workload at Snowflake.
- Snowflake Overview: Snowflake's evolution from startup to IPO in 2020, addressing on-prem data warehouse bottlenecks and poor data lake performance, leveraging public cloud infrastructure, and becoming a full-fledged data platform known as the Snowflake Data Cloud.
- AI in Seconds: Snowflake's approach to delivering LLM-enabled experiences with managed complexity and operations, including Document AI, Universal Search, and Snowflake Copilot.
- Applications in Minutes: Introduction of Snowflake Cortex, a collection of AI and LLM serverless functions, vector support, and Streamlit integration for building LLM applications.
- Customization in Hours: Snowpark Container Services for accessing GPUs, managed Kubernetes layer for model fine-tuning, and partner ecosystem for deploying proprietary commercial LLM applications.
- Demos: Showcased Snowflake Copilot for generating SQL from natural language, translation and summarization functions, and building a chat application using retrieval augmented generation.
- Fine-Tuning and Deployment: Demonstrated fine-tuning an embedding model on a GPU and deploying it as a real-time inference endpoint within Snowflake.
Insights
- Snowflake's Differentiation: Snowflake differentiates itself by offering a comprehensive data platform that handles multiple enterprise workloads, including AI/ML, and provides a marketplace for data sharing and partner applications.
- Generative AI Focus: Snowflake emphasizes ease of use and security for generative AI applications, leveraging enterprise data as a core differentiator.
- Product Strategy: Snowflake's product strategy includes providing pre-built AI experiences, tools for building custom LLM applications, and infrastructure for fine-tuning and deploying models.
- Partner Ecosystem: Snowflake's partner ecosystem is a key component, allowing third-party developers to deploy and offer their LLM applications within Snowflake's secure environment.
- Streamlit Acquisition: The acquisition of Streamlit by Snowflake 18 months prior to the presentation has enabled the integration of Streamlit applications within Snowflake, enhancing the ability to build and share LLM applications.
- Snowpark Container Services: This service is a significant advancement, offering GPU access and a managed Kubernetes platform tailored for data scientists and developers, simplifying the deployment and scaling of LLM applications.
- Model Registry and Deployment: Snowflake's model registry feature within Snowpark ML allows for the deployment of custom models as real-time inference endpoints, showcasing Snowflake's commitment to supporting the full lifecycle of AI/ML model development and deployment.