Title

AWS re:Invent 2023 - Navigating the future of AI: Deploying generative models on Amazon EKS (CON312)

Summary

Organizations are exploring how to support machine learning workloads on Kubernetes, particularly generative AI (Gen AI), due to its growing importance and potential applications.
DevOps and MLOps engineers are challenged with providing the necessary environments for data scientists, including faster storage, more computing power, and access to machine learning toolkits, all while maintaining security and managing costs.
The session, led by Mike (EKS Product Manager) and Rama (AWS Container Specialist), discusses how Amazon EKS can accelerate the deployment of Gen AI workloads by integrating AWS machine learning innovations in compute, networking, and infrastructure.
Rama provides a background on why Kubernetes is suitable for machine learning, addressing challenges such as dependency management, compute at scale, distributed training and inference, logging, monitoring, security, and compliance.
Mike discusses the evolution of AI, the rise of Gen AI, and how EKS can help with the challenges of running Gen AI workloads, such as scaling, performance, and cost.
John Weber, Senior Director of Developer Productivity at Adobe, shares Adobe's experience in extending their EKS platform to support machine learning workloads and the development of Adobe Firefly.
The session also introduces the JARC stack (JupyterHub, Argo, Ray, Kubernetes) as a recommended end-to-end ML stack and the Data on EKS (DoEKS) project for deploying data workloads on EKS.

Kubernetes is increasingly being adopted for machine learning workloads due to its ability to manage infrastructure primitives and its strong open-source community support.
The shift from traditional machine learning models to generative AI models is significant, with Gen AI models requiring less labeled data and being more adaptable to various tasks.
Adobe's use of EKS and their internal developer platform, Ethos, demonstrates how large organizations are leveraging Kubernetes to streamline the development process and improve developer productivity.
The introduction of the JARC stack and the DoEKS project indicates AWS's commitment to providing standardized, open-source solutions for machine learning and data workloads on EKS.
The session highlights the importance of collaboration between AWS and its customers to address common challenges such as GPU scarcity, Kubernetes API rate limiting, and container startup times.
Adobe's experience with EKS shows tangible benefits such as improved cluster-to-operator ratios, the retirement of homegrown CI/CD pipelines in favor of Argo, and a focus on reducing developer friction.
The session emphasizes the need for platforms and abstractions that allow developers to focus on building core business value rather than managing infrastructure, which is a key advantage of using EKS for machine learning workloads.