Title

AWS re:Invent 2023 - How Fetch built world-class ML models to power their business (SEG301)

Summary

Fetch is a premier D&B customer of AWS, offering a consumer rewards app that processes millions of receipts daily.
The presentation covers the technical aspects of machine learning, focusing on model development and deployment in the cloud.
Fetch transitioned from using third-party components to developing an in-house machine learning team to handle document scanning and fraud detection.
The company emphasizes the importance of team structure, with cross-functional, independent project teams for better integration and deployment of ML models.
Key projects include fraud detection, document understanding, and product intelligence.
The presentation highlights the use of Streamlit for building full-stack demos, the importance of domain-driven design, and the use of shadow pipelines for safe deployment.
Scaling with model servers is discussed, emphasizing the need for auto-scaling based on requests per second.
The importance of data quality, annotation, and stakeholder engagement is stressed for successful ML projects.
Fetch's future directions include improving personalization, discovery, recommendation systems, stateful ML services, and generative AI.

Fetch's transition to an in-house ML team reflects a broader trend of companies seeking greater control and customization over their core business processes.
The use of Streamlit for prototyping and the emphasis on domain-driven design suggest a pragmatic approach to ML development, focusing on business needs rather than just technical capabilities.
Shadow pipelines and the separation of ML endpoints from backend services demonstrate a mature approach to deploying ML models, ensuring minimal disruption to existing systems.
The discussion on scaling challenges with model servers and the recommendation to auto-scale by requests per second provide practical advice for ML engineers facing similar deployment issues.
The emphasis on data quality and annotation highlights the often-underestimated importance of data preparation in ML projects, which can significantly impact model performance.
Fetch's future focus on personalization and recommendation systems indicates a strategic move to leverage their unique dataset to enhance user experience and business value.
The mention of generative AI and vector databases suggests Fetch is keeping pace with cutting-edge ML technologies, potentially offering innovative solutions to complex data tasks.