Title

AWS re:Invent 2022 - Drive positive impact with feature flags and data (PRT053)

Summary

Trevor, a co-founder at Split, and Ariel Perez, VP of Measurement and Experimentation at Split, discuss the use of feature flags and data to drive positive impact in software development.
The journey of feature flag adoption is outlined in stages: basic flagging, advanced percentage-based rollouts, data integration for impact analysis, and full-blown experimentation with hypothesis documentation.
A survey of Split's customers shows varying adoption levels: 15% have some features behind flags, 23% have most features behind flags, 8% are integrating data, and over 50% are running experiments.
Ariel presents two technical use cases: traffic mirroring and tap compare testing, which are used for monitoring rollouts and creating zero downtime migrations.
The first case study involves migrating from Amazon Kinesis to Kafka with the goals of reducing costs, maintaining event integrity, and ensuring no impact on latency or throughput, all with zero downtime.
The second case study discusses moving from Amazon S3 to MongoDB to enable more complex querying and improve write throughput, again aiming for zero downtime.
Both migrations utilized feature flags and data to monitor and measure the impact, ensuring successful transitions without service interruptions.

Feature flags are a powerful tool for managing feature rollouts, enabling safer deployments, and facilitating A/B testing and experimentation.
The adoption of feature flags is a journey, and organizations may find themselves at different stages, from basic use to sophisticated experimentation.
Integrating data with feature flags allows teams to make informed decisions based on the actual impact of changes, rather than assumptions or guesses.
Traffic mirroring and tap compare testing are advanced techniques that can be used to ensure that new services or migrations do not negatively affect the existing system's performance or reliability.
The use of feature flags and data in migrations, as demonstrated in the case studies, can lead to significant improvements in system performance and cost efficiency while maintaining high availability.
Split's platform appears to offer robust tools for monitoring and analyzing the impact of feature flag-driven changes, providing causal analysis to understand the direct effects of migrations or new features.
The emphasis on zero downtime migrations reflects a growing industry trend towards continuous delivery and high availability, where services must remain operational even during significant system changes.