Title
AWS re:Invent 2023 - Platform engineering with Amazon EKS (CON311)
Summary
- Introduction: Kevin Coleman and Roland Barsia from AWS, along with Ahmed Bebarz from the New York Times, discuss platform engineering with Amazon EKS.
- Platform Engineering: Defined as building compute abstractions for internal customers to enable efficient cloud adoption and accelerate software delivery.
- Infrastructure Management Spectrum: Organizations can choose between decentralized (application teams manage infrastructure) and centralized (central teams manage infrastructure) models.
- Abstractions in Cloud: Abstractions allow developers to focus on application development rather than infrastructure management.
- Internal Platforms: More than infrastructure abstractions, they include self-service APIs, tools, services, knowledge, and support, and are treated as internal products.
- Benefits of Internal Platforms: Increased velocity, governance, and efficiency, leading to cost savings and economies of scale.
- Customer Examples: Salesforce's Hyperforce and NASA's data platform are highlighted as successful EKS-based internal platforms.
- Platform Implementation Patterns: Roland discusses ownership, level of abstraction, adoption, observability, and isolation as key considerations in platform engineering.
- Developer Experience: Emphasizes the importance of treating the platform as a product, providing escape hatches for developers, and the necessity of documentation and education.
- The New York Times Platform: Ahmed Bebarz shares insights on building an internal developer platform at The New York Times, focusing on standardization, efficiency, integration, scalability, and visibility.
- Future Goals: The New York Times aims for more managed add-ons, better IAM provisioning, and expanding workloads to include data-intensive applications and machine learning.
Insights
- Platform Engineering as a Product: Treating internal platforms as products with a customer-centric approach is crucial for adoption and success.
- Multi-Tenant vs. Single-Tenant Clusters: The choice between multi-tenant and single-tenant clusters depends on specific organizational needs, with trade-offs in management overhead and cost optimization.
- Documentation and Education: These are essential components of a successful platform, ensuring that developers can effectively use the platform without constant support.
- Customer Feedback: Continuous feedback from internal customers (developers) is vital for iterative improvement and relevance of the platform.
- Managed Services: There is a desire for more AWS-managed services to reduce the complexity of managing components like Istio and Carpenter.
- Multi-Account Architecture: The New York Times uses a multi-account architecture for better security, cost management, and resource organization.
- Observability Across Workflows: Ensuring observability at every stage of the development lifecycle is critical for understanding and improving the platform.
- Future-Proofing: Platforms should be designed with future expansions in mind, such as supporting ARM workloads and more complex applications.