Goldman Sachs Accelerating Time to Value in Data Analytics Fsi201

Title

AWS re:Invent 2022 - Goldman Sachs: Accelerating time to value in data analytics (FSI201)

Summary

  • Gerard Cowburn, Senior Solutions Architect at AWS, introduces the session on accelerating time to value in data analytics.
  • Ram Rajamoni, VP and Tech Fellow in Data Engineering, and Francesco Pontrandolfo, Product Manager for Goldman Sachs Financial Cloud, join the session.
  • The session covers how Goldman Sachs blends diverse hybrid data sources to inform investment hypotheses and deliver insights quickly.
  • An overview of Goldman Sachs and its data-driven investment process is provided.
  • The architecture of Goldman Sachs Financial Cloud is discussed, focusing on high-speed, low-latency, real-time analytics for financial market data.
  • Key data sourcing integrations and the use of the open-source Legend platform for data modeling and wrangling are examined.
  • The session concludes with insights and lessons learned to help others accelerate their data analytics journeys.

Insights

  • Goldman Sachs heavily invests in technology, with a significant portion of its workforce being engineers.
  • Time and speed are critical factors for competitiveness in the financial industry.
  • The Goldman Sachs Financial Cloud is a modular set of services addressing data curation, management, and analytics.
  • Data curation includes access to GS proprietary data and third-party data sets with an added curation layer.
  • Compute instances can be spun up on the GS Financial Cloud, allowing for the ingestion and enrichment of data with a consistent data model.
  • Data analysis tools include REST endpoints, a Python SDK called GSQuant, and a data visualization tool called Portal Pro.
  • The architecture of GS Financial Cloud on AWS includes ECS, DynamoDB, ElastiCache, OpenSearch, and a custom time series database optimized for AWS.
  • Real-time market data is streamed into the platform using a bespoke API and a solution called Electron.
  • Challenges in financial data include its evolving nature, the breadth and structure of data sources, and the need for real-time data streaming.
  • Cloud scalability, managed serverless infrastructure, and early engagement in risk-managed data transfer are key enablers for overcoming these challenges.