Title
AWS re:Invent 2023 - Explore geocoded datasets with PwC’s innovative data platform (AIM244)
Summary
- PwC has developed an enterprise geospatial insights platform to handle complex geocoded datasets.
- Geospatial data is challenging due to its variety, volume, and the need for accuracy and up-to-date information.
- The UK government's significant investment in geospatial technology indicates a growing industry.
- PwC's platform uses AWS services like Glue, DataBrew, Step Functions, Athena, SageMaker, Lake Formation, and Redshift Serverless to process and analyze data.
- The platform integrates various datasets, including transportation, income, demographics, and retail data, to create over 600 unique metrics.
- A 3D printed map of London and a VR interface were developed for visualization and interaction with the data.
- The platform is used by businesses to make informed decisions on store locations, competitive analysis, and consumer behavior.
- The web front end, Geo Hub, allows users to build queries and visualize geospatial data in real-time.
Insights
- Geospatial data processing is computationally intensive and expensive, with PwC's refreshes costing thousands of dollars.
- The platform's architecture needed to evolve to handle the scale of data, leading to the use of Redshift Serverless for more efficient in-memory joins.
- PwC created a new role within the company, combining data science with geospatial expertise, to manage the complexity of the data.
- The use of Uber's H3 framework for geocoding demonstrates the adoption of open-source tools for spatial analysis.
- Real-world applications of the platform include optimizing retail locations, analyzing competition, and enhancing customer experiences.
- The innovative use of 3D printing and VR for data visualization indicates a trend towards more interactive and immersive data analysis methods.
- The platform's ability to analyze millions of data points in seconds showcases the power of AWS services in handling big data and providing actionable insights.