As a strategic data initiative with AB InBev, our team was tasked with exploring the use of alternative data to enhance their Business-to-Business (B2B) product recommendation systems. Traditionally, AB InBev relied heavily on internally generated data—such as customer purchase activity—for these predictions. We were brought in to investigate whether novel, external data sources could increase recommendation efficacy, particularly for new or inactive customers, where experimentation carried low risk.
Our objective was to identify, integrate, and evaluate external data sources—such as socioeconomic indicators, weather data, regional beer flavor preferences, and local events—that could be linked to Points of Commerce (POCs) and serve as features in downstream machine learning models. We established scalable data pipelines to source, scrape, and process these datasets, then leveraged H3 geohashing to spatially aggregate them and align with AB InBev’s internal POC data. Once joined, these enriched datasets were used to power experimental models, including a Graph Neural Network for edge prediction-based product recommendations. As an exploration, we also performed clustering to uncover early, interpretable insights regarding customer segmentation. This lightweight, cost-effective analysis helped surface performance patterns tied to geospatial and behavioral factors, ensuring that actionable intelligence wasn’t overlooked in favor of more complex black-box approaches.
The project culminated in two key deliverables: a fully processed, ML-ready dataset that could be reused across pipelines, and a comprehensive Alternative Data Playbook to document our methods, decisions, and lessons learned. The enriched data and accompanying framework not only demonstrated the untapped value of alternative data for strategic decision-making but also laid the foundation for more agile, low-inertia experimentation within AB InBev’s data science ecosystem.
Mentor: Nick Eubank
