Decoding the Dollar: Real-Time FX Forecasting with LLMs

MIDS logo
: BNY Mellon
: Banking
: 2026

This capstone project addresses a concrete decision-support problem in financial risk management: analysts face a high-volume stream of macroeconomic and market news, but only a small subset of articles contains actionable information for foreign -exchange risk hedging. The project’s final scope, as documented in the clean-up branch and the capstone presentation, is an interpretable pipeline that ingests news, extracts article content and publication time, enriches articles with macro context, classifies DXY -relevant events with a large language model, maps those predictions to minute -level U.S. Dollar Index moves, and evaluates whether “critical” classifications identify unusually large market reactions. The strongest evidence in the available materials supports five headline conclusions. First, the project converged on the U.S. Dollar Index (DXY) as the target variable because it provides a single composite proxy for USD strength against a six-currency basket and aligns directly with FX risk. Second, the clean-up branch implements an interpretable multi -stage pipeline rather than an end -to-end black box, with explicit event taxonomy, tiering, criticality labels, and direction chains. Third, the final prese ntation reports that Claude Haiku 4.5 was selected over Qwen 3 and Gemini Flash 2.5 using a human-reviewed set of 62 articles, balancing alignment, latency, and cost. Fourth, the team back -tested more than 2,300 articles over approximately August 2025 through March 2026. Fifth, the reported evaluation shows statistically significant separation between high -criticality and non -high articles, with larger absolute DXY moves for high-criticality classifications across horizons from 5 minutes through 1 day. At the same time, the repository shows that the system is still a strong prototype rather than a frictionless production package. The branch remains file -based, many paths are hard -coded, the orchestration and documentation are not fully synchronized, DXY intraday data must be staged manually, and some imported libraries do not appear in the root requirements file. The documentation below therefore presents the project as a rigorous capstone prototype with meaningful empirical results and clear architectura l logic, while explicitly marking items that remain unspecified or inconsistent in the available sources. Project Overview Introduction and problem statement. The project’s central problem formulation is that financial analysts face information overload, signal ambiguity, and missed hedging windows when trying to interpret large volumes of economic and policy news quickly enough for market -risk mitigation. The capstone presentation frames the project question as how to capture forward-facing signals from news in time to support hedging decisions, and the final branch operationalizes that goal as DXY event classification and response evaluation. Why DXY. The final presentation states that DXY was selected because it measures the value of the U.S. dollar against a basket of six major currencies, captures broad USD demand, and provides a single FX-impact signal instead of forcing the team to manage six indiv idual bilateral exchange rates. The same presentation lists the basket weights as EUR 57.6%, JPY 13.6%, GBP 11.9%, CAD 9.1%, SEK 4.2%, and CHF 3.6%, and argues that this makes DXY a reasonable target for macro-news-driven FX risk analysis. Background and related work. The available materials do not contain a conventi

Mentor: David Ye

Project poster (PDF)