portfolio-details7.html Portfolio Details

Portfolio Details

Project information

  • Category: Data Engineering & Time Series Forecasting
  • Platform: Databricks Lakehouse
  • Project type: Academic / Professional Portfolio
  • Scope: End-to-end FX forecasting pipeline with validation

FX Lakehouse Forecasting

Overview:

This project implements an end-to-end Foreign Exchange (FX) forecasting solution using a Lakehouse architecture on Databricks. The pipeline ingests historical FX data via API, processes it through Bronze, Silver, and Gold layers, and applies time series models to generate six-month forecasts with full validation.

Objectives:

  • Design a scalable FX data pipeline using Delta Lake
  • Model trend and seasonality in historical FX rates
  • Generate short-term forecasts (6 months ahead)
  • Validate models using industry-standard metrics
  • Produce analytics-ready datasets for BI tools

Architecture:

  • Bronze: Raw FX rates ingested from Alpha Vantage API
  • Silver: Cleaned and standardised FX time series
  • Gold: Curated datasets optimised for ML and forecasting

Currency Pairs:

  • USD → BRL
  • USD → NZD
  • BRL → NZD (including synthetic reconstruction via USD pivot)

Modelling Approach:

  • ARIMA and SARIMAX models with weekly seasonality
  • Time-based train/test split
  • Forecast horizon of approximately 180 days

Results:

  • All models achieved MAPE below 3.5%
  • Strong performance across different FX dynamics
  • Validated forecasts aligned with real historical behaviour

Outputs:

  • Delta tables for historical, forecasted and validation data
  • Forecast plots with confidence intervals
  • Datasets ready for Power BI and Tableau integration

Technology Stack:

  • Databricks, Apache Spark, Delta Lake
  • Python, Pandas, NumPy
  • Statsmodels (ARIMA / SARIMAX)
  • Alpha Vantage API

Key Learnings:

  • Practical application of Lakehouse architecture
  • Correct temporal validation for time series forecasting
  • Integration of data engineering and statistical modelling
  • Design of auditable, production-oriented pipelines

Disclaimer: This project is for educational and analytical purposes only and does not constitute financial advice.

Full source code available on https://github.com/fmulato/Databricks