From spreadsheet chaos to Snowflake in 8 weeks: what actually worked
A practical account of migrating six data silos into a unified warehouse — the decisions, the surprises, and what we'd do differently.
Fatima Al-Rashid
Lead Data Engineer
Nov 20, 2024
7 min read
Eight weeks. Six source systems. Zero downtime. Here's what happened.
Source system archaeology
Before any data moves, you need to understand what you have. Six Shopify stores, a custom ERP with undocumented schema modifications, Google Analytics 4, three advertising platforms, and a warehouse management system. The ERP was the deepest rabbit hole — denormalised aggressively, with data stored as serialised JSON columns.
The source-of-truth problem
Revenue when the order is placed (Shopify) vs. revenue when the order is fulfilled (ERP) sounds trivial until you're producing a consolidated P&L. We spent a full week in workshops with the CFO just defining terms. These are business questions, not technical ones, and getting them right before you write a single dbt model separates a warehouse that gets used from one that gets ignored.
dbt: the right tool for the transformation layer
The layering convention: raw sources in staging, business logic in intermediate, business-facing marts on top. This means business logic changes happen in one place. When Orla changed how they calculate customer lifetime value, the update was a change to one dbt model — not a surgical operation across fifteen reports.
What we'd do differently
Write dbt tests alongside the models, not after. The tests we wrote caught two silent data quality issues that had been in source systems for months.
Tags
Author
Fatima Al-Rashid
Lead Data Engineer
More in Data
Want to work together?
We build the things we write about. Start with a 30-minute discovery call.
Book a call