When Every Dollar Counts: Rewriting Revenue From Source of Truth

How I fixed broken revenue calculations in TrueProfit's reportfns by abandoning incremental patches and recalculating from raw Shopify order data — and why a reconciliation job I built got rejected anyway

Introduction

Revenue is the first number a merchant looks at. Everything else — profit, ROAS, LTV — flows from it. If revenue is wrong, the entire analytics product is wrong. Merchants are making ad spend decisions, inventory decisions, hiring decisions based on numbers they trust to be correct.

At TrueProfit, we build a Shopify analytics app. Our core product is helping merchants understand their true profit after ad spend, COGS, shipping, and refunds. The centerpiece is the revenue figure shown on the dashboard: gross revenue, net revenue after refunds, revenue by channel.

For months, that number was slightly off. Not wildly wrong — close enough that merchants didn't immediately notice. But off. A few percentage points here, a discrepancy on a specific day there. The kind of wrong that erodes trust slowly, until one day a merchant with a spreadsheet emails support asking why their Shopify admin shows $48,230 but TrueProfit shows $47,110.

This is the story of how I found it, fixed it, and what I learned about building systems that handle financial data.

The Revenue Pipeline

TrueProfit's revenue data flows through several layers. Shopify sends webhooks when orders are created, updated, or refunded. We process these in reportfns — serverless Lambda functions responsible for aggregating order data into daily/weekly/monthly report summaries stored in MongoDB.

Revenue data flow — from Shopify webhook to merchant dashboard

The reportfns Lambda receives order events and incrementally updates a daily report document. When an order comes in: add to gross revenue. When a refund arrives: subtract from revenue, add to refunds. When an order is edited: apply the delta.

Simple in theory. The problem is that Shopify's order model is surprisingly complex, and the real world doesn't send events in the clean sequence you'd expect.

The Problem

When we started investigating, the first thing we did was build a comparison tool: pull the last 30 days of orders directly from the Shopify Admin API for a set of test shops, calculate what revenue should be, and compare against what our database stored.

The results were uncomfortable. ~6% of daily revenue figures were wrong — some over-counted, some under-counted. The errors weren't random noise. They clustered around specific patterns: shops with lots of refunds, shops using multi-currency, shops that frequently edit orders after creation.

Edge Cases Everywhere

Shopify's order model has more states than you'd think:

Partial refunds — a merchant refunds one line item out of five. Shopify fires a refunds/create event. But the refund object contains line item refunds, shipping refunds, and adjustments as separate fields. Easy to double-count or miss parts.
Order edits — Shopify added orders/edited webhooks relatively recently. Older webhook handlers don't know about this event type and silently drop it. The order's total changes but your database doesn't.
Multi-currency — Shopify presents prices in the shop's currency and the customer's currency. Refunds are issued in the customer's currency. If you convert at different times (order creation vs. refund creation), you get phantom currency gains/losses.
Timezone mismatches — Shopify timestamps are UTC. Merchants think about revenue in their local timezone. An order at 11:30pm UTC is Tuesday in Vietnam but Wednesday in New York. If you bucket by Shopify's UTC date, you'll disagree with the merchant's Shopify admin view which shows local time.
Test orders — Shopify marks some orders as test orders. They shouldn't count toward revenue. Easy to accidentally include them if you don't check the test field.
Cancelled orders with refunds — if an order is cancelled and fully refunded, should it appear in revenue at all? Shopify says yes (gross revenue minus refunds = 0). Some merchants expect it to not appear at all.

Accumulated Patches

The original reportfns code was written to handle the happy path: order comes in, add revenue. As edge cases were discovered over time, patches were applied on top. The result was something like this:

reportfns/handler.go — BEFORE (accumulated patches)

func handleOrderEvent(ctx context.Context, event OrderEvent) error {
    order := event.Order

    // Original logic
    revenue := order.TotalPrice

    // Patch #1 (3 months later): subtract refunds
    if order.TotalRefunded > 0 {
        revenue -= order.TotalRefunded
    }

    // Patch #2 (5 months later): handle currency
    // TODO: this is wrong for multi-currency, fix later
    if order.Currency != shopCurrency {
        revenue = convertCurrency(revenue, order.Currency, shopCurrency)
    }

    // Patch #3 (7 months later): someone noticed test orders
    if order.Test {
        return nil
    }

    // Patch #4 (8 months later): cancelled orders
    if order.CancelledAt != nil && order.TotalRefunded >= order.TotalPrice {
        // skip? or not? unclear...
        // return nil
    }

    // Update the daily report
    return upsertDailyReport(ctx, shopID, orderDate(order), revenue)
}

Notice the commented-out return, the TODO that never got fixed, the patch applied after the currency conversion that means test orders after the currency conversion are still excluded but test orders before aren't. This code had been touched by multiple engineers over 18 months. Nobody held a full mental model of what it actually did.

Worse: each patch was applied as an incremental delta handler. There was no mechanism to go back and recalculate old data when a bug was fixed. Every fix only applied going forward. Old reports stayed wrong.

Investigation

Tracing the Discrepancy

I started by writing a diagnostic tool that compared our stored revenue against a freshly calculated figure for a given shop and date range. The approach:

Fetch all orders for the shop directly from Shopify Admin API (paginated)
Apply our revenue calculation rules to those raw orders
Compare against what MongoDB stored
Log discrepancies with the specific order IDs involved

tools/revenue_audit.go

type AuditResult struct {
    ShopID       string
    Date         time.Time
    StoredRevenue  float64
    ActualRevenue  float64
    Delta          float64
    DeltaPct       float64
    DiscrepantOrders []string
}

func auditShopRevenue(ctx context.Context, shopID string, from, to time.Time) ([]AuditResult, error) {
    // Fetch raw orders from Shopify (source of truth)
    orders, err := shopifyClient.GetOrdersInRange(ctx, shopID, from, to)
    if err != nil {
        return nil, fmt.Errorf("fetch orders for %s: %w", shopID, err)
    }

    // Group by local date (using shop's timezone)
    shop, _ := shopRepo.Get(ctx, shopID)
    loc, _ := time.LoadLocation(shop.Timezone)

    byDate := make(map[string][]ShopifyOrder)
    for _, o := range orders {
        localDate := o.CreatedAt.In(loc).Format("2006-01-02")
        byDate[localDate] = append(byDate[localDate], o)
    }

    var results []AuditResult
    for dateStr, dayOrders := range byDate {
        actual := calculateRevenueFromOrders(dayOrders, shop)
        stored, _ := reportRepo.GetDailyRevenue(ctx, shopID, dateStr)

        delta := actual - stored
        if math.Abs(delta) > 0.01 { // ignore float rounding < 1 cent
            results = append(results, AuditResult{
                ShopID:        shopID,
                Date:          parseDate(dateStr),
                StoredRevenue: stored,
                ActualRevenue: actual,
                Delta:         delta,
                DeltaPct:      delta / actual * 100,
            })
        }
    }
    return results, nil
}

Root Causes

Running the audit across 50 shops over 90 days surfaced four distinct root causes:

Root Cause	Frequency	Impact
Timezone bucketing (UTC vs. shop local time)	All shops	Revenue moved across day boundaries, net zero but days wrong
Partial refund double-counting	~40% of shops	Refunds subtracted twice when `refunds/create` event arrived after `orders/updated`
Multi-currency conversion timing	~25% of shops	Order converted at creation rate, refund converted at different rate
Order edit events dropped	~15% of shops	`orders/edited` webhook not registered; post-creation order changes lost

The timezone issue was the most pervasive. Every shop was affected, but in a way that averaged out over the month (orders shift between days, not disappear). The partial refund double-count was the most damaging in absolute dollar terms.

The Fix

The Source of Truth Approach

The key insight was this: incremental event processing is the wrong model for financial data.

Incremental updates work fine when each event carries the full delta and events arrive in order and are never duplicated. Shopify's webhook delivery satisfies none of these properties. Webhooks can arrive out of order. They can be retried (duplicate delivery). A single order state change can generate multiple overlapping events. And there's no guaranteed ordering between orders/updated and refunds/create for the same refund.

The fix was to change the model entirely: instead of applying deltas, recalculate from the raw order state.

When any order event arrives for a shop on a given day, we don't try to figure out what changed. We:

Fetch the current state of all orders for that shop on that day from Shopify
Calculate revenue from scratch using a pure function
Write the result, overwriting whatever was there before

This is idempotent. It's safe to re-run. It handles out-of-order events naturally because we don't care about order — we care about the final state. And it will automatically correct any previous miscalculation.

flowchart TD E["Order Event Arrives\n(create/update/refund/edit)"] D["Extract shop_id + date\n(using shop timezone)"] F["Fetch ALL orders for\nshop + date from Shopify API"] C["calculateRevenue(orders)\npure function, no side effects"] W["Write to MongoDB\n(upsert, idempotent)"] E --> D --> F --> C --> W

New approach: recalculate from source of truth on every event

Implementation

The core of the new implementation is a pure calculateRevenue function that takes a slice of Shopify orders and returns the revenue breakdown for that day. No database calls. No side effects. Easy to test.

reportfns/revenue.go — AFTER

// RevenueResult is the daily revenue breakdown calculated from raw orders.
// All values are in the shop's currency.
type RevenueResult struct {
    GrossRevenue   float64
    TotalRefunds   float64
    NetRevenue     float64
    OrderCount     int
    RefundCount    int
}

// CalculateRevenue computes revenue from a slice of raw Shopify orders.
// Pure function: no I/O, no side effects, safe to call multiple times.
func CalculateRevenue(orders []ShopifyOrder, shopCurrency string) RevenueResult {
    var result RevenueResult

    for _, order := range orders {
        // Skip test orders — they don't represent real revenue
        if order.Test {
            continue
        }
        // Skip orders that are fully cancelled AND fully refunded — they net to zero
        // and pollute order counts without representing real business activity
        if order.CancelledAt != nil && isFullyRefunded(order) {
            continue
        }

        // Convert order total to shop currency at the exchange rate
        // recorded at the time of order creation.
        // CRITICAL: always use order.PresentmentCurrency + order.ExchangeRate
        // never re-fetch current exchange rates — that introduces drift.
        grossInShopCurrency := toShopCurrency(order.TotalPrice, order.PresentmentCurrency, order.ExchangeRate, shopCurrency)
        result.GrossRevenue += grossInShopCurrency
        result.OrderCount++

        // Calculate refunds: sum all refund transactions
        // DO NOT use order.TotalRefunded — it can be stale on webhook payloads.
        // Walk order.Refunds[] directly.
        for _, refund := range order.Refunds {
            refundAmt := refundAmount(refund)
            refundInShopCurrency := toShopCurrency(refundAmt, order.PresentmentCurrency, order.ExchangeRate, shopCurrency)
            result.TotalRefunds += refundInShopCurrency
            result.RefundCount++
        }
    }

    result.NetRevenue = result.GrossRevenue - result.TotalRefunds
    return result
}

// isFullyRefunded returns true if the order's refunds cover its entire price.
func isFullyRefunded(order ShopifyOrder) bool {
    var totalRefunded float64
    for _, r := range order.Refunds {
        totalRefunded += refundAmount(r)
    }
    // Use a small epsilon to handle float precision
    return totalRefunded >= order.TotalPrice-0.01
}

// refundAmount returns the net amount refunded in an order's presentment currency.
// It sums line item refunds + shipping refunds - restocking fees.
func refundAmount(refund ShopifyRefund) float64 {
    var total float64
    for _, t := range refund.Transactions {
        if t.Kind == "refund" && t.Status == "success" {
            total += t.Amount
        }
    }
    return total
}

The handler becomes much simpler. It doesn't need to understand what changed in the event — it just triggers a recalculation:

reportfns/handler.go — AFTER

func handleOrderEvent(ctx context.Context, event OrderEvent) error {
    shop, err := shopRepo.Get(ctx, event.ShopID)
    if err != nil {
        return fmt.Errorf("get shop %s: %w", event.ShopID, err)
    }

    // Determine the affected date in the shop's local timezone
    loc, err := time.LoadLocation(shop.Timezone)
    if err != nil {
        return fmt.Errorf("load timezone %s: %w", shop.Timezone, err)
    }
    affectedDate := event.OccurredAt.In(loc).Truncate(24 * time.Hour)

    // Fetch current order state from Shopify (source of truth)
    orders, err := shopifyClient.GetOrdersForDate(ctx, event.ShopID, affectedDate, loc)
    if err != nil {
        return fmt.Errorf("fetch orders for %s on %s: %w", event.ShopID, affectedDate.Format("2006-01-02"), err)
    }

    // Calculate revenue from scratch — pure, idempotent
    rev := CalculateRevenue(orders, shop.Currency)

    // Upsert the daily report — overwrite, don't accumulate
    return reportRepo.UpsertDailyRevenue(ctx, UpsertRevenueParams{
        ShopID:       event.ShopID,
        Date:         affectedDate,
        GrossRevenue: rev.GrossRevenue,
        TotalRefunds: rev.TotalRefunds,
        NetRevenue:   rev.NetRevenue,
        OrderCount:   rev.OrderCount,
        RefundCount:  rev.RefundCount,
        RecalcAt:     time.Now(),
    })
}

Notice what's gone: no conditionals on event type, no delta logic, no if order.Test scattered around, no TODO comments. The handler is 30 lines. The complexity lives in CalculateRevenue which is a pure function with unit tests covering every edge case.

Edge Case Handling

Because we now have a pure function with full control over the calculation, edge cases become explicit and testable:

reportfns/revenue_test.go — table-driven tests

func TestCalculateRevenue(t *testing.T) {
    tests := []struct {
        name     string
        orders   []ShopifyOrder
        currency string
        want     RevenueResult
    }{
        {
            name: "simple order, no refunds",
            orders: []ShopifyOrder{
                {TotalPrice: 100.0, PresentmentCurrency: "USD", ExchangeRate: 1.0},
            },
            currency: "USD",
            want: RevenueResult{GrossRevenue: 100.0, NetRevenue: 100.0, OrderCount: 1},
        },
        {
            name: "partial refund — only refunded transactions, not order.TotalRefunded",
            orders: []ShopifyOrder{
                {
                    TotalPrice:          100.0,
                    PresentmentCurrency: "USD",
                    ExchangeRate:        1.0,
                    Refunds: []ShopifyRefund{{
                        Transactions: []RefundTransaction{
                            {Kind: "refund", Status: "success", Amount: 30.0},
                        },
                    }},
                },
            },
            currency: "USD",
            want: RevenueResult{GrossRevenue: 100.0, TotalRefunds: 30.0, NetRevenue: 70.0, OrderCount: 1, RefundCount: 1},
        },
        {
            name: "test order — excluded from revenue",
            orders: []ShopifyOrder{
                {TotalPrice: 200.0, Test: true},
            },
            currency: "USD",
            want:     RevenueResult{},
        },
        {
            name: "fully cancelled and refunded — excluded",
            orders: []ShopifyOrder{
                {
                    TotalPrice:   50.0,
                    CancelledAt:  ptr(time.Now()),
                    Refunds: []ShopifyRefund{{
                        Transactions: []RefundTransaction{
                            {Kind: "refund", Status: "success", Amount: 50.0},
                        },
                    }},
                },
            },
            currency: "USD",
            want:     RevenueResult{},
        },
        {
            name: "multi-currency: USD order in EUR shop",
            orders: []ShopifyOrder{
                {
                    TotalPrice:          100.0,
                    PresentmentCurrency: "USD",
                    ExchangeRate:        0.92, // 1 USD = 0.92 EUR at order time
                },
            },
            currency: "EUR",
            want: RevenueResult{GrossRevenue: 92.0, NetRevenue: 92.0, OrderCount: 1},
        },
    }

    for _, tt := range tests {
        t.Run(tt.name, func(t *testing.T) {
            got := CalculateRevenue(tt.orders, tt.currency)
            if !almostEqual(got.GrossRevenue, tt.want.GrossRevenue) {
                t.Errorf("GrossRevenue: got %.2f, want %.2f", got.GrossRevenue, tt.want.GrossRevenue)
            }
            // ... other field checks
        })
    }
}

These tests are the real value. Before the rewrite, there were zero unit tests for revenue calculation. Now there are 23 test cases covering every edge case we found in production.

The Reconciliation Job

Design

After the fix was deployed and the numbers started looking correct, I wanted a safety net. A recurring job that would proactively compare our stored revenue against Shopify's numbers and alert if drift appeared.

The design was straightforward: a daily Lambda that runs on a cron, picks a sample of active shops, fetches their last 7 days of orders from Shopify, recalculates revenue, and compares against our database. Any discrepancy above 0.1% triggers a Slack alert with the shop ID and affected dates.

reconciliation/job.go

type ReconciliationJob struct {
    shopRepo     ShopRepository
    reportRepo   ReportRepository
    shopifyClient ShopifyClient
    alerter      Alerter
}

func (j *ReconciliationJob) Run(ctx context.Context) error {
    // Sample 5% of active shops per run — full scan is too expensive
    shops, err := j.shopRepo.GetActiveSample(ctx, 0.05)
    if err != nil {
        return fmt.Errorf("sample active shops: %w", err)
    }

    var alerts []DiscrepancyAlert
    for _, shop := range shops {
        discrepancies, err := j.checkShop(ctx, shop, 7) // last 7 days
        if err != nil {
            // Log but don't fail the whole job — one shop's error shouldn't block others
            log.Error("reconciliation check failed", "shop_id", shop.ID, "err", err)
            continue
        }
        alerts = append(alerts, discrepancies...)
    }

    if len(alerts) > 0 {
        return j.alerter.Send(ctx, alerts)
    }
    return nil
}

func (j *ReconciliationJob) checkShop(ctx context.Context, shop Shop, days int) ([]DiscrepancyAlert, error) {
    loc, _ := time.LoadLocation(shop.Timezone)
    now := time.Now().In(loc)

    var alerts []DiscrepancyAlert
    for d := 0; d < days; d++ {
        date := now.AddDate(0, 0, -d).Truncate(24 * time.Hour)

        orders, err := j.shopifyClient.GetOrdersForDate(ctx, shop.ID, date, loc)
        if err != nil {
            return nil, fmt.Errorf("fetch orders for day -%d: %w", d, err)
        }

        actual := CalculateRevenue(orders, shop.Currency)
        stored, err := j.reportRepo.GetDailyRevenue(ctx, shop.ID, date)
        if err != nil {
            return nil, fmt.Errorf("get stored revenue: %w", err)
        }

        if discrepancyPct(actual.NetRevenue, stored.NetRevenue) > 0.001 {
            alerts = append(alerts, DiscrepancyAlert{
                ShopID:   shop.ID,
                Date:     date,
                Actual:   actual.NetRevenue,
                Stored:   stored.NetRevenue,
                DeltaPct: discrepancyPct(actual.NetRevenue, stored.NetRevenue),
            })
        }
    }
    return alerts, nil
}

Why It Got Rejected

I was fairly proud of this. Clean code, proper sampling, alerting on drift. I opened the PR and waited for approval.

It got rejected. Not because of a technical flaw. Not because of operational trade-offs. The tech lead simply didn't agree with the approach — he believed the existing incremental logic could be patched further, and that a full reconciliation was overkill.

I explained the context. The incremental approach had accumulated 18 months of patches — each one fixing a symptom while introducing new edge cases. The root problem was architectural: you can't reliably compute financial totals from deltas when the deltas themselves are unreliable (missed webhooks, out-of-order events, partial data).

The response was firm: rejected. I closed the PR and moved on.

What Happened Next

Over the next two weeks, the tech lead attempted to fix the revenue accuracy himself using the incremental patching approach. What followed was... educational.

The public status page tells the story. Over those two weeks:

13 production releases deployed trying to fix cascading issues
Ghost/phantom records inflated merchant sales by $5K–$20K per shop
357,000 historical records had to be backfilled due to missing refund timestamps
Duplicate report entries created by race conditions in the patched sync logic
Discount allocation gaps introduced by yet another incremental patch

Each patch fixed one symptom and introduced another. The exact failure mode I had warned about.

After two weeks of firefighting, the team quietly adopted the same approach I had proposed: recalculate from source of truth. Fetch raw order data from Shopify, compute totals from scratch, replace the stored values. The status page even mentions "a comprehensive redesign leveraging Shopify's Agreements API" — which is essentially what my PR did, with a different API surface.

I'm not bitter about it. This happens in engineering teams more often than we like to admit. Sometimes the decision-maker doesn't have full context. Sometimes ego gets in the way of evaluating a solution on its technical merits. The important thing is: the right solution won in the end. It just took two extra weeks of production pain to get there.

Results

After deploying the source-of-truth rewrite and running the audit tool across all shops to backfill corrected figures:

Metric	Before	After
Revenue accuracy (vs. Shopify admin)	~94%	100%*
Days with discrepancy >0.1%	~6% of shop-days	<0.03% (float precision)
Partial refund double-counting	Affected 40% of shops	Eliminated
Multi-currency drift	Unpredictable	Zero (fixed-rate at order time)
Order edit events handled	No	Yes (recalc on any event)
Unit test coverage on revenue calc	0 tests	23 test cases
Code complexity (revenue handler)	~120 LOC, 6 patches	30 LOC handler + pure function

*100% accuracy assumes Shopify webhooks are delivered correctly and MongoDB is consistent. The calculation itself is deterministic — given the same input data, it always produces the correct result. If a webhook is missed or the DB has stale data, that's an infrastructure problem, not a calculation problem.

The merchant support tickets about revenue discrepancies dropped to zero within two weeks of the fix. That's the real signal.

Key Takeaways

Financial data must recalculate from source of truth

Incremental delta-based updates are fragile for financial data. Out-of-order events, duplicates, and state mutations break delta logic in ways that are hard to detect and impossible to backfill. Recalculate from the authoritative source whenever anything changes.

Idempotency is more valuable than efficiency for correctness-critical paths

Yes, fetching all orders for a day on every event is more expensive than applying a delta. But it's safe to retry, safe to run twice, and automatically self-corrects. The cost difference is small. The correctness difference is everything.

Pure functions for financial calculations

A pure function with no I/O is the only way to confidently unit test financial logic. If your revenue calculation has database calls embedded in it, you can't test edge cases without mocking half your infrastructure. Extract the math into a pure function first.

Timezone handling is never trivial

UTC timestamps and merchant local time diverge at day boundaries. Always convert to the shop's timezone before bucketing into report periods. Never assume UTC. This is the most common and most silent revenue bucketing bug.

Don't trust webhook payload fields for financial totals

Shopify's order.TotalRefunded can be stale on webhook payloads — it reflects the state at webhook delivery time, not the true current state. Walk the order.Refunds[] array and sum transactions yourself. Same applies to order.TotalPrice after an order edit.

Fix old data, don't just fix going forward

A bug fix that only applies to new events leaves months of wrong data in the database. Build the backfill into the fix from day one. Our audit tool became the backfill tool — same calculation logic, run against historical data.

The right solution wins eventually — even if it gets rejected first

My source-of-truth approach was rejected. Two weeks and 13 production releases later, the team adopted the same approach. Don't let ego — yours or someone else's — delay the correct fix. Defend your solution with data, and if you're overruled, let reality do the convincing.

Timeline

Week 1

Revenue discrepancies reported by merchants.

Support tickets: "TrueProfit shows different revenue than Shopify admin." Investigation begins.

Week 1–2

Audit tool built and run across 50 shops.

~6% of shop-days show discrepancy >0.1%. Four root causes identified: timezone bucketing, partial refund double-counting, multi-currency drift, dropped order edits.

Week 2

Decision: patch vs. rewrite.

Team discussion. Decided to rewrite the revenue handler around source-of-truth recalculation rather than adding more patches. Estimated 1 week of work.

Week 3

Rewrite implemented and tested.

Pure CalculateRevenue function written with 23 unit tests. New handler: 30 LOC. All four root causes handled.

Week 3

Deployed to production + historical backfill.

Audit tool repurposed to backfill 90 days of corrected revenue data across all shops. Revenue accuracy: ~94% → 100% (given correct webhook delivery & DB consistency).

Week 4

Source-of-truth approach + reconciliation job PR opened.

Complete solution: recalculate from raw Shopify data + proactive drift detection. Clean implementation, full test coverage.

Week 4

PR rejected by tech lead.

Tech lead chose to continue with incremental patching approach instead. PR closed.

Week 5–6

Production incident: 13 releases, ghost records, 357K backfills.

Incremental patches caused cascading failures. Ghost records inflated sales $5K–$20K per shop. Public status page documented the chaos.

Week 7

Team adopts source-of-truth approach.

The same recalculation-from-raw-data strategy from the rejected PR. Revenue accuracy restored to 100%.

Week 8+

Zero merchant revenue discrepancy tickets.

Revenue numbers stable. Trust rebuilt. System running correctly on idempotent source-of-truth recalculation.