← Journal№ 002Field note

The five reports your CFO is wrong about

11 min readPublished June 1, 2026

The number that doesn't survive the question

The call usually comes the morning after a board meeting. The CMO put a revenue slide up and the CFO put a different one up, and for a minute the room wasn't talking about strategy any more, it was relitigating arithmetic. By Wednesday somebody books a discovery call. The first sentence of that call almost never survives a follow-up question, because the metric definitions behind the slide were never written down, and the lineage from raw source to deck was never traced. "Our blended CAC is forty-three dollars." "Our ROAS on Meta is four-point-two." Both are technically defensible, and neither is the same number twice. The claims have lineage nobody at the table can trace, and a CFO who has read the same dashboard three times eventually stops believing it.

The five reports your CFO is wrong about are almost never wrong because the metric itself is bad. They are wrong because the definition was never written down, and the lineage from raw source to slide was never traced. The dashboard is the receipt. The actual error is upstream, in the conversation that didn't happen the day a sales engineer hand-built the first attribution view in a notebook and emailed it to a marketing lead, who forwarded it to a CFO who put a number in a deck.

We get pulled into these rooms after the trust has already cracked. The board challenged a number, finance went off to reconcile it, the reconciliation took three weeks, the answer was technically defensible but the conversation never came back. Now the CMO is hedging every claim with "according to GA4" or "per the platform," and the CFO is rebuilding the math in a spreadsheet on Monday mornings. The wry truth is that a company can run for years on a dashboard nobody believes; it just cannot make decisions on one.

The pattern we keep finding is not five broken metrics. It is five reports where the definition has drifted, the lineage is opaque, and the people quoting them are quoting them from memory. What follows is the five we keep replacing in the first ninety days of a Diagnose engagement, and what we put in their place once the room finally agrees on what the numbers actually are.

The five we keep replacing

Blended CAC

Blended CAC reads as one of the most defensible numbers on a marketing dashboard, because it sounds like an average and averages feel honest. Total spend on the top, new customers on the bottom, divide. The problem is almost never the math. The problem is that "total spend" is being read from the ad platforms' billed dollars, "new customers" is being read from a Shopify or Stripe extract that uses a different date logic, and neither side has agreed on what a customer is. We have audited shops where the CMO is reading blended CAC against new email subscribers, the CFO is reading it against first-paid-order, and the analyst rebuilding the view every Monday is reading it against first-completed-shipment with returns netted in the original cohort. All three numbers are reasonable, and none of them is the same number. The CFO is wrong about blended CAC because the definition the CFO is defending is not the definition the CMO is reporting, and nobody in the building owns the gap.

Last-click ROAS

Every paid-media review meeting we sit through has the same moment. The Meta dashboard says four-point-two, the Google dashboard says three-point-eight, and the internal ROAS view, the one the head of growth quotes in the all-hands, says two-point-nine. Each is technically right inside its own attribution window. GA4's seven-day-click conversion view, the Meta Ads Manager view, and the warehouse-computed view are simply three different questions wearing the same name. The CFO is wrong about ROAS not because the platforms lie (although Meta CAPI's deduplication logic has its own honest opinions about which conversion to keep) but because the room is debating a number whose definition floats. Pin the definition to a specific touchpoint window, a specific revenue source, and a specific exclusion logic, and the same four sources begin to agree to within a couple of percent. That is usually the moment the meeting changes tone.

Session conversion rate

Session conversion rate is the metric that quietly drifts the furthest the fastest. GA4 changed its session boundary in 2024 and most teams never re-baselined. Engineering shipped a single-page-app refactor in the third quarter and the session count halved without anyone touching the goal definition. Marketing launched a paid push that drove low-intent traffic, conversion fell two points in a week, and an emergency meeting got called about a landing page that was working fine. Session conversion rate is rarely wrong as a calculation. It is just measuring a denominator that keeps changing for reasons unrelated to the funnel. The CFO is wrong about it because the CFO is reading a percentage and assuming the denominator is constant; meanwhile the data team is silently aware that the denominator moved twice this quarter, and that the chart is, strictly speaking, a chart of the denominator.

NPS roll-ups

NPS is the report that survives executive scrutiny the longest because the headline number is small, the calculation is simple, and the response volume is usually high enough to feel statistical. The roll-up is the part that breaks. We have seen NPS roll-ups quoted at the company level that were weighted by survey response volume, which over-indexes the small product lines that send the most surveys, and we have seen NPS roll-ups quoted at the company level that were weighted by revenue, which over-indexes the enterprise accounts and washes out the SMB cohort the product team is actually trying to fix. We have also seen NPS roll-ups quoted with no weighting scheme at all, which is its own answer. The CFO is wrong about NPS because the headline number is a weighted average whose weights nobody specified, and the weights are doing all of the work.

"Active users" without a definition

Every product dashboard has an "active users" line. We have audited products where active means "had a session in the last 28 days," products where active means "completed a billable event in the last 7 days," and products where active means "is not in a deactivated billing state and logged in once last quarter." A growth team will optimize against any of those and the number will move; a CFO will read the same line on a slide and assume it means the third one. We do not say it on the kickoff call, but the most expensive dashboard in any SaaS company is usually the one labelled "active users." The CFO is wrong about active users because the active-user line on the slide is one of three possible definitions, and the slide did not say which.

It's almost never the metric. It's the definition.

The instinct, when a number breaks, is to debate the methodology: move to a modeled attribution view, adopt Meta's view of the world, switch from GA4 to Segment. We have sat in rooms where the question on the table was "which BI tool should we buy" and the answer the team needed was "which definition of revenue does the company actually use." A new tool will not fix a metric whose definition floats; it will just provide a fourth version of the same disagreement.

A multi-brand CPG holding rebuilt their cross-channel attribution model three times in eighteen months. Each rebuild was technically more sophisticated than the last: last-click first, then a position-based hybrid, then a Markov-chain removal-effect model with per-channel calibration weights. The model improved each time; the arguments did not. By the time we were brought in, three brands inside the same holding company were running their media reviews against three different versions of the same model because nobody had written down which decisions the model was supposed to defend. The fix was not a fourth attribution model. It was a one-page measurement plan, signed by the brand leads and the CFO, naming the seven decisions the attributed-revenue number had to support each quarter and the smallest signal that would change course. We documented that work at /work/cross-channel-attribution; the current model has survived two more CMO changes.

A healthtech marketplace was reporting blended CAC payback of fourteen months while finance was reporting blended CAC payback of nine. The gap was twenty-eight percent on cohort revenue, and neither team thought the other was lying. Marketing was counting an "activated patient" as a completed signup. Finance was counting completed-net revenue, which excluded patients who cancelled in the trial window and netted returns into the cohort the patient was acquired in. Both were measuring something real, and neither had written the definition down. We rebuilt the cohort-economics mart at the grain of cohort by channel by month-since-acquisition, with cumulative net revenue against fully-loaded spend, and the first sponsor sign-off was a one-page metric contract on what an activated patient was. CAC payback compressed from fourteen months to six once two unprofitable channels got cut, but the actual deliverable was the page the CFO signed. The work lives at /work/marketplace-cohort-economics, and the model is still in production.

What both engagements have in common is the diagnosis. The metric was not the problem. The metric was a faithful implementation of a definition nobody had written down. Replace the definition with a signed one, trace the lineage from source to slide, and the metric mostly does the right thing on its own. We keep waiting for an engagement where the methodology really is the bottleneck; we have not had one yet.

Writing the metric contract

The artifact that does the work is unglamorous. It is a single markdown file, usually called metric_contract.md, that names every metric on every executive dashboard. Each entry carries the SQL source naming the dbt model the panel reads from, a business definition in English plain enough that a new finance hire can read it on day one, the refresh cadence set against decision cadence rather than compute cost, a named owner who is one human and not a Slack channel, the on-track and off-track thresholds written against historical baselines rather than ambition, and a dated sign-off line carrying the name of the executive now responsible for defending the number in the next board meeting.

We pair the contract with a traced lineage. Every metric in the contract gets walked from raw source through the staging layer, through the marts, into the BI panel where the executive reads it. Where the trace breaks, we mark it. Most metric contracts ship with a list of breakages in their first week, because the act of tracing is what surfaces them. dbt source freshness, configured against the SLA the contract names, then keeps the trace honest after we leave. When a source breaks, the named owner gets the alert. When the alert gets ignored, the contract has a follow-up clause.

We grade the resulting fix list by severity, because not every broken metric is a fire. P0 reserves itself for revenue-at-risk or actively-wrong numbers in front of an executive; P1 covers the metrics whose trust is eroding but where the bleeding is slow; P2 is polish, hygiene, future-proofing. Most engagements ship with three or four P0s, ten or twelve P1s, and a longer tail of P2s. The CFO defends the P0 list first. The data team owns the P1 list across the next quarter. The P2 list seeds the next round of work, or it doesn't, and that is also a defensible call.

This is the spine of every Diagnose engagement we run. Two weeks, a written audit, a signed metric contract, a severity-graded fix list, and an honest build-buy-kill verdict on every tool. The metrics on the slides stop being a private argument and start being a defensible artifact. The CFO stops being wrong about the five reports, not because the metrics changed, but because the room finally agreed which question each one is answering.

If you are recognising the meeting in this piece, the work to fix it is smaller than you think, and the surprise on the other side is that it is mostly clerical: write the definition, trace the lineage, sign the page. The five reports stop being wrong the same week. The longer arc, where this work fits in the maturity model that lets a stack go from broken to defensible, lives in our field guide.