How to Audit 12 Months of Broken UTM Data Before Your QBR
Your QBR is in a week and the channel numbers do not reconcile. Here is the fast, honest way to audit a year of UTM data before you present it.
Your quarterly business review is a week out, and the attribution slide does not reconcile. Paid social shows three different utm_source values, half your email is sitting in Unassigned, and the channel totals do not match what finance has. The data is not wrong about reality — it is fragmented, because twelve months of links were written twelve different ways.
Why twelve months of UTM data drifts#
Drift is not carelessness — it is the default behavior of any shared link process. Spreadsheets stop enforcing consistency once more than two people touch them, and a year is plenty of time for three contractors, a new hire, and an agency to each invent their own spelling for the same channel. SEMrush (2024) found that 67% of teams use UTMs but only 58% have a documented strategy, and 30% skip UTM markup in more than 30% of their campaigns. The gap between "we use UTMs" and "we use them the same way" is exactly where your QBR numbers go to die.
Two GA4 mechanics turn that gap into broken reports. First, GA4 is case-sensitive: email, Email, and EMAIL are three different mediums, so one channel quietly splits into three rows. Second, any hit GA4 cannot map to a channel lands in Unassigned — commonly 10–20% of sessions — which is where un-tagged and mis-tagged links pile up. When the data is this fragmented, as much as 26% of conversions can be credited to the wrong channel, which is the difference between renewing a budget and cutting it.
The point of the audit
You are not trying to fix every row. You are trying to find the three or four clusters that move the channel totals, fix those, and document the rule so they do not come back. That is what makes a year of data presentable in an afternoon.
The six-step pre-QBR UTM audit#
This is the same sequence whether you do it by hand or with a UTM audit tool. The difference is only speed: the manual version is a spreadsheet weekend, the automated version is a scan plus a review. The steps do not change.
- 1
Export your source/medium report
In GA4, open Reports → Acquisition → Traffic acquisition, set the date range to the last 12 months, add
session_source/session_medium/session_campaign, and export to CSV. This is your raw evidence — every spelling that actually reached your property. - 2
Grade it
Paste or upload the CSV and get an A–F health grade with a one-sentence reason. The grade gives you a number for the QBR ("we started the quarter at a C, we are at a B now") and tells you how deep the cleanup goes before you spend any time on it.
- 3
Cluster the variants
Review the issue clusters ranked by traffic impact. This is where the Anthropic Claude API groups
fb,Facebook,facebook, andmetainto one canonical channel and surfaces case duplicates and typos. The clustering is bounded and reviewable — it proposes, you decide. - 4
Approve the canonical taxonomy
Accept the suggested canonical values in one click — one lowercase token per source, medium, and campaign. Override anything that does not match how your team actually talks. Nothing is written to GA4 or your ad platforms; you are approving a map, not pushing changes.
- 5
Re-map the history
GA4 hits are immutable, so you collapse the duplicates where you report. Take the variant→canonical map into Looker Studio, BigQuery, or your QBR spreadsheet so
Facebookandfacebookfinally sum to one channel. New links go through a governed builder so the drift stops at the source instead of being re-cleaned next quarter. - 6
Document and share
Export the branded PDF health report and share the link before the meeting, so the channel owners see the same grade you do. Save the locked taxonomy as the team's reference. That document is what turns a one-time cleanup into a standard.
The grade is deliberately blunt because a QBR audience does not want a 40-tab workbook — they want to know whether the numbers can be trusted and what changed. A letter does that. Underneath it, the cluster list is where the actual GA4 UTM cleanup happens, and that is the next step to look at closely.
What the cluster step actually finds#
Here is what a year of drift looks like in the raw export. Every one of these reached GA4 as a separate row, and every one was meant to be the same two channels: paid Facebook and the monthly newsletter.
| What GA4 recorded | Canonical value | Why it split |
|---|---|---|
| facebook / Facebook / fb / meta | Case + nickname drift across four people | |
| cpc / CPC / paid-social | cpc | Case mismatch and a medium synonym |
| spring_sale / Spring-Sale / springsale | spring_sale | Separator and casing inconsistency |
| newsletter / Newsletter | newsletter | A single capital letter forks the source |
| email / Email / EMAIL | GA4 treats all three as different mediums |
Once the map is applied, the four Facebook rows become one and the two newsletter rows merge — and your QBR channel totals finally match. The deeper mechanics of why a single capital letter forks a channel are covered in our note on a UTM naming convention your whole team will actually follow.
Days of spreadsheet work, compressed honestly#
It is fair to ask what you actually save. The honest answer: the scan runs synchronously for up to roughly 500 links and returns a grade in about a minute, and the first public scan has no signup wall. You still spend a few minutes reading the clusters and deciding the canonical names — that judgment is yours, not the tool's. What disappears is the part nobody enjoys: building pivot tables, eyeballing thousands of rows, and arguing about whether paid-social and cpc are the same thing.
The spreadsheet weekend
- Manually pivot 12 months of rows by source and medium
- Eyeball thousands of values to spot case duplicates
- Argue over canonical names in a thread, with no record
- Re-do it next quarter because nothing was documented
A one-minute audit + a few clicks
- Paste the CSV → A–F grade and ranked clusters in ~60s
- Variants pre-grouped by traffic impact for review
- Approve canonical values once; export a branded PDF
- Governed builder + recurring audits keep it clean
26%
of conversions can be credited to the wrong channel when UTM data is fragmented
10–20%
of GA4 sessions commonly land in Unassigned
29%
better attribution accuracy after standardizing (Cometly, 2026)
Common mistakes that sink a pre-QBR cleanup#
- Trying to fix every row instead of the few clusters that move channel totals. Sort by traffic impact and stop when the curve flattens.
- Assuming GA4 will merge
Facebookandfacebookfor you. It will not — case-sensitivity is a feature, and you must canonicalize them yourself. - Cleaning the export but leaving the link process untouched, so the same variants reappear before the next QBR. Fix forward with a governed builder.
- Renaming sources in a way no teammate agreed to, then watching everyone ignore the new convention. Document the canonical map and share it.
- Forgetting Unassigned. Un-tagged links never show up in your UTM columns at all — you have to look at where those Unassigned sessions came from separately.
- Presenting a clean quarter with no plan to keep it clean. Pair the audit with automated drift detection so next quarter starts ahead.
Your pre-QBR UTM audit checklist
- Exported the last 12 months of source / medium / campaign from GA4 to CSV
- Ran the audit and recorded the starting A–F grade for the slide
- Reviewed every cluster ranked above ~5% of broken sessions
- Approved one canonical token per source, medium, and campaign
- Built the variant→canonical map and applied it in Looker / BigQuery / the QBR sheet
- Checked the Unassigned bucket for un-tagged campaigns
- Exported the branded PDF and shared the public report with channel owners
- Saved the locked taxonomy and routed new links through the governed builder
- Scheduled a recurring audit so drift is caught before the next review
How do I audit a year of UTM data before a QBR without spending the weekend on it?
Export the last 12 months of source/medium/campaign from GA4 as a CSV, then run a single audit instead of cleaning rows by hand. You get an A–F grade and the issue clusters ranked by traffic impact in about a minute, review the few clusters that matter, approve the canonical values, and apply that map in your reporting layer. The judgment is yours; the grinding is gone.
Can I clean up UTM data that is already in GA4?
Not in GA4 itself — historical hits are immutable, so Facebook and facebook stay as separate recorded values forever. The practical fix is to re-map the canonical values where you report (Looker Studio, BigQuery, or your QBR spreadsheet) so the duplicates sum correctly, and to stop the drift going forward with a governed link builder. The audit gives you the exact variant→canonical map to apply.
How long does a UTM audit actually take?
The scan runs synchronously for up to roughly 500 links and returns a grade in about a minute, with no signup wall on the first public scan. Reviewing the clusters and approving canonical names adds a few more minutes depending on how messy a year has been. The slow part of a manual cleanup — pivoting and eyeballing thousands of rows — is what the tool removes.
What counts as a good UTM health grade?
There is no universal pass mark, but a first audit of a year-old account often lands around a C: readable, with a few clusters splitting real channels and 10–20% of sessions in Unassigned. An A means your source/medium/campaign values are consistent and almost nothing is fragmenting. The grade is most useful as a trend — show the QBR you moved from a C to a B and that the channel totals can now be trusted.
Do I need a UTM audit tool, or can I just use a spreadsheet?
A spreadsheet works for a one-person, low-volume setup. It stops enforcing consistency the moment more than two people create links, which is why a year of shared work drifts. A dedicated tool clusters the variants for you, ranks them by traffic impact, exports a shareable report, and locks a taxonomy so the cleanup holds — the parts a spreadsheet cannot do on its own.
Grade your UTM data before the QBR
Paste your GA4 export and get an A–F health grade with ranked issue clusters in about a minute. No signup for the first scan.