UTM Audits

How to Audit 12 Months of Broken UTM Data Before Your QBR

Your QBR is in a week and the channel numbers do not reconcile. Here is the fast, honest way to audit a year of UTM data before you present it.

Jun 29, 2026 8 min readBy The UTM Drift Guard team
UTM Audits
igInstagraminstainstagram

Your quarterly business review is a week out, and the attribution slide does not reconcile. Paid social shows three different utm_source values, half your email is sitting in Unassigned, and the channel totals do not match what finance has. The data is not wrong about reality — it is fragmented, because twelve months of links were written twelve different ways.

Why twelve months of UTM data drifts#

Drift is not carelessness — it is the default behavior of any shared link process. Spreadsheets stop enforcing consistency once more than two people touch them, and a year is plenty of time for three contractors, a new hire, and an agency to each invent their own spelling for the same channel. SEMrush (2024) found that 67% of teams use UTMs but only 58% have a documented strategy, and 30% skip UTM markup in more than 30% of their campaigns. The gap between "we use UTMs" and "we use them the same way" is exactly where your QBR numbers go to die.

Two GA4 mechanics turn that gap into broken reports. First, GA4 is case-sensitive: email, Email, and EMAIL are three different mediums, so one channel quietly splits into three rows. Second, any hit GA4 cannot map to a channel lands in Unassigned — commonly 10–20% of sessions — which is where un-tagged and mis-tagged links pile up. When the data is this fragmented, as much as 26% of conversions can be credited to the wrong channel, which is the difference between renewing a budget and cutting it.

Case duplicates (Facebook vs facebook)
36%
of broken sessions
Source / medium swapped or blank
23%
of broken sessions
No UTM → Unassigned
17%
of broken sessions
Campaign typos & date suffixes
12%
of broken sessions
Stray spaces / un-encoded characters
8%
of broken sessions
Illustrative: in one 6-person team's audit, the broken sessions clustered in a handful of issue types. Yours will differ, but the shape rarely does — a few clusters carry most of the damage.

The point of the audit

You are not trying to fix every row. You are trying to find the three or four clusters that move the channel totals, fix those, and document the rule so they do not come back. That is what makes a year of data presentable in an afternoon.

The six-step pre-QBR UTM audit#

This is the same sequence whether you do it by hand or with a UTM audit tool. The difference is only speed: the manual version is a spreadsheet weekend, the automated version is a scan plus a review. The steps do not change.

  1. 1

    Export your source/medium report

    In GA4, open Reports → Acquisition → Traffic acquisition, set the date range to the last 12 months, add session_source / session_medium / session_campaign, and export to CSV. This is your raw evidence — every spelling that actually reached your property.

  2. 2

    Grade it

    Paste or upload the CSV and get an A–F health grade with a one-sentence reason. The grade gives you a number for the QBR ("we started the quarter at a C, we are at a B now") and tells you how deep the cleanup goes before you spend any time on it.

  3. 3

    Cluster the variants

    Review the issue clusters ranked by traffic impact. This is where the Anthropic Claude API groups fb, Facebook, facebook, and meta into one canonical channel and surfaces case duplicates and typos. The clustering is bounded and reviewable — it proposes, you decide.

  4. 4

    Approve the canonical taxonomy

    Accept the suggested canonical values in one click — one lowercase token per source, medium, and campaign. Override anything that does not match how your team actually talks. Nothing is written to GA4 or your ad platforms; you are approving a map, not pushing changes.

  5. 5

    Re-map the history

    GA4 hits are immutable, so you collapse the duplicates where you report. Take the variant→canonical map into Looker Studio, BigQuery, or your QBR spreadsheet so Facebook and facebook finally sum to one channel. New links go through a governed builder so the drift stops at the source instead of being re-cleaned next quarter.

  6. 6

    Document and share

    Export the branded PDF health report and share the link before the meeting, so the channel owners see the same grade you do. Save the locked taxonomy as the team's reference. That document is what turns a one-time cleanup into a standard.

A
B
C
D
F
C71 / 100C — a handful of clusters are splitting real channels
A first audit of a year-old account usually grades in the C range — readable, but leaking sessions to duplicates and Unassigned. The grade is the headline number you bring to the QBR.

The grade is deliberately blunt because a QBR audience does not want a 40-tab workbook — they want to know whether the numbers can be trusted and what changed. A letter does that. Underneath it, the cluster list is where the actual GA4 UTM cleanup happens, and that is the next step to look at closely.

What the cluster step actually finds#

Here is what a year of drift looks like in the raw export. Every one of these reached GA4 as a separate row, and every one was meant to be the same two channels: paid Facebook and the monthly newsletter.

Six links from one account — two real channels, six recorded variants.
?utm_source=facebook&utm_medium=cpc&utm_campaign=spring_sale
?utm_source=Facebook&utm_medium=CPC&utm_campaign=Spring-Sale
?utm_source=fb&utm_medium=paid-social&utm_campaign=spring_sale_2026
?utm_source=meta&utm_medium=cpc&utm_campaign=springsale
?utm_source=newsletter&utm_medium=email&utm_campaign=june
?utm_source=Newsletter&utm_medium=Email&utm_campaign=June_Send
facebookFacebookfbmeta
facebook
cpcCPCpaid-social
cpc
newsletterNewsletter
newsletter
emailEmailEMAIL
email
The cluster step collapses many spellings into one canonical token per channel — the map you approve in a click.
How the same six links read after the canonical map is applied in your reporting layer.
What GA4 recordedCanonical valueWhy it split
facebook / Facebook / fb / metafacebookCase + nickname drift across four people
cpc / CPC / paid-socialcpcCase mismatch and a medium synonym
spring_sale / Spring-Sale / springsalespring_saleSeparator and casing inconsistency
newsletter / NewsletternewsletterA single capital letter forks the source
email / Email / EMAILemailGA4 treats all three as different mediums

Once the map is applied, the four Facebook rows become one and the two newsletter rows merge — and your QBR channel totals finally match. The deeper mechanics of why a single capital letter forks a channel are covered in our note on a UTM naming convention your whole team will actually follow.

Days of spreadsheet work, compressed honestly#

It is fair to ask what you actually save. The honest answer: the scan runs synchronously for up to roughly 500 links and returns a grade in about a minute, and the first public scan has no signup wall. You still spend a few minutes reading the clusters and deciding the canonical names — that judgment is yours, not the tool's. What disappears is the part nobody enjoys: building pivot tables, eyeballing thousands of rows, and arguing about whether paid-social and cpc are the same thing.

The spreadsheet weekend

  • Manually pivot 12 months of rows by source and medium
  • Eyeball thousands of values to spot case duplicates
  • Argue over canonical names in a thread, with no record
  • Re-do it next quarter because nothing was documented

A one-minute audit + a few clicks

  • Paste the CSV → A–F grade and ranked clusters in ~60s
  • Variants pre-grouped by traffic impact for review
  • Approve canonical values once; export a branded PDF
  • Governed builder + recurring audits keep it clean
Same six steps, very different week. The judgment stays with you; the grinding goes away.

26%

of conversions can be credited to the wrong channel when UTM data is fragmented

10–20%

of GA4 sessions commonly land in Unassigned

29%

better attribution accuracy after standardizing (Cometly, 2026)

Common mistakes that sink a pre-QBR cleanup#

  • Trying to fix every row instead of the few clusters that move channel totals. Sort by traffic impact and stop when the curve flattens.
  • Assuming GA4 will merge Facebook and facebook for you. It will not — case-sensitivity is a feature, and you must canonicalize them yourself.
  • Cleaning the export but leaving the link process untouched, so the same variants reappear before the next QBR. Fix forward with a governed builder.
  • Renaming sources in a way no teammate agreed to, then watching everyone ignore the new convention. Document the canonical map and share it.
  • Forgetting Unassigned. Un-tagged links never show up in your UTM columns at all — you have to look at where those Unassigned sessions came from separately.
  • Presenting a clean quarter with no plan to keep it clean. Pair the audit with automated drift detection so next quarter starts ahead.

Your pre-QBR UTM audit checklist

  • Exported the last 12 months of source / medium / campaign from GA4 to CSV
  • Ran the audit and recorded the starting A–F grade for the slide
  • Reviewed every cluster ranked above ~5% of broken sessions
  • Approved one canonical token per source, medium, and campaign
  • Built the variant→canonical map and applied it in Looker / BigQuery / the QBR sheet
  • Checked the Unassigned bucket for un-tagged campaigns
  • Exported the branded PDF and shared the public report with channel owners
  • Saved the locked taxonomy and routed new links through the governed builder
  • Scheduled a recurring audit so drift is caught before the next review

How do I audit a year of UTM data before a QBR without spending the weekend on it?

Export the last 12 months of source/medium/campaign from GA4 as a CSV, then run a single audit instead of cleaning rows by hand. You get an A–F grade and the issue clusters ranked by traffic impact in about a minute, review the few clusters that matter, approve the canonical values, and apply that map in your reporting layer. The judgment is yours; the grinding is gone.

Can I clean up UTM data that is already in GA4?

Not in GA4 itself — historical hits are immutable, so Facebook and facebook stay as separate recorded values forever. The practical fix is to re-map the canonical values where you report (Looker Studio, BigQuery, or your QBR spreadsheet) so the duplicates sum correctly, and to stop the drift going forward with a governed link builder. The audit gives you the exact variant→canonical map to apply.

How long does a UTM audit actually take?

The scan runs synchronously for up to roughly 500 links and returns a grade in about a minute, with no signup wall on the first public scan. Reviewing the clusters and approving canonical names adds a few more minutes depending on how messy a year has been. The slow part of a manual cleanup — pivoting and eyeballing thousands of rows — is what the tool removes.

What counts as a good UTM health grade?

There is no universal pass mark, but a first audit of a year-old account often lands around a C: readable, with a few clusters splitting real channels and 10–20% of sessions in Unassigned. An A means your source/medium/campaign values are consistent and almost nothing is fragmenting. The grade is most useful as a trend — show the QBR you moved from a C to a B and that the channel totals can now be trusted.

Do I need a UTM audit tool, or can I just use a spreadsheet?

A spreadsheet works for a one-person, low-volume setup. It stops enforcing consistency the moment more than two people create links, which is why a year of shared work drifts. A dedicated tool clusters the variants for you, ranks them by traffic impact, exports a shareable report, and locks a taxonomy so the cleanup holds — the parts a spreadsheet cannot do on its own.

Grade your UTM data before the QBR

Paste your GA4 export and get an A–F health grade with ranked issue clusters in about a minute. No signup for the first scan.

Run a free audit