Published 21 May 2026

Finding the gaps in your CRM before you enrich

Last updated: 21 May 2026

A CRM gap analysis before enrichment counts five things: completeness (what percentage of records have each key field), validity (how many fields are well-formed but not stale), match-key sufficiency (whether records hold enough data to match against an external reference file), duplicate volume, and field-by-field decay rate. A typical UK B2B CRM has 30% to 60% email completeness, 15% to 40% direct-dial completeness, and 5% to 15% duplicate volume. Run the analysis before requesting a quote so suppliers can price the work accurately.

Key points

Why run a gap analysis before you ask for a quote?

Most enrichment projects go wrong in the scoping phase, not the delivery phase. A sales director forwards a CRM export and asks for "direct dials and emails appended to everything." The supplier runs a match pass, returns a file with 28% coverage, and the buyer is surprised. The issue was not the supplier's data quality. It was that the buyer never checked how many records were matchable in the first place.

A gap analysis takes two to four hours of analyst time and prevents that conversation entirely. It also changes the economics. If your CRM holds 40,000 contacts but only 22,000 have enough identity data to match, you are pricing 22,000 enrichments, not 40,000. The project cost drops sharply, and the matched output is higher quality because you are not returning partial appends against weak records.

There is a compliance angle too. Under UK GDPR, data enrichment is a form of data processing. Knowing what fields are missing from your records, and documenting that you identified the gap, forms part of the paper trail that supports your Legitimate Interests Assessment (LIA) for B2B contacts.

What are the five dimensions of a CRM gap analysis?

1. Field completeness

Completeness is the simplest dimension: for each field, what percentage of records carry a non-null value? Count nulls, blanks, and placeholder values ("N/A", "Unknown", "-") as missing. A completeness score of 55% on the email field means 45% of your contacts cannot be reached by email from your own data today.

Fields worth measuring in any UK B2B CRM audit, in rough order of commercial impact:

2. Field validity

A field is populated but not necessarily valid. An email address of the format [email protected] is structurally sound, but if the individual left the company 18 months ago it is functionally worthless. Validity checking has two layers: format validation (does the value look like an email, a UK phone number, a correctly formatted postcode?) and freshness validation (when was this field last verified, and does the underlying contact still hold the role?).

Format validation is quick. Run a regex against email addresses to flag anything without an @ symbol or a valid top-level domain. For UK telephone numbers, check that the value is 11 digits and starts with 01, 02, 03, 07, or 08. For postcodes, validate against the Royal Mail postcode format. A field that fails format validation is as useless as a null field for enrichment matching.

3. Match-key sufficiency

Match-key sufficiency is the most underappreciated dimension. An enrichment supplier matches your records against their reference file using one or more identity signals. The richness of those signals determines whether a match is possible at all, and how confident the match result will be.

For UK B2B enrichment, the match-key hierarchy looks like this:

Count how many records fall into each tier. The strong plus good tier is your realistic enrichable universe. The weak and unmatchable tiers are where you either do manual research or accept a data gap permanently.

4. Duplicate volume

Duplicates inflate your apparent CRM size and waste enrichment budget. A record appearing three times in your system means you pay three times for the append, then merge the records later and discard two of the enriched copies. For a CRM of 30,000 records with a 12% duplicate rate, that is 3,600 wasted enrichment credits before you start.

Run two passes. The first is deterministic: flag every pair of records that share the same email address or the same phone number. These are almost always duplicates. The second pass is probabilistic: flag pairs where the normalised first name, last name, and company name match above a threshold score (typically Levenshtein distance below 2 on each field). The probabilistic pass catches records entered with slight spelling variations or nickname differences, such as "Rob Smith at Acme" and "Robert Smyth at Acme Ltd".

5. Field-by-field decay rate

Decay rate is the hardest dimension to measure from within your CRM alone, but it is critical for understanding how quickly you need to re-enrich after the first pass. UK B2B contact data decays at roughly 25% to 35% per year, driven primarily by job moves, redundancies, company restructures, and domain changes.

To estimate your current decay rate without sending a campaign, cross-reference your CRM against a reference dataset (or run a validation pass through an email verification service or telephone verification service). The proportion of records that come back as invalid gives you a point-in-time decay estimate. Compare that figure against the last-verified date on your records to calculate an annualised rate. A CRM where 30% of emails come back as invalid but records were last verified two years ago implies a 15% annual decay rate, which is lower than average and suggests good CRM hygiene historically.

How to run the analysis: SQL, export, and audit

If your CRM exports to a flat file, most of the analysis is a set of SQL queries or pivot-table operations. For a Salesforce or HubSpot export, a single spreadsheet with one row per contact and one column per field is enough to run completeness and format-validity checks in under an hour.

Completeness query pattern

For each field you want to audit, the logic is:

COUNT(records where field is not null and not blank) / COUNT(all records) * 100

In SQL on an exported table called crm_contacts:

SELECT
  COUNT(*) AS total_records,
  ROUND(100.0 * COUNT(email) / COUNT(*), 1) AS email_completeness_pct,
  ROUND(100.0 * COUNT(phone_direct) / COUNT(*), 1) AS phone_completeness_pct,
  ROUND(100.0 * COUNT(job_title) / COUNT(*), 1) AS job_title_completeness_pct,
  ROUND(100.0 * COUNT(linkedin_url) / COUNT(*), 1) AS linkedin_completeness_pct
FROM crm_contacts
WHERE email != '' AND email IS NOT NULL;

Adjust for your field names and for the placeholder values your team uses (filter them the same way as nulls). If you are working in Excel rather than SQL, a COUNTIF formula with a "not blank" condition gives the same result per column.

Duplicate detection without specialist tools

Sort the export by email address, then use a COUNTIF to flag any email that appears more than once. Do the same by phone number. Then sort by company name plus last name and eyeball the first 500 rows. You will find most of the high-confidence duplicates in that pass. For a CRM of over 20,000 contacts, a proper deduplication tool or a short data-cleansing commission from your enrichment supplier is worth the cost.

CRM fields audit: typical UK B2B benchmarks

The table below gives benchmark completeness ranges for UK B2B CRMs that have not been enriched in the past 24 months. Your figures may sit outside these ranges depending on how thoroughly your sales team captures data at point of entry, and how much inbound demand you receive versus outbound prospecting.

CRM field Typical completeness (UK B2B) Enrichable? Notes
Company name 90%+ No (usually present) Needed as match key; low absence rate
Postcode / county 65%–85% Yes, via Companies House Can be derived from registered address
UK SIC 2007 code 20%–50% Yes, via Companies House Often blank if CRM was not set up to capture it
Job title 40%–65% Yes Quality varies; titles need normalisation
Business email 30%–60% Yes Often the highest-value field to enrich
Direct-dial number 15%–40% Yes DDIs are harder to source than switchboard numbers
Mobile number 10%–30% Yes Business mobiles increasingly available via public sources
LinkedIn URL 10%–25% Yes High commercial value for ABM targeting
Employee count / revenue 25%–55% Yes, via Companies House Annual filed accounts give employee and turnover bands
Duplicate records 5%–15% of total Resolve before enriching Higher in CRMs with multiple data-entry points

In our experience, the gap between what sales teams think their CRM completeness is and what the audit reveals is usually 15 to 25 percentage points. Email completeness feels like 70% because everyone enters an email at the point a deal is opened, but the CRM also holds thousands of older contacts, imported list segments, and inbound leads that were never fully qualified.

How the gap analysis shapes enrichment scope and price

Once you have your five dimensions measured, you can translate them into an enrichment brief. The brief answers four questions:

  1. Which fields need appending? Prioritise by commercial impact. Direct dials and business emails drive the highest return on enrichment cost for outbound campaigns.
  2. How many records are in scope? Your matchable universe (strong and good match-key tiers only). Give this number to your supplier, not the total CRM size.
  3. What is the realistic match rate? For a UK B2B file enriched against a well-sourced reference dataset, expect 40% to 75% match on email and 25% to 55% on direct dial, depending on job seniority and industry sector. C-suite contacts in regulated sectors typically have lower match rates than mid-level roles in sectors with high staff turnover.
  4. What fields need validation rather than appending? Records that already carry a value but may be stale need a verification pass, not an append. Email verification and telephone verification are priced differently from append work and should be scoped separately.

Supplying this brief means your supplier can provide a fixed price rather than an estimate. It also means you can compare quotes accurately because every supplier is working from the same scope. Without it, one supplier may quote on total CRM size and another on matchable records, and the numbers will be incomparable.

For more on what enrichment delivers commercially once the gaps are filled, see our piece on CRM enrichment ROI for UK B2B teams.

Deduplicating before enriching: the order matters

The order of operations is deduplication first, then gap analysis, then enrichment. Running enrichment before deduplication is a common mistake and a costly one. Consider a CRM of 25,000 contacts with 2,500 duplicates. You commission an enrichment project, pay for 25,000 appends, and the supplier returns a file with 18,000 enriched records. You then run your deduplication pass and merge 2,200 pairs down to 1,100 records, discarding the duplicate enriched copy each time. You have paid for 1,100 enrichments you immediately threw away.

Deduplication also improves match rate. When a supplier's matching algorithm encounters two records for the same individual, it sometimes splits the match evidence across both records rather than consolidating it. One record gets the email, the other gets the phone number, and neither gets both. Merging the records first presents the matching algorithm with a single, richer target, which raises confidence and completeness in the output.

Practical sequencing checklist

  1. Export all contacts from CRM to a flat file.
  2. Run completeness counts for each priority field.
  3. Run format-validity checks (email regex, phone format, postcode format).
  4. Classify records by match-key tier (strong, good, weak, unmatchable).
  5. Run duplicate detection (deterministic pass, then probabilistic pass).
  6. Resolve duplicates and re-import the clean file.
  7. Document gap analysis output (field by field, with counts and percentages).
  8. Submit clean file and gap analysis to enrichment supplier for scoped quote.

Need GDPR-compliant data for your next campaign?

Tell us your targeting criteria and we will run a free count. B2B decision-makers, B2C consumer files, or CRM enrichment, all live verified.

Request Data Counts

Frequently asked questions

What is a CRM gap analysis?
A CRM gap analysis is a structured audit of your contact and company records that quantifies five dimensions before any enrichment work begins: field completeness (what percentage of records carry each key field), validity (how many populated fields contain well-formed, non-stale values), match-key sufficiency (whether records hold enough data to match against an external source), duplicate volume, and field-by-field decay rate. The output tells you the scale of work required, which fields to prioritise, and what a realistic match rate from enrichment will look like.
What are typical UK B2B CRM benchmarks for completeness?
Typical UK B2B CRMs carry 30% to 60% email completeness, 15% to 40% direct-dial completeness, and 5% to 15% duplicate volume. Company name and country are almost always present (90%+), but job title completeness often sits at 40% to 65%, and LinkedIn URL completeness below 25% for most organisations that have not previously enriched their data.
What is match-key sufficiency and why does it matter for enrichment?
Match-key sufficiency means a record holds enough data for the enrichment supplier to locate a corresponding record in their reference file. For B2B enrichment, the strongest match key is a business email address plus company name. Where email is absent, the combination of first name, last name, job title, company name, and postcode can still achieve a fuzzy match, though accuracy drops. Records with only a first name and company name are often unmatchable. Knowing your match-key sufficiency rate before requesting a quote lets the supplier give you an accurate price and prevents disappointment when the matched file comes back smaller than expected.
Should you deduplicate your CRM before enriching it?
Yes, always deduplicate before enriching. If you enrich first, you pay to append data to records that will later be merged, wasting budget on the duplicate copy. Run a deduplication pass using a deterministic rule (exact email match) followed by a probabilistic pass (name plus company plus postcode). Resolve surviving near-matches manually if the duplicate volume is below 2,000 records, or automate with a threshold confidence score above that. Once the file is clean, the enrichment match rate improves because there are no competing records for the same individual.
How often should a UK B2B CRM be audited for data gaps?
Run a light completeness check every quarter and a full gap analysis (including decay and match-key sufficiency) at least annually. UK B2B contact data decays at roughly 25% to 35% per year because of job moves, company restructures, and domain changes. A CRM that was fully enriched 18 months ago may have 40% of its direct dials and 30% of its emails already stale. Quarterly checks let you prioritise which segments need refreshing before a campaign, rather than discovering the problem after the mail has gone out.