What is entity resolution?
Entity resolution is the process of determining whether multiple data records refer to the same real-world entity — a person, company, or organization. Also called record linkage, entity matching, or deduplication, it's essential for maintaining accurate CRM, compliance, and analytics systems where the same entity may appear in multiple forms.
Common in CRM systems where the same person may appear as "John Smith" at "Acme Corp" and "J. Smith" at "Acme Corporation" — entity resolution determines these are the same person and merges or links the records.
How entity resolution works
Entity resolution follows a multi-step process to identify and link records that refer to the same real-world entity:
- Data ingestion — Records are collected from multiple sources (CRM, marketing automation, third-party lists, web forms) and normalized into a consistent format.
- Blocking and indexing — Rather than comparing every record against every other record (which scales quadratically), blocking groups records by shared attributes — like first letter of last name and company domain — to reduce the comparison space.
- Pairwise comparison — Records within each block are compared across multiple fields using fuzzy matching algorithms (Levenshtein distance, Jaro-Winkler, phonetic matching) and semantic similarity.
- Scoring and threshold — Each pair receives a match score based on field-by-field similarity. Pairs above a high-confidence threshold are auto-merged; pairs in the gray zone are flagged for review.
- Merge or link decision — Matched records are either merged into a single golden record or linked as related records, depending on the use case and data governance policy.
Modern entity resolution increasingly uses ML-based approaches that learn from historical match decisions, improving accuracy over time — especially for ambiguous cases like common names, multinational companies, and records with sparse data.
Why entity resolution matters
Without entity resolution, CRM and operational systems accumulate duplicates at an alarming rate. The downstream impacts are significant:
- Inflated pipeline — Duplicate records create phantom opportunities, making pipeline reviews unreliable and forecast accuracy worse.
- Wasted marketing spend — The same person receives the same campaign multiple times from different records, increasing unsubscribe rates and eroding trust.
- Compliance risk — In regulated industries, failing to link records for the same entity can mean missed sanctions matches or incomplete KYC verification.
- Rep frustration — When two reps unknowingly work the same account from different records, it creates internal conflict and a poor customer experience.
Enterprise CRMs average 20–30% duplicate rates. For a CRM with 500K records, that's 100K–150K records that should be merged or linked — inflating every metric downstream.
How Salmon approaches entity resolution
Salmon's AI engine performs entity resolution in real time — as records enter your CRM. Fuzzy matching, cross-source verification, and confidence scoring ensure duplicates merge correctly without manual review. Every merge decision includes full source attribution and audit trail, so you can see exactly why two records were linked.
Related concepts
See real-time enrichment on your data.
Send us a sample from your CRM. We'll show you what Salmon enriches, verifies, and fixes — live, in 30 minutes.