Accuracy & Matching
How MatchAudit standardizes inputs, supports multilingual names, finds candidates, and scores confidence—so your decisions stand up to banks, auditors, and regulators.
1) Data Sources & Freshness by Plan
- • Screens run against offline JSON lists under
/public/sanctions-lists
. - • Lists are refreshed daily via import/cron and served locally for fast response.
- • Matching uses Fuse.js fuzzy search with strict thresholding (see below).
- • Screens use the OpenSanctions API for near real-time freshness.
- • API responses include provenance/timestamps; we propagate these to your audit trail.
- • If the API is unavailable, we fall back to the latest offline snapshot and mark it as such.
Covered jurisdictions and update cadence are listed in Coverage & Credence.
2) Normalization (used in Free/Starter)
Before matching, we standardize both inputs and list values:
- • Transliteration to Latin script (e.g., “Александр” → “Aleksandr”).
- • Lower-case + Unicode normalization (NFD).
- • Diacritics removal (e.g., “Łukashenko” → “Lukashenko”).
- • Strip punctuation & collapse whitespace to single spaces.
This reduces false mismatches caused by formatting while preserving informative tokens.
3) Multilingual Input & Search
- • Inputs can be in many languages/scripts. We transliterate queries to Latin and match against normalized list entries. Examples: “محمد” → “Muhammad”; “Александр Лукашенко” → “Aleksandr Lukashenko”.
- • Non-Latin aliases are preserved where available. For sources like the UN/EU that include Arabic/Cyrillic aliases, we index those and display them in the details so reviewers see the native form.
- • We do not translate semantics; we normalize characters. If both native and romanized variants exist in the source, we can match either and show both in the result details.
Practical tip: when possible, search with the most complete legal name plus any known local-language spelling.
4) Candidate Matching (Fuse.js)
We search across multiple name fields and aliases using Fuse.js with a strict threshold:
- •
name
,primaryName
,wholeName
,firstName
,secondName
,thirdName
,lastName
,surname
,familyName
- • Aliases:
aliases
,aliasName
,nameAliases.wholeName
,nameAliases.firstName
,nameAliases.lastName
- • Fuse threshold = 0.10 (strict; fewer weak candidates).
- • Fuse returns
match.score
(0 = perfect). We convert to confidence as 1 − score. - • We include matched snippets and a human-readable rationale per result.
No phonetic algorithms are used by default; this reduces over-broad hits and keeps rationales transparent.
5) Per-List Detail Extraction
We parse and attach contextual fields to help you disambiguate candidates:
- • EU: gender, titles/functions, birthdates, citizenships, EU reference, regulations/links, remarks.
- • UN: designations, nationality, dates/places of birth, addresses, UN reference, listed on, comments, aliases (incl. non-Latin).
- • OFAC (US): SDN type, programs, gender, titles/designations, nationality, addresses, AKAs, IDs, remarks.
- • UK (OFSI): primary name, aliases, sanctions text vs “other information”, addresses/countries.
- • Switzerland (SECO): sex, nationality, birthdate, justification.
- • Australia (DFAT): birthdate, nationality, address, reason, regime.
- • Canada (SEMA): DOB, nationality, reason, regime, date listed.
6) Confidence Bands & Labels
Each candidate receives a confidence value (0–1, shown as %) and label:
- ≥ 0.95 (95–100%): Very High confidence match
- 0.85–0.949… (85–94.9%): Medium confidence match
- < 0.85 (<85%): Low confidence match
Labels reflect the current implementation; future tuning may add a distinct “High” band.
7) What’s Different on Paid Plans (API)
- • Data freshness: near real-time via OpenSanctions API (vs daily offline refresh).
- • Provenance: source timestamps/identifiers carried into your audit trail and PDFs.
- • Throughput: better for bulk/API workloads; UI can stream/paginate large sets.
If the API is unavailable, we fall back to the last snapshot and mark the source in the result.
8) Customer Review & Decisioning
MatchAudit does not provide human reviewers or manual adjudication services. You and your team are responsible for reviewing candidates and making final decisions per your policy.
- • Compare DOB, nationality, program/regime, roles/titles, and addresses (where available).
- • Pay attention to aliases/AKAs and non-Latin forms shown in the details.
- • Record an outcome (e.g., “Cleared”, “Escalated”, “Confirmed”) in your workflow for audit readiness.
9) Known Limitations
- • Transliteration variants and rare spellings can still be missed.
- • Strict thresholds reduce noise but may hide very weak candidates; try alternative spellings when in doubt.
- • Source list quality and identifiers vary by jurisdiction and update cycle.
Screening reduces risk but does not eliminate it. Always follow your internal due-diligence procedures.
10) Legal
Screening outputs are risk indicators, not legal determinations. MatchAudit provides timestamps, list versioning, and rationales to support audit readiness; responsibility for final decisions remains with you.
Effective date: 2025-08-16. We update this page as thresholds and sources evolve.