Accuracy & Matching

How MatchAudit standardizes inputs, supports multilingual names, finds candidates, and scores confidence—so your decisions stand up to banks, auditors, and regulators.

1) Data Sources & Freshness by Plan

Free & Starter
  • • Screens run against offline JSON lists under /public/sanctions-lists.
  • • Lists are refreshed daily via import/cron and served locally for fast response.
  • • Matching uses Fuse.js fuzzy search with strict thresholding (see below).
Pro, Business, Enterprise
  • • Screens use the OpenSanctions API for near real-time freshness.
  • • API responses include provenance/timestamps; we propagate these to your audit trail.
  • • If the API is unavailable, we fall back to the latest offline snapshot and mark it as such.

Covered jurisdictions and update cadence are listed in Coverage & Credence.

2) Normalization (used in Free/Starter)

Before matching, we standardize both inputs and list values:

  • Transliteration to Latin script (e.g., “Александр” → “Aleksandr”).
  • • Lower-case + Unicode normalization (NFD).
  • Diacritics removal (e.g., “Łukashenko” → “Lukashenko”).
  • • Strip punctuation & collapse whitespace to single spaces.

This reduces false mismatches caused by formatting while preserving informative tokens.

3) Multilingual Input & Search

  • Inputs can be in many languages/scripts. We transliterate queries to Latin and match against normalized list entries. Examples: “محمد” → “Muhammad”; “Александр Лукашенко” → “Aleksandr Lukashenko”.
  • Non-Latin aliases are preserved where available. For sources like the UN/EU that include Arabic/Cyrillic aliases, we index those and display them in the details so reviewers see the native form.
  • • We do not translate semantics; we normalize characters. If both native and romanized variants exist in the source, we can match either and show both in the result details.

Practical tip: when possible, search with the most complete legal name plus any known local-language spelling.

4) Candidate Matching (Fuse.js)

We search across multiple name fields and aliases using Fuse.js with a strict threshold:

Fields we search
  • name, primaryName, wholeName, firstName, secondName, thirdName, lastName, surname, familyName
  • • Aliases: aliases, aliasName, nameAliases.wholeName, nameAliases.firstName, nameAliases.lastName
Threshold & scoring
  • • Fuse threshold = 0.10 (strict; fewer weak candidates).
  • • Fuse returns match.score (0 = perfect). We convert to confidence as 1 − score.
  • • We include matched snippets and a human-readable rationale per result.

No phonetic algorithms are used by default; this reduces over-broad hits and keeps rationales transparent.

5) Per-List Detail Extraction

We parse and attach contextual fields to help you disambiguate candidates:

  • EU: gender, titles/functions, birthdates, citizenships, EU reference, regulations/links, remarks.
  • UN: designations, nationality, dates/places of birth, addresses, UN reference, listed on, comments, aliases (incl. non-Latin).
  • OFAC (US): SDN type, programs, gender, titles/designations, nationality, addresses, AKAs, IDs, remarks.
  • UK (OFSI): primary name, aliases, sanctions text vs “other information”, addresses/countries.
  • Switzerland (SECO): sex, nationality, birthdate, justification.
  • Australia (DFAT): birthdate, nationality, address, reason, regime.
  • Canada (SEMA): DOB, nationality, reason, regime, date listed.

6) Confidence Bands & Labels

Each candidate receives a confidence value (0–1, shown as %) and label:

  • ≥ 0.95 (95–100%): Very High confidence match
  • 0.85–0.949… (85–94.9%): Medium confidence match
  • < 0.85 (<85%): Low confidence match

Labels reflect the current implementation; future tuning may add a distinct “High” band.

7) What’s Different on Paid Plans (API)

  • Data freshness: near real-time via OpenSanctions API (vs daily offline refresh).
  • Provenance: source timestamps/identifiers carried into your audit trail and PDFs.
  • Throughput: better for bulk/API workloads; UI can stream/paginate large sets.

If the API is unavailable, we fall back to the last snapshot and mark the source in the result.

8) Customer Review & Decisioning

MatchAudit does not provide human reviewers or manual adjudication services. You and your team are responsible for reviewing candidates and making final decisions per your policy.

  • • Compare DOB, nationality, program/regime, roles/titles, and addresses (where available).
  • • Pay attention to aliases/AKAs and non-Latin forms shown in the details.
  • • Record an outcome (e.g., “Cleared”, “Escalated”, “Confirmed”) in your workflow for audit readiness.

9) Known Limitations

  • • Transliteration variants and rare spellings can still be missed.
  • • Strict thresholds reduce noise but may hide very weak candidates; try alternative spellings when in doubt.
  • • Source list quality and identifiers vary by jurisdiction and update cycle.

Screening reduces risk but does not eliminate it. Always follow your internal due-diligence procedures.

10) Legal

Screening outputs are risk indicators, not legal determinations. MatchAudit provides timestamps, list versioning, and rationales to support audit readiness; responsibility for final decisions remains with you.

Effective date: 2025-08-16. We update this page as thresholds and sources evolve.