Chapter 8 Exercises

Sanctions Screening: Watchlists, False Positives, and Calibration


Exercise 8.1: Regime Classification

Difficulty: Introductory

For each of the following sanctions-related scenarios, identify the relevant sanctions regime(s) and the primary compliance obligation triggered.

a) A US bank receives a wire transfer instruction with a beneficiary named "Pyongyang National Finance Corp" registered in North Korea

b) A UK bank's customer — a UK-registered company — seeks to make a payment to a company in Iran that is not on any published UK or EU sanctions list, but whose ultimate owner is on the OFAC SDN list

c) A German bank discovers that a corporate customer's majority shareholder (60% ownership) was added to the EU Consolidated Financial Sanctions List last week

d) An EU-incorporated bank operating a London branch wishes to process a USD wire transfer for a customer where the counterparty appears on the EU Consolidated List but not the OFAC SDN list

e) A fintech platform discovers that one of its business customers recently changed its name — the new name matches an alias on the UN Al-Qaida sanctions list


Exercise 8.2: False Positive Analysis and Threshold Trade-offs

Difficulty: Introductory-Intermediate

A community bank is evaluating threshold settings for its new sanctions screening system. Testing was conducted against a sample of 10,000 customer names, of which 8 are known true sanctions matches.

Threshold Total Alerts True Positives Captured Missed True Matches
0.70 2,840 8 0
0.75 1,620 8 0
0.80 740 8 0
0.85 310 7 1
0.90 95 6 2
0.95 22 4 4

a) Calculate the false positive rate and precision for each threshold setting.

b) At what threshold does the "missed true match" problem begin? Describe the regulatory risk associated with the first missed true match.

c) The bank's compliance team can review 50 sanctions alerts per week. At which thresholds is the alert volume operationally manageable?

d) Recommend a threshold setting and justify your recommendation. Your justification should explicitly address both the false positive operational burden and the regulatory risk of missed matches.

e) Beyond threshold adjustment, what additional controls could the bank implement to manage the alert volume at the 0.80 threshold while maintaining detection capability?


Exercise 8.3: Supporting Data Integration

Difficulty: Intermediate

The following sanctions alerts have been generated for a financial institution. For each, assess the likely false positive probability based on the available supporting data and recommend a review priority (HIGH/MEDIUM/LOW). Provide your reasoning.

Alert A: - Customer name: "Mahmoud Al-Rashidi" (DOB: 1975-03-14, Nationality: UK) - Matched watchlist entry: "Mahmoud Al-Rashidi" (DOB: 1975-03-14, Nationality: Iraqi) - Name similarity score: 1.00 (exact match)

Alert B: - Customer name: "Chen Wei" (DOB: 1988-11-22, Nationality: Singapore) - Matched watchlist entry: "Chen Wei" (DOB: 1962-05-07, Nationality: PRC) - Name similarity score: 1.00 (exact match)

Alert C: - Customer name: "Viktor Petrov" (DOB: 1979-08-09, Nationality: Russian) - Matched watchlist entry: "Viktor Petrov" (DOB: 1979-08-09, Nationality: Russian) - Name similarity score: 1.00 (exact match)

Alert D: - Customer name: "Fatima Al-Mansouri" (DOB: 1990-02-14, Nationality: UAE) - Matched watchlist entry: "Fatimah Al-Mansouri" (DOB: unknown, Nationality: unknown) - Name similarity score: 0.88 (fuzzy match)

Alert E: - Customer name: "Global Trade Finance Ltd" (Registered: UK) - Matched watchlist entry: "Global Trade Finance Co" (Registered: Iran) - Name similarity score: 0.86


Exercise 8.4: Payment Screening Design

Difficulty: Intermediate

A digital payment platform processes an average of 150,000 wire transfers per day, primarily UK domestic and EU cross-border payments, with approximately 8,000 international wire transfers per day involving non-EU counterparties.

Design a tiered payment screening architecture that: a) Screens all payments against the required watchlists before execution b) Operates within the payment processing latency constraints (< 5 seconds for domestic payments; < 30 seconds for international) c) Prioritizes depth of screening for higher-risk payment types d) Manages the alert volume to a level reviewable by a team of 4 analysts

In your design, specify: - Which payment types receive which level of screening - How alerts are held and managed during analyst review - What information is available to the analyst for each alert - What escalation path exists for confirmed true matches


Coding Exercise 8.5: Build a Multi-Algorithm Screening Engine

Difficulty: Coding — Intermediate

Extend the SanctionsScreener class from Section 8.2 to implement a multi-algorithm ensemble matching approach:

  1. Implement three separate matching functions: - levenshtein_similarity(s1, s2) (already provided — use or improve) - ngram_similarity(s1, s2, n=2) — n-gram overlap score - token_sort_similarity(s1, s2) — sort tokens alphabetically then compare (handles word order variations: "James Brown" vs "Brown James")

  2. Implement an ensemble_score(name1, name2, weights=(0.4, 0.3, 0.3)) function that combines all three scores using the provided weights.

  3. Modify SanctionsScreener.screen() to use the ensemble score instead of Levenshtein alone.

  4. Test your implementation with the following name pairs and explain why each algorithm produces different scores: - "Mohammed Al-Hassan" vs "Mohamed Al-Hassan" (transliteration variant) - "Hassan Mohammed Al-" vs "Al-Hassan Mohammed" (word order variant) - "Abdurahman" vs "Abdurrahman" (doubled consonant) - "John Smith" vs "Smith John" (word reversal)


Coding Exercise 8.6: Designation-Triggered Screening Simulation

Difficulty: Coding — Advanced

Write a Python simulation of a designation-triggered screening process:

  1. Create a SanctionsListManager class that: - Maintains a current watchlist (list of dict records) - Has a method add_designation(entity: dict) -> list[str] that adds a new designation and returns the IDs of all customers who now match the new designation (above threshold) - Uses the SanctionsScreener to perform the matching

  2. Simulate the following scenario: - Initialize with 1,000 synthetic customer records (random names, DOBs, nationalities) - Initialize with a watchlist of 50 synthetic sanctions entries - Add 3 new designations sequentially - For each new designation, report: how many customers were re-screened, how many new potential matches were found, and how quickly the re-screening completed (time the operation)

  3. Your simulation should demonstrate why designation-triggered screening requires compute capacity planning: if adding one new designation triggers re-screening of 1,000 customers in your simulation, extrapolate what this means for an institution with 5 million customers and a typical week where OFAC publishes 3–5 new designations.


Applied Exercise 8.7: The Demographic Equity Audit

Difficulty: Applied — Research Required

The chapter notes that common names from certain cultural backgrounds generate disproportionately high false positive rates in sanctions screening. This has equity implications: customers from these backgrounds experience higher rates of payment delay and compliance friction.

a) Research the concept of "de-risking" and its relationship to sanctions compliance. How has over-aggressive sanctions screening contributed to financial exclusion for certain demographic groups and geographic regions?

b) Several jurisdictions (US, UK, EU) have issued guidance on "de-risking" as a concern. Summarize the regulatory position: is over-aggressive screening that results in de-risking considered a compliance success or a compliance failure by regulators?

c) A financial institution's analytics team finds that customers with Arabic names generate 4.7x more sanctions false positive alerts per customer than customers with Anglo-Saxon names. The institution is considering implementing separate, less sensitive screening thresholds for this customer segment to reduce the disparity. What are the arguments for and against this approach from both a compliance and an equity perspective?

d) What technical or process improvements could reduce the demographic disparity in false positive rates without compromising screening effectiveness?

Write a 400-word summary addressing all four questions.