Chapter 23: Quiz — NLP for Regulatory Intelligence and Horizon Scanning

DataField.Dev

Chapter 23: Quiz — NLP for Regulatory Intelligence and Horizon Scanning

14 questions covering the concepts, techniques, and practical applications from Chapter 23. Questions progress from foundational recall to applied analysis.

Questions

Question 1

Which of the following best describes the core structural problem that regulatory intelligence NLP is designed to address?

A. Regulators use inconsistent terminology across jurisdictions, making manual reading unreliable B. Compliance analysts lack the legal training to interpret regulatory obligations without assistance C. The volume of regulatory output exceeds the throughput capacity of human-only monitoring processes D. Regulatory documents are published in too many languages for a single compliance team to cover

Question 2

A regulatory intelligence system classifies a single ESMA document as: Topic = "Reporting," Topic = "Algorithmic Trading," Business Line = "Trading," Business Line = "Asset Management," Urgency = "High." This is an example of:

A. Single-label classification applied twice to the same document B. Multi-label classification, where a document receives multiple labels across multiple dimensions C. An error — a document should belong to only one topic category D. Hierarchical classification, where labels must be nested under a single parent category

Question 3

Priya's firm is considering using FinBERT rather than general-purpose BERT for classifying regulatory documents. The primary reason for this choice is:

A. FinBERT is significantly faster at inference time than BERT B. FinBERT was pre-trained on financial and regulatory text corpora, providing better representations of domain-specific vocabulary C. FinBERT requires less labeled training data than BERT D. FinBERT supports multi-label classification natively, while BERT does not

Question 4

Named entity recognition (NER) in a regulatory intelligence system is used to extract which of the following from regulatory text? Select all that apply:

A. Regulatory references (e.g., "Article 26 of MiFIR") B. Effective and compliance dates C. Sentiment scores for the regulator's tone D. Types of firms to which a regulation applies E. Specific financial instruments referenced in the document

Question 5

A compliance analyst searches a regulatory corpus for documents about "transaction monitoring threshold requirements." The keyword search returns three results. The semantic search over the same corpus returns eleven results, including several that discuss "suspicious transaction trigger levels," "monitoring calibration standards," and "STR alert thresholds." Which statement best explains this difference?

A. The semantic search had lower precision because it retrieved more documents B. Semantic search represents documents and queries as dense vectors in a meaning space, retrieving conceptually related content regardless of exact wording C. Keyword search uses a smaller index than semantic search, limiting its coverage D. The regulatory corpus was indexed incorrectly for keyword search

Question 6

A compliance team receives a delta analysis report comparing ESMA's 2022 transaction reporting guidelines with the amended 2025 version. The report flags a paragraph in Article 4 as "semantically changed." However, reading both versions, the team finds that the sentence structure was simplified but the substantive obligation appears identical. What does this finding suggest about the delta analysis system?

A. The system is malfunctioning and should be replaced B. Semantic change detection has a sensitivity that may flag editorial changes; human review of flagged paragraphs remains necessary to assess substantive significance C. The team has misread the amendment; semantic change detection is authoritative D. The system correctly identified a hidden substantive change that the team missed

Question 7

Obligation extraction from regulatory text is described in the chapter as the hardest NLP task in regulatory intelligence. Which of the following best explains why?

A. Regulatory documents are too long for current NLP models to process in a single pass B. Obligations are expressed in many forms — explicit, conditional, implicit, and cross-referenced — and missed obligations create direct compliance gaps C. Regulators intentionally obscure obligations to preserve interpretive discretion D. The NER models required for obligation extraction are not yet commercially available

Question 8

A compliance team deploys an LLM-based Q&A chatbot over their regulatory corpus without a RAG architecture. A user asks: "Does our firm need to submit a DORA Article 19 incident report for service disruptions affecting non-critical systems?" The chatbot responds confidently with a specific answer. Which of the following is the most appropriate next step for the compliance team member?

A. Accept the answer — LLMs trained on regulatory text are reliable for specific article queries B. Verify the chatbot's answer against the primary text of DORA Article 19 before relying on it for compliance purposes C. Ask the chatbot to repeat its answer to confirm consistency D. Escalate immediately to external legal counsel, as LLM answers are never usable

Question 9

Explain the architecture of a RAG (Retrieval-Augmented Generation) system for regulatory Q&A. Your answer should describe the two-phase process (indexing and query), the role of the vector database, and why RAG reduces hallucination compared to a context-free LLM query.

[Open-response question — see answer key for model response]

Question 10

A compliance manager reviews the monthly metrics for the firm's regulatory intelligence platform and finds that the trading compliance team has marked 78% of their alerts as "Not Relevant" over the past three months. The most likely explanation for this finding, and the most appropriate remedial action, is:

A. The trading team is not taking regulatory review seriously; remediation is a training and culture issue B. The classification model's routing rules for "Trading" are too broad — the precision is low, and the classifier should be recalibrated based on the team's feedback C. The regulatory environment has been quieter than usual; 78% irrelevance is acceptable variation D. The alert system has a technical fault in the routing table; the IT team should investigate

Question 11

The chapter describes the "80/20 principle" in regulatory intelligence automation. In this context, what does the 80 percent represent, and what does the 20 percent represent?

A. 80% of regulatory documents are from the top five regulators; 20% are from niche bodies B. 80% of compliance budget is spent on technology; 20% on human review C. Automation handles approximately 80% of routine monitoring work; the remaining 20% involves genuinely ambiguous cases where human compliance judgment is essential D. 80% of obligations are extracted accurately by NLP; 20% require manual identification

Question 12

Why is tracking effective dates a particularly important component of obligation extraction in regulatory intelligence systems?

A. Effective dates determine which version of a regulation the classification model should use B. Regulatory systems cannot process documents without a valid effective date field C. Effective dates define the compliance deadline — missing an effective date means a firm may not complete required changes before legal obligations apply, potentially resulting in breach D. Effective dates are used to prioritize regulatory publications in the ingestion queue

Question 13

A mid-sized UK investment manager is deciding whether to build a custom regulatory intelligence platform or license a commercial vendor. The firm covers primarily UK and EU regulatory bodies, follows standard asset management regulation (FCA, ESMA, PRA), and has a compliance team of six. Which recommendation is most defensible?

A. Build custom — a six-person team has enough capacity to manage an in-house NLP system B. Buy commercial — the firm's standard jurisdictional footprint is well-covered by existing platforms, and the ongoing data science and taxonomy management costs of a custom build would likely exceed licensing costs without providing proportionate benefit C. Build custom — commercial vendors cannot cover EU regulatory content adequately D. Buy commercial — but only from vendors headquartered in the UK to ensure post-Brexit compliance

Question 14

A compliance officer argues that the firm does not need to maintain audit trails for regulatory intelligence decisions because the obligation register already records what requirements apply. A senior compliance director disagrees. Who is correct, and why?

A. The compliance officer is correct — the obligation register is sufficient evidence of regulatory compliance B. The senior compliance director is correct — regulators expect to see not just what obligations were identified, but evidence of how the firm's process identified them, who reviewed them, when, what conclusions were reached, and what remediation was taken. The audit trail is the process evidence; the obligation register is the outcome. Both are necessary. C. Both are partially correct — audit trails are required only for High Urgency obligations D. The compliance officer is correct — audit trails are a technology feature, not a regulatory requirement

Answer Key

Q1: C The core problem is throughput: the volume of regulatory publications exceeds the capacity of a human-only monitoring process. The chapter quantifies this as hundreds of regulatory change alerts per day for large institutions. This is a structural problem, not a skills or language problem.

Q2: B Multi-label classification allows a single document to receive multiple labels across multiple dimensions simultaneously. This is essential for regulatory text because a single publication commonly spans multiple topics and applies to multiple business lines. Single-label classification would force an artificial choice.

Q3: B FinBERT was pre-trained on financial corpora, which means its tokenizer and embedding weights already represent financial vocabulary — including regulatory and legal terminology — at a much higher level of accuracy than general-purpose BERT. This provides a better initialization point for regulatory classification fine-tuning, requiring less labeled data to achieve comparable performance.

Q4: A, B, D, E NER in regulatory intelligence extracts regulatory references, dates, firm types, and financial instruments. Sentiment scoring is a different NLP task (sentiment analysis) and is not typically considered part of NER-based regulatory intelligence. The four correct answers represent the actionable metadata that NER extracts from regulatory text.

Q5: B Semantic search encodes both documents and queries as dense vectors in a semantic embedding space, where proximity represents conceptual similarity. Documents discussing "STR alert thresholds" and "monitoring calibration standards" are semantically close to "transaction monitoring threshold requirements" even without shared vocabulary. Keyword search cannot make this connection because it operates on exact term matching.

Q6: B Semantic change detection computes vector distance between paragraph representations across document versions. Editorial reformatting can shift vector representations enough to exceed the detection threshold even without substantive change. This is a known limitation of semantic delta analysis: flagged changes require human review to assess whether they are substantively significant or merely stylistic. The finding does not indicate system malfunction — it indicates the system is working as designed, and that human review of flagged paragraphs is a necessary part of the workflow.

Q7: B Obligation extraction is difficult because regulatory obligations take many linguistic forms — explicit "shall" statements, conditional requirements, definitional expansions that implicitly extend existing obligations, and obligations embedded in cross-references to other documents. Missing an obligation in the extraction process is not a technical inconvenience: it creates a compliance gap that may result in regulatory breach. The stakes are what make it the hardest task.

Q8: B The correct response is to verify the answer against the primary text of DORA Article 19. An LLM without RAG answers from training memory, which may be outdated, incomplete, or hallucinated. DORA is a complex and recent regulation; the LLM's training data may not accurately reflect its current provisions. Verification against the primary source is the minimum due diligence required before relying on the answer for compliance purposes. Neither blind acceptance nor immediate escalation to external counsel is the proportionate first step.

Q9: Model Response A RAG system operates in two phases. In the indexing phase, all regulatory documents in the corpus are split into chunks (typically paragraphs), each chunk is encoded into a numerical vector by a sentence transformer model (such as all-MiniLM-L6-v2), and these vectors are stored in a vector database (such as FAISS, Pinecone, or Chroma). The vector database supports fast nearest-neighbor search over high-dimensional vectors.

In the query phase, when a user asks a question, that question is also encoded into a vector by the same sentence transformer. The vector database retrieves the N most similar document chunks (those whose vectors are closest to the query vector). These chunks are provided to the LLM as context, along with the user's question. The LLM then answers the question based on the provided context, citing the specific passages it drew from.

RAG reduces hallucination compared to context-free LLM queries because the LLM is constrained to answer based on the retrieved document text rather than from memory. When the answer must be grounded in provided text, the LLM cannot invent regulatory provisions that were not in the source documents. Every claim the LLM makes can be traced to a specific retrieved passage, enabling verification. RAG does not eliminate hallucination entirely — LLMs can misread context — but it makes outputs verifiable, which is the crucial property for compliance use.

Q10: B A 78% irrelevance rate indicates a precision problem in the classification and routing system. The trading team's alert stream contains too many documents that are genuinely not relevant to their area. The correct remediation is to analyze the false positive alerts, identify why they were classified as "Trading / Relevant," and adjust either the classification model's decision boundary for that business line or the routing rules that determine which classified documents are sent to that team. Training and culture are not the explanation; the workflow data is the signal.

Q11: C The 80/20 principle in this context refers to the division of labor between automation and human judgment. Approximately eighty percent of the regulatory intelligence workflow — ingestion, classification, routing, obligation extraction from clear regulatory text — can be automated effectively. The remaining twenty percent consists of cases where classification confidence is low, applicability depends on firm-specific facts, regulatory interpretation is contested, or obligations are expressed ambiguously. These cases require the legal knowledge, business understanding, and professional experience of a trained compliance professional.

Q12: C Effective dates define when a regulatory obligation becomes legally binding. An obligation extract without an effective date leaves the compliance team without the information they need to schedule remediation work. If the compliance team does not know that an obligation applies from 1 October 2025, they cannot plan to be ready by that date — and if they are not ready, they are in breach of the obligation on day one of its effectiveness. Effective date extraction is one of the most operationally consequential outputs of the obligation extraction module.

Q13: B A mid-sized firm with a standard UK/EU jurisdictional footprint and asset management regulatory profile is well-suited to a commercial platform. Thomson Reuters Regulatory Intelligence, Wolters Kluwer FRR, and Compliance.ai all cover FCA and ESMA content comprehensively. The ongoing costs of a custom build — data science capacity for model maintenance, taxonomy management, regulatory content sourcing, and workflow integration — would consume significant resources for a six-person compliance team, likely without providing proportionate benefit over a mature commercial product. Custom builds are most justified when the firm's regulatory footprint is genuinely unusual or when very tight integration with proprietary systems is required.

Q14: B The senior compliance director is correct. Regulators conducting reviews of a firm's compliance program want to see evidence of how the regulatory intelligence process works: that documents are systematically identified, reviewed by an accountable person, assessed for firm-specific impact, and tracked through to remediation. The obligation register records what obligations were identified — the outcomes. The audit trail records the process: who reviewed the document, on what date, what they concluded, and what actions they took. Both are required. A process without an audit trail cannot be demonstrated to have existed, which is the same as not having had the process at all.