Case Study 1: JP Morgan's COIN — How Python Replaced 360,000 Hours of Legal Work

DataField.Dev

Case Study 1: JP Morgan's COIN — How Python Replaced 360,000 Hours of Legal Work

Background

In June 2017, JPMorgan Chase & Co. quietly launched a program called COIN — short for Contract Intelligence. The system used natural language processing (NLP) to review commercial loan agreements, a task that had previously consumed approximately 360,000 hours of work annually by lawyers and loan officers. COIN could review the same documents in seconds.

The announcement sent ripples through both the financial services and legal industries. Headlines oscillated between alarm ("AI is replacing lawyers") and enthusiasm ("JPMorgan saves millions with machine learning"). But the real story was more nuanced — and more instructive for business professionals learning to work with code and data.

This case study examines what COIN actually did, how it was built, what it replaced, and what it tells us about the role of Python-based automation in professional services. The goal is not to teach you to build an NLP system (that comes in Chapter 14) but to illustrate the central theme of Chapter 3: code is a business tool that multiplies human capability.

The Problem: 12,000 Agreements, 150 Attributes Each

JPMorgan's investment banking division handles thousands of commercial loan agreements every year. Each agreement is a dense legal document, typically 10 to 20 pages, specifying terms such as interest rates, collateral requirements, borrower covenants, grace periods, prepayment penalties, and dozens of other attributes.

Before COIN, the process for reviewing these agreements was entirely manual:

Document intake. A loan agreement arrived, usually as a PDF or scanned image.
Manual review. A lawyer or trained loan officer read the entire agreement, identifying and extracting approximately 150 specific data points per document.
Data entry. The extracted information was entered into JPMorgan's internal systems — compliance databases, risk models, and portfolio management tools.
Quality check. A second reviewer verified the extraction to catch errors.

This process was expensive in three ways:

Time. At an average of 30 hours per agreement (including review, extraction, entry, and verification), 12,000 agreements consumed 360,000 person-hours per year.
Cost. At fully loaded professional rates, this represented tens of millions of dollars annually in labor costs.
Error. Human review of repetitive, detail-heavy documents is error-prone. Missed covenants or misrecorded terms could expose the bank to regulatory risk or financial loss.

The challenge was not that the work was intellectually complex — it was that it was voluminous, repetitive, and detail-intensive. These are precisely the characteristics that make a task a candidate for automation.

The Solution: Natural Language Processing at Scale

COIN used a combination of NLP techniques to automate the extraction of key terms from loan agreements:

Optical Character Recognition (OCR) converted scanned documents into machine-readable text.
Named Entity Recognition (NER) identified specific entities — party names, dates, monetary amounts, percentages — within the text.
Pattern matching and classification mapped extracted entities to the 150 required data fields.
Confidence scoring flagged extractions where the model was uncertain, routing those items to human reviewers rather than auto-populating.

The system was built primarily in Python, leveraging the language's deep ecosystem of NLP and machine learning libraries. While JPMorgan did not publish its internal codebase, the typical technology stack for a system of this kind includes:

pandas for data manipulation and structuring extracted information
scikit-learn for classification models
spaCy or NLTK for text processing and named entity recognition
PyPDF2 or Tesseract for PDF parsing and OCR
Custom-trained models for domain-specific legal language

The development team — a mix of technologists, data scientists, and legal domain experts — spent approximately 18 months building, training, and validating the system before production deployment.

What COIN Did Not Do

Understanding what COIN did not do is as important as understanding what it did.

COIN did not replace lawyers. It replaced one specific task that lawyers performed — the manual extraction of structured data from standardized documents. Lawyers were still needed for:

Negotiating agreement terms
Advising on unusual or non-standard provisions
Interpreting ambiguous language
Making judgment calls about risk
Reviewing the cases that COIN flagged as low-confidence

COIN did not make decisions. It extracted data. The decisions about whether to approve a loan, how to price risk, or whether a covenant violation had occurred were still made by humans using the data COIN provided.

COIN did not work on all documents. The system was trained on commercial loan agreements — a specific, relatively standardized document type. It was not a general-purpose legal document reader. Extending it to other document types (mergers and acquisitions contracts, regulatory filings, litigation documents) would require additional training data and model development.

The Business Impact

JPMorgan reported several quantifiable benefits:

Metric	Before COIN	After COIN
Time per agreement	~30 hours	Seconds (machine) + minutes (human review of flagged items)
Annual hours consumed	360,000	Estimated <40,000 (human oversight of flagged extractions)
Error rate	Industry-typical manual error rates (2-5% per field)	Significantly reduced (machine extraction is consistent)
Scalability	Linear (more agreements = proportionally more hours)	Near-constant (processing 12,000 or 20,000 agreements takes roughly the same compute time)

Beyond the direct cost savings, COIN delivered strategic benefits:

Speed to decision. Loan officers could access extracted terms within hours of document receipt, rather than waiting days or weeks for manual review.
Consistency. Unlike human reviewers, COIN applied the same extraction logic to every document, every time. This reduced the variance in data quality across the portfolio.
Redeployment of talent. The lawyers and loan officers who had spent significant time on extraction work were reassigned to higher-value activities — client advisory, complex deal structuring, and risk analysis that requires human judgment.

The Python Connection

Why does this case study appear in a chapter about learning Python? Because COIN illustrates a pattern that recurs throughout the business applications of AI:

The hard part is not the algorithm — it is the data pipeline.

COIN's NLP models were sophisticated, but the majority of the engineering effort went into the less glamorous work:

Parsing PDFs with inconsistent formatting, varying page layouts, and occasional scan quality issues
Cleaning and standardizing text extracted from OCR (handling misspellings, encoding issues, inconsistent abbreviations)
Structuring extracted data into the 150-field schema that JPMorgan's downstream systems expected
Building validation rules that flagged suspicious extractions (e.g., a loan amount that is an order of magnitude larger than typical, or a date that falls on a weekend)

This is pandas work. It is string manipulation, data cleaning, type conversion, groupby operations, and automated quality checks — exactly the kinds of operations you learned in this chapter. The ML model sits at the center, but the data pipeline wraps around it.

This is why Chapter 3 exists before Chapters 7 through 18 (the machine learning chapters). You cannot build AI systems if you cannot load, clean, transform, and validate data. And you cannot do those things efficiently without Python.

Lessons for Business Professionals

1. Automation targets repetitive, structured tasks — not judgment

COIN succeeded because it targeted a task that was high-volume, rule-based, and structured (extracting specific fields from standardized documents). It did not attempt to automate tasks requiring interpretation, negotiation, or strategic judgment.

Implication: When you evaluate automation opportunities in your organization, look for tasks that are repetitive, well-defined, and currently performed by skilled people who could be doing higher-value work. The question is not "Can we replace this person?" but "Can we free this person to do more important things?"

2. Domain expertise is non-negotiable

COIN was not built by technologists alone. It required deep legal domain expertise to define the 150 extraction fields, to train the models on legal language, and to establish the validation rules that distinguished a correct extraction from an error.

Implication: AI projects succeed when they combine technical capability with business domain knowledge. This is the core argument for MBA-level AI literacy — you do not need to build the model, but you need to understand the business problem well enough to guide the people who do.

3. The 80/20 rule applies

COIN handled the straightforward cases automatically and flagged the ambiguous ones for human review. This is a common and effective pattern: automate the 80% that is routine, and focus human attention on the 20% that requires judgment.

Implication: Do not demand 100% automation. A system that correctly handles 80% of cases and escalates the rest is far more valuable than a system that attempts 100% and fails unpredictably.

4. Start with the data, not the model

Before any machine learning model was trained, JPMorgan's team spent months building the data pipeline — the infrastructure to ingest, parse, clean, and structure legal documents. Without this pipeline, there was nothing to model.

Implication: The skills you learned in this chapter — loading data, cleaning it, exploring it, structuring it — are not preliminary to the "real" work of AI. They are the foundation.

5. Measure impact in business terms

JPMorgan did not report COIN's success in terms of model accuracy or F1 scores. They reported it in hours saved, error rates reduced, and talent redeployed. The business impact was the headline.

Implication: When you present AI work to business stakeholders, translate technical metrics into business outcomes. "The model achieved 94% accuracy" means less to a CFO than "We reduced processing time from 30 hours to 3 minutes and freed 12 FTEs for client-facing work."

The Broader Trend

COIN was notable when it launched in 2017, but the pattern it established has since become widespread across professional services:

Accounting firms use Python-based tools to automate audit procedures, extracting and cross-referencing financial data across thousands of documents.
Insurance companies use NLP to process claims, extracting policy details, damage descriptions, and settlement amounts from adjuster reports.
Consulting firms use automated analysis pipelines to process large datasets during due diligence engagements, reducing the time from weeks to days.
Healthcare organizations use NLP to extract clinical information from unstructured medical records, enabling population-level health analytics.

In each case, the pattern is the same: Python-based data pipelines feed machine learning models that automate repetitive extraction and classification tasks, with human professionals handling the exceptions and providing judgment.

The business professionals who thrive in this environment are not the ones who ignore the technology or the ones who become full-time programmers. They are the ones who understand the technology well enough to identify opportunities, frame problems, evaluate solutions, and communicate results. They are the people this textbook is designed to create.

Discussion Questions

Identify an automation candidate. Think of a task in your current or previous job (or a job you aspire to) that shares characteristics with the loan agreement review that COIN automated: repetitive, structured, detail-intensive. Describe the task and explain why it might be a candidate for Python-based automation. What percentage of the task could be automated, and what would still require human judgment?
The redeployment question. COIN freed approximately 320,000 hours of professional labor annually. If you were a manager at JPMorgan, how would you redeploy that capacity? What higher-value activities could those professionals focus on? What risks would you consider (e.g., skill atrophy, employee resistance)?
Domain expertise in AI development. The case emphasizes that COIN required deep legal domain expertise, not just technical skill. Give an example from another industry (marketing, supply chain, HR, finance) where domain expertise would be critical for an AI automation project. What specific knowledge would a domain expert contribute that a data scientist alone could not?
The 80/20 automation pattern. COIN did not attempt to handle 100% of cases automatically. It automated the straightforward cases and flagged ambiguous ones for human review. What are the risks of pushing for 100% automation? What are the risks of settling for less than 80%? How would you determine the right threshold for a given use case?
Ethical and workforce considerations. When a task that previously required 360,000 hours of human labor is automated, what happens to the people who performed that work? Discuss the ethical responsibilities of an organization implementing this kind of automation. How does this connect to the broader themes of AI governance that we will explore in Part 5 of this book?
Python as the starting point. The case notes that the majority of engineering effort for COIN went into data pipeline work — parsing, cleaning, structuring — rather than the ML model itself. How does this insight change your view of the Python skills introduced in this chapter? How does it change your expectations for what you will need to learn in the chapters ahead?

Further Exploration

Son, H. (2017). "JPMorgan Software Does in Seconds What Took Lawyers 360,000 Hours." Bloomberg, February 28, 2017.
JPMorgan Chase & Co. (2017). Annual Report, Technology Section.
Markoff, J. (2011). "Armies of Expensive Lawyers, Replaced by Cheaper Software." The New York Times, March 4, 2011. (An earlier exploration of the same trend.)
Susskind, R. & Susskind, D. (2015). The Future of the Professions: How Technology Will Transform the Work of Human Experts. Oxford University Press.