Case Study 2 — Reconciling Two Systems with EXCEPT
Not every case study is a disaster. This one shows set operations doing exactly what they're best at: comparing two data sets to find precisely what's in one but not the other.
EXCEPTin both directions is a reconciliation superpower.
Background
A company was migrating order data from a legacy system into a new PostgreSQL database. After the bulk import, the critical question was: did every order make it, and did anything get corrupted or duplicated in transit? With hundreds of thousands of orders, eyeballing was impossible. The team needed to know, precisely:
- Which orders are in the legacy system but missing from the new one (lost in migration)?
- Which orders are in the new system but not in the legacy one (spuriously created)?
- Do the orders that exist in both agree on their key fields?
Set operations answered all three cleanly.
The reconciliation queries
Both systems exposed an order summary as (order_id, customer_id, order_date, total). (Assume a foreign-data-wrapper or a staging table legacy_orders made the legacy data queryable alongside the new orders.)
1. Orders missing from the new system — in legacy EXCEPT new:
SELECT order_id FROM legacy_orders
EXCEPT
SELECT order_id FROM orders;
If this returns any rows, those order ids were lost in migration. (Empty result = nothing lost.)
2. Orders that shouldn't be there — in new EXCEPT legacy:
SELECT order_id FROM orders
EXCEPT
SELECT order_id FROM legacy_orders;
Rows here are orders in the new system with no legacy counterpart — duplicates or spurious inserts.
3. Rows that exist in both but disagree on a field — compare full rows with EXCEPT both ways:
-- Rows in legacy whose exact values don't match new (same id, different data)
SELECT order_id, customer_id, order_date, total FROM legacy_orders
EXCEPT
SELECT order_id, customer_id, order_date, total FROM orders;
Because EXCEPT compares entire rows, any order whose customer_id, order_date, or total changed during migration shows up here — even though its order_id exists in both. Running it the other direction (orders EXCEPT legacy_orders) catches the mirror discrepancies.
4. A count sanity check with INTERSECT:
-- How many order_ids exist in BOTH systems?
SELECT order_id FROM legacy_orders
INTERSECT
SELECT order_id FROM orders;
If the INTERSECT count equals both source counts and both EXCEPTs are empty, the migration is provably complete and faithful.
The outcome
The first run revealed 37 orders in legacy EXCEPT new (lost — traced to a malformed character in an address field that aborted their import batch) and 0 in new EXCEPT legacy (nothing spurious). The full-row EXCEPT surfaced 12 orders whose total differed by a cent — a rounding difference from a currency-conversion change, caught before it reached finance. All were fixed, the queries re-run until both EXCEPTs returned empty, and the migration was signed off with evidence rather than hope.
The analysis
-
EXCEPTis the precise tool for "what's in A but not B." TwoEXCEPTs (A−B and B−A) together give you a complete picture of differences between two sets — the foundation of any reconciliation. -
Compare full rows to catch value-level drift. Selecting all the key columns (not just the id) makes
EXCEPTflag rows that exist in both but disagree on content. This catches corruption that an id-only check would miss. -
INTERSECTconfirms overlap. "How many are in both?" plus "both differences are empty" is a complete, auditable proof of a faithful copy. -
Set operations turn "looks right" into "provably right." For migrations, dedups, audits, and system reconciliations, these operators replace spot-checking with exhaustive comparison. Empty
EXCEPTresults are a strong, checkable guarantee. -
This is the constructive twin of the anti-join. The same idea (
EXCEPT= difference) that finds "customers who never ordered" finds "orders that didn't migrate." One mental tool, many uses.
Discussion questions
- Explain what
legacy_orders EXCEPT orderstells you, and why you must also run it the other way. - Why does selecting all key columns (not just
order_id) catch value-level corruption? - How do two empty
EXCEPTs plus a matchingINTERSECTcount constitute a proof of a faithful migration? - Why are set operations better than manual spot-checking for this task?
- ⭐ Could you do the same reconciliation with
NOT EXISTSjoins? Sketch it, and say when theEXCEPTform is clearer.