Building a Bank Reconciliation Engine with Fuzzy Matching
How I built a progressive fuzzy-match reconciliation engine across 5 bank accounts — tier-priority confidence scoring, review queues for ambiguous matches, and discrepancies that surface before month-end, not after.
The problem
Traditional bank rec tools assume perfect data. Real life is messy — memos change, amounts split across multiple posts, dates drift by a day, and descriptions shift between banks. A pure exact-match engine breaks on the first edge case.
What you'll build
- A multi-tier match engine (exact → date-adjusted → fuzzy-memo → amount-split)
- Confidence scoring so you know which matches to trust
- A review queue that surfaces only what a human needs to look at
- Audit logs that make every match reversible
Prerequisites
- Python 3.11+ with pandas and rapidfuzz
- Sample bank statements and a GL/AP export
- Basic comfort with SQL and dataframe joins
1. Designing the match pipeline
Coming soon — why progressive matching beats a single pass, and the tier order I use.
2. Writing the match tiers
Coming soon — exact, date-window, fuzzy-memo, and amount-split matchers — with code.
3. Confidence scoring
Coming soon — how I weight each signal and set a threshold for "auto" vs. "review."
4. The review queue
Coming soon — the minimal UI I render as HTML + the audit log that survives re-runs.
Wrap-up
Coming soon — what went wrong on the first version and how I fixed it.