Bank AML and Fraud AI assistant prototype
An early prototype that combined fraud and AML signals into a single pre-investigation brief, the part of an AML officer's day that costs the most time and produces the least insight.
TL;DR
An early prototype that fused fraud and AML signals into a single per-customer pre-investigation brief, the part of an AML officer's day that eats the most time and yields the least insight. It mapped the officer's workflow end to end and showed how human-in-the-loop AI could collapse the hours of context-loading in front of every investigation, with the officer always deciding. Built ahead of the broader agent wave; the strategic and workflow design outlived the code.
The AML officer's day is mostly spent before the investigation actually begins. The model the industry runs on, rules-based monitoring throws alerts, a junior analyst triages, a senior officer investigates the survivors, is throughput-bound at the triage step, not at the investigation step. An officer can investigate a few cases in depth a day. The triage backlog is hundreds. That is the squeeze.
This prototype was built around that squeeze, ahead of the broader agent wave that would later make human-in-the-loop AI a default pattern. The premise: aggregate the signals that already exist inside the bank, transaction monitoring output, KYC posture, historical patterns, fraud alerts, and produce a single, structured pre-investigation brief per customer so the officer opens the file already knowing what is there, where to look, and what to verify next.
Combining fraud and AML. The two functions are usually split across separate teams and separate tools, fraud sits with payment operations, AML sits with compliance, neither sees the other's signal. In practice the same customer often shows up in both, with patterns that only make sense when read together. The prototype was built around the assumption that the unit of value is one report per customer combining both signals, not two parallel queues that re-converge at a senior officer's desk a week later.
Pre-investigation triage as the unit of value. The report does not make the decision, it makes the human decision faster. Each report leads with a structured summary (what triggered, what the bank already knows about this customer, what historical patterns matter), drills into the specific transactions or relationships flagged, and lays out the next questions a competent officer would ask. The officer arrives at a file already past the first 30–60 minutes of context-loading and straight into the part of the work that needs their judgement.
Human in the loop, by design. No automated decisioning, no auto-closing of cases, no auto-filing of suspicious-activity reports. Every report ends in a human officer making the call. That constraint was not imposed late, it was the entire design premise. In a regulated context, the model that passes audit is the one where the human is the decisionmaker and the AI is the brief that lands on their desk. That distinction shaped every piece of UI and every output the prototype produced.
Designed against the real data shape. Banks do not have neat datasets. AML and fraud signals come from rules engines, KYC systems, historical case management, sanctions screening, customer master data, most of it siloed, much of it inconsistent, none of it built to be joined at the case level. The prototype was sketched on the assumption that those systems exist, that the bank would not rebuild them, and that the value layer was at the synthesis step, turning what the bank already had into the brief the officer actually needs.
Built ahead of the wave. The model class and tooling that would make this kind of work routine a year later did not yet exist. The agent abstractions everyone uses now, tool use, multi-step plans, structured outputs anchored to schemas, had to be reasoned about from scratch. The fact that the workflow design from then maps cleanly onto how this kind of system gets built today is, in retrospect, the most useful artefact of the work.
Why a regulated industry was the right place to start. Banking AML is the cleanest possible test bed for human-in-the-loop AI, the regulatory regime requires a human decisionmaker, the volume problem is genuine, the cost of an error is bounded by an existing compliance framework, and the officers themselves know exactly where their time goes. Solving the workflow here is what builds the credibility to solve it in places where the constraints are softer.
The prototype stayed a prototype. The production AML and fraud tooling that has shipped since follows the same broad pattern, a synthesis layer that produces a structured brief and routes a human officer into the investigation already informed. What this work proved was that the workflow was buildable and that the human-in-the-loop constraint was strategically the right one to design around. That this particular prototype did not become the product is a fact about the wave that arrived afterwards, not about the design that anticipated it.
Questions about this work, or something like it?
Ask the agent