From Volume to Value

For two decades, compliance programs have been measured by their inputs: how many alerts they generate, how many SARs they file, how many analysts they employ. These metrics tell you almost nothing about whether the program actually works.

FinCEN’s proposed rule changes the question. Under the effectiveness standard, the examiner is asking whether your screening produced meaningful outcomes. This article proposes six metrics that form a scorecard translating the effectiveness standard into operational language compliance leaders can act on immediately.

Why the Old Metrics Fail

Activity metrics answer “are we doing things?” They do not answer “are the things we’re doing working?” Under the effectiveness standard, activity that does not produce outcomes is not effective — no matter how much of it there is.

The most dangerous metric in compliance is one that goes up every quarter and tells you nothing about whether you are safer. Alert volume is that metric.

The Six Metrics

Metric 1: Detection Effectiveness Rate

What it measures: The percentage of genuine risks that screening and monitoring systems successfully identify — the true-positive detection rate.

How to calculate it: Run periodic back-tests: take known enforcement actions and sanctions designations from the past twelve months and determine whether your system would have flagged them.

What good looks like: Above 90% for sanctions screening. Below that is a material gap in program establishment.

Metric 2: Investigation Yield

What it measures: The percentage of investigated alerts resulting in a meaningful compliance action: SAR filing, EDD decision, account restriction, or law enforcement referral.

How to calculate it: Divide alerts resulting in action by total alerts investigated. Track by screening program.

What good looks like: Leading institutions report 30–60%. Below 10% means 90% of investigation capacity produces no outcome.

Predictive AI Banner

Metric 3: False Positive Cost per Alert

What it measures: The fully-loaded cost of investigating a single false positive, including analyst time, technology overhead, QA, and management.

How to calculate it: Sum all investigation costs, divide by alerts investigated, multiply by false positive rate.

What good looks like: Industry average is $15–40 per false positive. At 90% FP rate with 10,000 daily alerts, that’s $49–131 million per year in waste.

Metric 4: Time-to-Disposition

What it measures: Elapsed time from alert generation to final disposition.

How to calculate it: Track timestamps at each stage. Calculate median and 90th-percentile by alert tier.

What good looks like: 24 hours for high-risk sanctions alerts. 72 hours for standard. 30-day averages are difficult to defend as “effective.”

Metric 5: Risk Coverage Ratio

What it measures: Percentage of identified ML/TF risk categories covered by active controls.

How to calculate it: Map every risk in the assessment to an active control. Covered categories divided by total categories.

What good looks like: 100%. Any gap is a gap in program establishment.

Predictive AI Banner

Metric 6: Cycle-over-Cycle Improvement Rate

What it measures: The rate of improvement across the five metrics above from one measurement period to the next.

How to calculate it: Compare each metric to prior quarter. Calculate percentage improvement.

What good looks like: Any positive trend. A program improving from 5% to 25% investigation yield over twelve months is demonstrably more effective than one maintaining 5% for a decade.

Under the effectiveness standard, the most powerful thing a compliance leader can show an examiner is not a perfect score on any single metric. It is a consistent trend line across all six.

How to Use the Scorecard

Board reporting. Translate the six metrics into a single-page dashboard. The board needs to see outcomes and improvement, not technical details.

Examiner presentation. Walk through each metric: what we measure, current performance, trend, and what we’re doing to improve. Connect each to the risk assessment and technology decisions.

Vendor evaluation. Require vendors to demonstrate impact on each metric. A vendor that cannot articulate their effect on detection effectiveness, investigation yield, and false positive cost is selling activity, not outcomes.

Predictive AI Banner

Conclusion

The shift from volume to value is the operational translation of FinCEN’s effectiveness standard. These six metrics — detection effectiveness rate, investigation yield, false positive cost per alert, time-to-disposition, risk coverage ratio, and cycle-over-cycle improvement rate — answer the only question the proposed rule asks: does your program work?

Start measuring them this quarter. The baseline you establish today is the starting point for the improvement trajectory that will define your next examination.

AI Banner

Why the Old Metrics Fail

The Six Metrics

Metric 1: Detection Effectiveness Rate

Metric 2: Investigation Yield

Metric 3: False Positive Cost per Alert

Metric 4: Time-to-Disposition

Metric 5: Risk Coverage Ratio

Metric 6: Cycle-over-Cycle Improvement Rate

How to Use the Scorecard

Conclusion

The proposed rule asks one question: does your program work? These six metrics are the answer.