Threshold Tuning
Threshold tuning is the process of calibrating score decision thresholds, Risk Indicator (RI) score scales, and RI weights so that the fraud detection system performs optimally for your institution's portfolio. Tuning balances the fraud catch rate (sensitivity) against the false positive rate — the two key performance indicators for any fraud detection system.
Key tuning concepts
- False Positive Rate (FPR)
- The percentage of legitimate transactions incorrectly scored as HIGH or CRITICAL. High FPR increases customer friction and operational cost.
- False Negative Rate (FNR)
- The percentage of fraudulent transactions scored below the REVIEW threshold. High FNR means fraud passes through undetected.
- Decision threshold
- The composite score value at which the risk level changes (e.g., the boundary between MEDIUM and HIGH). Raising a threshold reduces FPR at the cost of higher FNR.
- RI score scale
- The mapping function that converts a raw RI value (e.g., transaction velocity = 12 transactions/hour) into a sub-score (0–100). Score scales can be numeric, Boolean, or string-based.
- RI weight
- The relative importance of an RI in the composite score calculation. Higher-weight RIs have more influence over the final score.
RI score scale types
Every RI uses one of three score scale types to map its raw value to a 0–100 sub-score. Scale definitions live in ri-config.ini.
Numeric scale
Maps a numeric RI value to a score using range brackets. Use this for RIs with continuous numeric outputs (velocity counts, amount values, days elapsed).
Boolean scale
For binary RIs — a condition is either true or false. The scale assigns a fixed score when true and 0 when false.
String / categorical scale
Maps specific string values to scores. Use this for channel, country, or transaction type RIs where each category has a defined risk level.
Tuning decision thresholds
Decision thresholds are set per BTA in threshold-config.yaml. Different payment types warrant different thresholds: a wire transfer to a new international payee carries inherently more risk than an ACH credit, so the HIGH threshold for WEB_WIRE_TRANSFER should be lower (more sensitive) than for ACH_CREDIT_BATCH.
Quarterly tuning workflow
-
Pull the Performance Report
In the FraudShield Operations Console, go to Reports > Model Performance. Download the quarterly report for each active BTA. Key metrics: FPR, FNR, precision, recall, and alert volume trend.
-
Identify out-of-range RIs
Review the RI Contribution Report for RIs with consistently low sub-scores across fraud cases. These RIs may be over-weighted relative to their predictive value. Flag any RI where the average fraud-case sub-score is below 30.
-
Run threshold simulation
Use the Threshold Simulator (Operations Console > Tuning > Simulate) to replay the last 90 days of transactions against proposed new thresholds. Review the simulated FPR/FNR impact before changing production settings.
-
Update configuration files
Apply approved changes to
ri-config.iniandthreshold-config.yamlin the staging environment. Validate with a shadow-mode run (scoring without influencing decisions) for 5–10 business days. -
Promote to production
Submit a change request and, once approved, deploy configuration files to production. Monitor alert volume, FPR, and FNR for the first 48 hours using the Operations Dashboard.
-
Document the tuning rationale
Record the business rationale, simulation results, and approvals in the Model Change Log. This forms part of the model governance audit trail required under SR 11-7 and CFPB model risk guidelines.
Population-based tuning
For RIs whose optimal thresholds vary significantly by customer segment (for example, high-net-worth clients vs. basic retail accounts), FraudShield AI supports population-scoped RI configurations. Set the population_scope parameter on any RI to apply a separate score scale for a defined customer segment.