• Patrick Özer, Partner |
  • Jori van Schijndel, Senior Manager |

Anti-Money Laundering (hereinafter: ‘AML’) is a required component of the financial industry’s ongoing efforts to detect and prevent financial crimes. Most financial institutions rely on rule-based and/or machine learning models to identify suspicious activity. However, the effectiveness of these models is directly tied to the quality of feedback they receive. One key method to refine models, in general, is through backtesting, which evaluates model performance by comparing historical data with model outcomes. This process not only aids in the identification of potential weaknesses, but also allows for adjustments to improve overall performance. For instance, in credit risk models, financial institutions analyze past loan data, including borrower characteristics, loan terms, and repayment histories to validate the accuracy and reliability of models predicting loan default probability.

In theory, backtesting in AML models should allow financial institutions to fine-tune their models, increasing true positives and reducing false positives. However, in practice, they often face significant challenges when it comes to conducting proper backtesting. Why can AML models not benefit from backtesting the same way that, for example, credit models can?

The challenge – lack of feedback on alerts

A significant challenge in the backtesting process is the lack of ground truth for model outcomes. When a transaction monitoring (hereinafter: ‘TM’) model identifies unusual transactions, they are initially reviewed by an internal alert handling team and a compliance department. If deemed ‘unusual’, these alerts are escalated to relevant authorities, such as the Financial Intelligence Unit (hereinafter: ‘FIU’). However, financial institutions rarely receive follow-up information on whether these alerts resulted in investigations, prosecutions, or whether they were ultimately classified as false positives. Additionally, when no feedback is received, this does not automatically justify the position that the alert should ultimately be considered a false positive. Consequently, financial institutions are unable to utilize this crucial feedback as ground truth for the refinement of their models. This limitation results in challenges such as the inability to enhance detection model accuracy, and higher rates of false positives. Regulators like the Dutch central bank (DNB) expect institutions to demonstrate compliance with strict laws, such as the Anti-Money Laundering and Anti-Terrorist Financing Act (Wwft) and the Sanctions Act (SW), but without reliable backtesting, this is certainly challenging.

A similar issue arises in onboarding and Know Your Customer (hereinafter: ‘KYC’) procedures. Financial institutions might reject clients based on risk assessments, but without proper feedback, they cannot verify whether their decision was justified. This can lead to confirmation bias, i.e., it is assumed the decision was correct, even in the absence of concrete evidence. This mirrors a known challenge in models related to the issuance of loans, where bias arises because financial institutions lack data on individuals who were denied loans. Sanctions screening models, another type of model used for AML purposes, also face challenges with respect to backtesting. The sparse examples of true positive sanction alerts make it difficult for institutions to validate their decisions, which both increases the risk of non-compliance and limits efficiency gains.

A collaborative solution – sharing insights to improve model performance

A potential (partial) solution to improve model performance in transaction monitoring could lie in collaboration between financial institutions. By sharing anonymized or aggregated insights into transaction patterns, both normal and suspicious, institutions can refine their detection models. By first removing sensitive or personal information, collaboration can comply with privacy regulations such as the General Data Protection Regulation (GDPR). This exchange of information enables the continuous improvement of detection models, and potentially, a significant reduction in false positives and false negatives.

Improving backtesting through collaboration between financial institutions can be the key to developing more effective TM models. By participating in non-sensitive data-sharing initiatives, institutions can not only improve their model effectiveness, but also contribute to a more secure global financial system. However, while collaborative backtesting can enhance TM models, it will not solve all AML-related challenges, like the unique issues faced by KYC and sanctions screening models. What other strategies do you think could further strengthen AML efforts?