Having established the potential benefits of ML-based matching, you now must determine the right solution for your organization. This article addresses one important factor to consider when evaluating different ML-based offerings.
Not all AI is created equal
Whether you develop in-house or opt for a vendor solution, recognize that not all ML is not created equal!
Many different ML algorithms and different implementations exist, each with its own strengths and weaknesses. The effectiveness of ML will also depend on whether it is coupled with use-case specific algorithms for transaction matching.
Metrics over marketing: trust but verify
With the extensive range of prebuilt AI libraries available nowadays, many solution providers are making calls to such libraries and touting AI capabilities in their marketing literature.
How then to meaningfully evaluate these claimed AI capabilities? AI requires a different approach to evaluation of traditional software where software can be compared and contrasted based on checklists of features and functionality.
The behaviour of an AI system is by definition driven by the specifics of the data upon which it is trained and therefore the only way to truly and objectively assess the effectiveness of an AI system is to run it with example data for the specific planned deployment and measuring relevant operational KPIs. Ideally, this data should be your own production reconciliation data for the most accurate insights. However in order to evaluate multiple vendor solutions on your production reconciliation data may be difficult since either the data needs to be sent outside of your organization to the vendors (necessitating sign-offs from your information security team) or permission needs to be obtained to install multiple vendor solutions on your own infrastructure so the evaluations can be run in-house (also requiring legal and security sign-offs).
The Power and Transparency of Public Benchmark Data
A much less onerous way to compare solutions is to use published vendor key performance metrics (KPIs) which have been measured on a publicly available representative dataset (or if not published by the vendors, you may request vendors to evaluate their systems on publically available datasets). This allows for rapid, low-effort comparisons of ML models from various vendors or internal solutions.
Operartis' dataset and metrics (https://www.operartis.com/benchrec) launched at the ACM International Conference on AI in Finance https://ai-finance.org/icaif-23-competitions-datasets/#Fin-Tran-Match, was provided for this purpose. This dataset is the only publicly available dataset for transaction matching and consists of obfuscated production data from a general ledger to bank reconciliation obtained from a Tier-1 bank.
Operartis encourages solution providers and academics to also contribute their own data sets to expand the comprehensiveness of the benchmarks.
Selecting KPIs: Balancing Automation and Accuracy
Transaction matching with ML aims to reduce manual effort by increasing automatically matched transactions. However, these additional matches should not come at the expense of too many mismatches.
The most effective ML classifier should:
Automate as many matches as possible (high recall)
Keep mismatches low (high precision)
The total time saved by the classifier is the difference between the reduced manual workload resulting from the additional created matches minus the time spent identifying and correcting mismatches. This balance between automation level and precision can be decided by considering the average time spent on manual matches and mismatch corrections.
In addition to these KPIs if the solution providers matching engine provides a confidence value for each match, then the calibration curve for this confidence should be checked. This is a plot of the confidence values versus the correctness of matches with that confidence value. A perfectly calibrated system has a perfect correspondence between the confidence value and the actual correctness. Essentially meaning that the provided confidence value is accurate. It is important that the confidence value for a match is a true reflection of its correctness so that operational decisions can be made based on the confidence of a match (e.g. if confidence is below a threshold, then manually review, else apply the match automatically).
Unlock the Power of Reconciliation Automation
Are you ready to transform your reconciliation processes? Matchimus offers a compelling solution that delivers unmatched efficiency, accuracy, and intelligence.
Get a Quote Personalized to Your Use Case with Our Proof of Value (PoV) Assessment.Â
We'll measure the match rate improvement on your reconciliation data and provide an automation report detailing your ROI before you commit to purchase.
Let's talk it over – schedule your demo to increase efficiency, improve exception management, reduce costs, and enhance visibility into your financial data. Matchimus – the future of reconciliation is here.