March 15, 2026
What We Learned Building Transaction Intelligence for a NZ Finance Company
A look at the challenges and lessons from building production ML for financial transaction classification.
March 15, 2026
A look at the challenges and lessons from building production ML for financial transaction classification.
One of our first major production engagements was building transaction classification and intelligence capabilities for a growing New Zealand finance company. The project spanned roughly three to four months and touched nearly every part of their data pipeline. Here's what we learned — without giving away anything proprietary.
The client was dealing with a growing volume of financial transactions that needed to be categorised accurately and consistently. As their lending portfolio scaled, manual classification was becoming a serious bottleneck. Inconsistencies in how transactions were labelled were creating downstream issues — particularly around compliance reporting and internal analytics.
They needed a system that could handle the complexity of their transaction landscape, adapt to new patterns over time, and integrate cleanly into their existing operations.
This wasn't a quick model-and-deploy job. The project involved multiple workstreams running in parallel — data exploration, pipeline architecture, model development, integration work, and extensive testing. Financial data is messy and nuanced, and getting it right required patience.
We spent the first stretch just understanding the domain: how the client's business worked, what their transaction categories actually meant in practice, and where the edge cases lived. That groundwork shaped everything that followed.
From there, the work moved through several phases — cleaning and structuring historical data, building and validating models, designing the serving infrastructure, and iterating closely with the client's team to make sure the outputs made sense in their day-to-day operations.
The time we spent understanding the client's business before writing any code paid for itself many times over. Domain knowledge informed our feature engineering, our validation strategy, and how we handled ambiguous cases. It was the single most valuable thing we did.
We worked closely with the client's operations and compliance teams across the full engagement. Their feedback shaped the model at every stage — catching edge cases, validating outputs, and helping us understand what "good enough" actually looked like in their context.
We knew from the start that this system would need to evolve — new transaction types, changing regulatory requirements, shifting business priorities. We designed the pipeline to be modular and well-documented, so the client's team could make changes and retrain models without depending on us.
Historical labels had inconsistencies that propagated through training in ways we didn't fully anticipate. We spent more time than planned on data cleaning and reconciliation. In future projects, we'd scope data quality as a distinct workstream with its own timeline.
Some integration issues could have been caught sooner if we'd deployed to a staging environment earlier in the process. We've since adjusted our approach to bring integration work forward.
The system is now in production and processing transactions daily with high accuracy. It's significantly reduced the time the client's team spends on manual classification and brought much-needed consistency to their categorisation across the portfolio.
More importantly, the client's team understands how the system works and can maintain it independently. That's what we mean when we say we build your capability, not your dependency. This project was a success — not because it was easy, but because we invested the time to do it properly.