Why machine learning still offers rich potential for the tax function

In the rush to embrace gen AI, tax leaders shouldn’t forget the benefits of conventional machine learning, says Stuart Tait.

Why isn’t anyone talking about machine learning anymore?

Given the (not unjustified) hype surrounding generative AI, you could be forgiven for thinking it has supplanted machine learning (ML) – which was the transformative technology of the day only a few years ago.

Yet there’s still huge potential for conventional ML systems to automate tax procedures, making them faster, more accurate and more efficient.

It’s worth remembering that gen AI is machine learning: it’s just the latest iteration of it.

AI is so called because it emulates human cognition to predict outcomes. ML technology does this by inferring rules to work out the right answer between two binary options, having been trained on the specific data required for that purpose. In a tax context, that might be deciding whether or not a transaction is tax deductible.

Gen AI takes this a stage further. Trained on the entire internet, tools like ChatGPT predict the next word in a sentence to generate human-like content.

I’ve looked at gen AI’s use cases in the tax function in a previous blog. They’re undoubtedly powerful; but don’t rush headlong into pointing them at every tax problem. There are many scenarios in which a machine-learning tool will do the job.

Stuart Tait

Tax Technology & Innovation Partner

KPMG in the UK

mail
call

Harnessing machine learning

The process of adopting ML in the tax function can be complex – especially if you’re developing your own algorithms. Here’s an overview of what’s involved:

Decide whether to build or buy

There are plenty of well established, machine-learning tax tools out there. So there may be a ready-made model you can use, rather than creating one from scratch.

This will come down two factors. First, how common is the question you need answered? And second, how unique a context does it sit in within your organisation?

By way of example: the factors affecting the tax deductibility of, say, expenditure on capital assets or entertainment are broadly the same across sectors. It would usually make sense – and save you a lot of time, money and effort – to work with a partner who has already built a model for that purpose. In which case, you can move straight to step six.

When it comes to legal fees, however, tax deductibility is more complicated. It depends on the purpose of the legal services being procured, which in turn depends on the nature of the business. Each firm will have different reasons for hiring legal support.

As such, automating the process of deciding whether your legal expenses are deductible will likely require a bespoke algorithm. That means following all of the steps below.
Assess your training data
Machine learning algorithms must be trained on data. Lots of data. Working with a data scientist, you’ll need to analyse your training data from two points of view:
- Availability: You’ll need at least 10,000 relevant data points. Can you get access to those internally? If not, developing your own algorithm won’t be possible.
- Quality: Does your data contain all the fields required for an algorithm to make the decision you want to automate? Are there any gaps? Does it contain fields that aren’t relevant? Are all fields filled in comprehensively and accurately?
You may want to set some rules for your quality evaluation. For instance, data fields must contain a certain number of characters to provide enough information; fields containing only digits shouldn’t be used; and so on. Codifying your requirements will help you decide whether your data is up to scratch.
Select your algorithm type

There are various types of ML algorithm, all with their own pros and cons and ideal use cases. It’s impossible to explain these here, without getting into reams of technical detail. Suffice to say you’ll need the help of a data scientist to identify which sort best meets your particular requirements.
Train and test your algorithm
The next stage is to divide your data into two sets:
- A training set – containing around 80% of the data points. Feed this into the algorithm, along with the outcome to each scenario (the ‘right answer’). That will allow it to infer the rules and make predictions.
- A test set – comprising the remaining 20%. Input this without the outcomes, and compare the model’s outputs to the correct conclusion.
Your algorithm won’t ever be 100% accurate (though it should be close). You’lll need to analyse its errors, to understand whether they’re too conversative or optimistic. In a tax context, of course, you’ll want to err on the side of caution, to prevent the risk of underpayment or non-compliance.

With each response, your algorithm will provide a confidence rating. In our tax deductibility scenario, an output might be: ‘This transaction is deductible: 80% confidence’. Use this rating to set parameters governing how to use the responses – such as:
- 90% confident or higher: act on the model’s recommendation
- 80-90% confident – take the more conversative position
- Below 80% confident – get a person to make the decision
Manage, review and iterate

Once your solution is live, regularly review its accuracy, and keep feeding in new data to improve its performance.

You must also keep it up to date: tax regulation never stands still. Tweaks to the rules must be reflected in the algorithm’s training data (again, you’ll need support from a data scientist here). Wholesale changes may mean having to create a whole new model.
Keep on top of the fundamentals
Whether you’ve bought in a ML solution, or developed your own, make sure you and your users are constantly aware of three vital facets:
- What the tool is meant to do: users must understand what the model is intended to decide, and crucially, what it’s not for. Asking queries it hasn’t been trained to answer will produce faulty responses.
- The quality and accuracy of your data: always assess new data before feeding it into the algorithm.
- What good looks like: be clear on how you’re benchmarking the success of the algorithm. For example, how often does it provide the right answer with a certain level of confidence?

Developing, running and maintaining ML models is a highly technical task: you’ll need specialist support along the way.

KPMG’s data scientists and AI experts understand how machine learning can be applied to tax operations, regulation and compliance. We can help wherever you are on your journey. We can offer a one-hour overview of ML in the tax function; a half-day session to identify your pain points; or a full Ignition workshop, where we’ll recommend the right solutions for your business.

Please get in touch to see how we can take you through the process of developing and adopting machine-learning tools.

Loading

The page is loading.

Loading

The page is loading.

Why machine learning still offers rich potential for the tax function

Stuart Tait

Harnessing machine learning

Decide whether to build or buy

Assess your training data

Select your algorithm type

Train and test your algorithm

Manage, review and iterate

Keep on top of the fundamentals

Our tax insights

Something went wrong

Oops!! Something went wrong, please try again

Contact

Company

Services