By focusing on the development of specialised small language models that are fine-tuned on high-quality, proprietary data and self-hosted on controlled infrastructure, we achieve 3 key strategic advantages: superior accuracy, reliability, and radical cost-efficiency.

Study: specialised AI models’ big advantage in precision tasks

15 September 2025

The article at a glance

There is far greater accuracy, better stability and hugely lower energy costs in using specialised small language models rather than generalised large language models (LLMs) in legal and regulatory workflows combining AI speed and human judgment, says a new study from the Regulatory Genome Project at Cambridge Judge Business School, University of Cambridge. The specialised models had a 38% relative accuracy gain, ran at 1/80th the cost and consumed an estimated 1/200th the energy of the best-performing LLM.

Artificial intelligence (AI) is rapidly changing how organisations, including regulators and law firms, compile, analyse and classify data and documents. Many such firms are experimenting with Agentic AI systems that handle multi-step tasks with Human-in-the-Loop (HITL) workflows, combining AI speed with human judgment, to increase AI performance and enable collaboration with workflows that are likely to remain human-centric.

A new study from the Regulatory Genome Project, a public-private research initiative hosted at Cambridge Judge Business School, finds, that for a common legal and compliance workload requiring precision, specialised small language models were far more accurate while consuming significantly less energy than a panel of 6 leading generalist large language models (LLMs) used for such tasks.

The research, published today (15 September) by the Regulatory Genome Project in a White Paper, found that in classifying Anti-Money Laundering regulatory documents the specialised small language model had a 38% relative accuracy gain compared with the best performing LLM in the panel, Gemini 2.5 Pro owned by Google, while running at 1/80th the cost and consuming an estimated 1/200th the energy. In classifying cryptocurrency documents, the specialised small language model enjoyed a 72% relative accuracy gain, showing an even greater advantage of domain-specific model training in rapidly evolving regulatory and legal landscapes.

Specialised language model developed by RegGenome, born from University of Cambridge research

Robert Wardrop.
Professor Robert Wardrop

The specialised small language model examined in the White Paper was developed by Regulatory Genome Development Ltd, known as RegGenome, a regulatory data technology company born out of research at the University of Cambridge that is a founding member of the Regulatory Genome Project. RegGenome processes regulation from around the world using AI to transform human-readable information into data that is machine-readable, and this data is in use by nearly 100 regulatory authorities and commercial customers around the world.

The 34-page White Paper is co-authored by Max Ashton-Lelliott, Senior Data Scientist at RegGenome, and Robert Wardrop, Management Practice Professor of Finance at Cambridge Judge Business School and Founder of the Regulatory Genome Project. While the research was focused on regulatory and legal workflows, the authors believe the findings are generalisable to many other sectors and tasks given the common interplay at work between AI and human interaction.

Practical workplace effect is reduced human review time and AI-assisted review at scale

As outlined in a case study in the White Paper, the practical workplace effect of the superior performance of specialised small language models is that the higher AI success rate directly reduces the manual effort required by analysts as part of the HITL workflows.

“The RegGenome small language model creates a step-change in efficiency,” says the White Paper. “Its near-instantaneous processing time makes the entire workflow faster and more responsive.” The paper says its analysis “proves that for a HITL system to be truly effective, it must be evaluated on total end-to-end efficiency. The RegGenome model is not just more accurate or cheaper in isolation, it enables a fundamentally more efficient and collaborative relationship between the analyst and the AI, saving days of expert-level work and transforming the viability of AI-assisted review at scale.”

AI needs accuracy and speed for effective workflow

“The impressive capabilities of generalist LLMs like GPT and Claude has attracted widespread adoption – recent research indicates 40% of knowledge workers are using AI tools to increase their personal productivity,” the authors say in an Executive Summary of the White Paper. “But the same users who have integrated these tools into their personal workflows describe them as unreliable when encountered within enterprise workflows.

“For AI to be a viable part of a workflow it must exceed an acceptable threshold of accuracy while also having the speed, stability, and resource efficiency necessary for effective collaboration,” the authors say.

The White Paper evaluates specialist and generalised language models for a 2-level task, involving classification of a document and its regulatory obligation, that is commonly undertaken in legal and compliance workflow. These models are compared for accuracy, cost, speed, stability and estimated energy consumption both for established regulatory domains such as Anti-Money Laundering and emerging areas of regulation such as the oversight of cryptocurrencies.

This article was published on

15 September 2025.