"A broken agent could cost a company billions of dollars." Conscium featured in Sifted magazine

Éanna Kelly a contributing editor at Sifted, writes about companies testing and monitoring AI agents.

If AI proponents are to be believed, at some point in the next few years there will be more AI agents than humans in the workforce. Some of these agents — autonomous digital secretaries powered by AI — will be making big decisions, from approving transactions to accessing sensitive data. As their autonomy expands, the need to keep tabs not only on what they’re doing, but whether they’re authorised to do so, will become essential.

This is where an emerging crop of startups — ones testing, monitoring and authenticating agents — comes in. “A broken agent could cost a company billions of dollars,” says Ted Lappas, cofounder of London-based Conscium, which has developed a platform to test agents.

If AI proponents are to be believed, at some point in the next few years there will be more AI agents than humans in the workforce. Some of these agents — autonomous digital secretaries powered by AI — will be making big decisions, from approving transactions to accessing sensitive data. As their autonomy expands, the need to keep tabs not only on what they’re doing, but whether they’re authorised to do so, will become essential.

This is where an emerging crop of startups — ones testing, monitoring and authenticating agents — comes in. “A broken agent could cost a company billions of dollars,” says Ted Lappas, cofounder of London-based Conscium, which has developed a platform to test agents.

There are various ways of describing the agent-policing services startups offer, such as“ observability”, “verification” or AI agent “improvement”. It’s a young field but competition is already growing.
This year, investors have backed Israeli companies Deepchecks and Digma, and US-based Kolena, Braintrust and Vouched, which all build tools to test or identify AI models.

There have already been some M&A moves in this bubbling field. In August, AI model-maker Anthropic acquired the cofounders and most of the team behind London-based Humanloop, a platform for LLM evaluation and observability. In March, US AI infrastructure company Corewave bought a similar startup called Weights & Biases.

What about European challengers? Below are some of the early players helping companies to assess AI accuracy and identity.

Langfuse
HQ:
Berlin
Founders:
Marc Klingen (CEO), Max Deichmann and Clemens Rawert
Equity raised:
$4m seed
Lead investors:
Lightspeed Venture Partners, La Famiglia and Y Combinator
The Berlin open source company — which is rumoured to be raising investment from VCs— helps to debug and improve LLM applications. Langfuse has amassed some 19k“stars” on developer hub Github — which means a lot of people feel passionate about it —and the company says 19 of the “Fortune 50” companies use its services.
“When a customer pings us, we pop open Langfuse to understand what’s going on,” says Alexander Danilowicz, cofounder of Californian vibe coding site Magic Patterns. “This allows us to keep customer support resolutions at less than eight minutes on average.”

Conscium
HQ:
London
Founders:
Daniel Hulme (CEO), Ed Charvet, Calum Chace, Ted Lappas and Panagiotis Repoussis
Equity raised:
$0m
The London startup has developed a platform to verify AI agents for accuracy, responsiveness and other factors.
Cofounder Lappas says “every job title that doesn’t include physical work is on the table” for AI agents to take over in the next few years. He says the first wave of agents will be limited in what they do (they’re chatbots, essentially, helping companies retrieve information).
Agents will gradually get more sophisticated, he says, and be given more responsibilities until we get to the ultimate agent: the version that operates like a human and is able to adapt to a changing environment. “That will be far the most interesting and popular development in 2026,” says Lappas.
Conscium’s platform is intended to be used regularly by companies running agents. “Testing needs to be continuous, multiple times a day. Every time there’s a change in the agent’s universe and you have a different input to the agent’s brain, you have to test it again,” he says.

“A broken agent could cost a company billions of dollars.” Conscium featured in Sifted magazine

Latest Media Articles

Daniel placed at 25 on list of the world’s top 100 AI leaders

UK tech pulled in $15.3B last year, but can it keep producing unicorns?

Daniel puts AI predictions to the test