AI and the need for testing

Written by Barney Quinn, Ambassador, Insoft Services

The problem

There is no doubt AI is advancing fast. Governments are backing it, companies are racing to adopt it, and investment is booming. But despite the excitement, we are seeing growing numbers of failures – often serious ones.

These failures are not due to bad luck or poor tech. They are often the result of weak governance, inadequate testing, and teams cutting corners to keep up with the pace.

Even the big player’s struggle. In February 2025, an Apple AI feature wrongly claimed a BBC article said someone had taken their own life – completely false. Apple responded quickly, but if it can happen to Apple, it can happen to anyone.

I am not anti-AI. I believe AI brings innovation, creativity, and speed. But it must be checked, tested, and governed before it is trusted. A quick “eyeball test” is not enough in complex systems.  

 

AI needs control

I approach this from a risk and governance perspective. Boards and management need to know what is being built – and ensure proper oversight. Not later. Not after it fails.

Today, I would label all AI projects with a red flag – not because AI is bad, but because many teams are skipping QA and testing steps or doing them poorly. In a rush to ship new tools, too many organizations have gotten sloppy.

If professional teams get it wrong, what happens when non-technical departments build and deploy AI without formal processes? It is happening more than you think.

AI is powerful, but we cannot let it run unchecked like it is infallible. That is a fast track to failure.  Remember AI is usually overconfident and devoid of accountability. That’s a key differentiator between a human doing the work or proper controls in place vs. uncontrolled AI – the AI has no concern or repercussions if it’s wrong or not, despite having a huge negative impact on the business.

 

Speed is the pressure

The business world has changed. Systems used to take years to built and deployed. Now, it is quarters – or weeks. AI enables that speed, but it also increases risk.

With AI high on every CEO’s agenda, solutions are often rushed. The faster we move, the more we need governance, QA, and testing to catch up. Sadly, those areas have often been left behind.

Teams now assume the tech will “just work.” But it does not – not without control.

We also need to look more closely at the tools we use. Some tools (like DeepSeek) may have regional or political biases. We must apply due diligence – not just to the AI itself, but also to the systems behind it.

 

The legal view

Governments are starting to react. Regulation is coming – sometimes strong, sometimes vague. In the UK, company law applies, but that may be too slow. The EU’s upcoming AI Act (February 2025) includes major fines – up to €35 million or 7% of global turnover.

One proposed law in California used the word “carefulness.” I like that. It is what we need: governance and tests – before things break.

 

The Importance of testing

This is not just about new AI systems. It applies to upgrades, scripts, patches – everything. Assumptions that “it’ll work” can lead to serious consequences.

Think CrowdStrike in 2024. The global impact was massive. Some companies suffered for months.

We need to test properly. Always.

I recently spoke with Sandeep Shah at Webtrends Optimize (disclosure: I am a shareholder). His view? “Testing is common sense, not rocket science.” Good testing teams follow best practices, use smart tools, and experiment before going live.

 

Experimentation (A/B Testing)

Tech-savvy companies like Booking.com and Amazon run A/B or split tests before full deployment. They compare two versions, watch how users react and refine.

Start small. Test. Improve. Then go live. That is what professionals do.

 

Shadow AI projects: A real risk

Many AI systems are now being developed by non-technical teams – often without the board even knowing. A Salesforce study in Q4 2023 showed that 50% of employees using GenAI did so without approval.

That is a problem. These teams often lack the skills to test or govern what they build. Without oversight, these systems go live – untested, unreviewed, and risky.

Companies today are complex. Systems talk to each other. If one breaks, many break. You need ecosystem-level visibility and governance.

 

Real-world examples

– A well-known delivery company’s chatbot spiraled out of control. It was taken offline after going viral.
– A GM car dealership in the US sold a $76,000 car for $1 (thankfully, not honored).
– A UK train company’s ticketing app issued incorrect tickets and had to be shut down.
– A BBC report in Feb 2025 found that AI chatbots failed to accurately summarize even basic news content.

We cannot have another Post Office Horizon-style disaster in ten years, with executives claiming “We thought the AI was right.”

These incidents all point to the same thing: failure to test, govern, and communicate clearly – especially at the board level.

 

What should companies do?

We do not want to kill innovation. But we must manage risk.

Recommendations:
– Ensure all AI projects are visible to the board and management.
– Use sandbox environments to test AI safely before launch.
– Have a separate team evaluate the AI – not just the creators.
– Maintain a risk register, reviewed monthly by management.
– For major or sensitive projects, engage a professional testing company for independent QA.

 

Insoft: A partner in testing

As an Ambassador for Insoft Services, I can say their team of expert testers works both onshore and offshore. They use advanced tools like Tricentis Tosca, which include self-healing features – saving time and reducing errors.

But tools are not magic. You still need professionals to manage them effectively.

 

Final thought

AI is here to stay. It brings value, speed, and creativity – but only when deployed with care.

Do not assume it will work. Do not wait for failure to evaluate.
Test early. Test often. Govern always.


Read More Regulation

Comments are closed.