AI tools are making ‘repeated factual errors’, major new research warns

Your support helps us tell the story

From reproductive rights to climate change to Big Tech, The Independent is on the ground as the story unfolds. Whether it’s investigating the finances of Elon Musk’s pro-Trump PAC or producing our latest documentary, ‘The A Word,’ which sheds light on American women fighting for reproductive rights, we know how important it is to separate facts from messages.

We need reporters on the ground at such a critical moment in U.S. history. Your donation allows us to continue sending journalists to tell both sides of the story.

The Independent is trusted by Americans across the political spectrum. And unlike many other quality news sources, we choose not to exclude Americans from our reporting and analysis through paywalls. We believe quality journalism should be available to all and paid for by those who can afford it.

Your support makes a big difference.

The latest wave of internet-based AI search tools “often make mistakes, misread information and even give risky advice,” according to a damning study by Which?

Which one? The team surveyed 4,189 adults in the UK in September 2025 about their AI habits and found that nearly a third of them believed AI searches were more important to them than standard web searching.

Additionally, nearly half of respondents said they had a “great deal” or “fair” degree of confidence in the information they received from AI search engines, with this share rising to two-thirds among frequent users.

The team tested six AI tools: ChatGPT, Google Gemini (both Gemini and Gemini AI overviews, or AIO in standard Google searches), Microsoft’s Copilot, Meta AI, and Perplexity.

Significant new research warns that AI tools are prone to errors (Getty Images)

Each AI engine was asked 40 common questions about human concerns such as money and finance, law, health/nutrition, and consumer rights/travel issues. Which experts then evaluated the answers? They were rated on factors such as accuracy, usefulness, and ethical responsibility.

The team said all the AI tools in testing made “repeated factual errors”, gave incomplete advice, gave overly confident advice without considering ethical issues, sometimes relied on poor sources such as old forum threads, and also directed users to “dangerous premium services” rather than directing them to free tools and resources, meaning people risked overpaying or engaging with “dubious services”.

“There are too many false and misleading statements out there for convenience, especially given how much people use and rely on these devices now,” Which? said the team.

He added: “AI is the future, but relying too much on it right now could be costly.”

The investigation into the reliability of AI comes as Sundar Pichai, CEO of Google parent company Alphabet, said AI models were “error-prone” and encouraged people to use them in conjunction with other tools.

Sundar Pichai, CEO of Google parent company Alphabet, urged people to use AI models among other tools (access point)

Speaking to the BBC this week Mr Pichai said He states that people should not “blindly trust” new technology and that the mistakes made by AI tools highlight the importance of having a rich information ecosystem rather than relying solely on AI.

Which one do you respond to? A Google spokesperson said in the study: “We have always been transparent about the limitations of Gerative AI, and we place reminders directly in the Gemini app to ensure users double-check information. For sensitive topics, such as legal, medical or financial matters, Gemini goes a step further by recommending that users consult qualified professionals.”

Microsoft said: “Copilot answers questions by distilling information from multiple web sources into a single answer. Answers include linked quotes so users can do further research and research, just like traditional search. With any AI system, we encourage people to verify the accuracy of the content, and we’re committed to listening to feedback to improve our AI technologies.”

An OpenAI spokesperson said: “If you’re using ChatGPT to research consumer products, we recommend choosing the built-in search tool. It shows you where the information is coming from and gives you links so you can check it yourself. Improving accuracy is something the whole industry is working on. We’re making good progress, and our latest default model, GPT-5, is the smartest and most accurate model we’ve developed.”

AI tools are making ‘repeated factual errors’, major new research warns

Leave a Reply Cancel reply

Champions League semi-final ‘special’ for Klopp

Dani Alves: The truth behind fights with Cristiano Ronaldo

Real Madrid v Espanyol Betting

Arsenal and Sutton communities teams deepen bonds

Dani Alves: The truth behind fights with Cristiano Ronaldo

Super Bowl 2017: Here’s How Many People Watched the Super Bowl

Related Articles

Leave a Reply Cancel reply