Opinion: Harnessing AI handyman’s tools

ARTIFICIAL Intelligence’s tendency to hallucinate will not disappear in the near future. According to the Vectara Hallucination Leaderboard, large language models like Claude and ChatGPT continue to produce incorrect responses in up to 10.8 percent of cases.
At the same time, artificial intelligence is blamed for large-scale layoffs in the technology industry. How can we maintain accuracy and truth if humans are removed from the process, especially given that the people building these systems do not fully understand what hallucinations are or why they occur?
“These are essentially what you might call black box systems,” said Nofil Khan, founder of Sydney-based artificial intelligence consultancy Avicenna.
“We designed the algorithms, we gave it the data, and the output is that it works really well. But we don’t actually know what’s going on.”
step one
The first step in reducing AI hallucinations is for individuals to ask whether they need AI in the first place.
“A lot of companies come to me and say, ‘Hey, we want to implement AI,’ and I say, ‘Okay, great, why?’ And then there’s no answer. Or they say competitors are doing it and they should do it too,” Mr. Khan said.
As a result, businesses are using AI in places where it is neither necessary nor strategically sound.
Agent AI
While LLMs like Claude and ChatGPT are designed to create content, agency AI uses LLMs to manage other tools rather than trying to do everything itself.
Wei Liu, associate professor of computer science and software engineering at the University of Western Australia, describes AI agents as being like conductors of an orchestra.
In this vein, Mr. Khan said that agency AI uses a set of tools that, once selected, allow the user to perform precise work by selecting precise tools.
For example, users can create an executive assistant representative who triages and responds to emails.
AI agents, while still powered by Masters, can help reduce hallucinations thanks to the many precise, non-Masters tools they have at their disposal.
AI harnesses
“AI harness is the system that holds and guides an AI model,” Mr. Khan said. Business News.
“The harness constrains the model’s ability and allows the model to understand what tools it can use and when it should use them.
“If the AI model is the mechanic, the seat belt is the toolbox and manual that tells it which tools to use and when.”
He said the straps help reduce hallucinations by allowing the AI to do what the user wants, as well as being able to control its own operation.
rag
Dr Liu said reintroduction of augmented generation (RAG) reduces hallucinations by basing AI responses on validated internal data, such as structured databases or collections of corporate documents not usually available on the public internet.
This, he said, reduces hallucinations by allowing the AI to fine-tune the model to make it more domain-specific, while also providing greater user confidence in the LLM’s sources of information.
Mr Khan said a practical example of this was that of a law firm that wanted to use its past case archive.
“You can ask the AI, ‘In our previous cases, note all the different situations where xyz occurred,'” he said.
“It can then go through and review hundreds, thousands of cases to find the most relevant cases for that input.”
Garbage in, garbage out
None of the approaches mentioned above can completely eliminate hallucinations. Therefore, the final and arguably most important protection is the quality of the data at both ends of the AI process.
On the front end, this means making sure good data goes in, especially when using RAG, Mr. Khan said.
“[AI is] “It’s only as good as the training data,” he added.
Human verification on the backend remained critical. “At the end of the day, you still need a human being to be involved,” Dr Liu said. Business News.
Beyond verification, there also needs to be accountability. Artificial intelligence can produce output but cannot take responsibility for it.
This still belongs to us.
READ THE FIRST REPORT HERE
• Dr Kate Raynes-Goldie is chief curiosity officer at The Up Next Company, Oceania’s leading LEGO® Serious Play® expert, engaging keynote speaker and creator of SUPERCONNECT. Since 2002, he has been helping people understand and wonder about innovation, technology and the future.


