Vietnam.vn - Nền tảng quảng bá Việt Nam

Warning about ChatGPT 'hallucinogenic'

Recent studies have shown that GPT o3 and o4-mini – the most powerful models in OpenAI's product portfolio – are fabricating even more false information than their predecessors.

ZNewsZNews20/04/2025

The two newly launched ChatGPT models have a higher frequency of fabricated information than the previous generation. Photo: Fireflies .

Just two days after announcing GPT-4.1, OpenAI officially launched not one, but two new models, named o3 and o4-mini. Both models demonstrate superior reasoning capabilities with many powerful improvements.

However, according to TechCrunch , these two new models still suffer from "hallucinate" or self-inventiveness. In fact, they exhibit more hallucinations than some of OpenAI's older models.

According to IBM, hallucinations are phenomena where large language models (LLMs) – often chatbots or computer vision tools – receive data patterns that do not exist or are unrecognizable to humans, thereby producing meaningless or inaccurate results.

In other words, users often expect AI to produce accurate results based on trained data. However, in some cases, the AI's results are not based on accurate data, creating a "false" response.

In its latest report, OpenAI discovered that o3 was "illusory" when answering 33% of the questions on PersonQA, the company's internal standard for measuring the accuracy of a model's knowledge of humans.

For comparison, this figure is double the "illusion" rate of OpenAI's previous reasoning models, o1 and o3-mini, which were 16% and 14.8%, respectively. Meanwhile, the O4-mini model fared even worse on PersonQA, experiencing "illusion" for 48% of the test duration.

More concerningly, the "father of ChatGPT" doesn't actually know why this is happening. Specifically, in its technical report on o3 and o4-mini, OpenAI states that "further research is needed to understand why the 'hallucinations' worsen" when scaling reasoning models.

O3 and o4-mini perform better in some areas, including programming and mathematical tasks. However, because they need to "make more statements than general statements," both models have resulted in "more accurate statements, but also more inaccurate statements."

Source: https://znews.vn/canh-bao-ve-chatgpt-ao-giac-post1547242.html


Comment (0)

Please leave a comment to share your feelings!

Heritage

Figure

Enterprise

Huynh Nhu makes history at the SEA Games: A record that will be very difficult to break.

News

Political System

Destination

Product