ChatGPT has passed the important test of machine-human identification

GPT-4.5 is the largest model OpenAI has ever built. Source: The Verge .

A new study from the Department of Cognitive Science at the University of California, San Diego marks a milestone in the field of artificial intelligence: OpenAI's GPT-4.5 model has achieved superior performance on the Turing test using a "personality"-based interaction approach.

This is considered the most human-like AI conversation system ever, opening up many potential applications in the field of social intelligence.

GPT-4.5 is billed by OpenAI as “a major leap forward in scaling pre-training and post-training.” It is the largest model OpenAI has ever built, with a size and computational power that surpasses previous versions.

According to OpenAI's official blog post on February 27, GPT-4.5 began rolling out to ChatGPT Pro users on the day of its announcement.

Can AI fool humans?

The experiment compared four representative AI systems: the 1960s chatbot ELIZA, Meta AI’s LLaMa-3.1-405B, and OpenAI’s GPT-4o and GPT-4.5. The team designed two independent tests with 250 participants each, for a total of 500 people from online platforms like Prolific. Participants were of different ages, genders, and education levels to ensure a diverse sample.

Comparison table of four typical AI systems. Source: AIbase

The test uses the traditional Turing format: each participant chats via a text interface with two subjects (one human, one AI) for 5 minutes, then rates which one is human.

The results were surprising: GPT-4.5 achieved a “Turing test passing” rate of up to 73%, surpassing the human average (60-70%). This is the first time an AI model has actually “passed” the standard Turing test. Meanwhile, GPT-4o scored slightly lower, LLaMa-3.1-405B approached or reached human performance in some contexts, and ELIZA fell far short.

Ability to interact like a human

What stood out about GPT-4.5 was not just its fluency in language, but also its ability to express emotion and adapt its responses to the nuances of its interlocutor's communication. Many participants described it as "friendly" and "authentic."

Notably, when users appeared confused or stressed, GPT-4.5 could offer humorous or comforting responses, leading many to believe they were chatting with a real person.

Conversation between two subjects (one AI, one human) during the test. Photo: UC San Diego .

Meanwhile, LLaMa-3.1-405B, while technically impressive, is less expressive and less contextually adaptive than GPT-4.5. GPT-4o, while powerful, is inferior in terms of personalization and situational response adaptation.

The GPT-4.5 breakthrough could open up a range of practical applications, from virtual tutors to psychological support to customer care. However, as AI becomes more human-like, distinguishing between reality and virtuality and regulating how this technology is used will become a key societal challenge.

The research comes amid rapid advances in AI. The success of GPT-4.5 is not only a technical triumph for OpenAI, but also raises profound questions about the relationship between humans and machines. One tester commented that it felt like he was talking to a friend – until he realized it was all just lines of code. The dialogue between humans and AI may have only just begun.

Source: https://znews.vn/chatgpt-da-vuot-qua-bai-danh-gia-quan-trong-xac-dinh-may-nguoi-post1542945.html