Vietnam.vn - Nền tảng quảng bá Việt Nam

AI's intelligence is being challenged with the game Super Mario

Super Mario games become the new 'playground' for the power race of AI models.

Báo Thanh niênBáo Thanh niên05/03/2025

According to TechCrunch , many people think that Pokémon is the toughest test for artificial intelligence (AI)? But the AI ​​challenge has not stopped there, recently, researchers at the University of California San Diego (USA) have just launched a new challenge with the game Super Mario Bros. The results show that not all AIs can successfully 'reach the finish line'.

Trí khôn của AI đang được thử thách bằng game Super Mario - Ảnh 1.

Mario games are being used to test the performance of large AI models

PHOTO: TECHCRUNCH SCREENSHOT

Super Mario poses a huge challenge for AI models

Hao AI Labs took an AI into the world of Mario to test the capabilities of today's leading language models. The results showed that Anthropic's Claude 3.7 performed the best, followed by Claude 3.5. Meanwhile, Google's Gemini 1.5 Pro and OpenAI's GPT-4o had more difficulty playing the game on their own.

It's worth noting that this isn't the original 1985 Super Mario Bros. The game runs on an emulator, integrated with the GamingAgent framework to let the AI ​​control the little Mario. The GamingAgent provides basic instructions to the AI ​​and screenshots of the game. The AI ​​then generates Python code to control the character.

According to Hao AI, the game forces models to 'learn' how to plan complex moves and build strategies for playing. Interestingly, 'reasoning' models like OpenAI's o1, which are stronger on most tests, struggle more than 'non-reasoning' models.

The reason given is that reasoning models take time to make decisions, while Super Mario Bros. requires quick reflexes. A second of delay can lead to failure.

Using games to evaluate AI has been around for a long time, but many experts are skeptical about the accuracy of this method. They argue that games are too simple and provide too much data to train AI, not reflecting the true capabilities of AI in the real world.

Andrej Karpathy, a research scientist at OpenAI, calls this the ‘assessment crisis.’ He admits that there is currently no accurate metric for assessing AI capabilities.

While debates about the accuracy of evaluating AI through games remain, seeing AI 'fight' in Mario's world is still an interesting experience and helps people better understand the capabilities of AI.


Comment (0)

No data
No data

Same category

Lotus tea - A fragrant gift from Hanoi people
More than 18,000 pagodas nationwide rang bells and drums to pray for national peace and prosperity this morning.
The Han River sky is 'absolutely cinematic'
Miss Vietnam 2024 named Ha Truc Linh, a girl from Phu Yen

Same author

Heritage

Figure

Enterprise

No videos available

News

Political System

Destination

Product