Vietnam.vn - Nền tảng quảng bá Việt Nam

AI's intelligence is being challenged with the game Super Mario

Super Mario games become the new 'playground' for the power race of AI models.

Báo Thanh niênBáo Thanh niên05/03/2025

According to TechCrunch , many people think that Pokémon is the toughest test for artificial intelligence (AI)? But the AI ​​challenge has not stopped there, recently, researchers at the University of California San Diego (USA) have just launched a new challenge with the game Super Mario Bros. The results show that not all AIs can successfully 'reach the finish line'.

Trí khôn của AI đang được thử thách bằng game Super Mario - Ảnh 1.

Mario games are being used to test the performance of large AI models

PHOTO: TECHCRUNCH SCREENSHOT

Super Mario poses a huge challenge for AI models

Hao AI Labs took an AI into the world of Mario to test the capabilities of today's leading language models. The results showed that Anthropic's Claude 3.7 performed the best, followed by Claude 3.5. Meanwhile, Google's Gemini 1.5 Pro and OpenAI's GPT-4o had more difficulty playing the game on their own.

It's worth noting that this isn't the original 1985 Super Mario Bros. The game runs on an emulator, integrated with the GamingAgent framework to let the AI ​​control the little Mario. The GamingAgent provides basic instructions to the AI ​​and screenshots of the game. The AI ​​then generates Python code to control the character.

According to Hao AI, the game forces models to 'learn' how to plan complex moves and build strategies for playing. Interestingly, 'reasoning' models like OpenAI's o1, which are stronger on most tests, struggle more than 'non-reasoning' models.

The reason given is that reasoning models take time to make decisions, while Super Mario Bros. requires quick reflexes. A second of delay can lead to failure.

Using games to evaluate AI has been around for a long time, but many experts are skeptical about the accuracy of this method. They argue that games are too simple and provide too much data to train AI, not reflecting the true capabilities of AI in the real world.

Andrej Karpathy, a research scientist at OpenAI, calls this the ‘assessment crisis.’ He admits that there is currently no accurate metric for assessing AI capabilities.

While debates about the accuracy of evaluating AI through games remain, seeing AI 'fight' in Mario's world is still an interesting experience and helps people better understand the capabilities of AI.


Tag: share

Comment (0)

No data
No data

Same tag

Same category

Enjoy the top fireworks at the opening night of the 2025 Da Nang International Fireworks Festival
Da Nang International Fireworks Festival 2025 (DIFF 2025) is the longest in history
Hundreds of colorful offering trays sold on the occasion of the Duanwu Festival
Ninh Thuan's infinity beach is most beautiful until the end of June, don't miss it!

Same author

Heritage

Figure

Business

No videos available

News

Political System

Local

Product