Vietnam.vn - Nền tảng quảng bá Việt Nam

Announcing the assessment criteria for reasoning and interaction of Vietnamese LLM

Zalo AI and Japan Advanced Institute of Science and Technology (JAIST) introduce a new version of VMLU, promoting the Vietnamese AI community to perfect high-level LLM models.

ZNewsZNews01/10/2025

First introduced in 2023, VMLU (Vietnamese Multitask Language Understanding) has become a pioneering “Make in Vietnam” standard set, motivating many domestic research groups to improve the quality of Vietnamese large language models (LLM).

According to statistics, in 2024, VMLU announced 45 LLMs on the rankings, received evaluation requests from more than 155 organizations and individuals, summarized 691 downloads of the evaluation criteria set and 3,729 LLM evaluations from the platform. The standards set is used by many domestic and foreign organizations such as VinBigData, VNPT AI, Viettel Solutions, University of Science and Technology - VNU-HCM, UONLP x Ontocord - University of Oregon (USA), DAMO Academy - Alibaba Group, SDSRV teams - Samsung...

VMLU anh 1

Zalo AI and JAIST Institute introduce new version of VMLU.

As AI models become increasingly intelligent, the VMLU has been upgraded to assess more complex competencies. Specifically, the expanded set of standards assesses three core skills of a modern LLM, including:

Reading Comprehension (ViSQuAD): 3,310 questions assess the ability to understand text in depth and handle complex questions based on the specific characteristics of Vietnamese language and context.

Reasoning (ViDrop): 3,090 questions challenge LLM's logical reasoning abilities through tasks such as comparison, counting, and arithmetic calculations.

Interaction (ViDialog): 210 dialogues assess coherence, contextual understanding, and application of multidisciplinary knowledge (history, geography, logic) in dialogue.

The highlight of the new set of standards is the advanced assessment method, combining a variety of forms from multiple choice, open-ended questions to step-by-step reasoning requirements. In particular, VMLU applies the "LLM as a judge" method (using LLM to evaluate LLM) - a trend being applied by the global AI community to achieve more objective and large-scale results.

With 10,880 multiple-choice questions, covering 58 topics, divided into multiple levels, the 2023 version focused on assessing the foundational knowledge of LLM. Meanwhile, the new set of standards goes a step further, measuring the reasoning and interaction ability of LLM in real-life contexts . This upgrade not only helps developers evaluate models more comprehensively, but also promotes LLM to create useful values ​​for end users.

VMLU anh 2

The expanded set of criteria assesses the three core skills of a modern LLM.

“There are currently hundreds of different benchmarks in the world to evaluate the capabilities of large language models. However, the number of benchmarks specifically for Vietnamese is very limited. With the launch of benchmarks in 2023 and 2025, we hope to diversify the assessment aspects,” said Dr. Chau Thanh Duc, Director of Artificial Intelligence Research & Development at Zalo AI.

The new set of standards has been launched on the VMLU website https://vmlu.ai/ for individuals and research groups to evaluate their models.

VMLU anh 3

The new set of standards has been launched on the VMLU website.

With the cooperation of leading experts at Zalo AI and JAIST Institute, VMLU will continue to research and develop more diverse assessment standards in terms of fields and difficulty. In the future, VMLU also aims to develop safety and integrity assessment standards, ensuring that LLM models are developed responsibly.

Source: https://znews.vn/bo-tieu-chuan-make-in-vietnam-danh-gia-suy-luan-tuong-tac-cua-llm-post1589609.html


Comment (0)

No data
No data

Same category

Visit Lo Dieu fishing village in Gia Lai to see fishermen 'drawing' clover on the sea
Locksmith turns beer cans into vibrant Mid-Autumn lanterns
Spend millions to learn flower arrangement, find bonding experiences during Mid-Autumn Festival
There is a hill of purple Sim flowers in the sky of Son La

Same author

Heritage

;

Figure

;

Enterprise

;

No videos available

News

;

Political System

;

Destination

;

Product

;