Vietnam.vn - Nền tảng quảng bá Việt Nam

Huawei claims AI training is better than DeepSeek

Huawei's progress in AI modeling architecture is significant as the company seeks to reduce its dependence on US technologies.

ZNewsZNews05/06/2025

Using improved techniques from DeepSeek's AI training, the Huawei Ascend chip has delivered outstanding performance. Photo: Reuters .

Researchers working on Huawei's Pangu large language model (LLM) announced on June 4 that they had improved DeepSeek's original approach to training artificial intelligence (AI) by leveraging the company's proprietary hardware, SCMP reported.

Specifically, the paper published by Huawei's Pangu team, which includes 22 core collaborators and 56 additional researchers, introduced the concept of Mixture of Grouped Experts (MoGE), an upgraded version of the Mixture of Experts (MoE) technique that played a key role in DeepSeek's cost-effective AI models.

According to the paper, while MoE offers low execution costs for large model parameters and advanced learning capabilities, it also often leads to inefficiencies. This comes from uneven activation, which hinders performance when running on multiple devices in parallel.

Meanwhile, MoGE is improved by a team of experts in the selection process and better balances the workload of the "experts," according to the researchers.

In AI training, the term “expert” refers to specialized sub-models or components within a larger model. Each of these models will be designed to handle specific tasks or distinct types of data. This allows the overall system to leverage diverse expertise to improve performance.

According to Huawei, the training process consists of three main phases: pre-training, long-context expansion, and post-training. The entire process included pre-training on 13.2 trillion tokens and long-context expansion using 8,192 Ascend chips - Huawei's most powerful AI processor, used to train AI models and aimed at challenging Nvidia's dominance in high-end chip design.

By testing the new architecture on an Ascend neural processing unit (NPU) specifically designed to accelerate AI tasks, the researchers found that MoGE “results in better expert load balancing and more efficient performance for both model training and inference.”

As a result, compared with models such as DeepSeek-V3, Alibaba's Qwen2.5-72B, and Meta Platforms' Llama-405B, Pangu outperforms most general English benchmarks and all Chinese benchmarks, demonstrating superior performance in long-context training.

Source: https://znews.vn/huawei-tuyen-bo-huan-luyen-ai-tot-hon-deepseek-post1558359.html


Comment (0)

No data
No data
The moment the SU-30MK2 "cuts the wind", air gathers on the back of the wings like white clouds
'Vietnam - Proudly Stepping Forward to the Future' Spreads National Pride
Young people hunt for hair clips and gold star stickers for National Day holiday
See the world's most modern tank, suicide UAV at the parade training complex
The trend of making cakes printed with red flag and yellow star
T-shirts and national flags flood Hang Ma Street to welcome the important holiday
Discover a new check-in location: The 'patriotic' wall
Watch the Yak-130 multi-role aircraft formation 'turn on the power boost, fight round'
From A50 to A80 - when patriotism is the trend
'Steel Rose' A80: From steel footsteps to brilliant everyday life

Heritage

Figure

Enterprise

No videos available

News

Political System

Destination

Product