Vietnam.vn - Nền tảng quảng bá Việt Nam

DeepSeek reveals its secrets.

DeepSeek has revealed for the first time how it built the world's leading open-source AI model at low cost, thanks to simultaneous hardware and software design.

ZNewsZNews19/05/2025

DeepSeek reveals how they build low-cost AI models. Photo: Bloomberg .

In a research report published on May 15th, DeepSeek shared for the first time details on how it built one of the world's most powerful open-source AI systems at a fraction of the cost of its competitors.

The study, titled “Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures,” is co-authored with founder Liang Wenfeng. DeepSeek attributes its success to the parallel design of hardware and software, a differentiating approach compared to many companies that still focus on optimizing software independently.

“DeepSeek-V3, trained on 2,048 Nvidia H800 GPUs, demonstrated how parallel design can efficiently solve challenges, enabling efficient training and inference at scale,” the research team wrote in the report. DeepSeek and the hedge fund High-Flyer had stockpiled H800 chips before they were banned from export to China by the US starting in 2023.

According to the article, the DeepSeek research team was well aware of the hardware limitations and the exorbitant costs of training large language models (LLMs), the underlying technology behind chatbots like OpenAI's ChatGPT. Therefore, they implemented a series of technical optimizations to increase memory performance, improve communication between chips, and enhance the overall efficiency of the AI ​​infrastructure.

Furthermore, DeepSeek emphasizes the role of the Mixed Expert Model (MoE) architecture. This is a machine learning method that divides the AI ​​model into subnetworks, each processing a separate portion of the input data and working collaboratively to optimize the results.

MoE helps reduce training costs and accelerate reasoning speed. This method is now widely adopted in the Chinese tech industry, including Alibaba's latest Qwen3 model.

DeepSeek first gained attention when it released its basic V3 model in December 2024 and its R1 reasoning model in January. These products caused a stir in the global market, contributing to a widespread drop in AI-related technology stocks.

Although DeepSeek hasn't revealed any further plans recently, it has maintained community interest through regular reports. In late March, the company released a minor update to DeepSeek-V3, and by the end of April, they quietly launched the Prover-V2 system for mathematical proof processing.

Source: https://znews.vn/deepseek-tiet-lo-bi-mat-post1554222.html


Comment (0)

Please leave a comment to share your feelings!

Heritage

Figure

Doanh nghiệp

News

Political System

Destination

Product

Happy Vietnam
Competition

Competition

Giraffe

Giraffe

Da Nang beach

Da Nang beach