DeepSeek develops a model that can self-verify mathematical inferences

DeepSeek - Ảnh 1. — DeepSeak has developed an AI model that not only writes code but also checks and proves itself correct.

DeepSeekMath-V2 has set unprecedented performance records, surpassing human achievements in rigorous academic competitions. Notably, the model won a gold medal at the 2025 International Mathematical Olympiad (IMO) and shocked with a score of 118/120 at the Putnam exam, far surpassing the record of 90 highest scores ever achieved by humans.

But what really makes this model groundbreaking is not the score, but DeepSeek's "self-verification" feature.

DeepSeek's self-verification and error correction mechanism

For many years, artificial intelligence (AI) models and large language models (LLMs) have faced a serious weakness when solving problems that require absolute logic like mathematics, which is the phenomenon of "wrong reasoning but correct answer".

That is, the model may randomly produce the correct final answer, but the sequence of inference steps, formulas, or logical steps that led to that result may be incorrect, incomplete, or hallucinate.

In the fields of science , engineering and mathematics, a correct answer with a wrong solution has absolutely no value and significantly reduces the reliability of an AI system. DeepSeekMath-V2 was created to end this era of unreliability.

DeepSeekMath-V2's self-verification ability is at the core of its success. It acts as an "internal auditor" in the AI's thinking process. Instead of just making a single inference and outputting an answer, the DeepSeekMath-V2 model incorporates a two-way mechanism.

The first is the proving role, where the model generates an initial chain of arguments and solutions. The model then automatically triggers an internal checker system, which reviews each logical step of the chain of arguments just generated, looking for errors, inconsistencies, or unreasonable leaps.

This process is very similar to how the IMO-ProofBench evaluation system works, where one AI generates an argument and another AI verifies it. By repeating this cross-checking until the chain of arguments is confirmed to be absolutely solid, DeepSeekMath-V2 ensures that not only is the answer correct, but the path to that answer is also absolutely correct and transparent.

Unlocking the Future of Trustworthy AI

The implications of this self-verifying inference method could set a new standard of transparency and trustworthiness for real- world AI applications.

In the future, this model can be applied in other important areas such as: AI not only writes code but also self-checks and proves correctness, minimizing serious errors.

In addition, AI can automatically verify complex chains of reasoning when developing hypotheses or proving new theorems, thereby ensuring the rationality and safety of important decisions made by AI.

DeepSeek's decision to publicly release the model's source code on platforms like Hugging Face and GitHub is a strategic move, allowing the global research community to access and build on this verifiable inference principle.

DeepSeekMath-V2 represents a quantum leap forward, not only demonstrating AI’s superior ability to solve the most difficult problems, but also ensuring that this ability is built on a foundation of trust and unshakeable logic. This is proof that the next generation of AI will not only be smarter, but also more honest and transparent in its thinking process.

Back to topic

EAST SEA

Source: https://tuoitre.vn/deepseek-phat-trien-mo-hinh-co-kha-nang-tu-kiem-chung-cac-suy-luan-trong-toan-hoc-2025113016585069.htm