The score spectrum cannot be used as a basis for assessing the quality of education.

PHỔ ĐIỂM - Ảnh 1. — Candidates are stressed during the high school graduation exam - Illustration photo: NAM TRAN

* Quickly look up 2025 high school graduation exam scores HERE

In 2025, the high school graduation exam will be held with an important change: no standardized question bank will be used, instead the exam will be created using expert methods.

This is a flexible solution in the context of transition to the new generaleducation program. However, changing the method of setting questions also raises an important warning: the score distribution and basic statistical indicators of this exam cannot be used to evaluate teaching quality or plan education policy.

Score distribution is not a measure of test quality

In the 2025 high school graduation exam, for the first time, the Ministry of Education and Training will not use a standardized question bank, but will switch to a manual question-making method conducted by experts. This change will not only affect the way the exam is constructed, but also directly affect the way the quality of the questions and teaching effectiveness are analyzed and evaluated.

Immediately after the exam ends, the score distribution and basic statistical parameters such as average and median scores continue to be announced and become the focus of public attention. However, it is important to clearly recognize that the score distribution is only a descriptive statistical tool, not a direct measure of the difficulty or quality of the exam.

The score distribution can help identify some general characteristics of the exam, such as whether the test results are skewed left or right, concentrated at certain score levels, or have unusual peaks.

However, these are only indirect indicators, influenced by many factors outside the exam such as the candidate's academic level, review level, exam preparation orientation and random factors during the test.

The assessment of the difficulty, accuracy and classification of the test cannot be based solely on the score distribution.

To have a scientific conclusion, it is necessary to carefully analyze the structure of the exam, each specific question, the level of meeting the requirements in the program, and apply specialized indexes such as: Difficulty index: reflects the level of challenge of each question; Discrimination index: evaluates the ability to classify good and weak students; Reliability coefficient: measures the stability and consistency of the entire exam.

In the context of the 2025 exam not being standardized, using the score distribution to reflect the quality of the exam or to conclude whether the exam is easy or difficult lacks scientific basis. Instead, this year's score distribution should be understood as a statistical tool primarily serving enrollment, and cannot be used to provide feedback on the quality of teaching or the level of meeting the requirements of the new general education program.

The score distribution is only valid when the test meets the standards.

Score distributions and statistical parameters such as mean scores, standard deviations, passing rates, and score distributions are important tools in analyzing test results. In theory, they can reflect the difficulty level of the test, the ability to classify students, and even teaching trends over time.

However, the prerequisite for these indicators to be valuable is that the test must be a standardized measuring tool. This includes: having a clear test matrix and specifications; questions that are tested for difficulty and discrimination; having experimental data from trial tests; and having a strict construction - review - acceptance process.

If the test is not standardized, no matter how good the score distribution is, it will not reflect the true nature. A left-skewed score distribution (many high scores) does not necessarily mean the test is easy, and a low average score does not necessarily mean the student is weak, it all depends on the reliability of the test.

Exams using expert methods: flexible but not a substitute for standardization

Expert-based question making is not uncommon in education. It is often used when quick response is needed, when there is not enough time to build a standard question bank, or in internal exams. However, this method lacks objectivity and stability compared to a standardized question system.

When setting questions according to experts: the level of difficulty depends on the subjectivity of the compiler; the questions have not been tested in practice; there is no comparison data to adjust; the distribution of difficulty and skills is designed based on experience instead of data.

As a result, the score distribution becomes the product of an uncalibrated measure. Using it to draw conclusions about student ability, teaching quality, or program relevance would be a serious methodological error.

Unreliable data, inaccurate conclusions, inappropriate policies

In the context of education reform at a pivotal stage, using data from exams to evaluate, compare and make decisions is extremely necessary. However, the most dangerous thing is to rely on unreliable data to make systematic policies.

If we use the 2025 high school exam score distribution - which is not based on standardized test questions - to evaluate teaching quality between regions; compare results by student groups; and analyze the suitability of the new education program, such analyses lack scientific basis, easily lead to misunderstandings of the current situation and introduce counterproductive policies.

An entrance exam cannot be equated with a systematic assessment exam.

It is important to make a clear distinction: an exam may be good enough to serve as a graduation or admission criterion, but it is not qualified to be a tool to measure the quality of the education system.

The 2025 high school graduation exam, as the first exam under the new general education program, can fully assume the function of graduation assessment and university entrance screening. However, expecting the score distribution to evaluate the quality of teaching, program effectiveness or student level nationwide is unrealistic and methodologically incorrect.

Unstandardized tests → unreliable data → cannot be used as a benchmark for educational analysis or policy making.

Organizational sentiment should not replace scientific principles.

In education, as in any field that uses data to make decisions, the principle that “reliable data comes from reliable measurement tools” must be strictly adhered to. The expectation of having data cannot be ignored at the expense of standardization in data collection tools.

Organizing the 2025 high school graduation exam using expert-generated questions is an acceptable organizational option at the operational level. However, the results of this exam should not, and cannot, be used to make systematic assessments or policy recommendations.

Measurement science does not allow for an inaccurate measurement to be used as a benchmark. Education cannot base policy on unreliable data.

Back to topic

Dr. Sai Cong Hong

Source: https://tuoitre.vn/khong-the-lay-pho-diem-lam-can-cu-danh-gia-chat-luong-giao-duc-20250716150343597.htm