Surprised by the level of 'flattery' of Chinese and American AI models

The study, published in early October, tested 11 large language models (LLMs) by asking them to advise users in situations involving interpersonal conflict, manipulation, and deception. The results showed that AI chatbots were often too easy to agree with and support users' views, rather than challenge or give honest advice.

Among the models analyzed, DeepSeek V3 (released December 2024) was one of the most “sycophantic,” agreeing with users 55% more than humans, while the average of all models was 47%.

chatbot china linkedin — Chinese and American AI chatbots tend to flatter users too much. Photo: LinkedIn

Similarly, Alibaba Cloud's Qwen2.5-7B-Instruct model (launched in January 2025) was rated as the most user-flattering model, going against the Reddit community's correct judgment 79% of the time, topping the list.

DeepSeek-V3 came in second, siding with the poster 76% of the time even when they were wrong.

To construct the “human norm,” the team used data from the Reddit community “Am I The Ahole”**, where users post real-life situations asking who is at fault.

When comparing the AI's responses with the conclusions of the community (largely English speakers), the researchers found that the AI tended to side with the poster, even when they were clearly wrong.

“These trends create a counterproductive effect – causing humans to favor flattering AI models, and developers to train AI to flatter more to please users,” the authors warn.

The phenomenon of “AI flattery” is not only a social problem but also affects businesses, according to Professor Jack Jiang, Director of the AI Evaluation Lab at the University of Hong Kong Business School.

“It would be dangerous if a model consistently agreed with the analysis or conclusions of experts in the business,” he said. “That could lead to erroneous or untested decisions.”

This research contributes to the elucidation of an emerging ethical issue in the era of generative AI – where models designed to please users may sacrifice objectivity and honesty, leading to unintended consequences in human-machine interactions that can negatively impact users’ social relationships and mental health.

Source: https://vietnamnet.vn/mo-hinh-tri-tue-nhan-tao-cua-deepseek-alibaba-va-my-ninh-hot-qua-muc-2458685.html