
(Photo: Freepik)
Hackers are exploiting the "personality" of AI chatbots in increasingly sophisticated ways, with attacks no longer relying solely on malware or technical vulnerabilities, but shifting to manipulative language.
In the early stages, "hacking" AI chatbots was quite simple. Users just needed to instruct the system to ignore previous instructions, pretend not to be bound by the rules, or role-play as an unconstrained artificial intelligence. These methods are called "jailbreaking," which means tricking the AI model to bypass safe instructions.
One of the prominent attack types in the past was “DAN,” short for “Do Anything Now,” in which users asked ChatGPT to role-play as an AI capable of doing anything. Another example is the “grandma exploit,” where a chatbot is tricked into playing the role of a grandmother telling stories to children, but the content is then steered toward dangerous information.
Tech companies have quickly patched many of the old-fashioned vulnerabilities, but the underlying weaknesses remain. Chatbots are designed for conversation, so excessively restricting dialogue can diminish the system's usefulness. Meanwhile, simply banning sensitive words isn't enough, as many words can appear in legitimate contexts such as history, medicine, journalism, or chemistry.
According to the article, the AI security race is no longer just a programmer's problem. Those seeking to circumvent chatbot security are increasingly resembling writers, psychologists, or interrogators, using flattery, pressure, deception, or manipulation to make the models lower their guard.
According to AI security testing company Mindgard, some attacks now resemble psychology more than computer science . AI models don't have emotions like humans, but they are trained to respond as if they do. This simulation can create different types of reactions, making each chatbot seem to have its own "personality."
This presents a new challenge as AI agents are increasingly used for scheduling, task management, food ordering, or customer service. If models can be manipulated through conversation, security forces will have to examine both their social and emotional limitations, in addition to traditional technical vulnerabilities.
Source: https://vtv.vn/tin-tac-khai-thac-tinh-cach-cua-chatbot-ai-10026052519025336.htm








Comment (0)