
At the Google I/O 2025 event, Google shocked the tech world when it introduced a video -generating AI model called Veo 3, marking a big step forward for a tech giant into a controversial field.
According to The Verge reporter Allison Johnson's experience, the most sophisticated thing about Veo 3 is its ability to create original audio for each video, from sound effects, background noise to even character dialogue.
"Veo 3 strikes me as an absolutely 'garbage' AI content generator," commented The Verge reporter.
New features and shocking realism
“We’re entering a new era of creativity,” Google’s Gemini VP Josh Woodward explained during the Veo 3 launch keynote, highlighting the ability to create “ultra-realistic” videos.
Johnson was initially skeptical, but after experiencing the AI tool for herself, she was convinced Woodward wasn’t exaggerating. Veo 3 is capable of creating products that are frighteningly realistic.
Specifically, The Verge reporter tried to create a short video with the content of a news anchor announcing a fire. The clip is extremely convincing, with the sound quality and scenery similar to any traditional news report.
![]() |
A scene from a video created by Veo 3. Photo: The Verge. |
The post, which included a series of videos featuring AI-generated characters protesting the commands used to create the AI-generated videos, has since racked up 50,000 upvotes on Reddit. The scenes include a disaster, a woman lying in a hospital bed on a ventilator, and a character being threatened with a gun — all with spoken dialogue and realistic background sounds.
Compared to other AI video creation tools, Veo 3 has made things a lot simpler. All it takes is a basic command, a few minutes of waiting for the platform to process, and a subscription to Google's AI Ultra plan ( $249.99 per month).
It was even easier for Johnson to create videos using less specific commands, and that pointed to one thing: the Veo 3 excels at creating the lowest-common-denominator type of YouTube content for kids.
The end of the "silent film era"
To date, no AI video generation model has been able to simultaneously provide synchronized audio, or any audio at all, to accompany the video output. However, Veo 3 – with its synchronized audio generation capabilities – is looking to end the “silent era”.
“We are exiting the silent era of video creation,” Google DeepMind CEO Demis Hassabis said during a press conference.
The widespread availability of video builder tools has led to an explosion of vendors to the point where the space is becoming saturated.
From startups like Runway, Lightricks, Genmo, Pika, Higgsfield, Kling, and Luma, to tech giants like OpenAI and Alibaba, models are being released at a rapid pace. In many cases, there is little difference between these models.
It remains to be seen whether Veo 3 will be able to surpass OpenAI Sora in terms of video quality, but the ability to output fully produced videos with both audio and video could immediately make Veo 3 a more compelling platform.
![]() |
The most outstanding feature of Veo 3 is its ability to create sound that is "perfectly" synchronized with video. Photo: Google. |
“In the world of film and television, background noise and sound effects are often the work of artists. Now, imagine if all you had to do was describe to Veo the sound you wanted in the background and attached to the action, and it would output it all, including video and dialogue. This is work that animators would spend weeks or months doing,” Johnson commented.
If Veo 3 can actually follow commands and output hours of consistent video and audio, it won't be long before we have the first animated feature film created entirely with AI.
Soon after Veo 3 launched, creators began sharing clips on platforms like X, including a stand-up comedy video created entirely with AI. Viewers were amazed to learn that the entire scene, including voice, video, and even audience audio, was created from just a text description.
Then there’s another viral clip that recreates Pythagoras explaining his famous theorem, complete with ancient context and accurate dialogue. There’s even a music video made entirely by Veo 3, where the visuals and music are in perfect sync.
The Economic Times commented that this type of technology could be called "a new era of filmmaking", allowing anyone - from individual creators to major media studios - to produce professional content at low cost and with minimal resources.
Source: https://znews.vn/ac-mong-tu-ai-tao-video-moi-cua-google-post1556018.html
Comment (0)