Google Gemini takes AI to the next level: change background, hairstyle, and combine photos with just one command

The Google Gemini upgrade uses the “nano banana” image model developed by Google DeepMind. The feature is now available globally for both free and paid users. Its biggest strength is its ability to keep faces and objects consistent in images, something other AI tools often struggle with.

“We’ve really pushed the image quality and the model’s ability to follow instructions,” said Nicole Brichtova, product lead at DeepMind. “This update makes editing more seamless and the results are good enough to be used for any purpose.”

Keep “you” in every photo

One of the things that makes AI photos look fake is that small details get distorted. Google says Gemini solves this problem, allowing you to change the entire scene while keeping the face and expression the same. You can try a new hairstyle, change the color of the wall, or bring a pet into the scene without worrying about image distortion.

Blend photos together.gif — Merge photos with new context from two existing images using Google Gemini. Source: Google

Gemini also allows you to upload multiple photos to combine into one, such as combining a portrait with your cat to create a photo of the two of you riding together on the road.

Gemini supports multi-turn editing, allowing users to add every detail to a space: from wallpaper, furniture, to paint color. The plus point is that only the part that needs to be edited changes, the rest remains the same.

Additionally, Gemini can mix styles between photos. For example, turn rain boots into floral print shoes, or create a butterfly-patterned dress.

AI image creation race among technology giants

Google's upgrade comes as the AI imaging war heats up. OpenAI previously launched GPT-4o, which can generate images directly, and went viral with a series of Studio Ghibli-style memes. CEO Sam Altman revealed that the number of users increased so much that the company's GPUs almost "melted".

To keep up, Meta announced a partnership with Midjourney, while German startup Black Forest Labs with its FLUX model is dominating many charts.

multi turn editing.gif — Google Gemini's multi-step photo editing capabilities. Source: Google

Google hopes Gemini can close the gap with ChatGPT. According to CEO Sundar Pichai, Gemini currently has 450 million monthly users, much lower than ChatGPT, which has more than 700 million weekly users.

Brichtova said Gemini is designed for real-world scenarios, from visualizing living rooms and gardens to creating entertaining photos. The model has better “ world knowledge,” and can combine multiple photos and color palettes into a single rendering.

However, Google also sets strict limits. All generated images have a clear watermark and hidden identifiers in the metadata. The company strictly prohibits the creation of sensitive images involuntarily to prevent deepfake abuse.

Google has previously apologized for Gemini’s inaccurate historical imagery. This time, the company believes it has struck a balance between creativity and safety. “We want users to be creative, but not everything is allowed,” Brichtova stressed.

With Gemini 2.5 Flash Image, Google is betting on elevating the AI photo editing experience, hoping to retain old users and attract new ones in a fierce technology race with OpenAI, Meta and other competitors.

(According to TechCrunch, Tom's Guide)

For 85 million VND per 'brain', Nvidia paves the way for the era of humans making robots. Nvidia has just launched Jetson AGX Thor - a chip dubbed the "robot brain", capable of helping machines see, think and act like humans, opening the physical AI race at a price of 3,499 USD.

Source: https://vietnamnet.vn/google-gemini-nang-tam-ai-tao-anh-doi-nen-kieu-toc-chi-bang-mot-cau-lenh-2436782.html