Vietnam.vn - Nền tảng quảng bá Việt Nam

New AI tool creates high-quality photos, 9 times faster

Scientists from MIT and NVIDIA have successfully developed HART - a tool that creates high-quality images at an exceptionally fast speed, while consuming so few resources that it can run directly on a laptop or smartphone.

VietNamNetVietNamNet26/03/2025

photo 1.jpg

This image of an astronaut riding a horse was created using two types of generative AI models. Photo: MIT News


When speed and quality are no longer a trade-off

In the field of AI imaging, there are currently two main approaches:

Diffusion models allow for sharp, detailed images. However, they are slow and computationally expensive, requiring dozens of processing steps to remove noise from each pixel.

Autoregressive models are much faster because they predict small parts of an image sequentially. But they often produce images with less detail and are prone to errors.

HART (hybrid autoregressive transformer) combines the two, providing the “best of both worlds”. It first uses an autoregressive model to construct the overall image by encoding it into discrete tokens. Then, a lightweight diffusion model takes over to fill in the residual tokens – the detailed information lost during encoding.

The resulting images are of comparable (or better) quality to state-of-the-art diffusion models, but are 9x faster to process and use 31% fewer computational resources.

New approach to creating quality images at high speed

One of the notable innovations of HART is how it solves the problem of information loss when using autoregressive models. Converting images into discrete tokens speeds up the process, but also loses important details such as object edges, facial features, hair, eyes, mouths, etc.

HART's solution is to have the diffusion model focus only on "patching up" these details through residual tokens. And since the autoregressive model has already done most of the work, the diffusion model only needs 8 processing steps instead of over 30 steps as before.

“The diffusion model is easier to implement, leading to higher efficiency,” explains co-author Haotian Tang.

Specifically, the combination of an autoregressive transformer model with 700 million parameters and a lightweight diffusion model with 37 million parameters gives HART the same performance as a diffusion model with up to 2 billion parameters, but 9 times faster.

Initially, the team also tried integrating the diffusion model into the early stages of the image generation process, but this accumulated errors. The most effective approach was to let the diffusion model handle the final step and focus only on the “missing” parts of the image.

Opening the future of multimedia AI

The team’s next step is to build next-generation visual-linguistic AI models based on the HART architecture. Since HART is scalable and adaptable to a wide range of data types (multimodal), they expect to be able to apply it to video generation, audio prediction, and many other areas.

This research was funded by several organizations including the MIT-IBM Watson AI Lab, the MIT-Amazon Science Center, the MIT AI Hardware Program, and the US National Science Foundation. NVIDIA also donated GPU infrastructure to train the model.

(According to MIT News)


Source: https://vietnamnet.vn/cong-cu-ai-moi-tao-anh-chat-luong-cao-nhanh-gap-9-lan-2384719.html


Comment (0)

Please leave a comment to share your feelings!

Same tag

Same category

Notre Dame Cathedral in Ho Chi Minh City is brightly lit to welcome Christmas 2025
Hanoi girls "dress up" beautifully for Christmas season
Brightened after the storm and flood, the Tet chrysanthemum village in Gia Lai hopes there will be no power outages to save the plants.
The capital of yellow apricot in the Central region suffered heavy losses after double natural disasters

Same author

Heritage

Figure

Enterprise

Dalat coffee shop sees 300% increase in customers because owner plays 'martial arts movie' role

News

Political System

Destination

Product

Footer Banner Agribank
Footer Banner LPBank
Footer Banner MBBank
Footer Banner VNVC
Footer Banner Agribank
Footer Banner LPBank
Footer Banner MBBank
Footer Banner VNVC
Footer Banner Agribank
Footer Banner LPBank
Footer Banner MBBank
Footer Banner VNVC
Footer Banner Agribank
Footer Banner LPBank
Footer Banner MBBank
Footer Banner VNVC