Bbabo NET

Science & Technology News

Artificial Intelligence Reaches New Levels of Realism with Microsoft's VASA

Microsoft has announced the launch of a new neural network called VASA, which is capable of bringing photos and virtual characters to life. Using just one static image and a speech audio track, VASA creates videos of talking faces that display a wide range of emotions, natural head movements and facial expressions. Through extensive experimentation and evaluation on a number of new metrics, Microsoft attempted to outperform previous generative technologies.

VASA not only provides high quality video, but also supports online generation at 512x512 resolution up to 40 fps with low initial latency. This could pave the way for interactions with virtual faces that mimic human communication in the future.

Realism: The model is capable of synchronizing lip movements with audio and capturing a wide range of emotions, expressive facial nuances and natural head movements.

Controllable generation: The diffusion model is able to take into account parameters such as gaze direction, head position, and changes in emotions.

Out-of-distribution generalization: The method is capable of processing photographs and audio that extend beyond the training dataset, including drawings and illustrations. VASA is also capable of using singing audio tracks and non-English speech.

Real-time generation: The method generates 512x512 video frames at 45fps offline and can support up to 40fps online with latency as low as 170ms on a PC with a single NVIDIA RTX 4090 graphics card.

Microsoft recognizes the risks and notes that the technology should only be used for good. However, there is a threat that VASA could become an indispensable tool in the hands of fraudsters. Therefore, at this time, Microsoft has no plans to release an online demo, API, or product, or provide additional implementation details, until it is definitively confident that AI will be used responsibly and according to clear guidelines.

Given the potential of the technology and the possible dangers associated with VASA, the development of such AI in the future will likely be slower than possible.

Do you think there should be strict rules for using such technologies?

Artificial Intelligence Reaches New Levels of Realism with Microsoft's VASA