Science & Technology News

2024-04-24 09:55:28+00:00

Yandex has updated the YandexART diffusion neural network to version 1.3

Hello! My name is Evgeniy Lyapustin, I am a senior developer in the computer vision team. Together with our colleagues from Yandex Research, we have updated the YandexART diffusion neural network to version 1.3.

The main change is that the neural network switched to latent diffusion technology. In addition, the dataset on which the model was trained was increased by 2.5 times. Thanks to this, the new version of YandexART better understands text queries and creates even more realistic images.

YandexART 1.3 is already used in Masterpiece, whose users now have the opportunity to create images in different formats, such as 16:9, 4:3 or 3:4. Later, the updated neural network will begin to be used in other Yandex services.

With cascade diffusion, the image progressively improves with increasing resolution. Latent diffusion works differently. It forms an intermediate latent representation of the image in the form of a compact description containing basic information about the image in a compressed form. The neural network then expands the code into a full high-resolution image in one step.

Latent diffusion technology consumes less computing resources and allows you to create more realistic graphics. We have seen this in practice. We trained two versions of the model under the most similar conditions: cascade and latent. And at each stage of training, the latent one won in quality and speed measurements.

The dataset has been increased from 330 million picture-text pairs to more than 850 million pairs. In order for the model to better understand user requests, synthetic texts were added to the dataset on which it was trained—more detailed descriptions of images generated by the neural network. In the picture below you can see an example of synthetic text.

In addition, in order for YandexART to take into account more details from the prompt, the new model uses not one, but two text encoders. The first is our encoder from the previous version 1.2, which was trained on matching picture-text pairs.

The second one is new for us, based on the open source umt5_xxl. Unlike the first one, this encoder was trained only on texts. Two different encoders give the model signals of different nature.

According to the results of SBS measurements by Yandex assessors, the YandexART 1.3 neural network wins in 57 percent of cases compared to Midjourney V5.2 and in 63 percent of cases compared to the previous version of YandexART 1.2.

bbabo.Net

Kazan network of halal canteens was certified
Russia and the CIS (bbabo.net) - On August 16, the certificate of certification of Halal Home Dining was presented
India - why is oil so expensive?
The words of another man’s wife may seem sweet as honey; they may be as smooth as crude oil
Russia - The Armed Forces of the Russian Federation will receive more than 3,700 new types of equipment
Russia - Russian Deputy Defense Minister Alexei Krivoruchko said that at the Army
Lithuania responded to accusations of attempts to attack Belarus with drones
Belarus (bbabo.net), - The drone attack on Belarus by Lithuania is “disinformation

Science & Technology News

Yandex has updated the YandexART diffusion neural network to version 1.3

Kazan network of halal canteens was certified

India - why is oil so expensive?

Russia - The Armed Forces of the Russian Federation will receive more than 3,700 new types of equipment

Lithuania responded to accusations of attempts to attack Belarus with drones