Over the past year, most people in the world who have come across AI-generated images may have misunderstood the buzzword “generative” as something they’ve never heard before. But anyone who knows a little more about AI will be familiar with the fact that generative AI has its origins in his emergence of GANs.
GAN, the beginning of generative AI
In 2014, a group of researchers including former Google Brain research scientist Ian Goodfellow, his professor and Turing Award winner Yoshua Bengio, and others published a paper on Generative Adversarial Networks (GANs). They decided to use neural networks in imaginative ways. It pits two networks against each other, always trying to outsmart the other. Both are trained on the same image dataset and ultimately produce sufficiently convincing new fake images.
GANs have given deep learning models something they didn’t have before: imagination. For this reason, GANs can be looked back on as one of the biggest steps towards giving machines human-like consciousness. The newspaper caused a kind of earthquake in the community, and Goodfellow became an instant celebrity.
in an exclusive chat with Analytics India Magazine, AI veteran Bengio raved about the distance generative models have come since. Regarding recent advances with text-to-image generators such as OpenAI’s DALL.E and his StabilityAI’s Stable Diffusion, Bengio said:
Origin of the diffusion model
Bengio also applauded the rapidly increasing work on diffusion models. Diffusion models, introduced in 2015 in a paper titled “Deep Unsupervised Learning Using Nonequilibrium Thermodynamics”, were the next step in shaping generative models.
Diffusion models worked faster and produced images with improved quality produced by GANs.
These models worked on denoising techniques that took corrupted images and recombined the images until a final clean image was produced. This is how Stable Diffusion works. It is trained to repeatedly add and remove small amounts of noise to the image until the final output is produced.
Breakthroughs leading to modern generative models
None of these concepts were new to Bengio and other members of the AI community. “Some of the ideas behind this go back over a decade,” he said. Bengio has left a series of papers showing this. In 2013, Bengio, Li Yao, and several other researchers addressed the importance of denoising and reduced autoencoders, titled “Generalized Denoising Autoencoders as Generative Models”. published a paper on
All of today’s leading text-to-image generators, including DALL.E 2, Google’s Imagen, and Stable Diffusion, use diffusion models.
Important breakthroughs achieved last year contributed to their prominence.
In December 2021, the latent diffusion model was introduced in a paper titled “High-resolution image synthesis using the latent diffusion model”. These models used autoencoders to compress the images into a relatively small latent space during training. An autoencoder is then used to decompress the final latent representation that ultimately produces the output image. Constant repetition of the diffusion process in this latent space greatly reduces image generation time and cost.
Latent diffusion models can also do other things that previous generative models couldn’t, such as image inpainting, super-resolution, and semantic scene synthesis.
Returns in probabilistic ML and other concepts
Bengio has also been enthusiastic about the area of probabilistic machine learning for some time and firmly believes that advances in this area can solve some of the challenges in deep learning. “I am very excited about the progress in the field of probabilistic machine learning,” he said.
Probabilistic machine learning designed AI models to help machines understand what learning is through experience. Statistics are used to predict the likelihood of future outcomes through random occurrences or actions. For example, a probabilistic classifier could assign a probability of, say, 0.8 to the “cat” class. This indicates a very high degree of confidence that the animal in the image is a cat. If you can connect the dots, you can apply it immediately to self-driving cars.
Probabilistic machine learning has significant advantages. You can get a clear picture of how confident or uncertain the machine is about the accuracy of its predictions.
According to Bengio, many of these old ideas, like old wine in a new bottle, have the potential to be revived with modern AI. Another thing he believes has the potential to be transformative. It’s causality. Although lineages of thought about cause and effect appeared in pre-Socratic times, deep his learning researchers recognized them much later because of the challenges posed by the lack of a causal representation in machine learning models. did.
“One of the biggest limitations of current machine learning, including deep learning, is the ability to generalize well to new settings such as new distributions. It’s something, and humans are very good at it,” Bengio said. He said there is good reason to think that humans are good at it because they have a causal model of the world. he added.