Introduction to Generative Models
Introduction
This is an introduction to the main settings encountered in generative modelling: the goal of this Lecture is to understand the relationship between vanilla unconditional generative modelling and industrial generative models such as DALL-E, Stable Diffusion, GPT, etc.
Unconditional Generative Modelling
What In unconditional generative modelling, we are given a set of unlabelled data
Assumption The core underlying assumption of generative modelling is that the data , is drawn from some unknown underlying distribution : for all
Goal Using the empirical data distribution , the goal is to generate new samples that look like they were drawn from the same unknown distribution
Class-Conditional Generative Modelling
What In class-conditional generative modelling, we are given a set of labelled data
Assumption For class-conditional generative models, the assumption is that the data , is drawn from some unknown underlying conditional probability distributions : for all
Goal Using the empirical data distributions , the goal is to generate new samples that look like they were drawn from the same unknown distributions . More precisely, we want to be able to generate new images of cats and dogs that follow the conditional probability distributions
Remark i) To train class-conditional generative models, we could split the dataset into two parts, one with all the cat images and one with all the dog images, and train two separate unconditional generative models. However, this would not leverage similarities between the two classes: both cats and dogs have four legs, a tail, fur, etc. Class-conditional generative models can share information across classes.
Remark ii) Generative modelling is a very different task than standard supervised learning. The usual classification task is the following, given an empirical labelled data distribution , the goal is to estimate the probability a given new image is a cat or a dog, i.e. we want to estimate . On the opposite, in class-conditional generative modelling, we are given a class (e.g. cat), and we want to estimate the probability distribution of images of cats , and sample new images from this distribution.
Text-Conditional Generative Modelling
What In text-conditional generative modelling, we are given a set of data (e.g. images) and their text description
For instance, Stable Diffusion was trained on the LAION-5B dataset, a dataset of 5 billion images and their textual description.
Assumption For text-conditional generative models, the assumption is that the data , is drawn from some unknown underlying conditional probability distributions : for all
The main difference with class-conditional is that the conditioning variable is now a text description, not a fixed number of classes.
Goal Using the data and their text description , the goal is to generate new samples , given a text description. More precisely, given a text description we want to be able to generate new images that follow the conditional probability distributions
Remark iii) Text-conditional generative modelling is very challenging regarding multiple aspects:
- one usually observes only one sample per textual description , i.e., one has to leverage similarities between text descriptions to learn the conditional distributions .
- one has to handle new text descriptions that were not seen during training, i.e., the model needs to be able to generalize to new text.
- text descriptions are complex objects, that are not easy to handle (discrete objects with variable sequence length). Handling text conditioning requires a lot of engineering and is out of the scope of this introduction Lecture (tokenization, embeddings, transformers, etc.).
Remark iv) Even if text-conditional generative modelling is very challenging, the tools, algorithms, and concepts used for unconditional generative modelling are the same.