Generalization and Equilibrium in Generative Adversarial Nets (GANs)

Рет қаралды 18,051

Күн бұрын

Пікірлер: 10

@jamespaladin607 6 жыл бұрын

I am very new to neural networks and found this presentation very interesting. At age 75 the math was a bit over my head. But I asked my self why do we need the generator at all. What if we just feed an image into a discriminator and compare its output to the actual input image to generate the error signal to train on. Train the discriminator until it gets the output image to match the input image.Then slowly add noise to the input image while still generating the error signal as we were before and keep training the network. In the limit we slowly reduce the image input content and raise the noise input content we should be able to generate the desired image just from noise. Thus the discriminator becomes an image generator. Now change the noise input characteristics and see what images result. Is this a dumb idea?

@saikongz 4 жыл бұрын

This is a brilliant idea! Not sure if I am right but my intuition is that this approach might lead to modal collapse, which means different random noise input would end up with several predefined outputs. However, we sometimes expect a generator could generate results with higher variance.

@Mayur7Garg 3 жыл бұрын

I know this a 2 year old comment but I came across this video just recently. The idea that you describe is amazing. In fact, I myself was trying to implement a similar concept once- Train a standard autoencoder but slowly transform it into a denoising autoencoder by polluting the input data overtime. Once the noise on the input data reaches 100%, the autoencoder, in principle, should behave like a generator. However, I feel like there is an issue with this approach. In the standard GAN approach, the generator uses the n dimensional input from an n-dimensional input space to generate the output. In the process, it learns to associate each dimension with a specific latent attribute that affects the output. Once the GAN is trained, moving any value in this n-dimensional input increases or decreases the affect of the latent attribute in the output image in a deterministic manner. For instance, in some GANs I have seen, one of the input features often encodes the pose of the face that the generator generates. Changing the value of that specific input whilst keeping the others constant causes the face of the output image to tilt from left to right in such models. In the approach you suggest, such encoding is not possible as the input feature is replaced by noise but the outputs are unchanged. This causes the relationship of any latent attribute in the output image to the input image to be null i.e. random. For instance, consider a dataset where we have an equal number of male and female images in a face dataset. When we add noise to the input, we cannot guarantee that at least one of the input feature will contain the gender information i.e. will have a different distribution of values for male images and a different distribution of values for female images because noise is inherently random. This argument applies to any latent or visible attribute of the data that we are training it on. Simply put, the model will start to get worse as we add more noise because there would be less and less information about the output in the input data. This is not a problem with standard GANs since they are not trained to have a fixed output but are trained instead to match the distribution of the training data. The approach you suggest will not work because the model's objective is to find the mapping between individual input and output data points, that generalizes well to all data, which will cease to exist as we keep adding noise to the input. For standard GANs, the objective is not to search for the mapping but instead create a mapping between the entire input and output distributions. And never for one moment think this is a dumb idea. On the contrary, this was one of those ideas that took me on a rather thought provoking ride! 👏👏

@kshiteejsheth9416 7 жыл бұрын

at 22:25 , Why do we have to union bound ? because the idea 3 holds for any fixed discriminator, which means it holds for the best discriminator and thats all we want