16-DCGAN from Scratch with Tensorflow - Create Fake Images from CELEB-A Dataset

16-DCGAN from Scratch with Tensorflow - Create Fake Images from CELEB-A Dataset | Deep Learning

Рет қаралды 5,259

Rohan-Paul-AI

Күн бұрын

Пікірлер: 14

@alexanderkempen9063 2 жыл бұрын

Great video! But why are you using 4*4*512 in the first Dense layer at the Generator step?

@RohanPaul-AI 2 жыл бұрын

Thanks Alexander, and on your question, The idea of the Generator of DCGAN for CELEB-A Dataset is that - by the end of the model, I want to decrease channels to 3 and increase the width and height to reach the original image size (64 * 64 ). Hence, I am starting with a Dense layer that takes this seed as input, then upsample several times until I reach the desired image size of **64x64x3**. So what I am doing here is increasing the first hidden layer (Dense Layer) to project the data to (4 * 4 * 512) >> And then keep adjusting the Filter for the subsequent Layers >> and in the final Conv2DTranspose, change the filters to 3. In this way the final output would be (64 * 64 * 3). That is, in other words, the generator, G, needs to be designed to map the latent space vector (z of shape 100) to the required data-space (64x64x3 in this case). If I were to do this project of applying DCGAN on MNIST Dataset (which is shaped as 28 * 28 * 1) - Then I would choose a Dense Layer such that it takes the seed as input, then upsample several times until I reach the desired image size of 28x28x1. Hoping that made sense 🙂 and also Thanks for subscribing 🙂

@alexanderkempen9063 2 жыл бұрын

@@RohanPaul-AI Thanks for your quick reply and insightfull answer! However, I still am wondering what the meaning of (4*4*512) is. Does it mean you start with 512 and downsample 4x times through the Conv2D towards 64 (which is your desired value for the pictures to be?). So, let's say my images are of (337, 450, 3), what would be the dense input's (and the values for reshaping). I am still a bit confused how to determine that value. It would mean a lot to me if you could help me understand this!

@RohanPaul-AI 2 жыл бұрын

@@alexanderkempen9063 Your 1st Q - The meaning of (4*4*512) I am applying each one of the 512 dense neurons to each of the 4x4 kernel size to the 100 Element Noise Vector, giving me below.. =(512*4*4*100) number of parameters. To the above I also have to add the Bias Terms which will be - (512*4*4) Totaling = 827392 parameters. **Dense** is pretty much Keras's way to say matrix multiplication. As a separate example, Whenever we say `Dense(512, activation='relu', input_shape=(32, 32, 3)),` what we are really saying is Perform matrix multiplication to result in an output matrix with a desired last dimension to be 512. *********************** Your 2nd Q. Does it mean you start with 512 ? Here for my Generator Function, I started with 4 * 4 and then apply `Conv2DTranspose` 4 Times i.e starting value of 4 and then multiplying 2 * 2 * 2 * 2 Giving me 64 as final. Because, by the end of the Network, I had to reach the original image size (64 * 64 ). *********************** Your 3-rd Question - For your images of (337, 450, 3), what would be the dense input's (and the values for reshaping). And I am assuming you are trying to do a DCGAN on that set of images. In Generator Function, Your "input_shape" parmeter to Dense() Function will continue to be [100] as thats the starting noise vector to generate images for DCGAN. And then for deciding on the reshape, you have to make a decions on the following. a) No of unit - which is the first param of Dense() function. b) No of Filter - which is the first parameter of Conv2DTranspose() function. c) kernel_size or filter_size - which is the second parameter of Conv2DTranspose() function. d) And number of strides for your Conv2DTranspose() All of these are hyper-parameter that can be tuned. So theoretically, for your image dataset, there is nothing stopping you to start with (4 * 4 * 512 ) itself. Just that after starting with (4 * 4 * 512 ) - you have to design your layers and keep adding your layers in such a way, so that by the end of the network, you reach your original size of the images. i.e. your Generator Funtion genereates original size images. Now the obvious question you may ask is, if there's any ideal number, for example, for the number of filters (i.e. the first parameter of Conv2DTranspose ) to use Short Ans - There is no direct method to know the number of filters to use for your model. However you can test some values like 16,32,64,128,256, 512.. Similarly, you may ask how should you choose your kernel_size ( i.e. filter_size) ? Again, the answer is no. The most popular choice that is used by every Deep Learning practitioner out there, and that is 3x3 kernel size. General rule of thumb for seleting the value of kernel_size is - If your input image is larger than 128×128 - Consider using a 5×5 or 7×7 kernel to learn larger feature. If your images are smaller than 128×128 you may want to consider starting with strictly 3×3 or 4×4 filters ---------------- An example Lets say, for a DCGAN project, I have original image size of (200,200). That means, I want to generate images of dimension (200, 200). And lets say, I am starting with a Dense Layer of (25 * 25 * 432) So, to get the number 200 I have to do 25 x 2 x 2 x 2 (i.e. 200) Which means, I have to implement a network to deconvolve 3 times like below ```py cnn.add(Dense(25*25*432, input_dim=latent_size, activation='relu')) cnn.add(Reshape((25, 25, 432))) # And then deconvolve 3 times to 25x2x2x2 = 200 cnn.add(Conv2DTranspose(192, 2, strides=2, padding='valid', activation='relu', kernel_initializer='glorot_normal')) cnn.add(BatchNormalization()) cnn.add(Conv2DTranspose(96, 2, strides=2, padding='valid', activation='relu', kernel_initializer='glorot_normal')) cnn.add(BatchNormalization()) cnn.add(Conv2DTranspose(3, 2, strides=2, padding='valid', activation='relu', kernel_initializer='glorot_normal')) cnn.add(BatchNormalization()) ```

@LiquidMasti 3 жыл бұрын

Very informative video learned so much!!

@RohanPaul-AI 3 жыл бұрын

Glad you liked

@jasonjennings8465 5 ай бұрын

Is it possible to run this through VSphere VM, to use NVIDIA vgpu to take advantage of the full file set?

@RohanPaul-AI 5 ай бұрын

good question, honestly not sure, have not used NVDIA's vgpu

@tensorguy6456 3 жыл бұрын

Thanks. This is great.

@Tady_lin 2 жыл бұрын

I have some problems. I tried to run the code and use 50,000 images to train, but the values Epoch:, Loss: D_real = 0.000, D_fake = 0.000, G = 0.000, until Epoch: 300, I can't generator like real face , I don’t know what went wrong!!!

@Beltusams 2 жыл бұрын

I have this too? Did you manage to fix it?

@krp8225 2 жыл бұрын

Same here

@ryaniseverywhere 2 жыл бұрын

Can we generate image with spesific label using DCGAN?

@RohanPaul-AI 2 жыл бұрын

To my knowledge Conditional GAN (CGAN) and InfoGAN are specially designed for image with labels - which can take extra help from the labels to make the model performing better.