Just wanted to mention something I didn't bring up in the video. You should be a bit careful when adding transformations to your model, more is not always better. In some cases it might actually ruin your results, let's say a simple scenario of the MNIST dataset. Using RandomHorizontalFlip or RandomVerticalFlip might totally destroy your model since the transformations changed the actual digit in the image and it's training on incorrect samples.
@InturnetHaetMachine3 жыл бұрын
5:34 I don't know why, I just laughed out loud so much when that cat image popped up when you're talking about vertical flip.
@joshlazor62084 жыл бұрын
Two things: 1.) How would you calculate the mean and standard deviation of the image with Normalize? 2.) What does resize to 256, 256 do? Does it change the pixels to 256x256 pixels?
@AladdinPersson4 жыл бұрын
1) You would go through each image in your dataset and calculate the mean and standard deviation for each channel. Then you use those values with Normalize to get mean 0 and standard deviation of 1. 2) Yes
@joshlazor62084 жыл бұрын
@@AladdinPersson Do you have a video on finding the Mean and Standard Deviation of different Images? I'm not sure how to do that
@AladdinPersson4 жыл бұрын
@@joshlazor6208 I don't have a video on that unfortunately, but I think you can work it out yourself if you spend some time on it ;) What you want to do is use torch.mean and torch.std on the input tensors. Try and google these things also, there's a lot of answers on pytorch forum and you can also ask questions there. Here's a thread that might be useful. discuss.pytorch.org/t/about-normalization-using-pre-trained-vgg16-networks/23560/6
@akarshrastogi36823 жыл бұрын
Hi, an important query : so if we Don't go over that loop that saves more images to a folder, our images will essentially be "replaced" with some changes, and not "added" to the list of already existing images, right? When we feed the transforms list to a DataLoader, does the number of images to train on increase many fold, OR stays the same, but with multiple changes to the same images according to those probabilities ?
@zhitongchen75495 ай бұрын
A question I encounter is, like for the image segmentation task, the labels are a image as well, if we apply random augmentation for the images, it is not really possible to get correspond labels to be transformed into same layout right? like random crop, we can not make sure that the label and the image been cropped at the same place.
@holandaraf4 жыл бұрын
Hi Aladdin, great video. I am having trouble understanding something related to data augmentation itself. For example, in your video, you do something like this: suppose you have 20 images and you apply all of those transforms. You still end up with 20 images, but they are all rotated, gray scaled etc. Woudn't it work better if you had your 20 original images + the new images transformed? You would have more data and more generalization, I suppose. Not sure if I was able to make myself clear, sorry. Thanks in advance!
@BeansGiveMeGas3 жыл бұрын
scroll down to the last comment on this page, the answer is in detail
@tanmay86393 жыл бұрын
Can someone tell this?
@muhammedjaabir26092 жыл бұрын
@@tanmay8639 imagine you have 256 images and your batch size is 16, so your first batch is fed into the model for the training in the first epoch, imagine it gets the original images at first using the method __getitem__() for those 16 images and then after finishing all the iterations and when it comes to the second epoch it then again get those 16 images and also notice that there's a `p= ` for the augmentations so this time it might augment the image. so basically you are feeding the new images to the model
@tanmay86392 жыл бұрын
@@muhammedjaabir2609 hi, so by this you mean more the epochs more the variety of input images seen by the model.
@caot6813 жыл бұрын
I follow this video and previous video "How to build custom Datasets for Text in pytorch" and see that: If we use transform this way, and then use random_split to create train set and validation set, then the validation set will be applied the same transforms as the train set (which not I want). I think validation set should apply something like Resize, Normalize, but not RandomHorizontalFlip, RandomRotation...How can I apply different transforms to train set and validation set for the custom Datasets in your previous video "How to build custom Datasets for Text in pytorch"?
@actzful4 жыл бұрын
Hello, thanks for the awesome video. I have one question, when using the dataloader, how is the transformation being done? For example, you have 2 transformation and 10 images. How does the dataloader transform it to 30 images (including original image) or does it?
@AladdinPersson4 жыл бұрын
Those are great questions, I'll try my best to explain. For each image you pass through it gets mapped to one image that has been transformed. Using several transforms doesn't therefore extend the data but rather over a couple of epochs the dataset has been augmented with a lot of variants of the passed images. When using so many transforms as I did in the video with probabilities it's unlikely the model will ever see the original images but rather for each pass it will see new images that's just distorted. Effectively it augments the data if you pass an image many many times but over a single pass there's just a single variant for every image. Hopefully it was clear, it can be a bit confusing. Edit: Also if you have several transformation (like two as you mentioned) the image will pass both of them, and the resulting image will be when both transformations have been applied.
@actzful4 жыл бұрын
@@AladdinPersson Thanks for the great explanation! Another question I have is when you pass in ToTensor() or Normalize in Compose, does it get transformed immediately or it will only get transformed when you iterate through epochs with the DataLoader? Transform immediately meaning that the result dataset will be of Tensor and Normalized in the first iteration of training. I guess I'm not quite clear on if those transformation is being done sequentially or randomly. For example, Compose[Resize -> RandomCrop -> ColorJitter -> ToTensor -> Normalize]. In Epoch 1, it would train with Resize, Epoch 2, it would train with RandomCrop, and so on and so forth. Does it work in this way or differently? Is the purpose of transform method in ImageFolder essentially just to increase the data sample or can it be use as a transformation of the data as well? Meaning that if all the training data came in all shape and sizes and I want it to Resize them all to 224 and turn them to Grayscale(3), will the result dataset be 224 AND Grayscale(3) or it will be sometimes that it is 224 with Grayscale(3) and sometimes it will be with no Grayscale(3).
@AladdinPersson4 жыл бұрын
Yeah when you send in Compose(transform1, transform2, ...) it will perform each transform sequentially. If you for example would use two transforms the first being Resize to 32x32 and the second one to convert it to GrayScale, the resulting image would always be 32x32 grayscale. That is why you usually use Random transforms (like we did in the video) so for each pass they will be different from each other (probably) and thus augmenting the data over several epochs.
@mauriciog81582 жыл бұрын
I have a question: after the transformations are done, do the labels from the images also get corresponded to the new data or I have to do something else to assign to the new images their labels?
@RahulKumar-mh4bk4 жыл бұрын
Thanks for the great video! I was wondering if we could make small cropped/patches images beforehand and pass them through the network. I am completely beginner, could provide some references or explain how I could make those patches? Also, in data augmentation can we specify number of output images after transform?
@AladdinPersson4 жыл бұрын
I'm not sure what it is you want to do exactly, you want to crop them and add them in the dataset or? Why do you want to it that way? In the data augmentation each of the transforms are applied sequentially and there is no way to say how many augmented images there will be exactly
@nikolayandcards4 жыл бұрын
What comes first: labeling or augmentation? For example, I can't find a dataset for a specific task and I have to make my own but I am lazy to label 4000 images manually. I thought that I can label only like 50 images and then use data augmentation so I can make them 250 total (images + labels/segmentation masks). Is this correct or do I have to perform data augmentation (from 50 to 250) first and then label all (250) images manually?
@AladdinPersson4 жыл бұрын
Yea if you label 50 images use data augmentation (and a lot of it, check out: arxiv.org/abs/1909.13719) then you'll effectively get a much larger dataset. I would recommend using a pretrained network on ImageNet or something and you could probably achieve some decent results. Let me know how it goes for you
@jonesbbq3074 жыл бұрын
How does RandomGrayscale work? If your input images are mixed with 3-channels and 1-channel how do you write your model to account for that?
@AladdinPersson4 жыл бұрын
It's going to to repeat the single grayscale channel over the RGB, such that R==G==B. You will still have 3 channels but all will be identical.
@jonesbbq3074 жыл бұрын
Aladdin Persson I see. Also I am having trouble figuring out how to save the images into different folders.
@skyli39454 жыл бұрын
isn't the normalization already included in the ToTensor() function? I learnt this from one course in udemy that, ToTensor already handled transforming imagies to tensor (in a way that color channel comes first, comparing the way that its origionally stored which is coloar channel comes last), and normalize it plus, I don't think we need ToPIL() to start with for the image transformation to work?
@AladdinPersson4 жыл бұрын
You're right that ToTensor also does some normalization. To Tensor divides by 255, and in that way makes everything between [0,1]. We want mean of it to be 0 and standard deviation 1 so we do couple of extra normalization. I'm quite sure we need ToPIL, if I remember correctly we will get an error otherwise.
@katyhessni5 ай бұрын
thanks
@hizircanbayram98984 жыл бұрын
Is transform.Compose applies transforms respectively? Thank you
@AladdinPersson4 жыл бұрын
Transforms compose is just so that we can apply all the transformations we add to the list. Specifically all of these will be applied sequentially, so we if have transformers.Compose([ transform1, transform2, ]) then transform1 will first be applied, then transform2
@mahussain14 жыл бұрын
Hi Aladdin, I have a question: Do applying Random Transformation (flip, crop rotate or any etc.) with p=1 means it is applied to all tensors/PIL images that are passed? OR it is like, if p=1 then any randomly chosen PIL image/Tensor will be be transformed because it is now selected and those skipped tensors/PIL images will be preserved as they were before? please help me out. Thanks in advance.
@AladdinPersson4 жыл бұрын
If you set p=1 it's applied to all images that are getting passed/loaded
@mahussain14 жыл бұрын
@@AladdinPersson thank you so much!
@nerdygeek74254 жыл бұрын
@Aladdin Persson I have a query. I am working on a custom dataset of images with variable sizes. I tried to apply transforms.Resize(224,224) but an error showing -"Unknown resampling filter (224)" What has happended and how to overcome it? Thanks in advance
@nerdygeek74254 жыл бұрын
Lol. Figured it out. I didn't put parenthesis in mentioning the size. Tranforms.Resize((224,224)) will be the result. Previously it set the parameter "interpolation" to 224 which caused the resampling error.
@AladdinPersson4 жыл бұрын
@@nerdygeek7425 Hehe was just about to say that :)
@zunwang16874 жыл бұрын
Hello, I'm a little confused about understanding the normalization process. In linear regression, for each feature, we subtract its mean and divided by its std. In image every image is an observation so I think we have 3*224*224 features. So why don't we center and normalize data on each channel's each pixel, instead only do this on each channel?
@AladdinPersson4 жыл бұрын
That's an interesting perspective! I'm not entirely sure but just thinking about it I can imagine that if you would normalize every pixel then that could destroy the correlation between different pixel values and hence ruin the structure of the original image. By subtracting with the same value over the entire image the distribution of the pixel values stay the same and hence the structure and the main contents of the original image remains but the image could have been altered in brightness/color etc.
@zunwang16874 жыл бұрын
@@AladdinPersson Wow that's an awesome interpretation, thanks!
@hoangdz65584 жыл бұрын
I have strictly followed your videos (both customDataset and Transform) but I my code could not generate any photo. Can you explain why? Thanks
@AladdinPersson4 жыл бұрын
Perhaps you could share your code and in that way I could easier see what could've gone wrong. Actually I think it would be best if you could do a post on pytorch forum: discuss.pytorch.org/ And link the post and I can respond there (it's much easier to read code on there)
@narayanareddybattula54654 жыл бұрын
Did u check your folder structure like this ==> ./train/class_name/image_*.jpg
@venkatesanr94554 жыл бұрын
Great explanation and informative content, Any application related with end to end processing would help alot such as NLP tasks/others. Like to share future works?
@AladdinPersson4 жыл бұрын
Thanks I appreciate you saying that, my recent videos have been towards NLP. I've done a couple videos on torchtext, is there anything more specifically you're looking for with nlp preprocessing?
@venkatesanr94554 жыл бұрын
@@AladdinPersson Yes I hav seen that and good to learn. Are u have plans in doing NLP projects like named entity recognition, chatbot or any others.
@AladdinPersson4 жыл бұрын
Yeah for sure, not sure in what order I'll be doing them as I have a few other concepts I want to explore also, like image captioning etc
@rajukurapati16394 жыл бұрын
Why isn't this playlist in a sequential order?
@AladdinPersson4 жыл бұрын
What do you mean? Why it's not Pytorch tutorial 1, 2, etc? I made the videos to be stand-alone so that some videos might be built on another one but they aren't necessarily to be watched in any sequential ordering
@rajukurapati16394 жыл бұрын
@@AladdinPersson You were mentioning that "In the next video, we will be using this and this", so a little confused.
@AladdinPersson4 жыл бұрын
@@rajukurapati1639 I understand, that is odd and was a mistake on my part. Perhaps I was confused at the time over how to structure the videos