PyTorch Datasets and DataLoaders - Training Set Exploration for Deep Learning and AI

Рет қаралды 61,893

deeplizard

Күн бұрын

Пікірлер: 132

@deeplizard 6 жыл бұрын

Check out the corresponding blog and other resources for this video at: deeplizard.com/learn/video/mUueSPmcOBc

@sulemanrasheed1634 5 жыл бұрын

The reason for using plt.imshow(np.transpose(grid, (1,2,0))): For a colored image... plt.imshow takes image dimension in following form [height width channels] ...while pytorch follows [channels height width]... so for compatibility we have to change pytorch dimensions so that channels appear at end... the standard representation of array is [axis0 axis1 axis2].... so we have to convert (0,1,2) to (1,2,0) form to make it compatible for imshow....

@7WhiteSword 5 жыл бұрын

Thank you for your explanation, i was confused at first, about to go to the documentation, but then i saw your comment :)

@heller4196 5 жыл бұрын

just use grid.permute(1, 2, 0) instead of np.transpose

@friday1015 4 жыл бұрын

On plt.imshow(np.transpose(grid, (1,2,0))) why is there a color channel. Didn't we squeezed it before on single image sample?

@fahadmuntasir2336 4 жыл бұрын

Why print(next(iter(train_set))) always results in an image of label 9? Printing it several times results in a similar image.

@sulemanrasheed1634 4 жыл бұрын

@@fahadmuntasir2336 ensure shuffle is true.

@sihanchen9099 5 жыл бұрын

This series is awesome so far, don't why there are not many people watching it.

@brucemurdock5358 Жыл бұрын

YT algorithm I guess. This is by far the best playlist honestly

@simoneparvizi775 2 жыл бұрын

OK ok ok ok. Wait a sec. WHY THE FUCK does this extremely well made video (deep explanations, step by step, great audio, great references to Jeremy Howard, to the paper briefely mentioned) got only 50k views...... jesus christ what an amazing video. Ty guys for the awesome work

@deeplizard 2 жыл бұрын

😆😅 Thank you, Simone! We're glad to hear that you've found value in the content and have appreciation for the style and level of detail for which we cover it.

@brainify6172 4 жыл бұрын

Best Explanation bro!!!. Video quality and the way you show typing and then you explain is also just awesome

@zhengguanwang4337 2 жыл бұрын

perfect!!!!!

@JimmyCheng 6 жыл бұрын

volume can be louder. Anxiously waiting for the next episode! Ending is magnificent!

@deeplizard 6 жыл бұрын

It's in the works! Thank you for mentioning the volume. 🙏 Very helpful! We are working on it.

@dippatel1739 6 жыл бұрын

please keep making video. your content is great.

@deeplizard 6 жыл бұрын

Thank you dip! Will have more coming!

@drevolan 6 жыл бұрын

This video would've been amazingly useful last week when I struggled implementing my own DataLoaders/Sets :D There's not much information about this online so it's greatly appreciated. But please, normalize your audio, I had to crank my volume to be able to understand what was being said. But nonetheless good job, these videos are really appreciated.

@deeplizard 6 жыл бұрын

Hey Kesdan - Thank you. You are welcome! I'll double check the audio. Please let me know if you see an issue in with it in the future.

@sundarsanthanam6147 4 жыл бұрын

I dont understand the pyplot configurations in 11:16 to display the grid of images

@deeplizard 4 жыл бұрын

Hey Sundar - The torchvision.utils.make_grid function transforms the batch of ten images into grid of images. The grid is no different from any other image we might plot. To understand the nature of the grid, try inspecting the shapes of each tensor: images.shape grid.shape grid.permute(1,2,0).shape Note that the make_grid function pads the original images by 2. See the documentation here (padding): pytorch.org/docs/stable/_modules/torchvision/utils.html#make_grid I hope this helps! Chris

@sundarsanthanam6147 4 жыл бұрын

deeplizard thank you so much

@malamals 4 жыл бұрын

This video series gave me so much insight into deep learning. Thank you Deeplizard team for this amazing work. Can someone share the video of the paper being used here?

@deeplizard 4 жыл бұрын

Video of the paper?

@yeahorightbro 6 жыл бұрын

Best on you tube. Well done and thank you. Wondering though if you'll do one for custom text datasets?

@deeplizard 6 жыл бұрын

Hey Daniel - You are welcome! I will put custom datasets on the list. We'll likely use a custom dataset in the next project.

@Vikram-wx4hg 4 жыл бұрын

For label.shape I get following error: 'int' object has no attribute 'shape'

@deeplizard 4 жыл бұрын

Hey Vikram - This is due to a change in the api. I've provided details for this update on the blog. See the "Updates" section here: deeplizard.com/learn/video/mUueSPmcOBc Chris

@Vikram-wx4hg 4 жыл бұрын

@@deeplizard Thanks Chris a lot for your reply. I sort of suspected that there might have been an API update.

@panchajanya91 3 жыл бұрын

while I was doing on my own, label was not a tensor rather it was an int object.

@hardikchawla4966 5 жыл бұрын

excellent explanation!!

@deeplizard 5 жыл бұрын

Thanks Hardik!

@aryamaansaha2951 4 жыл бұрын

In np.transpose(grid, (1,2,0)) what does do 1, 2,0 represent ?

@daudasaniabdullahi4225 4 жыл бұрын

plt.imshow() requires image to be in this format (height, width, channel) but pytorch uses this format (channel, height, width). So therefore u reshape the image, remember ur indexing, channel-0, height-1, width-2 . So that's why u get dat. Thanks

@aryamaansaha2951 4 жыл бұрын

@@daudasaniabdullahi4225 thanks!

@zhengguanwang4337 2 жыл бұрын

Do you have tutorial of hyperparameters for RNN.? That would be great!!!!

@yunhuaji3038 5 жыл бұрын

Is that how fast & accurately you normally type codes or just a speeded replay?

@deeplizard 5 жыл бұрын

I can type fast but not that fast! 🤣 Yes. Speeded replay.

@chronicfantastic 6 жыл бұрын

That's an interesting point about the effectiveness of oversampling - we had a similar issue with an very unbalanced dataset at work and the sklearn weight-parameter didn't seem to make much difference. It's nice to see some research on it.

@ashutoshshah864 3 жыл бұрын

why is the len(batch) = 2? there are 10 images with 10 labels in a batch, right? A bit confused here. I am thinking, for some reason, that a batch would be a list of 10 tuples: batch = ([image, label], [image, label],...,[10th image, label])

@SonGoku-lc1sb 5 жыл бұрын

image , label = sample Image has a tensor of size [1.28.28] , where as the label had just an integer value 9 .. and not a tensor (9) Why so ?

@deeplizard 5 жыл бұрын

Hey Son Goku - This change was introduced in the torchvision version 0.2.2. Double check that this is your version like so: torchvision.__version__ You can see this change listed in the release notes here: github.com/pytorch/vision/releases Have a look (search for Cast MNIST) and you'll see it. In my opinion, they should have not made this change. They did it because it fixes another issue. Anyway, the dataloader, which is what we work with mostly still returns a tensor I think. Can you verify this? Thanks and hope this helps!

@deeplizard 5 жыл бұрын

Two underscores on each side (torchvision.__version__). KZbin is removing one of them.

@SonGoku-lc1sb 5 жыл бұрын

@@deeplizard yes Dataloader returns an object which on iterating returns u a batch or a list of tensors within itself .

@vgranjinidevi Жыл бұрын

I have a doubt, I have just started to learn deep learning, but I see that if I start coding, someplaces I have to use numpy, sometimes pandas, sometimes PyTorch, Sometimes Matplotlib or seaborn, others places scikit learn. It is like I start at a place and travel here and there, trying to learn all. Is the journey like this? Or Is there any streamlined way or course that teaches you these?

@deeplizard Жыл бұрын

Yes. It's completely normal. The journey is like this. Each tool excels at different tasks, almost like the gears on a bike helping you navigate different terrains. You'll find that as you gain experience, you'll get better at knowing when to go deep on a particular tool and when to just get the basics and move on. Remember, there's always another layer to peel back in this field, so you'll never run out of opportunities to go deeper. Hang in there, practice, and it will all click into place. Happy learning! 📚💡

@ratkush 6 жыл бұрын

Getting TypeError: object() takes no parameters when running next(iter(train_set))

@deeplizard 6 жыл бұрын

What happens if you try this instead? train_set[0]

@sytekd00d 6 жыл бұрын

I am getting the same error

@sytekd00d 6 жыл бұрын

I figured it out.... Check this line in your code: transform= transforms.Compose([transforms.ToTensor(), ]) Make sure you add parenthesis after 'ToTensor'. It should be 'ToTensor()'

@liucosette6091 2 жыл бұрын

@@sytekd00d works for me! thanks! could you please tell me why this error happened?

@rameshthamizhselvan2458 5 жыл бұрын

why we are transposing the grid in image show function. matplot lib accepts numpy array rit why don't we give grid.numpy() instead of transposing correct me if I'm wrong.

@deeplizard 5 жыл бұрын

Hey Ramesh - The imshow function accepts (H,W,C) and PyTorch tensors are shaped like this (C,H,W). This is why we re-arrange the data. Note that using permute() is more straight forward. The site was updated with this: deeplizard.com/learn/video/mUueSPmcOBc

@xiangli1133 2 жыл бұрын

can you do a series like this for the custom dataset, thanks?

@sinaasadiyan 6 жыл бұрын

Hi thanks for your videos. which version of pytorch are you using in these videos?

@deeplizard 6 жыл бұрын

Hey Sina - You are welcome! We are using v0.4.1

@TheAnubhav27 5 жыл бұрын

How do i create batches for custom datasets(non-image data) that are not part of the torchvision package? Is there any resource to learn that?

@felipeguimaraes7565 5 жыл бұрын

What is the purpose of plt.figure?

@SamerSallam92 6 жыл бұрын

Thank u very much for your great series I guess in the blog article you missed to add image, label = sample

@deeplizard 6 жыл бұрын

Hey Samer - You are welcome! Sometimes the blog and the video will differ. Thanks for pointing that out.

@vitoroliveira4290 5 жыл бұрын

True, and i didnt get how the label's shape would be printed, since its "int". My ide return this error. But its really not important anyway.

@AllenFangs 5 жыл бұрын

@@vitoroliveira4290 any solution for it ? same problem :(

@vitoroliveira4290 5 жыл бұрын

@@AllenFangs Not really, sorry

@AllenFangs 5 жыл бұрын

@@vitoroliveira4290 just check the comments from deeplizard, it's a bug fixed from Pytorch. I think it's not a big problem

@picumtg5631 2 жыл бұрын

Note that the new root should be ./data as it was changed in the fashionmnist

@jayachandra677 4 жыл бұрын

use train_set.targets() instead of train_labels() if you get error

@deeplizard 4 жыл бұрын

Hey Jay - Thanks for the information. This one was also posted in the updates section here: deeplizard.com/learn/video/jexkKugTg04

@datarachit 5 жыл бұрын

what is the logic behind transpose?

@deeplizard 5 жыл бұрын

The axis locations inside tensors are not standardized across libraries. Some libraries will switch these around. For example placing the channels at the last axis position. This is the case for the imshow function, so we have to move the axes around. Have a look at the top of the doc here (X : array-like or PIL image): matplotlib.org/api/_as_gen/matplotlib.pyplot.imshow.html

@xdcedar 6 жыл бұрын

how! could you type so fast and precisely! or does the truth is I am typing too slow actually..

@careymain3036 5 жыл бұрын

I'm using my own image data with pytorch dataloader. I getting error "cannot import name 'read_data_sets' " Have you seen this before ? All i could find in stackoverflow was -" if you have own file with name dataloader.py then it imports your file instead of module and it can't find read_data_sets in your file " but no explanation of how to fix that any idea?

@deeplizard 5 жыл бұрын

Hi Carey - Try changing the name of your dataloader.py file.

@careymain3036 5 жыл бұрын

@@deeplizard I am running this in jupyter all three classes - dataloader (class MRDataset(data.Dataset))- model and train are in the notebook so i dont have a .py for this project just the notebook - the above was the only answer i could find on stackoverflow

@deeplizard 5 жыл бұрын

What code is throwing the error and what is the full error?

@careymain3036 5 жыл бұрын

@@deeplizard github.com/maincarey/ML/blob/master/MRI.ipynb

@careymain3036 5 жыл бұрын

ImportError Traceback (most recent call last) in () 16 from tensorboardX import SummaryWriter 17 ---> 18 from dataloader import MRIDataset 19 from dataloader import read_data_sets 20 import model /usr/local/lib/python3.6/dist-packages/dataloader/__init__.py in () ----> 1 from dataloader import read_data_sets ImportError: cannot import name 'read_data_sets'

@CoolDude911 5 жыл бұрын

I don't know if someone could clarify but I would worry that over-sampling an uncommon class that is actually uncommon in real samples will create a biased model and probably over-fitted to the smaller range of data in the uncommon class.

@deeplizard 5 жыл бұрын

Hey Barry - Good question. Let's look to the paper. The paper says the following: "For classical machine learning models it was shown that oversampling can cause overfitting, especially for minority classes [33]. As we repeat small number of examples multiple times, the trained model fits them too well. Thus, according to this prior knowledge undersampling would be a better choice. The results from our experiments do not confirm this conclusion for convolutional neural networks." This is can be found in section 4.6: "Generalization of sampling methods" Link: arxiv.org/abs/1710.05381

@CoolDude911 5 жыл бұрын

@@deeplizard Update: I encountered this kind of problem at work with something. It turns out with a uniform common class and a variable uncommon class, you can get higher accuracy on real test data by training on data where the uncommon class has been augmented. Obviously too much augmentation will create a model that is too biased but 'too bias' may depend on the application. Over-fitting the small class is a separate problem. My guess as to why this happens is that a model can get stuck in a local optimum where it makes simple inferences on the common uniform class and has a much harder job learning anything about the variable class.

@mamoonanisar6774 5 жыл бұрын

Can I use this process to load and train my 3 labeled images(brain tumor) folders ?

@deeplizard 5 жыл бұрын

Hi mamoona - The answer is yes. Use torchvision.datasets.ImageFolder() to create your dataset.

@SaimKhan-xj5um 6 жыл бұрын

God i love this channel 🤗

@deeplizard 6 жыл бұрын

Thank you Saim! 🙏

@lynnliu7520 5 жыл бұрын

Why my label is an int instead of tensor.. D:

@deeplizard 5 жыл бұрын

Are you using your own dataset?

@lynnliu7520 5 жыл бұрын

@@deeplizard No, I follow the step in the video and use the fashion mnist. But the batch label works fine as tensor..

@deeplizard 5 жыл бұрын

What version of PyTorch are you running?

@lynnliu7520 5 жыл бұрын

@@deeplizard 1.0.1

@deeplizard 5 жыл бұрын

Hey SHU LIU - I finally tracked this down. This change was introduced in the torchvision version 0.2.2. Double check that this is your version like so: torchvision.__version__ You can see this change listed in the release notes here: github.com/pytorch/vision/releases Have a look (search for Cast MNIST) and you'll see it. In my opinion, they should have not made this change. They did it because it fixes another issue. Anyway, the dataloader, which is what we work with mostly still returns a tensor. Thanks for verifying that. Hope this helps!

@HimothyOHooligan 4 жыл бұрын

Yo is that sub-60Hz rumble even necessary during the typing sections. It's like 15dB above the speech level. I'm over hear listening on headphones and thinking there's an earthquake happening

@deeplizard 4 жыл бұрын

I could remove it. However, what if I told you that the sound is a psychological trick that increases information retention by 50%. Would you then be down? 🧠

@HimothyOHooligan 4 жыл бұрын

@@deeplizard I would read into that but would still prefer it to be not there or to be much lower in level. Info retention improvements go to zero if I feel like I have to skip through or mute.

@deeplizard 4 жыл бұрын

Hey Rudy - I'm jk about the improvement. It is possible, but the improvements would likely affect different people in a spectrum of ways. Thank you for your feedback on the matter. I think the effect is used less later in the course. Also, every video has a corresponding blog, so there are other options for learning. 😃

@willTryAgainTmrw 6 жыл бұрын

Waiting for next one....

@deeplizard 6 жыл бұрын

Thanks Pratham - Working on it now! Stay tuned!

@SamerSallam92 6 жыл бұрын

Also, I guess this line With shuffle=True, the first samples in the training set will be returned on the first call to next. Should be With shuffle=False ...

@deeplizard 6 жыл бұрын

You are correct!

@sunitakakkar8309 4 жыл бұрын

Sir, Itried to replicate your code but i am getting stuck when i am trying to get the shape of the labels. the error says that like this : AttributeError: 'int' object has no attribute 'shape' . By this i understand that while converting the data to tensor, the photo got converted to tensor but the label is still int. Can you pls help? i am sharing the link of colab workbook : colab.research.google.com/drive/1WoZHmfr8g9prNOo75mGMGfiKiFXoa6IR?usp=sharing

5 жыл бұрын

You should look at pytorch again. Behaviour changed alot.

@deeplizard 5 жыл бұрын

Hey Selcuk - Please share details about what has changed. All changes are posted to the blog on the website. You should check there. Any changes are most certainly minor.

@SuperLuckyLad 5 жыл бұрын

The little lecture at the end highlights a problem with 'our' wonderful technology... what if you are a vegan taxi driver and you regard your computer as your 'friend'? .... not such bright future then is it?

@deeplizard 5 жыл бұрын

Yes. Good point. I take issue with the destination. He said that "the car already knows where your work is". I find it interesting to question whether we'll even have what we now call "work". Whether we currently identify as taxi drivers or programmers. As for the breakfast, the tech should be able to personalize it.

@SuperLuckyLad 5 жыл бұрын

@@deeplizard .... maybe's the tech did personalise it... ran a subroutine, did some deep learning of it's own and decided it didn't like Vegans .... lol ..... and hey presto "Psycho Chip" is born.

@deeplizard 5 жыл бұрын

lol. 🤣 In all honesty, these are the types of things we'll need to be considering going forward. Thanks for commenting on it with some of your thoughts!

@NairodTheBeast 4 жыл бұрын

For some reason the audio from typing in this video bothers me more than other videos

@champnaman 6 жыл бұрын

Don't want any of the things that the guy mentioned @12:41

@deeplizard 6 жыл бұрын

Autonomous cars? 🚗

@soulfrench 5 жыл бұрын

why so many examples about images but not about something else..

@deeplizard 5 жыл бұрын

It's the classic example.

@ATULYADAV-jz9er 4 жыл бұрын

Hello You have done excellent and impressive work. Actually, I am new in machine learning and I was trying to run the code but I was facing problems. It would be grateful if you help me , i am trying to run this code and getting from model import ft_net, ft_net_dense, PCB ModuleNotFoundError: No module named 'model' Code link github.com/Wanggcong/Spatial-Temporal-Re-identification

@abijithjkamath 3 жыл бұрын

This video has 784 likes. Illuminati confirmed.

@ianjiang3762 4 жыл бұрын

Too much useless sound effect or video effect in this video

@sdc5574 5 жыл бұрын

The explanation is so complex..Kindly take up easy examples..Also,make a video in Indian accent..Ur accent is highly hard for an indian to understand.

@deeplizard 5 жыл бұрын

Hi Souhardya - I will work on my Indian accent! In the mean time, you can try using the website. There is a text version of the content. I hope that will help.

@sdc5574 5 жыл бұрын

@@deeplizard Plz make a full project of something in pycharm..plz.

@sdc5574 5 жыл бұрын

@@deeplizard kzbin.info/www/bejne/eneueZttlN-tgMU Something like this..

@deeplizard 5 жыл бұрын

Currently working out the details about which direction we'll go in terms of content. Thank you for the suggestion. Try different IDEs though. It will make you a stronger developer. Maybe do a project straight from the command line. 🤔

@sdc5574 5 жыл бұрын

@@deeplizard make a detailed project using pytorch.. It's easier to learn from project than from discrete videos.