PyTorch RNN Tutorial - Name Classification Using A Recurrent Neural Net

Рет қаралды 98,648

Patrick Loeber

Күн бұрын

Пікірлер: 136

@teetanrobotics5363 4 жыл бұрын

One of the best pytorch channels, i have ever encountered

@patloeber 4 жыл бұрын

Thanks so much!

@MartyAckerman310 4 жыл бұрын

I came for the RNN implementation from fundamentals, and stayed for the training pipeline. Thanks for a great video, it's the clearest explanation of RNN (in code) I've seen.

@patloeber 4 жыл бұрын

Thanks so much! Glad you like it

@sprobertson Ай бұрын

Awesome to see this as a video - I am re-learning pytorch after a long break and wondered if anyone had covered this tutorial

@samannaghavi4477 Жыл бұрын

A perfect toturial to understand RNN and how it can be implemented. Thank you so much.

@stancode7228 3 жыл бұрын

The best RNN video on KZbin! Thx for the efforts!

@patloeber 3 жыл бұрын

glad it was helpful!

@jaxx6712 6 ай бұрын

Best video I’ve ever seen on the rnn content thus far, thanks for the great video and keep up the good work 🎉

@navintiwari 3 жыл бұрын

Wow! This tutorial was really helpful in understanding and coding RNNs. Excellent!

@patloeber 3 жыл бұрын

Glad you enjoyed it!

@Borey567 5 ай бұрын

Exactly what I needed! Thank you so much!

@terraflops 4 жыл бұрын

can you believe I failed to hit the notification bell for this lovely channel? You rock!

@patloeber 4 жыл бұрын

Nice, thanks!

@mraihanafiandi 2 жыл бұрын

This is gold. I really hope you can create a similar content with transformers

@EasyCodesforKids 4 жыл бұрын

Thank you so much! This is a great tutorial on RNNs. You break down the hidden states really well!

@patloeber 4 жыл бұрын

Thank you:) glad you like it :)

@yildirimakbal6723 Жыл бұрын

Bir iş bu kadar iyi yapılır mı ya? Adamsın adam.....

@coc2912 Жыл бұрын

Your demonstration is easy to understand, thank you!

@stonejiang1162 Жыл бұрын

It seems your rnn structure is different from that provided in pytorch tutorial. But I think in a typical rnn model, .h2o( ) should be defined that takes 'hidden' as an input as apposed to a .i2o( ) that calculates output directly from combined.

@Szymon-vp7tn Жыл бұрын

I noticed it as well. I dont think its correct implementation

@IAmTheMainCharacter 4 жыл бұрын

Your channel did boom after your freecodecamp videos, keep going strong💪

@patloeber 4 жыл бұрын

Yes :) I will!

@IAmTheMainCharacter 4 жыл бұрын

@@patloeber I am starting a youtube channel myself, can I connect with you on LinkedIn.

@NehaJoshi-x5n 8 ай бұрын

Excellent! this video is so good - amazing explanation and intuition!

@HieuTran-rt1mv 4 жыл бұрын

Could you talk about Attention and Transformers? Your tutorials are easy to understand.

@xinjin871 3 жыл бұрын

I am also interested in these topics!

@rubenpartono 3 жыл бұрын

Have you seen the explanations by Yannic Kilchner? I'm not knowledgable enough to tell what audience Yannic's teaching is intended for but it seems like all the comments are all very positive!

@HieuTran-rt1mv 3 жыл бұрын

@@rubenpartono excuse me, who is Yannic? :)))

@rubenpartono 3 жыл бұрын

@@HieuTran-rt1mv He's a recently graduated Phd who posts deep learning related content on his KZbin channel (Yannic Kilchner)! His video "Attention is All You Need" is his take on discussing the paper. He has a lot of stuff related to transformers. I think he doesn't do any follow-along code but he does explain his intuitive understanding of cool deep learning papers.

@HieuTran-rt1mv 3 жыл бұрын

@@rubenpartono Thank you for your information, I will watch his videos now.

@affahrizain 2 жыл бұрын

This is a great explanation and example as well. Love it!

@alioraqsa Жыл бұрын

This video is so good!!!!

@user-or7ji5hv8y 4 жыл бұрын

This is a high quality video.

@patloeber 4 жыл бұрын

Thanks!

@INGLERAJKAMALRAJENDRA 3 ай бұрын

@4:24, I think video captioning would be better suited application for the rightmost architecture here, than video classification

@lifeislarge 3 жыл бұрын

Your tutorials are great I learnt so much.

@ashwinprasad5180 4 жыл бұрын

Bro, you are the best. Looking for more pytorch contents from you. Also would definitely watch if you start a tensorflow 2 series 👍

@patloeber 4 жыл бұрын

Thanks! Yes TF will come soon

@Jfantab Жыл бұрын

I noticed you concatenated the input and the hidden state then fed that into the linear layer, however I see in other implementations that they multiply each by their weight matrices respectively then add them together before feeding it through an activation layer. Is there any reason you concatenate those two together?

@aakashdusane Жыл бұрын

This makes more sense. No idea why he concatenated.

@sprobertson Ай бұрын

Multiplying is "standard" but the beauty of NNs is you can connect things however you want (or however works best). When making the original tutorial I was experimenting with a bunch of ways to pass the state and in this case concatenating just converged way faster.

@pangqiu4854 3 жыл бұрын

thank you for this informative video!

@floriandonhauser2383 3 жыл бұрын

Your tutorial does a great job at showcasing RNN. However, you forgot to mention a train-validate-test split to see if the model actually generalizes (understands the concept) or if it just learns the names by heart. Your "testing" uses names that the model was trained on. In an example like this it would be very important to use a train-validate-test split since without it you will never catch potential overfitting!

@patloeber 3 жыл бұрын

Thanks for the feedback, and yes absolutely correct. The code was heavily inspired by the original pytorch example which didn't split the data :( In most of my other tutorials I make sure to do this.

@floriandonhauser2383 3 жыл бұрын

@@patloeber That's great :) If guess since RNN is a more advanced topic most viewers hopefully have seen another video from you before which includes the train-validate-test split :) I've also done mistakes like deciding on hyperparameters or the model architecture based on the test instead of the validation set before.

@TechnGizmos 4 жыл бұрын

Based on your def train() function, init_hidden() would be called every time a new name from the dataset is used as an input. Why is it required every time? What would be the consequence of only calling it once(before the training loop) and never again? I thought that the hidden state contains important representations that adjust with every new training example, but over here it gets reset on every iteration. My understanding about what the hidden state represents may be flawed.

@patloeber 4 жыл бұрын

No your thinking is good. You can indeed use a RNN that knows about the whole sentence, or even multiple sentences...In this video to keep it simple I used a RNN that was trained only word by word, and for each word we looked at one character at a time. So it should learn based on the character sequences, so no need to store information from other words here...

@n000d13s 4 жыл бұрын

i am glad i subscribed. understanding the concept is one thing. actually being able to teach is totally another. 😊

@patloeber 4 жыл бұрын

Thanks! Glad you like it:)

@jacksonrolando403 3 жыл бұрын

Other than the use of a for loop to train your data instead of vector multiplication, this was an extremely informative and well-made video. Thanks!

@ugestacoolie5998 7 ай бұрын

hi, how would you implement this by using the torch.nn.RNN module?

@habeang304 3 жыл бұрын

Thank you, Can I make a model from this to detect if the given sentence is a question?

@felix9x 2 жыл бұрын

I enjoyed this tutorial. I run the code and added an accuracy calculation. I am getting about 60% accuracy using random data samples to form the training data. Russian names have more names on the list a little unbalanced there (minor nitpick)

@olsay 4 ай бұрын

Could you share the font family using in this tutorial?

@PUNERI_TONY Жыл бұрын

can you tell me how should I get the accuracy / confidence of this model after each output

@amindhahri2542 Жыл бұрын

I have a question cant we use for this problem transformers im stuck trying to use bert

@mohitgupta5000 11 ай бұрын

Great video! However, I have a doubt. I am wondering shouldn't you keep the trained weights for hidden state while inferencing instead of calling `init_hidden` in the `predict` function? Please clarify.

@mohitgupta5000 11 ай бұрын

My bad. I got it. The weights are infact the trained weights. Its just that hidden state starts from zero for all the new words while weights are shared among all `timesteps`. I am keeping the reply if someone else mistakenly think about it.

@alteshaus3149 3 жыл бұрын

Thank you very very much for these tutorials. It is reaally helpful to get further. Sorry for not being active on patrion, I am still a student. Plesaee keep going and show us more possibilities in deep learning. Maybe a series on NLP or Time Series Forecasting?

@donfeto7636 2 ай бұрын

Great content bro

@mmk4732 2 жыл бұрын

hello, appreciated your advice as in I cannt really import the all_letters from utils

@sighedsighed8882 4 жыл бұрын

Hello. Great content as always. Could you please explain how the following call works? output, next_hidden = rnn(input_tensor, hidden_tensor) It is calling the forward method in RNN class but is not explicit.

@patloeber 4 жыл бұрын

Using the __call__ method on a PyTorch model will always execute the forward pass

@VishnuRadhakrishnaPillai 4 жыл бұрын

Another awesome video. Thank you

@patloeber 4 жыл бұрын

Thanks!

@gonzalopolo2612 Жыл бұрын

In the RNN forward method, I think this is not standard. You are computing the output at t, Ot, as a function of hidden state at t-1 and the input at t-1. As far as I have seen the more common way is computing ht from ht-1 and xt-1 and then use this ht to compute the Ot. Is this on purpose? Do you have any source to see and understand this implementation better? Thank you and great tutorials!

@prabaldutta1935 4 жыл бұрын

@Python Engineer please keep posting more projects

@patloeber 4 жыл бұрын

Thanks! I definitely want to do more projects.

@stellamn 3 жыл бұрын

From your example I don't understand if you run the sequences batch wise, e.g. 50 words in 1 batch. So the 1st letter from all words as an input, then the 2nd from all words etc.. in that way one would save time. Or is that not possible?

@jjjw516 11 ай бұрын

Why is super(RNN..) used in the init of RNN itself? Shouldn't it be super(nn.module) ? Please help me to understand

@mingzhouzhu4668 3 жыл бұрын

this helped me do my homework lol thanks!!

@wenqinliu6522 4 жыл бұрын

Hi, this is a really great tutorial on RNN pytorch!! I have a question regards to the model, why there is no tanh in the model. And why pass the concat of input_tensor and hidden_tensor to the i2o, shouldn't it just pass the hidden_tensor to the i2o? Could you please clarify this? Thank you! class RNN(nn.Module): def __init__(self, input_size, hidden_size, output_size): super(RNN, self).__init__() self.hidden_size = hidden_size self.i2h = nn.Linear(input_size + hidden_size, hidden_size) self.tanh = nn.Tanh() self.i2o = nn.Linear(hidden_size, output_size) self.softmax = nn.LogSoftmax(dim = 1) # (1,57): second dim def forward(self, input_tensor, hidden_tensor): combined = torch.cat((input_tensor, hidden_tensor), 1) hidden = self.tanh(self.i2h(combined)) output = self.i2o(hidden) output = self.softmax(output) return output, hidden

@patloeber 4 жыл бұрын

It depends on the applications, yes sometimes we only pass the hidden cell to the next sequence step, but in this case the next step takes the previous hidden cell and a new input, so it needs the combined one. And in this simple example I didn't need TanH, but you are correct this is the default activation for RNNs

@wenqinliu6522 4 жыл бұрын

@@patloeberThanks!

@billy-cg1qq Жыл бұрын

Is the way to combine the input and the hidden is the only way of doing that, or are there other methods for combining the next hidden to generate a new hidden and output? because I saw a method were they have two weights tensors, the first is multiplied by the input and the second by the previous hidden then they add them and apply the activation function to get the new hidden, then multiply that by another weight tensor and apply another activation function to get the output.

@yuriihalychanskyi8764 4 жыл бұрын

love your videos

@patloeber 4 жыл бұрын

Thanks!

@sarahjamal86 4 жыл бұрын

Fantastic !

@patloeber 4 жыл бұрын

Thank you :)

@monishraju4789 Жыл бұрын

Hiii sir. what are plot steps and print steps? Can you explain?

@anantshukla6092 4 жыл бұрын

I was entering the names to check the country they belong to, and I found out that if the name starts with capital letter then it is assigned a different country than when the name starts with a small letter. Is it that the RNN is just memorizing the entire dataset and not generalizing ( meaning : over fitting) or maybe because the dataset has no name starting with a capital letter . Your help is appreciated. Thank you.

@patloeber 4 жыл бұрын

maybe all names should be transformed to lowercase first. any yes overfitting is very likely in this example

@gsom2000 4 жыл бұрын

in the end where you trying guesses, perhaps all are correct because model have seen all those names already?

@patloeber 4 жыл бұрын

Good point. Although I draw random training samples, it is very likely that with the large number of iterations it has seen all the names already. A better approach would be of course to separate into training and testing sets, but evaluation of the model was not the point of this tutorial.

@gsom2000 4 жыл бұрын

@@patloeber thanks for reply. i'd like to ask, are you planning any tutorials about BERT models?

@DarkF4lcon 3 жыл бұрын

Hi where would I find this program's code?

@n000d13s 4 жыл бұрын

i have a few questions if you dont mind. does this work on untrained data? also how can i implement rnn to reduce noise in an audio file? thanks. ☺

@sagnikrana7016 4 жыл бұрын

Top-notch video! Just a suggestion - the name of dictionaries category_lines and all_categories could have been more intuitive. Nonetheless, you explained it extremely well.

@patloeber 4 жыл бұрын

thanks! yes you are right :)

@bhar92 3 жыл бұрын

So I have some questions about the training procedure if you don't mind! In the training loop, you give the model a random word from our dictionary 100000 times. My first question is , how do you know that the model didn't repeat some words several times? For examples maybe it repeated numerous times many Arabic or Chinese names, and on the other side didn't had access to any Portugese words. If that happened, wouldn't the low loss graph be misleading, since the model repeatedly saw the same words? To my second question, wouldn't it be better if we split the data in train - test - val sets based on some percentage of our choosing? To make sure all the languages are represented accordingly, in all of the sets. Why did you chose this approach with the random words? Is it a common practice with NLP and RNNs? Thank you very much!

@tejamarneni 4 жыл бұрын

Best videos on PyTorch. Any plans on creating a Tensorflow playlist? @Python Engineer

@patloeber 4 жыл бұрын

Thanks! Yes indeed. You can have a look at my community tab, there I just announced to make a tenaorflow course soon

@muhammadkhattak4819 4 жыл бұрын

Thank you so much Sir.... Its a request to keep making videos on pytorch... Thank you..

@patloeber 4 жыл бұрын

Thanks! Yes I will :)

@sof_ai 4 жыл бұрын

Hello, thanks for your video. I have one question: why you are using this loop in train function? In 66 string: for i in range(line_tensor.size()[0]): output, hidden = rnn(line_tensor[i], hidden) Here output will be rewritten for each letter in line_tensor however loss will be computed using last output. If i am false please fix.

@patloeber 4 жыл бұрын

Because we look at each character one by one

@genetixx01 3 жыл бұрын

Hi, I would like to know if it's just me or the English and French names get poor prediction performance? I tried to play a bit with the learning rate but still.

@kitgary 4 жыл бұрын

Very good video! I have next topic for you, reinforcement learning!

@patloeber 4 жыл бұрын

That’s a complex topic, but will definitely come sooner or later :)

@pallavisaxena7496 2 жыл бұрын

Please create a series on NLP in detail

@raminessalat9803 4 жыл бұрын

Interesting thing is that we dont have any nonlinearity between each time step and hidden states are mapped linearly from new input and previous hidden state to the next hidden state and its still so powerful!

@patloeber 4 жыл бұрын

relu or tanh are applied for non-linearity

@raminessalat9803 4 жыл бұрын

@@patloeber i meant in this example that you showed, you didn't have nonlinearity right?

@pezosanta 4 жыл бұрын

Great tutorial, thanks! Could you please tell me which VSCode theme do you use? It looks great!

@patloeber 4 жыл бұрын

Nightowl Theme :)

@pezosanta 4 жыл бұрын

@@patloeber thanks :)

@pranilpatil4109 7 ай бұрын

This is many to one RNN?

@chandranbose5804 4 жыл бұрын

@python_engineer sir, last week I messaged in the python intermediate course. below error showing while executing the code. unable to pick up the developer who gave the info it will run in python 2.7. Can you help me? WindowsError: [Error 193] %1 is not a valid Win32 application

@chakra-ai 4 жыл бұрын

Hi, Thanks for the video. It would be great if you could post a video on torch serve implementation of any of these models.

@patloeber 4 жыл бұрын

Thanks for the suggestion! Will add it to my list

@shaikrasool1316 4 жыл бұрын

Sir, can we use triplent loss in image classification, If you re having 2000 image classes.. Or else normla cross entropy and softmax is enough

@Небудьбараном-к1м 4 жыл бұрын

Seems like you are not using batch-learning? Could you explain why?

@patloeber 4 жыл бұрын

Only to keep it simple in this tutorial. Also the dataset was not very big...

@makannikmat5679 3 жыл бұрын

very helpful

@patloeber 3 жыл бұрын

Glad to hear that

@pra8495 3 жыл бұрын

is there any Pytorch certification like tensorflow ??

@patloeber 3 жыл бұрын

good question! not that I know of...

@sergiozavota7099 4 жыл бұрын

Thank you a lot for your videos :) I am trying to run the training and prediction on the GPU but it seems that I forgot to pass to device some tensor. In particular I am passing to device: - combined = torch.cat((input_tensor, hidden_tensor), 1).to(device) - rnn = RNN(N_LETTERS, n_hidden, n_categories).to(device) but it shows the following error: All input tensors must be on the same device. Received cpu and cuda:0 so I suppose I'm missing something, but can't figure out what

@patloeber 4 жыл бұрын

yes all your tensors AND the model must be on the same device. so somewhere you are missing the .to(device) call

@manjunathjayam4895 4 жыл бұрын

hey @pythonEngineer , I recently finished a python course at my university. Now I m feeling a little confident about python. I watched this video I'm not understanding most of the functions. what do you suggest ?. is it enough if i watch your pytorch tutorial

@patloeber 4 жыл бұрын

Yes you should watch my pytorch beginner course first

@manjunathjayam4895 4 жыл бұрын

@@patloeber got it. Thank you

@swarnalathavura1334 4 жыл бұрын

Hey, can you make a full course in python in freecodecamp because you only did the Intermediate video so can you please upload a full course video

@patloeber 4 жыл бұрын

You mean an expert python course ?

@swarnalathavura1334 4 жыл бұрын

@@patloeber yes

@patloeber 4 жыл бұрын

Ok I’ll consider it :)

@chandradeepsingh.8661 4 жыл бұрын

Can you make a chatbot ai in python using rnn

@patloeber 4 жыл бұрын

I already have a chatbot tutorial with PyTorch on my channel, but yes probably I make a more advanced one in the future.

@razi_official 4 жыл бұрын

very nice tutorial sir, could you make a model on Signature Verification ? please must reply me

@user-or7ji5hv8y 4 жыл бұрын

Why is video classification many to many?

@patloeber 4 жыл бұрын

Only if you want to do a classification for each frame

@KennethScott-w8g 2 ай бұрын

Carter Radial

@CaraParker-j7n 4 ай бұрын

Perez Scott Clark Sandra Robinson Nancy

@GoForwardPs34 Жыл бұрын

got an error trying to run this ImportError: cannot import name 'load_data' from 'utils' Does anyone know how to resolve this? @Gonzalo Polo @Patrick Loeber