Sign Language Detection using ACTION RECOGNITION with Python | LSTM Deep Learning Model

  Рет қаралды 445,153

Nicholas Renotte

Nicholas Renotte

Күн бұрын

Пікірлер: 1 100
@BINARAI-q4r
@BINARAI-q4r 10 ай бұрын
00:01 This video demonstrates sign language detection using action recognition with Python. 01:40 The video discusses the process of sign language detection using action recognition and LSTM deep learning model. 05:16 MediaPipe Holistic allows us to get key points from face, body, and hands 07:17 Setting up webcam access and rendering frames using OpenCV 11:06 The code captures frames from a webcam and displays them on the screen. 12:46 Setting up MediaPipe Holistics and creating variables for MediaPipe Holistic and MediaPipe Drawing Utilities 16:46 The video explains the process of color conversion in sign language detection. 18:32 The process involves detecting sign language using media pipe and a deep learning model. 21:59 The video discusses the different types of landmarks in sign language detection using action recognition. 23:23 The video explains how to detect and visualize different types of landmarks using MediaPipe. 27:05 The video discusses how landmarks in facial and body pose can be connected to each other. 28:37 Implementing sign language detection using LSTM deep learning model in Python 32:18 Landmarks are drawn and rendered in real time using image pass and cv2 33:55 You can customize the formatting of the dots and connections in Sign Language Detection using a landmark drawing spec and a connection drawing spec. 37:32 Updating pose and hand landmarks with different colors and parameters 39:31 Different models in action: left hand, right hand, face, and pose. 43:09 The code demonstrates how to extract landmark values using pose estimation. 45:04 The video explains how to reshape and convert landmarks into a single array. 48:27 Building a neural network and extracting key points using action recognition with Python 50:10 Setting up error handling and placeholder arrays for pose and face landmarks. 53:52 The video explains how to extract key points for sign language detection using LSTM deep learning model in Python. 55:31 Concatenating pose, face, left hand, and right hand keypoints for sign language detection. 59:11 Using LSTM Deep Learning Model to detect sign language actions 1:00:57 Creating folders to store data for different actions and sequences. 1:04:16 Creates a folder structure for sign language detection using action recognition with Python. 1:05:48 Collecting data using MediaPipe loop and capturing snapshots at each point in time. 1:10:14 The code is outputting text to the screen and taking a break at frame 0. 1:11:44 The first block of code prints starting collection in the middle of the screen and pauses. 1:15:10 The code collects key points by looping through actions, sequences, and frames. 1:16:39 Implementing sign language detection using action recognition with a LSTM deep learning model. 1:20:25 Sign language detection using action recognition with Python 1:23:20 Using MediaPipe to collect key points for sign language detection 1:26:55 Creating a dictionary to map labels to numeric ids 1:29:08 Sequences represent feature data and labels represent y data 1:32:26 Data preprocessing and training and testing partitioning are important steps in sign language detection using LSTM deep learning model. 1:34:12 Training LSTM neural network using TensorFlow and Keras. 1:38:14 The model uses LSTM layers for sign language detection. 1:39:55 The next three layers are dense layers using fully connected neurons. 1:43:16 The video discusses the process of formulating a neural network for sign language detection using action recognition and LSTM deep learning model. 1:44:58 Training the model with 2000 epochs 1:48:11 The training accuracy is high at 93.75% after 173 epochs. 1:49:37 The model has three LSTM layers and dense layers, with a small number of parameters to train. 1:53:13 Reloading a deleted model and evaluating its performance using scikit-learn. 1:55:03 Converting y test and y hat values to matrices and then evaluating the model performance using a confusion matrix and accuracy score. 1:58:18 Implementing prediction logic by concatenating data onto sequence and making detections when 30 frames of data are available. 2:00:30 Implement logic to grab the last 30 sets of key points for generating predictions. 2:04:18 Implementing visualization logic and checking result threshold and sentence length 2:06:45 The code checks if the current action matches the last sentence in the string. 2:10:05 Sign language detection using LSTM deep learning model 2:12:39 The video discusses sign language detection using action recognition with Python using an LSTM deep learning model. 2:17:13 The video discusses sign language detection using action recognition with Python 2:19:14 Sign Language Detection using Action Recognition with Python 2:22:29 To ensure accurate action detection, the last frame needs to be included in the sequence. 2:24:04 The code implementation adds stability by checking if the last 10 frames have the same prediction. Crafted by Merlin AI.
@kanchanpatil9642
@kanchanpatil9642 Жыл бұрын
as someone who is following this in 2023, here's some change(s).....i'll be editing them in as they pop in while I go through the tutorial. 25:42 FACE_CONNECTIONS seems to be renamed/replaced by FACEMESH_TESSELATION.And well since we want just the outlines of the face, it's FACEMESH_CONTOURS that we would need in this project.
@taredje4664
@taredje4664 Жыл бұрын
thanks, you save me
@VanderlanAlves7
@VanderlanAlves7 Жыл бұрын
wow! Thank you very much!!!
@interstellarstar3742
@interstellarstar3742 10 ай бұрын
hey i can't collect data how i save .
@stinger9231
@stinger9231 6 ай бұрын
Thank You so much, got stuck there for a minute
@simranmehta2778
@simranmehta2778 Ай бұрын
Thankyou soo much
@girishkemba3865
@girishkemba3865 3 жыл бұрын
I remember some time ago requesting this type of video,but to see that its finally here brings me joy. Can't wait to do this and show to my sign language friends.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I know right, it's taken a while but finally it's here! Thanks for sharing!
@pradeepsaravanan7712
@pradeepsaravanan7712 Ай бұрын
did you show to your friends ?
@savi-2084
@savi-2084 10 ай бұрын
I can not thank you enough for all the videos you create i was a noob in tech but the moment i started watching your videos its been a year now and i am so proud of you and myself for coming this far and this project works for me❤
@simranmehta2778
@simranmehta2778 Ай бұрын
Just completed this project right now . Feeling extremely motivated to do more projects in future. I made couple of changes in this project such as I used LSTM but the activation function i used is tanh , I used Dropout Layer and l2 regularization to prevent overfitting . At the end I added audio feature so if the model predict 'hello' , hello audio will play.
@danishkhalid2027
@danishkhalid2027 26 күн бұрын
hey Simran Did you increased more words? and is your model project detecting sentence level and converting into text and voice? I want to see your project currently wants to work on it Would me great 😀
@tech_voyager
@tech_voyager Ай бұрын
Dude you are amazing i just almost completed my graduation project!!
@김미소2-z2w
@김미소2-z2w Жыл бұрын
Thank you so much! You are my best teacher in my college life!!!!
@SABEDIT2914
@SABEDIT2914 9 ай бұрын
Did you made this project?
@Stacio6
@Stacio6 3 жыл бұрын
Hi Nicholas thanks so much !!!! I am creating a model to help deaf people here in my country. Greetings from Guatemala !!!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Awesome stuff!!
@aminberjaouitahmaz4121
@aminberjaouitahmaz4121 3 жыл бұрын
Thank you for these clear, practical, straight to the point tutorials! Looking forward to your future videos!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Cheers @Amin, so pumped you're enjoying them!
@aqsaqamar1634
@aqsaqamar1634 Жыл бұрын
@Nicholas Renotte can you tell me why error is coming mp 'mediapipr. Python. Solutions. Holistic' has no attribute 'FACE_CONNECTIONS'
@charank2894
@charank2894 5 ай бұрын
​@@aqsaqamar1634 replace FACE_CONNECTIONS with FACEMESH_CONTOURS
@engeerdanisme
@engeerdanisme Жыл бұрын
Thank you @Nicholas Renotte I just passed my capstone project defense utilizing this deep learning model
@yohanessatria2220
@yohanessatria2220 3 жыл бұрын
Man, you are so underrated and deserve a lot more! thanks a lot for these awesome learning materials! I have learned a lot from you. Keep inspiring, man :)
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much @Yohanes! So glad you're enjoying them 🙏
@Rohan_is_discovering
@Rohan_is_discovering 6 ай бұрын
Someone just completed his internship with the help of your code and also got a certificate from an IT company
@gaddesaishailesh2772
@gaddesaishailesh2772 3 жыл бұрын
I was really waiting for this video!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
IKR, it's taken a little while hey @Gadde Sai Shailesh!
@malice112
@malice112 2 жыл бұрын
Nicholas is the best machine learning youtuber, his tutorials are interesting and fun.
@study_with_thor
@study_with_thor 3 жыл бұрын
that's amazing! I watched this video more than a month ago but it seemed difficult for me as a beginner. Then I've tried my best to finished Machine Learning/ Deep Learning/ Python / Tensorflow and some Data Science course within a month. Now watching this video again is like watching a movie! it's easy to follow! love it
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
YESSS! That's amazing that you stuck with it, great work man!!
@ruqaiyaali1645
@ruqaiyaali1645 2 жыл бұрын
you finished ML/DL/Python and Data science course within a month!! how is this possible man? I am having a hard time with these courses 🥲
@study_with_thor
@study_with_thor 2 жыл бұрын
@@ruqaiyaali1645 I think you must be familiar with python code. Make sure practice more than what you learn.
@nguyenvietthai5868
@nguyenvietthai5868 Жыл бұрын
@@study_with_thor are you Vietnamese. I see your name. Can you give me some experience please? If so, please respond to me. Thanks a lot.
@study_with_thor
@study_with_thor Жыл бұрын
@@nguyenvietthai5868 Hi there, please let me know your concerns, I hope that I could help you too.
@rainymatch
@rainymatch 7 ай бұрын
It's so cool to see how happy Nicholas is when everything works in the end. That's the spirit! Amazing video, thanks a lot for your work man!
@danieladama8105
@danieladama8105 3 жыл бұрын
Can’t lie.. I have learnt a lot from Nicholas
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
My man! Thanks for checking in!
@PIKACHU-zn8fx
@PIKACHU-zn8fx 9 ай бұрын
agreed still learning from him
@giteshpal405
@giteshpal405 Ай бұрын
did u add more actions of dataset?
@Cheese_Academia
@Cheese_Academia Жыл бұрын
Thanks for the amazing tutorials! absolutely life-saving. Just a reminder that the z value from mediapipe is with respect to the wrist landmark not the distance from the camera! I found out pretty late!
@leafiadias96
@leafiadias96 3 жыл бұрын
thanks for this amazing tutorial sir , we are working on a project that needed this section and your videos and explanation are being extremely helpful to me and my team ! thanks a lot
@fawwazhameed1104
@fawwazhameed1104 2 жыл бұрын
Heyy leafia, could you tell me about your project?
@Nikos_prinio
@Nikos_prinio 2 жыл бұрын
Hi ! I'm impressed by the amazing clarity of your explanations. For one second I thought you must be a trained teacher robot....
@torstenknodt6866
@torstenknodt6866 2 жыл бұрын
Thanks, great videos. Would be great if you could elaborate into the differences of the used media pipe implementation, compared to the others you mentioned. I mean really a comparison of the underlying models/ networks and their training.
@stevecoxiscool
@stevecoxiscool 3 жыл бұрын
Great explanation on how to use LSTM with pose coordinates.
@ibrahimalizada381
@ibrahimalizada381 3 жыл бұрын
Hi, Nicholas! These are great video series to watch and learn! Thank you very much! Can you please prepare a video applying CV on real-time sign language detection on the base of a ready dataset avaliable in Internet? It may be much more interesting if we can see ViT in action recognition as well.
@VarunAditTheGreat
@VarunAditTheGreat Жыл бұрын
Hey, I am trying to build a project with a bigger dataset for ASL. Did you find any dataset?
@asutoshpatro2865
@asutoshpatro2865 Жыл бұрын
@@VarunAditTheGreat i have found it its wlsal data set did u make pls share the code link
@ruthogadina757
@ruthogadina757 Жыл бұрын
i'm learning about this, would you like to work on a project together?
@ishaanverma1969
@ishaanverma1969 Жыл бұрын
This content is so underrated! Thank you so much!
@dinukii3332
@dinukii3332 2 жыл бұрын
Hi Nicholas! Thank you for your tutorial once again. Quick question, How can I change the code to access a folder that contains a dataset of videos without live capturing them? Really appreciate if you could give an answer :)
@NicholasRenotte
@NicholasRenotte 2 жыл бұрын
You could loop through each one of the videos by using os.listdir or the tensorflow dataset class then run it through the mp holistic pipeline!
@dinukii3332
@dinukii3332 2 жыл бұрын
@@NicholasRenotte Thank u:)
@HannahCynthia-mu4ct
@HannahCynthia-mu4ct 2 жыл бұрын
Heyy. Do you know the exact code to loop through video dataset?
@riadhaoufi9452
@riadhaoufi9452 2 жыл бұрын
@@HannahCynthia-mu4ct i'm looking for it too, i hope he gets to answer up thank you so much for the video brother @Nicholas Renotte
@riadhaoufi9452
@riadhaoufi9452 2 жыл бұрын
@@NicholasRenotte i'm so lost brother :(
@rabiraj1387
@rabiraj1387 3 жыл бұрын
Awaited Video Nicholas hope to complete it and implement on my side.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I know, can't believe it's finally out! Let me know how you go @Rabi Raj!
@theethatanuraksoontorn1369
@theethatanuraksoontorn1369 3 жыл бұрын
Hi Nicholas, been working on similar project. I believe this tutorial is done for simplicity so I would like to add a piece of my mind. When adding more action, the prediction on the realtime will be mix a lot due to frame overlap and wrong slicing of the frame. I would suggest to show some viz as start and end of the prediction. So the user can follow the start to the end frame. This way it is similar to the way it is collected and higher prediction accuracy.
@giteshpal405
@giteshpal405 Ай бұрын
hava u add more actions in your project ?
@yousseffarhan8901
@yousseffarhan8901 3 жыл бұрын
لا يمكنك أن تتخيل كم ساعدتني. شكرا جزيلا لك 🙏🏼
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
🙏
@tigre1217
@tigre1217 2 жыл бұрын
Hi nick! Nice tutorial on this sign language recognition program. I had faced some problems of the categorical accuracy staying the same when im trying to add more signs to the model rather than 3 like the ones you used in the video, is there any way to solve this issue? Thanks!
@rryann088
@rryann088 2 жыл бұрын
Hi, are you still working on it?
@labhjoshi3182
@labhjoshi3182 Жыл бұрын
@@rryann088 same question
@harrylee97625
@harrylee97625 3 жыл бұрын
Nicholas certainly deserves more views.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Awww, thanks @Harry. Much appreciated man!
@rowlandgoddy-worlu3382
@rowlandgoddy-worlu3382 2 жыл бұрын
This is an amazing video! I have learned a lot following your tutorials. One question - What if you are trying to capture actions that are not of equal time duration. E.g if a sign language like "Good Morning" lasts for 5 seconds and another sign like "Welcome" lasts for 9 seconds. How can this be treated?
@032lovishkumar8
@032lovishkumar8 8 ай бұрын
hey, i am getting error IndexError: list index out of range while running 2:00:10 , how can i resolve it ?
@ibrahimhameem1334
@ibrahimhameem1334 3 жыл бұрын
Super stuff Nicholas! Super grateful for your tutorials 🙌🏻. Keep up the great work!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much @Ibrahim, soooo much more to come!
@akshatraj5952
@akshatraj5952 Жыл бұрын
Videos that you make is wonderful. Thank you for these practical and clear points in the tutorials.
@usamaejaz5264
@usamaejaz5264 Жыл бұрын
MP_Data folder missing ha , tou wo kahan se lae gy
@piresflp
@piresflp 3 жыл бұрын
Hi Nicholas, thanks for the awesome tutorial! I've got 3 questions about the project, hope you don't mind helping me: 1. When training my model, i've 90%+ accuracy very quickly (150 epochs more or less), but all of sudden it dropped to 30% and kinda stabilized until the rest of the execution, how can I fix it? 2. If I want to add more signs after first training my model, I'll have to re-train it? Or can I train just those specific signs separately? How do I do it? 3. After the model is working just fine, it is possible to attach the real time script to an android app?
@howcircle5530
@howcircle5530 3 жыл бұрын
i also wanna know about you 3rd quection.🤓
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
1. So accuracy never went back up? Try adding more data for each class depending on what's performing well/not well. 2. You can apply transfer learning, drop the final layer, add a new layer which has the same number of classes as your new signs then retrain 3. Yes, I haven't shown it here as it's probably a whole other video though!
@tigre1217
@tigre1217 2 жыл бұрын
@@NicholasRenotte Hi Nick, can you elaborate more on the 2nd point? I was quite confused since it is my first deep learning project. Thanks!
@adriamasitoribio
@adriamasitoribio Жыл бұрын
@@tigre1217 hey! diod you figure it out?
@giteshpal405
@giteshpal405 Ай бұрын
have u done it?
@rusticagenerica
@rusticagenerica 2 жыл бұрын
Exceptional tutorial. Thank you from the bottom of my heart.
@MuhammadKamran-ow5vp
@MuhammadKamran-ow5vp Жыл бұрын
I have a question. Is it possible to feed video of arbitrary lengths (frames) instead of feeding an action of fixed length video? Because in real time, we perform sign language pretty fast and each action is of arbitrary length.
@latestdramas6351
@latestdramas6351 5 ай бұрын
Excuse ne
@T-She-Go
@T-She-Go 3 жыл бұрын
Thank you so much Nicholas 😌 This will help me with my project 🙌🏾
@yashas_hm
@yashas_hm 2 жыл бұрын
Hi Nicolas, Such an amazing video. Helped me a lot building a project. I am working on a different project in which I trained the model with around 20 signs from ASL but I am getting a categorical accuracy of only 0.05 on average in each epoch. can you tell me where I went wrong or anything to imporve it?
@martinposso2098
@martinposso2098 2 жыл бұрын
hey how you managed to fix that problem?
@phoque6
@phoque6 3 жыл бұрын
Thank you for a detailed and wonderful mediapipe tutorial :)
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
So glad you liked it!
@mervesisci4983
@mervesisci4983 3 жыл бұрын
Hi Nicholas, Thank you for this amazing tutorial. If we use padding in this case (videos containing movements with different number of frames) how can we make predictions in real-time? In the tutorial you set a fixed length (30 frames) (sequence=sequence[:30] if len(sequence)==30), but in my case there are different frame sizes for each activity in real-time prediction.
@abhisekpanigrahi1033
@abhisekpanigrahi1033 2 жыл бұрын
Hello Nicholas I also have this question. Can you please answer this what if the dimensions are different each time we run
@nnamakah
@nnamakah 2 жыл бұрын
Hi Nicholas, thanks for this project, it is incredible. How would you handle video files with varying number of frames? How can I possibly approach the situation?
@matteosacco00
@matteosacco00 6 ай бұрын
Same question, anyone with suggestions?
@depallyyadaiahgoud750
@depallyyadaiahgoud750 Жыл бұрын
That's way cooler one and your explanation was a ton easier 😉 Thanks Nick
@angelortiz3564
@angelortiz3564 2 жыл бұрын
This is so awesome! You can theoretically do the same for the static letters in the ASL alphabet, right? Just make the dataset that contains each hand sign. The model would be train on the keypoints of each hand sign. Although I am not sure it for some hand sign letters, the keypoints would be accurate. What do you think?
@anshumanchoudhary4732
@anshumanchoudhary4732 Жыл бұрын
That model would be far more easier to achieve
@eswar7781
@eswar7781 10 ай бұрын
​​@@anshumanchoudhary4732which model
@arpanroy2892
@arpanroy2892 3 жыл бұрын
Your every video slightly edited , directly goes in my cv 🤣🤣🤣🤣 , thanks for taking care of my future ❤❤❤
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Hahaha, build that experience man and go getem!
@y.yuvraj
@y.yuvraj 2 жыл бұрын
Hii Nicholas This is really an amazing tutorial I really appreciate it. But I am having an error at fitting the model and it is of 'ValueError' which is "Failed to convert a NumPy array to Tensor". I tried many things but it is not going away so please give me a hand on this.
@another.nikhil
@another.nikhil 2 жыл бұрын
check the datatype of the inputs in your model. The keras api only accepts numpy arrays.
@soumendas2336
@soumendas2336 3 жыл бұрын
thank you Nicholas i have learned a lot of things from this video ....that I was looking for the past few months..
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Yesss! So glad!
@WJ-zq3xo
@WJ-zq3xo 3 жыл бұрын
Great tutorial as usual, Nick! Learning a lot from you :D Did anyone try to use a set of videos instead of recording their own videos? If yes, what did you change in the code base? Kudos
@shrirampareek
@shrirampareek 3 жыл бұрын
Hey! I used some set of videos(26) and was able to get 92% on test dataset however when I tried doing the same gestures using webcam, I get same 4 classes all the time
@amessit10
@amessit10 2 жыл бұрын
@@shrirampareek can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@neerajpatil7850
@neerajpatil7850 2 жыл бұрын
@@amessit10 Same for me ! Have you figured out the why the error ?
@amessit10
@amessit10 2 жыл бұрын
@@neerajpatil7850 No man, i closed this project coz i only need hand gestures not full body keypoints
@amessit10
@amessit10 2 жыл бұрын
hands occludes , so recognition fails
@hamednasr3078
@hamednasr3078 2 жыл бұрын
I wish you recorded all your videos with zoom and font size of 22:30, it is really great 🙂
@NicholasRenotte
@NicholasRenotte 2 жыл бұрын
Yeah I've gotta work out how to do it, I just can't code with that amount of zoom though @Hamed. Will see what I can do!
@mahmudanajnin9367
@mahmudanajnin9367 3 жыл бұрын
hey nick! this project is amazing! thank you for these awesome tutorials. You did sign language detection with tensorflow object detection which detects sign using single frame but here we're using multiple frames to detect it. So i was wondering how is this one better than tensorflow object detection?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Just depends on the use case, the OD model does it on a single frame, this does it for multiple frames (this one is better for signs with multiple phases)
@ericklasco
@ericklasco 3 ай бұрын
2024 and this is still useful, thank you Nicholas👍
@anirbansaha244
@anirbansaha244 3 ай бұрын
hey did you complete the project?
@ericklasco
@ericklasco 3 ай бұрын
@@anirbansaha244 yes
@whisplay
@whisplay 3 ай бұрын
​@@ericklasco can you please share which version dependence you used for importing
@siva7702
@siva7702 2 ай бұрын
Please provide versions bro, like what is python version you used
@whisplay
@whisplay 2 ай бұрын
@@siva7702 Python 3.12.5 and import versions: !pip install tensorflow opencv-python mediapipe scikit-learn matplotlib !pip install --upgrade mediapipe note: use jupyter notebook! with this code camera can't be accessed on google colab or kaggle like ide's, if you want to use colab you need to add an additional Javascript code.
@theethatanuraksoontorn1369
@theethatanuraksoontorn1369 3 жыл бұрын
Hey Nicholas, I am working on similar project. Just wondering when I test the model using your metric it does not reflect the same accuracy as the real-time test. I train the model accuracy to 80-90% but the real-time test barely capture any sign language. Do you have any thought?
@giteshpal405
@giteshpal405 Ай бұрын
did u find any solution?
@fatiha2413
@fatiha2413 3 жыл бұрын
Hi, Nicholas! I learned a lot from this video! Thank you very much!
@amessit10
@amessit10 2 жыл бұрын
hiii FATIMA , can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@estebanpozo8702
@estebanpozo8702 3 жыл бұрын
Hi Nicholas, thanks again for this great tutorial! I am writing this because I would like to learn more about how did you chose your architecture. As you mention, almost all the state-of-the-art papers use a combination of CNN and LSTM. So, I have two questions 1. Would it be possible to get a more detailed explanation on how you build this model? 2. So far, I have reviewed “LSTM: A Search Space Odyssey” by Greff (+ other papers) and the “Neural Network design” handbook by Hagan. Could you recommend me any documents regarding LSTM architectures?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
This is how I normally build stuff: 1. Find a research paper that has implemented a similar model 2. Try building the code for that model 3. Fine tune and iterate (a lot) to get solid performance I wish I had a standard process but it is hyper iterative.
@estebanpozo8702
@estebanpozo8702 3 жыл бұрын
@@NicholasRenotte thanks! :)
@predoca46
@predoca46 7 ай бұрын
31:06 Im making a project to my school and he's look like your project, and he's function is like your. But, im dont have knowledge sufficient to make this alone. So im watching your video to learn much and complete that, thanks for the video and sorry for my english haha. Send hello to Brazil 🇧🇷 😂
@latestdramas6351
@latestdramas6351 5 ай бұрын
Hey
@vasuarora_
@vasuarora_ 3 ай бұрын
@@latestdramas6351school???
@Gabbosauro
@Gabbosauro 3 жыл бұрын
Hi Nicholas, I've been working on my thesis project about the quality of body movements and I encountered a problem with keras. I see that you feed in the first layer a sequence of constant 30 frames (1 second of video/list of mediapipe landmark object). In my case I have a variable number of frames (i.e. a video containing movements that lasts some 2 seconds (60 frames), some 2.5 seconds (75 frames), some 3 seconds (90 frames), etc., hence with different number of frames), how can I solve this? I looked around and people say that I can apply the so called "padding and masking" which takes the largest number of frames (longest video) and then add a special value to the others (padding) and after that somehow ignore/filter the special number later (masking). But this can't be applied to my case because I would like to have the freedom of variable number of frames during prediction. I hope you understand what I want to ask, otherwise let me know and I will try to clarify it as much as I can. Thank you!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
AFAIK it's the only way to do it, unless you look at something like a sequence to sequence model (I think, don't quote me on that though lol). Padding would be the easiest approach. Set a fixed max length and fill out the frames without detections with a numpy array with zeroes!
@Gabbosauro
@Gabbosauro 3 жыл бұрын
@@NicholasRenotte Thank you for the reply! Will the padding influence much the classification? I mean if video1 with movementA lasts 3 seconds and video2 with movementA lasts 1 second + 2 seconds of zeroes, would that cause problems during prediction or do you think it will work well?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
@@Gabbosauro I would prototype and see the impact first. Kinda hard to say without seeing benchmark results.
@Gabbosauro
@Gabbosauro 3 жыл бұрын
@@NicholasRenotte Alright, I'll test it out. Thanks!
@Gabbosauro
@Gabbosauro 3 жыл бұрын
@@NicholasRenotte What I did and it seems starting to do the training is setting input_shape=(None, number_of_features) so time_steps set as "None" instead of 30, and during model.fit() I give it a batch_generator. ( based on this reference: datascience.stackexchange.com/questions/48796/how-to-feed-lstm-with-different-input-array-sizes ) But sadly the accuracy chart doesn't look good, sometimes it is around 40-50%, sometimes it drops to 20%.
@mehmety5012
@mehmety5012 2 жыл бұрын
Great Tutorial Nicholas. Thank you so much !
@pritishmair9577
@pritishmair9577 3 жыл бұрын
Is there a dataset available for this, which has more signs than these 3. If so it will be really great if someone could share it
@vaibhav607
@vaibhav607 3 жыл бұрын
Please, can you reply on the status of this?
@ahmedkalair9862
@ahmedkalair9862 9 ай бұрын
@pritishmair9577 did u find it
@giteshpal405
@giteshpal405 Ай бұрын
did u find any dataset yet?
@gustavojuantorena
@gustavojuantorena 3 жыл бұрын
Wow! This is great Nick! 👏👏👏
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks a bunch Gustavo!!
@akshith.vbharadwaj2269
@akshith.vbharadwaj2269 2 жыл бұрын
Greetings Hey man this is an awesome tutorial and I completely love the way u have explained the process step by step. It was an awesome tutorial and I completely loved it. I tried it on my own and I have encountered some problems it would be a great help if u could help me out with it. I have followed the same method that u have prescribed on the video these are the problems which came up. Even after getting overall categorical accuracy 95% and above accuracy on training datasets when I do the gesture recognition it is not recognising one gesture. And sometimes it shows the same gesture even though I am showing a different gesture. Sometimes even it is detecting 2 gestures even though I am not giving any gestures. I am always retraining the same data to get a higher accuracy before going to the gesture recognition part. I have also increased a layer in the LSTM model but the results are the same. Would greatly appreciate the help.
@NicholasRenotte
@NicholasRenotte 2 жыл бұрын
Start with the data, I would add more data of the underperforming classes then retrain. Remember bad data in will lead to bad outputs and vice versa, try adding 20-30 more samples for each underperforming class and give it a go!
@joshgibson539
@joshgibson539 3 жыл бұрын
I really hope you continue this project.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I don't think I'm ever going to give this one up until I truly nail it. I feel like we're maybe 50ish percent of the way there. Still a TON of work to do.
@joshgibson539
@joshgibson539 3 жыл бұрын
@@NicholasRenotte I know it requires a lot of data and work to do. Also a project like this that helps people is always a great thing to be working on. I'm glad to see you sticking to it. I really wish SignAll would just release their product instead of making it about money. Their database has I have heard over 300,000+ sign language hand symbol videos labelled. I guess businesses and schools can request the software. But I just know they won't let just anyone touch it otherwise. That just really depresses me to know. I have a cousin that I can never understand when he comes over yet he understands me due to his hearing aid implant. It just sucks... and I think the world needs a solution that's not locked away.
@joshgibson539
@joshgibson539 3 жыл бұрын
@@NicholasRenotte Try requesting data from How2Sign's Github 16,000 vocabulary words (srvk /how2-dataset). just be sure to read their licensing terms before requesting it though if you do. Sorry I don't know many good resources I just want to see the project flourish.
@T-She-Go
@T-She-Go 3 жыл бұрын
Update: I managed to get an accuracy of 98% by changing the activation functions of the LSTM and Dense layers. 😌 Hope that this helps y'all who might be stuck on this Hi Nic 😌 me again 😅 So I'm trying to use a new data set of gestures and I can't seem to get an accuracy >20%. I have tried to change the learning rate, the optimiser, etc, but non of these work 🙈 Is there something that I am missing? Thank you in advance 🌸
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
How many gestures and how many classes? For really similar classes I'd suggest adding way more data in order to produce a more accurate model. Also, what activations did you change, curious?
@T-She-Go
@T-She-Go 3 жыл бұрын
@@NicholasRenotte I used 5 gestures, 2 were based on hand movements and 3 were based on head movements. I think I should've added more data because the prediction model could not tell the difference between all the head gestures x_x Also, I changed the ReLu activations to Sigmoid
@mahmudanajnin9367
@mahmudanajnin9367 3 жыл бұрын
thank you so much..using sigmoid function really worked for me!
@T-She-Go
@T-She-Go 3 жыл бұрын
@@mahmudanajnin9367 Yaaay :) I'm glad
@mahmudanajnin9367
@mahmudanajnin9367 3 жыл бұрын
@@T-She-Go can you tell me how to find out how many labels the confusion matrix is for?? i have 5 classes in my project and yhat = [1, 0, 1, 1, 2, 0, 1, 0, 4, 3]. My confusion matrix gives 5set of arrays..I'm really confused. Is it related to yhat value?
@barithiachudhan3034
@barithiachudhan3034 3 жыл бұрын
Hai nicholas it was such a wonderful implementation and thanks for sharing it with us
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
So glad you enjoyed it!!
@amitdutta3875
@amitdutta3875 3 жыл бұрын
you are great.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much @Amit!
@udoysaha3086
@udoysaha3086 8 ай бұрын
Helped a lot.. Everything explained really well.. Thank you so much!
@girisathvikavpragatiengine309
@girisathvikavpragatiengine309 Жыл бұрын
Hey Nicholas, the tensorflow version of 2.4.1 is showing an error. It says " Could not find a version that satisfies the requirement tensorflow==2.4.1" please help me out
@alissiazaidi2631
@alissiazaidi2631 10 ай бұрын
hey, did you find the solution ? Actually, I have the same error...
@pareshgupta3288
@pareshgupta3288 8 ай бұрын
@@alissiazaidi2631 just change the version if it's winows use: pip install tensorflow==2.10.0 if linux: pip install tensorflow==2.16.0
@pavansai2838
@pavansai2838 3 ай бұрын
heyy did find the solution for it?
@siva7702
@siva7702 2 ай бұрын
Downgrade your python version to tensorflow 2.4.1 supports only python version 3.6-3.8
@MuhammadKamran-ow5vp
@MuhammadKamran-ow5vp Жыл бұрын
It was really a great tutorial on real time sign language detection.
@knd3846
@knd3846 2 жыл бұрын
hi .. first of all thanks for your free code to this brilliant work. second, i am a beginner in using python yet i have come too far in running your code. At step 11 i am facing an error that keeps appearing and i am exhausted right now bcz i have spend my whole day in finding a perfecct solution for it. it keeps showing TypeError: only size-1 arrays can be converted to Python scalars. after running plt. imshow coding line ..... plz plz need help...
@xboxgaming4307
@xboxgaming4307 2 жыл бұрын
Facing same issue .. even i follow all of the same steps ... srsly i need help too
@safamunir1510
@safamunir1510 2 жыл бұрын
I'm having same issue in the coding ... please help us removing this error
@harryfeng4199
@harryfeng4199 2 жыл бұрын
did u manage to figure it out?
@knd3846
@knd3846 2 жыл бұрын
@@harryfeng4199 nope.. I have tried so mnay different things but its all in vane.. I am at my last step though..
@sowmyacheguri21
@sowmyacheguri21 2 жыл бұрын
Hey! Did u figure it out?
@redabenlekehal7271
@redabenlekehal7271 3 жыл бұрын
Brilliant as expected
@idkidk1774
@idkidk1774 3 жыл бұрын
finally it worked
@idkidk1774
@idkidk1774 3 жыл бұрын
Sir how to increase accuracy
@mrmoody915
@mrmoody915 Жыл бұрын
@@idkidk1774 create a for loop that trains the model each time it then checks accuracy and if it is higher than the previous highest accuracy the model is saved and the new highest accurancy is set
@mrmoody915
@mrmoody915 Жыл бұрын
@@idkidk1774 also just increase data sets
@aqsaqamar1634
@aqsaqamar1634 Жыл бұрын
@@mrmoody915 can you please solve my error
@mrmoody915
@mrmoody915 Жыл бұрын
@@aqsaqamar1634 which is
@masterank2005
@masterank2005 8 ай бұрын
can use the method shown in this video with a little alteration to do the static gestures recognition? Like the single frame hand gesture recognition? i know there is a video posted about that but i didnt really liked the labelling method shown in tha video of manually labelling each image.
@jeanpierrebravomendoza6470
@jeanpierrebravomendoza6470 9 ай бұрын
I'm deaf help
@satyaranjansahoo8431
@satyaranjansahoo8431 8 ай бұрын
Use caption
@entertain7
@entertain7 7 ай бұрын
Thanks for this amazing tutorial I have a question, how do we create for the reverse ..... which means from text to sign language translator
@TheDreamsandTears
@TheDreamsandTears 6 ай бұрын
I want to know that too!!
@lesterhsu
@lesterhsu Ай бұрын
Love this video. I can't believe that i just completed it.
@andy_rocky
@andy_rocky 20 күн бұрын
hey can you please help me with this
@lesterhsu
@lesterhsu 20 күн бұрын
⁠@@andy_rockySure. What's the problem you're having? The dependencies I use conda create python 3.12.5 tensorflow 2.18.0 mediapipe 0.10.14. Depending on the code in the video and popular reviews, you can completely do the same effect as in the video (the downloaded code may differ slightly from the one presented in the video).
@anirudhxmishra
@anirudhxmishra 12 күн бұрын
@@lesterhsu I'm having a problem with my dependencies, and I'm unable to run this on Jupyter Notebook. What's your GitHub?
@LucasEloi
@LucasEloi 3 жыл бұрын
Nice work, thank you for the wonderful video!
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Cheers @Lucas!
@ronakdubey581
@ronakdubey581 Жыл бұрын
Thanks a lot for this man code seems to be working fine with little changes , I have even added a speach function which will speak out the predictions works preety well
@VanderlanAlves7
@VanderlanAlves7 Жыл бұрын
how did you do that? I want to do it too but I am a begginer
@unnathi8796
@unnathi8796 8 ай бұрын
@@VanderlanAlves7 did you do it? how to do it?
@TheDreamsandTears
@TheDreamsandTears 6 ай бұрын
How did you do it?
@TheDreamsandTears
@TheDreamsandTears 6 ай бұрын
Can you share?
@matts2581
@matts2581 Жыл бұрын
Excellent instruction! TY very much for sharing! :)
@mcbacker
@mcbacker 3 жыл бұрын
Nicholas, first I would like to thank you for all the content you produce. I've learned a lot from it! Tell me, why did you choose 64/128/64 units for the LSTM layers? If my data doesn't have the face keypoints do you think less units would yield better accuracy?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I just threw a small network together, choosing an appropriate number of units is part of the process of building a sufficiently performant model. I'd keep the same number of units and try passing through your data as is!
@mcbacker
@mcbacker 3 жыл бұрын
@@NicholasRenotte Hi Nicholas, thanks for replying back. I'm new to ML and I'm struggling to find a "formula" or some kind of guidance to help to pick the best parameters for LSTM. Is Trial and error the best approach?
@Sutirtha
@Sutirtha 3 жыл бұрын
@@mcbacker I think you can use hyper tuning for getting the best parameter. You can use Ray Hypertune
@chamangupta4624
@chamangupta4624 3 жыл бұрын
Very good prjoect , very well implemented ,
@sahanahiremath8945
@sahanahiremath8945 2 жыл бұрын
This helped me sooo muchhhh! Thanks.
@muktabhushan7068
@muktabhushan7068 2 жыл бұрын
hey nick, at 4:12 in your video you got an error how you resolved that coz I am getting the same error
@ashurroganathan8632
@ashurroganathan8632 3 жыл бұрын
Always Great Videos :). I have learned many Things from you. Thx
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks so much @Ashur! So glad you're enjoying it!
@Sutirtha
@Sutirtha 3 жыл бұрын
Thank you so much for the video, The x,y,z values changes based on the position of the person and camera, how can we transform the key points so that irrespective of moving ourselves, the relative body coordinates remain with respect to movement in camera?
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
I'm not sure I follow, the keypoints will always be different as they're tracking the person (if the person moves, so do they keypoints). Got a use case for me?
@OsazeOgedegbe
@OsazeOgedegbe Жыл бұрын
Hello Nicholas, I really enjoyed this tutorial. I wanted to ask if there was a way to normalize the x, y and z coordinates to they are not dependent on their position in the frame.
@budiutomospd5803
@budiutomospd5803 2 жыл бұрын
Super..super. Hai Nick..its me Budi Conrad Bali
@khaledchikh90
@khaledchikh90 Жыл бұрын
What if we have length variant; eg: each video contains several frames ( not all equal to 30, sometimes video contains 10 frames, another video would have 50 frames )
@meetvardoriya2550
@meetvardoriya2550 3 жыл бұрын
Another biggeeeeeee on the heap!,amazing sir❤️🙏
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
YESSS! The big videos are quickly becoming my fav to make, lmk what you think @Meet Vardoriya!
@dantealonso7174
@dantealonso7174 3 жыл бұрын
Thanks a lot for this content, I've been learning a lot, you are a god :)
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Keep on learning my guy! Love that you're smashing them!
@abhishripatil791
@abhishripatil791 3 жыл бұрын
Thank you for this this helped me so much with my project esp making the dataset
@alexandregagne4151
@alexandregagne4151 3 жыл бұрын
Very good video. Thank you for your hard work :) New subs
@anusornchaikaew2567
@anusornchaikaew2567 2 жыл бұрын
Nicholas Renotte! Thank you so much.
@kiddicode6897
@kiddicode6897 2 жыл бұрын
Wow, Thank you. I like all your video. You are very intelligent.
@joshgibson539
@joshgibson539 3 жыл бұрын
There's a dataset called MS-ASL with 1,000 different sign (word) classes from video maybe you can implement that or some extensive dataset other than letters and numbers like everyone else uses.
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Heya @Josh, will take a look. Sounds promising!
@joshgibson539
@joshgibson539 3 жыл бұрын
@@NicholasRenotte Theres also The American Sign Language Lexicon Video Dataset (ASLLVD) consisting of videos of >3,300 ASL signs in citation form, each produced by 1-6 native ASL signers, for a total of almost 9,800 tokens. This dataset includes multiple synchronized videos showing the signing from different angles as well.
@joshgibson539
@joshgibson539 3 жыл бұрын
​@@NicholasRenotte Also there's the ASL-LEX is a database of lexical and phonological properties of American Sign Language signs. It was first released in 2016 with nearly 1,000 signs. ASL-LEX was updated in Fall 2020 with greatly expanded information and an increased size of 2,723 signs. ASL-LEX is available as a searchable web interface and as raw data in spreadsheet form. This website hosts the web visualization and provides instructions for how to use and download the database.
@joshgibson539
@joshgibson539 3 жыл бұрын
@@NicholasRenotte Actually nevermind about those last two they don't seem to be publicly available to me. I can only seem to find searchable video databases. Such as SigningSavvy, ASLSignBank, HandSpeak, and SpreadTheSign. Which are all just known as "ASL Sign Language Video Dictionary" when you Google them. Each word needs searched individually though. If the websites video were scraped and you got the metadata for them it would I guess be quicker. It's nice though that there's multiple videos by using the word from different video databases. Although it's probably still not the best way of doing it. It just sucks that they in general don't make the data publically downloadable as a single file. There's a software called SIgnAll SDK which also seems to be locked away. Way to hold back humanity everyone.
@shrirampareek
@shrirampareek 3 жыл бұрын
Hey, Currently I am also implementing the same, I am using ISL videos which are available publicly and contains around 4000 word gestures used everyday and for the same I tried implementing on a subset of 26 words relating to countries and continents and tweaked some model params and am able to get good accuracy on test data but when I implement the same using webcam I get the same 4 classes, and this has been bothering me, would definitely appreciate if you can provide any help
@adityashinde6265
@adityashinde6265 Жыл бұрын
Wow!! such a helpful video. Thankyou very much
@lincoln169
@lincoln169 3 жыл бұрын
I love your videos Nicholas 🙂💙
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Thanks a bunch!
@DarkOceanShark
@DarkOceanShark 3 жыл бұрын
Thank you so much, Nick! Your video is fantastic and I have to say your method of teaching is top-notch mate. I am using you video for my project to interpret all the 26 letter signs in ASL. Could you please do me a favor of telling me how to train the model using an already availble dataset instead of creating it ourselves, like how it's done in the video? Your help will be much appreciated. Even the suggestion of one of your videos where you use a dataset will suffice.
@amessit10
@amessit10 2 жыл бұрын
hiii can u able to do it for 26 letters ????? can u help me ?
@sazidshaik4577
@sazidshaik4577 3 жыл бұрын
Thanks For Considering My Comments And Did with LSTM Love You and Really Good
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Anytime, it was a long time coming but it's here!!
@AnhLe-hc8qm
@AnhLe-hc8qm 3 жыл бұрын
most useful video i've seen
@NicholasRenotte
@NicholasRenotte 3 жыл бұрын
Oh thank you SOOOOO much! So glad you liked it!
@jupjyotkhosla4640
@jupjyotkhosla4640 2 жыл бұрын
Great Video Nicholas! I've been following your videos since the start of Sign Language Detection Series. I'm working on a similar project for my College and my question is: 1. Can I use this Model for detecting moving/dynamic gestures? 2. If it does, how can we make sure that our model detects the necessary frames to interpret the moving sign?
@NicholasRenotte
@NicholasRenotte 2 жыл бұрын
1. Yep 2. If the sign is longer than what's passed through as the input you can always increase the max sequence length and pad any blank frames.
@jupjyotkhosla4640
@jupjyotkhosla4640 2 жыл бұрын
@@NicholasRenotte Thank you so much!❤
@memsofgamers9479
@memsofgamers9479 3 жыл бұрын
Best lecture 😍 sir will please make a full video for beginners
@yusufcan1304
@yusufcan1304 6 ай бұрын
thanks dude.What is the python version ?
@ramanandr7562
@ramanandr7562 10 ай бұрын
Hi can you do a video on how to integrate this model on a flutter project... 🙂
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 345 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4 МЛН
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 26 МЛН
Lazy days…
00:24
Anwar Jibawi
Рет қаралды 9 МЛН
International Seminar & Award Ceremony 2024
2:32:36
UI GreenMetric
Рет қаралды 2,5 М.
Learn Machine Learning Like a GENIUS and Not Waste Time
15:03
Infinite Codes
Рет қаралды 188 М.
Machine Learning Projects You NEVER Knew Existed
15:20
Nicholas Renotte
Рет қаралды 184 М.
5 Python Libraries You Should Know in 2025!
22:30
Keith Galli
Рет қаралды 58 М.
How I'd Learn AI (If I Had to Start Over)
15:04
Thu Vu data analytics
Рет қаралды 879 М.
How I animate 3Blue1Brown | A Manim demo with Ben Sparks
53:41
3Blue1Brown
Рет қаралды 1,1 МЛН
Самое простое объяснение нейросети
16:30
Программный Кот
Рет қаралды 132 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 395 М.
Large Language Models explained briefly
8:48
3Blue1Brown
Рет қаралды 827 М.
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 26 МЛН