00:01 This video demonstrates sign language detection using action recognition with Python. 01:40 The video discusses the process of sign language detection using action recognition and LSTM deep learning model. 05:16 MediaPipe Holistic allows us to get key points from face, body, and hands 07:17 Setting up webcam access and rendering frames using OpenCV 11:06 The code captures frames from a webcam and displays them on the screen. 12:46 Setting up MediaPipe Holistics and creating variables for MediaPipe Holistic and MediaPipe Drawing Utilities 16:46 The video explains the process of color conversion in sign language detection. 18:32 The process involves detecting sign language using media pipe and a deep learning model. 21:59 The video discusses the different types of landmarks in sign language detection using action recognition. 23:23 The video explains how to detect and visualize different types of landmarks using MediaPipe. 27:05 The video discusses how landmarks in facial and body pose can be connected to each other. 28:37 Implementing sign language detection using LSTM deep learning model in Python 32:18 Landmarks are drawn and rendered in real time using image pass and cv2 33:55 You can customize the formatting of the dots and connections in Sign Language Detection using a landmark drawing spec and a connection drawing spec. 37:32 Updating pose and hand landmarks with different colors and parameters 39:31 Different models in action: left hand, right hand, face, and pose. 43:09 The code demonstrates how to extract landmark values using pose estimation. 45:04 The video explains how to reshape and convert landmarks into a single array. 48:27 Building a neural network and extracting key points using action recognition with Python 50:10 Setting up error handling and placeholder arrays for pose and face landmarks. 53:52 The video explains how to extract key points for sign language detection using LSTM deep learning model in Python. 55:31 Concatenating pose, face, left hand, and right hand keypoints for sign language detection. 59:11 Using LSTM Deep Learning Model to detect sign language actions 1:00:57 Creating folders to store data for different actions and sequences. 1:04:16 Creates a folder structure for sign language detection using action recognition with Python. 1:05:48 Collecting data using MediaPipe loop and capturing snapshots at each point in time. 1:10:14 The code is outputting text to the screen and taking a break at frame 0. 1:11:44 The first block of code prints starting collection in the middle of the screen and pauses. 1:15:10 The code collects key points by looping through actions, sequences, and frames. 1:16:39 Implementing sign language detection using action recognition with a LSTM deep learning model. 1:20:25 Sign language detection using action recognition with Python 1:23:20 Using MediaPipe to collect key points for sign language detection 1:26:55 Creating a dictionary to map labels to numeric ids 1:29:08 Sequences represent feature data and labels represent y data 1:32:26 Data preprocessing and training and testing partitioning are important steps in sign language detection using LSTM deep learning model. 1:34:12 Training LSTM neural network using TensorFlow and Keras. 1:38:14 The model uses LSTM layers for sign language detection. 1:39:55 The next three layers are dense layers using fully connected neurons. 1:43:16 The video discusses the process of formulating a neural network for sign language detection using action recognition and LSTM deep learning model. 1:44:58 Training the model with 2000 epochs 1:48:11 The training accuracy is high at 93.75% after 173 epochs. 1:49:37 The model has three LSTM layers and dense layers, with a small number of parameters to train. 1:53:13 Reloading a deleted model and evaluating its performance using scikit-learn. 1:55:03 Converting y test and y hat values to matrices and then evaluating the model performance using a confusion matrix and accuracy score. 1:58:18 Implementing prediction logic by concatenating data onto sequence and making detections when 30 frames of data are available. 2:00:30 Implement logic to grab the last 30 sets of key points for generating predictions. 2:04:18 Implementing visualization logic and checking result threshold and sentence length 2:06:45 The code checks if the current action matches the last sentence in the string. 2:10:05 Sign language detection using LSTM deep learning model 2:12:39 The video discusses sign language detection using action recognition with Python using an LSTM deep learning model. 2:17:13 The video discusses sign language detection using action recognition with Python 2:19:14 Sign Language Detection using Action Recognition with Python 2:22:29 To ensure accurate action detection, the last frame needs to be included in the sequence. 2:24:04 The code implementation adds stability by checking if the last 10 frames have the same prediction. Crafted by Merlin AI.
@kanchanpatil9642 Жыл бұрын
as someone who is following this in 2023, here's some change(s).....i'll be editing them in as they pop in while I go through the tutorial. 25:42 FACE_CONNECTIONS seems to be renamed/replaced by FACEMESH_TESSELATION.And well since we want just the outlines of the face, it's FACEMESH_CONTOURS that we would need in this project.
@taredje4664 Жыл бұрын
thanks, you save me
@VanderlanAlves7 Жыл бұрын
wow! Thank you very much!!!
@interstellarstar374210 ай бұрын
hey i can't collect data how i save .
@stinger92316 ай бұрын
Thank You so much, got stuck there for a minute
@simranmehta2778Ай бұрын
Thankyou soo much
@girishkemba38653 жыл бұрын
I remember some time ago requesting this type of video,but to see that its finally here brings me joy. Can't wait to do this and show to my sign language friends.
@NicholasRenotte3 жыл бұрын
I know right, it's taken a while but finally it's here! Thanks for sharing!
@pradeepsaravanan7712Ай бұрын
did you show to your friends ?
@savi-208410 ай бұрын
I can not thank you enough for all the videos you create i was a noob in tech but the moment i started watching your videos its been a year now and i am so proud of you and myself for coming this far and this project works for me❤
@simranmehta2778Ай бұрын
Just completed this project right now . Feeling extremely motivated to do more projects in future. I made couple of changes in this project such as I used LSTM but the activation function i used is tanh , I used Dropout Layer and l2 regularization to prevent overfitting . At the end I added audio feature so if the model predict 'hello' , hello audio will play.
@danishkhalid202726 күн бұрын
hey Simran Did you increased more words? and is your model project detecting sentence level and converting into text and voice? I want to see your project currently wants to work on it Would me great 😀
@tech_voyagerАй бұрын
Dude you are amazing i just almost completed my graduation project!!
@김미소2-z2w Жыл бұрын
Thank you so much! You are my best teacher in my college life!!!!
@SABEDIT29149 ай бұрын
Did you made this project?
@Stacio63 жыл бұрын
Hi Nicholas thanks so much !!!! I am creating a model to help deaf people here in my country. Greetings from Guatemala !!!
@NicholasRenotte3 жыл бұрын
Awesome stuff!!
@aminberjaouitahmaz41213 жыл бұрын
Thank you for these clear, practical, straight to the point tutorials! Looking forward to your future videos!
@NicholasRenotte3 жыл бұрын
Cheers @Amin, so pumped you're enjoying them!
@aqsaqamar1634 Жыл бұрын
@Nicholas Renotte can you tell me why error is coming mp 'mediapipr. Python. Solutions. Holistic' has no attribute 'FACE_CONNECTIONS'
@charank28945 ай бұрын
@@aqsaqamar1634 replace FACE_CONNECTIONS with FACEMESH_CONTOURS
@engeerdanisme Жыл бұрын
Thank you @Nicholas Renotte I just passed my capstone project defense utilizing this deep learning model
@yohanessatria22203 жыл бұрын
Man, you are so underrated and deserve a lot more! thanks a lot for these awesome learning materials! I have learned a lot from you. Keep inspiring, man :)
@NicholasRenotte3 жыл бұрын
Thanks so much @Yohanes! So glad you're enjoying them 🙏
@Rohan_is_discovering6 ай бұрын
Someone just completed his internship with the help of your code and also got a certificate from an IT company
@gaddesaishailesh27723 жыл бұрын
I was really waiting for this video!
@NicholasRenotte3 жыл бұрын
IKR, it's taken a little while hey @Gadde Sai Shailesh!
@malice1122 жыл бұрын
Nicholas is the best machine learning youtuber, his tutorials are interesting and fun.
@study_with_thor3 жыл бұрын
that's amazing! I watched this video more than a month ago but it seemed difficult for me as a beginner. Then I've tried my best to finished Machine Learning/ Deep Learning/ Python / Tensorflow and some Data Science course within a month. Now watching this video again is like watching a movie! it's easy to follow! love it
@NicholasRenotte3 жыл бұрын
YESSS! That's amazing that you stuck with it, great work man!!
@ruqaiyaali16452 жыл бұрын
you finished ML/DL/Python and Data science course within a month!! how is this possible man? I am having a hard time with these courses 🥲
@study_with_thor2 жыл бұрын
@@ruqaiyaali1645 I think you must be familiar with python code. Make sure practice more than what you learn.
@nguyenvietthai5868 Жыл бұрын
@@study_with_thor are you Vietnamese. I see your name. Can you give me some experience please? If so, please respond to me. Thanks a lot.
@study_with_thor Жыл бұрын
@@nguyenvietthai5868 Hi there, please let me know your concerns, I hope that I could help you too.
@rainymatch7 ай бұрын
It's so cool to see how happy Nicholas is when everything works in the end. That's the spirit! Amazing video, thanks a lot for your work man!
@danieladama81053 жыл бұрын
Can’t lie.. I have learnt a lot from Nicholas
@NicholasRenotte3 жыл бұрын
My man! Thanks for checking in!
@PIKACHU-zn8fx9 ай бұрын
agreed still learning from him
@giteshpal405Ай бұрын
did u add more actions of dataset?
@Cheese_Academia Жыл бұрын
Thanks for the amazing tutorials! absolutely life-saving. Just a reminder that the z value from mediapipe is with respect to the wrist landmark not the distance from the camera! I found out pretty late!
@leafiadias963 жыл бұрын
thanks for this amazing tutorial sir , we are working on a project that needed this section and your videos and explanation are being extremely helpful to me and my team ! thanks a lot
@fawwazhameed11042 жыл бұрын
Heyy leafia, could you tell me about your project?
@Nikos_prinio2 жыл бұрын
Hi ! I'm impressed by the amazing clarity of your explanations. For one second I thought you must be a trained teacher robot....
@torstenknodt68662 жыл бұрын
Thanks, great videos. Would be great if you could elaborate into the differences of the used media pipe implementation, compared to the others you mentioned. I mean really a comparison of the underlying models/ networks and their training.
@stevecoxiscool3 жыл бұрын
Great explanation on how to use LSTM with pose coordinates.
@ibrahimalizada3813 жыл бұрын
Hi, Nicholas! These are great video series to watch and learn! Thank you very much! Can you please prepare a video applying CV on real-time sign language detection on the base of a ready dataset avaliable in Internet? It may be much more interesting if we can see ViT in action recognition as well.
@VarunAditTheGreat Жыл бұрын
Hey, I am trying to build a project with a bigger dataset for ASL. Did you find any dataset?
@asutoshpatro2865 Жыл бұрын
@@VarunAditTheGreat i have found it its wlsal data set did u make pls share the code link
@ruthogadina757 Жыл бұрын
i'm learning about this, would you like to work on a project together?
@ishaanverma1969 Жыл бұрын
This content is so underrated! Thank you so much!
@dinukii33322 жыл бұрын
Hi Nicholas! Thank you for your tutorial once again. Quick question, How can I change the code to access a folder that contains a dataset of videos without live capturing them? Really appreciate if you could give an answer :)
@NicholasRenotte2 жыл бұрын
You could loop through each one of the videos by using os.listdir or the tensorflow dataset class then run it through the mp holistic pipeline!
@dinukii33322 жыл бұрын
@@NicholasRenotte Thank u:)
@HannahCynthia-mu4ct2 жыл бұрын
Heyy. Do you know the exact code to loop through video dataset?
@riadhaoufi94522 жыл бұрын
@@HannahCynthia-mu4ct i'm looking for it too, i hope he gets to answer up thank you so much for the video brother @Nicholas Renotte
@riadhaoufi94522 жыл бұрын
@@NicholasRenotte i'm so lost brother :(
@rabiraj13873 жыл бұрын
Awaited Video Nicholas hope to complete it and implement on my side.
@NicholasRenotte3 жыл бұрын
I know, can't believe it's finally out! Let me know how you go @Rabi Raj!
@theethatanuraksoontorn13693 жыл бұрын
Hi Nicholas, been working on similar project. I believe this tutorial is done for simplicity so I would like to add a piece of my mind. When adding more action, the prediction on the realtime will be mix a lot due to frame overlap and wrong slicing of the frame. I would suggest to show some viz as start and end of the prediction. So the user can follow the start to the end frame. This way it is similar to the way it is collected and higher prediction accuracy.
@giteshpal405Ай бұрын
hava u add more actions in your project ?
@yousseffarhan89013 жыл бұрын
لا يمكنك أن تتخيل كم ساعدتني. شكرا جزيلا لك 🙏🏼
@NicholasRenotte3 жыл бұрын
🙏
@tigre12172 жыл бұрын
Hi nick! Nice tutorial on this sign language recognition program. I had faced some problems of the categorical accuracy staying the same when im trying to add more signs to the model rather than 3 like the ones you used in the video, is there any way to solve this issue? Thanks!
@rryann0882 жыл бұрын
Hi, are you still working on it?
@labhjoshi3182 Жыл бұрын
@@rryann088 same question
@harrylee976253 жыл бұрын
Nicholas certainly deserves more views.
@NicholasRenotte3 жыл бұрын
Awww, thanks @Harry. Much appreciated man!
@rowlandgoddy-worlu33822 жыл бұрын
This is an amazing video! I have learned a lot following your tutorials. One question - What if you are trying to capture actions that are not of equal time duration. E.g if a sign language like "Good Morning" lasts for 5 seconds and another sign like "Welcome" lasts for 9 seconds. How can this be treated?
@032lovishkumar88 ай бұрын
hey, i am getting error IndexError: list index out of range while running 2:00:10 , how can i resolve it ?
@ibrahimhameem13343 жыл бұрын
Super stuff Nicholas! Super grateful for your tutorials 🙌🏻. Keep up the great work!
@NicholasRenotte3 жыл бұрын
Thanks so much @Ibrahim, soooo much more to come!
@akshatraj5952 Жыл бұрын
Videos that you make is wonderful. Thank you for these practical and clear points in the tutorials.
@usamaejaz5264 Жыл бұрын
MP_Data folder missing ha , tou wo kahan se lae gy
@piresflp3 жыл бұрын
Hi Nicholas, thanks for the awesome tutorial! I've got 3 questions about the project, hope you don't mind helping me: 1. When training my model, i've 90%+ accuracy very quickly (150 epochs more or less), but all of sudden it dropped to 30% and kinda stabilized until the rest of the execution, how can I fix it? 2. If I want to add more signs after first training my model, I'll have to re-train it? Or can I train just those specific signs separately? How do I do it? 3. After the model is working just fine, it is possible to attach the real time script to an android app?
@howcircle55303 жыл бұрын
i also wanna know about you 3rd quection.🤓
@NicholasRenotte3 жыл бұрын
1. So accuracy never went back up? Try adding more data for each class depending on what's performing well/not well. 2. You can apply transfer learning, drop the final layer, add a new layer which has the same number of classes as your new signs then retrain 3. Yes, I haven't shown it here as it's probably a whole other video though!
@tigre12172 жыл бұрын
@@NicholasRenotte Hi Nick, can you elaborate more on the 2nd point? I was quite confused since it is my first deep learning project. Thanks!
@adriamasitoribio Жыл бұрын
@@tigre1217 hey! diod you figure it out?
@giteshpal405Ай бұрын
have u done it?
@rusticagenerica2 жыл бұрын
Exceptional tutorial. Thank you from the bottom of my heart.
@MuhammadKamran-ow5vp Жыл бұрын
I have a question. Is it possible to feed video of arbitrary lengths (frames) instead of feeding an action of fixed length video? Because in real time, we perform sign language pretty fast and each action is of arbitrary length.
@latestdramas63515 ай бұрын
Excuse ne
@T-She-Go3 жыл бұрын
Thank you so much Nicholas 😌 This will help me with my project 🙌🏾
@yashas_hm2 жыл бұрын
Hi Nicolas, Such an amazing video. Helped me a lot building a project. I am working on a different project in which I trained the model with around 20 signs from ASL but I am getting a categorical accuracy of only 0.05 on average in each epoch. can you tell me where I went wrong or anything to imporve it?
@martinposso20982 жыл бұрын
hey how you managed to fix that problem?
@phoque63 жыл бұрын
Thank you for a detailed and wonderful mediapipe tutorial :)
@NicholasRenotte3 жыл бұрын
So glad you liked it!
@mervesisci49833 жыл бұрын
Hi Nicholas, Thank you for this amazing tutorial. If we use padding in this case (videos containing movements with different number of frames) how can we make predictions in real-time? In the tutorial you set a fixed length (30 frames) (sequence=sequence[:30] if len(sequence)==30), but in my case there are different frame sizes for each activity in real-time prediction.
@abhisekpanigrahi10332 жыл бұрын
Hello Nicholas I also have this question. Can you please answer this what if the dimensions are different each time we run
@nnamakah2 жыл бұрын
Hi Nicholas, thanks for this project, it is incredible. How would you handle video files with varying number of frames? How can I possibly approach the situation?
@matteosacco006 ай бұрын
Same question, anyone with suggestions?
@depallyyadaiahgoud750 Жыл бұрын
That's way cooler one and your explanation was a ton easier 😉 Thanks Nick
@angelortiz35642 жыл бұрын
This is so awesome! You can theoretically do the same for the static letters in the ASL alphabet, right? Just make the dataset that contains each hand sign. The model would be train on the keypoints of each hand sign. Although I am not sure it for some hand sign letters, the keypoints would be accurate. What do you think?
@anshumanchoudhary4732 Жыл бұрын
That model would be far more easier to achieve
@eswar778110 ай бұрын
@@anshumanchoudhary4732which model
@arpanroy28923 жыл бұрын
Your every video slightly edited , directly goes in my cv 🤣🤣🤣🤣 , thanks for taking care of my future ❤❤❤
@NicholasRenotte3 жыл бұрын
Hahaha, build that experience man and go getem!
@y.yuvraj2 жыл бұрын
Hii Nicholas This is really an amazing tutorial I really appreciate it. But I am having an error at fitting the model and it is of 'ValueError' which is "Failed to convert a NumPy array to Tensor". I tried many things but it is not going away so please give me a hand on this.
@another.nikhil2 жыл бұрын
check the datatype of the inputs in your model. The keras api only accepts numpy arrays.
@soumendas23363 жыл бұрын
thank you Nicholas i have learned a lot of things from this video ....that I was looking for the past few months..
@NicholasRenotte3 жыл бұрын
Yesss! So glad!
@WJ-zq3xo3 жыл бұрын
Great tutorial as usual, Nick! Learning a lot from you :D Did anyone try to use a set of videos instead of recording their own videos? If yes, what did you change in the code base? Kudos
@shrirampareek3 жыл бұрын
Hey! I used some set of videos(26) and was able to get 92% on test dataset however when I tried doing the same gestures using webcam, I get same 4 classes all the time
@amessit102 жыл бұрын
@@shrirampareek can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@neerajpatil78502 жыл бұрын
@@amessit10 Same for me ! Have you figured out the why the error ?
@amessit102 жыл бұрын
@@neerajpatil7850 No man, i closed this project coz i only need hand gestures not full body keypoints
@amessit102 жыл бұрын
hands occludes , so recognition fails
@hamednasr30782 жыл бұрын
I wish you recorded all your videos with zoom and font size of 22:30, it is really great 🙂
@NicholasRenotte2 жыл бұрын
Yeah I've gotta work out how to do it, I just can't code with that amount of zoom though @Hamed. Will see what I can do!
@mahmudanajnin93673 жыл бұрын
hey nick! this project is amazing! thank you for these awesome tutorials. You did sign language detection with tensorflow object detection which detects sign using single frame but here we're using multiple frames to detect it. So i was wondering how is this one better than tensorflow object detection?
@NicholasRenotte3 жыл бұрын
Just depends on the use case, the OD model does it on a single frame, this does it for multiple frames (this one is better for signs with multiple phases)
@ericklasco3 ай бұрын
2024 and this is still useful, thank you Nicholas👍
@anirbansaha2443 ай бұрын
hey did you complete the project?
@ericklasco3 ай бұрын
@@anirbansaha244 yes
@whisplay3 ай бұрын
@@ericklasco can you please share which version dependence you used for importing
@siva77022 ай бұрын
Please provide versions bro, like what is python version you used
@whisplay2 ай бұрын
@@siva7702 Python 3.12.5 and import versions: !pip install tensorflow opencv-python mediapipe scikit-learn matplotlib !pip install --upgrade mediapipe note: use jupyter notebook! with this code camera can't be accessed on google colab or kaggle like ide's, if you want to use colab you need to add an additional Javascript code.
@theethatanuraksoontorn13693 жыл бұрын
Hey Nicholas, I am working on similar project. Just wondering when I test the model using your metric it does not reflect the same accuracy as the real-time test. I train the model accuracy to 80-90% but the real-time test barely capture any sign language. Do you have any thought?
@giteshpal405Ай бұрын
did u find any solution?
@fatiha24133 жыл бұрын
Hi, Nicholas! I learned a lot from this video! Thank you very much!
@amessit102 жыл бұрын
hiii FATIMA , can we implement this project for 26 letters as i am getting error " list index out of range" when trying to do more that 3 actions
@estebanpozo87023 жыл бұрын
Hi Nicholas, thanks again for this great tutorial! I am writing this because I would like to learn more about how did you chose your architecture. As you mention, almost all the state-of-the-art papers use a combination of CNN and LSTM. So, I have two questions 1. Would it be possible to get a more detailed explanation on how you build this model? 2. So far, I have reviewed “LSTM: A Search Space Odyssey” by Greff (+ other papers) and the “Neural Network design” handbook by Hagan. Could you recommend me any documents regarding LSTM architectures?
@NicholasRenotte3 жыл бұрын
This is how I normally build stuff: 1. Find a research paper that has implemented a similar model 2. Try building the code for that model 3. Fine tune and iterate (a lot) to get solid performance I wish I had a standard process but it is hyper iterative.
@estebanpozo87023 жыл бұрын
@@NicholasRenotte thanks! :)
@predoca467 ай бұрын
31:06 Im making a project to my school and he's look like your project, and he's function is like your. But, im dont have knowledge sufficient to make this alone. So im watching your video to learn much and complete that, thanks for the video and sorry for my english haha. Send hello to Brazil 🇧🇷 😂
@latestdramas63515 ай бұрын
Hey
@vasuarora_3 ай бұрын
@@latestdramas6351school???
@Gabbosauro3 жыл бұрын
Hi Nicholas, I've been working on my thesis project about the quality of body movements and I encountered a problem with keras. I see that you feed in the first layer a sequence of constant 30 frames (1 second of video/list of mediapipe landmark object). In my case I have a variable number of frames (i.e. a video containing movements that lasts some 2 seconds (60 frames), some 2.5 seconds (75 frames), some 3 seconds (90 frames), etc., hence with different number of frames), how can I solve this? I looked around and people say that I can apply the so called "padding and masking" which takes the largest number of frames (longest video) and then add a special value to the others (padding) and after that somehow ignore/filter the special number later (masking). But this can't be applied to my case because I would like to have the freedom of variable number of frames during prediction. I hope you understand what I want to ask, otherwise let me know and I will try to clarify it as much as I can. Thank you!
@NicholasRenotte3 жыл бұрын
AFAIK it's the only way to do it, unless you look at something like a sequence to sequence model (I think, don't quote me on that though lol). Padding would be the easiest approach. Set a fixed max length and fill out the frames without detections with a numpy array with zeroes!
@Gabbosauro3 жыл бұрын
@@NicholasRenotte Thank you for the reply! Will the padding influence much the classification? I mean if video1 with movementA lasts 3 seconds and video2 with movementA lasts 1 second + 2 seconds of zeroes, would that cause problems during prediction or do you think it will work well?
@NicholasRenotte3 жыл бұрын
@@Gabbosauro I would prototype and see the impact first. Kinda hard to say without seeing benchmark results.
@Gabbosauro3 жыл бұрын
@@NicholasRenotte Alright, I'll test it out. Thanks!
@Gabbosauro3 жыл бұрын
@@NicholasRenotte What I did and it seems starting to do the training is setting input_shape=(None, number_of_features) so time_steps set as "None" instead of 30, and during model.fit() I give it a batch_generator. ( based on this reference: datascience.stackexchange.com/questions/48796/how-to-feed-lstm-with-different-input-array-sizes ) But sadly the accuracy chart doesn't look good, sometimes it is around 40-50%, sometimes it drops to 20%.
@mehmety50122 жыл бұрын
Great Tutorial Nicholas. Thank you so much !
@pritishmair95773 жыл бұрын
Is there a dataset available for this, which has more signs than these 3. If so it will be really great if someone could share it
@vaibhav6073 жыл бұрын
Please, can you reply on the status of this?
@ahmedkalair98629 ай бұрын
@pritishmair9577 did u find it
@giteshpal405Ай бұрын
did u find any dataset yet?
@gustavojuantorena3 жыл бұрын
Wow! This is great Nick! 👏👏👏
@NicholasRenotte3 жыл бұрын
Thanks a bunch Gustavo!!
@akshith.vbharadwaj22692 жыл бұрын
Greetings Hey man this is an awesome tutorial and I completely love the way u have explained the process step by step. It was an awesome tutorial and I completely loved it. I tried it on my own and I have encountered some problems it would be a great help if u could help me out with it. I have followed the same method that u have prescribed on the video these are the problems which came up. Even after getting overall categorical accuracy 95% and above accuracy on training datasets when I do the gesture recognition it is not recognising one gesture. And sometimes it shows the same gesture even though I am showing a different gesture. Sometimes even it is detecting 2 gestures even though I am not giving any gestures. I am always retraining the same data to get a higher accuracy before going to the gesture recognition part. I have also increased a layer in the LSTM model but the results are the same. Would greatly appreciate the help.
@NicholasRenotte2 жыл бұрын
Start with the data, I would add more data of the underperforming classes then retrain. Remember bad data in will lead to bad outputs and vice versa, try adding 20-30 more samples for each underperforming class and give it a go!
@joshgibson5393 жыл бұрын
I really hope you continue this project.
@NicholasRenotte3 жыл бұрын
I don't think I'm ever going to give this one up until I truly nail it. I feel like we're maybe 50ish percent of the way there. Still a TON of work to do.
@joshgibson5393 жыл бұрын
@@NicholasRenotte I know it requires a lot of data and work to do. Also a project like this that helps people is always a great thing to be working on. I'm glad to see you sticking to it. I really wish SignAll would just release their product instead of making it about money. Their database has I have heard over 300,000+ sign language hand symbol videos labelled. I guess businesses and schools can request the software. But I just know they won't let just anyone touch it otherwise. That just really depresses me to know. I have a cousin that I can never understand when he comes over yet he understands me due to his hearing aid implant. It just sucks... and I think the world needs a solution that's not locked away.
@joshgibson5393 жыл бұрын
@@NicholasRenotte Try requesting data from How2Sign's Github 16,000 vocabulary words (srvk /how2-dataset). just be sure to read their licensing terms before requesting it though if you do. Sorry I don't know many good resources I just want to see the project flourish.
@T-She-Go3 жыл бұрын
Update: I managed to get an accuracy of 98% by changing the activation functions of the LSTM and Dense layers. 😌 Hope that this helps y'all who might be stuck on this Hi Nic 😌 me again 😅 So I'm trying to use a new data set of gestures and I can't seem to get an accuracy >20%. I have tried to change the learning rate, the optimiser, etc, but non of these work 🙈 Is there something that I am missing? Thank you in advance 🌸
@NicholasRenotte3 жыл бұрын
How many gestures and how many classes? For really similar classes I'd suggest adding way more data in order to produce a more accurate model. Also, what activations did you change, curious?
@T-She-Go3 жыл бұрын
@@NicholasRenotte I used 5 gestures, 2 were based on hand movements and 3 were based on head movements. I think I should've added more data because the prediction model could not tell the difference between all the head gestures x_x Also, I changed the ReLu activations to Sigmoid
@mahmudanajnin93673 жыл бұрын
thank you so much..using sigmoid function really worked for me!
@T-She-Go3 жыл бұрын
@@mahmudanajnin9367 Yaaay :) I'm glad
@mahmudanajnin93673 жыл бұрын
@@T-She-Go can you tell me how to find out how many labels the confusion matrix is for?? i have 5 classes in my project and yhat = [1, 0, 1, 1, 2, 0, 1, 0, 4, 3]. My confusion matrix gives 5set of arrays..I'm really confused. Is it related to yhat value?
@barithiachudhan30343 жыл бұрын
Hai nicholas it was such a wonderful implementation and thanks for sharing it with us
@NicholasRenotte3 жыл бұрын
So glad you enjoyed it!!
@amitdutta38753 жыл бұрын
you are great.
@NicholasRenotte3 жыл бұрын
Thanks so much @Amit!
@udoysaha30868 ай бұрын
Helped a lot.. Everything explained really well.. Thank you so much!
@girisathvikavpragatiengine309 Жыл бұрын
Hey Nicholas, the tensorflow version of 2.4.1 is showing an error. It says " Could not find a version that satisfies the requirement tensorflow==2.4.1" please help me out
@alissiazaidi263110 ай бұрын
hey, did you find the solution ? Actually, I have the same error...
@pareshgupta32888 ай бұрын
@@alissiazaidi2631 just change the version if it's winows use: pip install tensorflow==2.10.0 if linux: pip install tensorflow==2.16.0
@pavansai28383 ай бұрын
heyy did find the solution for it?
@siva77022 ай бұрын
Downgrade your python version to tensorflow 2.4.1 supports only python version 3.6-3.8
@MuhammadKamran-ow5vp Жыл бұрын
It was really a great tutorial on real time sign language detection.
@knd38462 жыл бұрын
hi .. first of all thanks for your free code to this brilliant work. second, i am a beginner in using python yet i have come too far in running your code. At step 11 i am facing an error that keeps appearing and i am exhausted right now bcz i have spend my whole day in finding a perfecct solution for it. it keeps showing TypeError: only size-1 arrays can be converted to Python scalars. after running plt. imshow coding line ..... plz plz need help...
@xboxgaming43072 жыл бұрын
Facing same issue .. even i follow all of the same steps ... srsly i need help too
@safamunir15102 жыл бұрын
I'm having same issue in the coding ... please help us removing this error
@harryfeng41992 жыл бұрын
did u manage to figure it out?
@knd38462 жыл бұрын
@@harryfeng4199 nope.. I have tried so mnay different things but its all in vane.. I am at my last step though..
@sowmyacheguri212 жыл бұрын
Hey! Did u figure it out?
@redabenlekehal72713 жыл бұрын
Brilliant as expected
@idkidk17743 жыл бұрын
finally it worked
@idkidk17743 жыл бұрын
Sir how to increase accuracy
@mrmoody915 Жыл бұрын
@@idkidk1774 create a for loop that trains the model each time it then checks accuracy and if it is higher than the previous highest accuracy the model is saved and the new highest accurancy is set
@mrmoody915 Жыл бұрын
@@idkidk1774 also just increase data sets
@aqsaqamar1634 Жыл бұрын
@@mrmoody915 can you please solve my error
@mrmoody915 Жыл бұрын
@@aqsaqamar1634 which is
@masterank20058 ай бұрын
can use the method shown in this video with a little alteration to do the static gestures recognition? Like the single frame hand gesture recognition? i know there is a video posted about that but i didnt really liked the labelling method shown in tha video of manually labelling each image.
@jeanpierrebravomendoza64709 ай бұрын
I'm deaf help
@satyaranjansahoo84318 ай бұрын
Use caption
@entertain77 ай бұрын
Thanks for this amazing tutorial I have a question, how do we create for the reverse ..... which means from text to sign language translator
@TheDreamsandTears6 ай бұрын
I want to know that too!!
@lesterhsuАй бұрын
Love this video. I can't believe that i just completed it.
@andy_rocky20 күн бұрын
hey can you please help me with this
@lesterhsu20 күн бұрын
@@andy_rockySure. What's the problem you're having? The dependencies I use conda create python 3.12.5 tensorflow 2.18.0 mediapipe 0.10.14. Depending on the code in the video and popular reviews, you can completely do the same effect as in the video (the downloaded code may differ slightly from the one presented in the video).
@anirudhxmishra12 күн бұрын
@@lesterhsu I'm having a problem with my dependencies, and I'm unable to run this on Jupyter Notebook. What's your GitHub?
@LucasEloi3 жыл бұрын
Nice work, thank you for the wonderful video!
@NicholasRenotte3 жыл бұрын
Cheers @Lucas!
@ronakdubey581 Жыл бұрын
Thanks a lot for this man code seems to be working fine with little changes , I have even added a speach function which will speak out the predictions works preety well
@VanderlanAlves7 Жыл бұрын
how did you do that? I want to do it too but I am a begginer
@unnathi87968 ай бұрын
@@VanderlanAlves7 did you do it? how to do it?
@TheDreamsandTears6 ай бұрын
How did you do it?
@TheDreamsandTears6 ай бұрын
Can you share?
@matts2581 Жыл бұрын
Excellent instruction! TY very much for sharing! :)
@mcbacker3 жыл бұрын
Nicholas, first I would like to thank you for all the content you produce. I've learned a lot from it! Tell me, why did you choose 64/128/64 units for the LSTM layers? If my data doesn't have the face keypoints do you think less units would yield better accuracy?
@NicholasRenotte3 жыл бұрын
I just threw a small network together, choosing an appropriate number of units is part of the process of building a sufficiently performant model. I'd keep the same number of units and try passing through your data as is!
@mcbacker3 жыл бұрын
@@NicholasRenotte Hi Nicholas, thanks for replying back. I'm new to ML and I'm struggling to find a "formula" or some kind of guidance to help to pick the best parameters for LSTM. Is Trial and error the best approach?
@Sutirtha3 жыл бұрын
@@mcbacker I think you can use hyper tuning for getting the best parameter. You can use Ray Hypertune
@chamangupta46243 жыл бұрын
Very good prjoect , very well implemented ,
@sahanahiremath89452 жыл бұрын
This helped me sooo muchhhh! Thanks.
@muktabhushan70682 жыл бұрын
hey nick, at 4:12 in your video you got an error how you resolved that coz I am getting the same error
@ashurroganathan86323 жыл бұрын
Always Great Videos :). I have learned many Things from you. Thx
@NicholasRenotte3 жыл бұрын
Thanks so much @Ashur! So glad you're enjoying it!
@Sutirtha3 жыл бұрын
Thank you so much for the video, The x,y,z values changes based on the position of the person and camera, how can we transform the key points so that irrespective of moving ourselves, the relative body coordinates remain with respect to movement in camera?
@NicholasRenotte3 жыл бұрын
I'm not sure I follow, the keypoints will always be different as they're tracking the person (if the person moves, so do they keypoints). Got a use case for me?
@OsazeOgedegbe Жыл бұрын
Hello Nicholas, I really enjoyed this tutorial. I wanted to ask if there was a way to normalize the x, y and z coordinates to they are not dependent on their position in the frame.
@budiutomospd58032 жыл бұрын
Super..super. Hai Nick..its me Budi Conrad Bali
@khaledchikh90 Жыл бұрын
What if we have length variant; eg: each video contains several frames ( not all equal to 30, sometimes video contains 10 frames, another video would have 50 frames )
@meetvardoriya25503 жыл бұрын
Another biggeeeeeee on the heap!,amazing sir❤️🙏
@NicholasRenotte3 жыл бұрын
YESSS! The big videos are quickly becoming my fav to make, lmk what you think @Meet Vardoriya!
@dantealonso71743 жыл бұрын
Thanks a lot for this content, I've been learning a lot, you are a god :)
@NicholasRenotte3 жыл бұрын
Keep on learning my guy! Love that you're smashing them!
@abhishripatil7913 жыл бұрын
Thank you for this this helped me so much with my project esp making the dataset
@alexandregagne41513 жыл бұрын
Very good video. Thank you for your hard work :) New subs
@anusornchaikaew25672 жыл бұрын
Nicholas Renotte! Thank you so much.
@kiddicode68972 жыл бұрын
Wow, Thank you. I like all your video. You are very intelligent.
@joshgibson5393 жыл бұрын
There's a dataset called MS-ASL with 1,000 different sign (word) classes from video maybe you can implement that or some extensive dataset other than letters and numbers like everyone else uses.
@NicholasRenotte3 жыл бұрын
Heya @Josh, will take a look. Sounds promising!
@joshgibson5393 жыл бұрын
@@NicholasRenotte Theres also The American Sign Language Lexicon Video Dataset (ASLLVD) consisting of videos of >3,300 ASL signs in citation form, each produced by 1-6 native ASL signers, for a total of almost 9,800 tokens. This dataset includes multiple synchronized videos showing the signing from different angles as well.
@joshgibson5393 жыл бұрын
@@NicholasRenotte Also there's the ASL-LEX is a database of lexical and phonological properties of American Sign Language signs. It was first released in 2016 with nearly 1,000 signs. ASL-LEX was updated in Fall 2020 with greatly expanded information and an increased size of 2,723 signs. ASL-LEX is available as a searchable web interface and as raw data in spreadsheet form. This website hosts the web visualization and provides instructions for how to use and download the database.
@joshgibson5393 жыл бұрын
@@NicholasRenotte Actually nevermind about those last two they don't seem to be publicly available to me. I can only seem to find searchable video databases. Such as SigningSavvy, ASLSignBank, HandSpeak, and SpreadTheSign. Which are all just known as "ASL Sign Language Video Dictionary" when you Google them. Each word needs searched individually though. If the websites video were scraped and you got the metadata for them it would I guess be quicker. It's nice though that there's multiple videos by using the word from different video databases. Although it's probably still not the best way of doing it. It just sucks that they in general don't make the data publically downloadable as a single file. There's a software called SIgnAll SDK which also seems to be locked away. Way to hold back humanity everyone.
@shrirampareek3 жыл бұрын
Hey, Currently I am also implementing the same, I am using ISL videos which are available publicly and contains around 4000 word gestures used everyday and for the same I tried implementing on a subset of 26 words relating to countries and continents and tweaked some model params and am able to get good accuracy on test data but when I implement the same using webcam I get the same 4 classes, and this has been bothering me, would definitely appreciate if you can provide any help
@adityashinde6265 Жыл бұрын
Wow!! such a helpful video. Thankyou very much
@lincoln1693 жыл бұрын
I love your videos Nicholas 🙂💙
@NicholasRenotte3 жыл бұрын
Thanks a bunch!
@DarkOceanShark3 жыл бұрын
Thank you so much, Nick! Your video is fantastic and I have to say your method of teaching is top-notch mate. I am using you video for my project to interpret all the 26 letter signs in ASL. Could you please do me a favor of telling me how to train the model using an already availble dataset instead of creating it ourselves, like how it's done in the video? Your help will be much appreciated. Even the suggestion of one of your videos where you use a dataset will suffice.
@amessit102 жыл бұрын
hiii can u able to do it for 26 letters ????? can u help me ?
@sazidshaik45773 жыл бұрын
Thanks For Considering My Comments And Did with LSTM Love You and Really Good
@NicholasRenotte3 жыл бұрын
Anytime, it was a long time coming but it's here!!
@AnhLe-hc8qm3 жыл бұрын
most useful video i've seen
@NicholasRenotte3 жыл бұрын
Oh thank you SOOOOO much! So glad you liked it!
@jupjyotkhosla46402 жыл бұрын
Great Video Nicholas! I've been following your videos since the start of Sign Language Detection Series. I'm working on a similar project for my College and my question is: 1. Can I use this Model for detecting moving/dynamic gestures? 2. If it does, how can we make sure that our model detects the necessary frames to interpret the moving sign?
@NicholasRenotte2 жыл бұрын
1. Yep 2. If the sign is longer than what's passed through as the input you can always increase the max sequence length and pad any blank frames.
@jupjyotkhosla46402 жыл бұрын
@@NicholasRenotte Thank you so much!❤
@memsofgamers94793 жыл бұрын
Best lecture 😍 sir will please make a full video for beginners
@yusufcan13046 ай бұрын
thanks dude.What is the python version ?
@ramanandr756210 ай бұрын
Hi can you do a video on how to integrate this model on a flutter project... 🙂