It's been 1.5 months since I started learning ML. I was looking for good quality free resources to clear my concepts and boost my skills and found this fantastic channel. Thank you Aman sir for providing concise playlists with excellent explanations, it saves our time and we learn the concepts faster.
@pramodyadav44223 жыл бұрын
After a longtime.. clearly understood the TF-IDF. I should have seen this video last year...
@sargamagarwal45442 жыл бұрын
Best video on YT for this 🔥🔥
@UnfoldDataScience2 жыл бұрын
Thanks Sargam. please share with friends as well.
@preranatiwary76904 жыл бұрын
Good video on feature extraction technique
@UnfoldDataScience4 жыл бұрын
Thanks for watching :)
@sameertemkar4 жыл бұрын
TF-IDF explained very nicely
@UnfoldDataScience4 жыл бұрын
Thanks Sam. Happy Learning.Tc
@shreedharchavan70333 жыл бұрын
Excellent explanation
@UnfoldDataScience3 жыл бұрын
Thanks Shreedhar.
@TJ-wo1xt3 жыл бұрын
one of the best explanations ever. thanx a lot.
@UnfoldDataScience3 жыл бұрын
Most welcome!
@exuberantyouth8765 Жыл бұрын
Great explanation Aman
@sandipansarkar92113 жыл бұрын
great explantion
@UnfoldDataScience3 жыл бұрын
Thanks Sandipan again :)
@sanahnahk73122 жыл бұрын
Thankyou buddy it was the best explanation i have came across so far.
@UnfoldDataScience2 жыл бұрын
Glad it helped
@quanbui16703 жыл бұрын
That was a very good lecture, the way you explained hard concepts is very systematic and easy to understand, thanks Aman
@UnfoldDataScience3 жыл бұрын
Most Welcome.
@ehimareokosun3842 жыл бұрын
Excellent explanation, cheers mate,
@raicheldavid3092 Жыл бұрын
Thank you sir.i was looking for a lecture video to understand about TFIDF now i got a great clarity about it
@UnfoldDataScience Жыл бұрын
Most welcome
@cryforwind13092 жыл бұрын
this video really make me understanding easier.谢谢
@UnfoldDataScience2 жыл бұрын
Glad to hear that!
@shemamilton73264 жыл бұрын
I have used R programming for 1000 tweets and extract the senti words...nice eplanation...there r so many built in libraries to do this work without knowing anything behind this... But now ....i came to know the process behind those library functions...Thank you so much Next can you explain naive bayes algorithm ....
@UnfoldDataScience4 жыл бұрын
Thanks Shema. Sure.
@sathyag26082 жыл бұрын
Very good explanation
@blankboy-ww7jt Жыл бұрын
very good sir, thanks for the lesson
@mjmj45154 жыл бұрын
Great. Your future as well the future of your students is bright. Good knowledge good representation.
@UnfoldDataScience4 жыл бұрын
Thanks a lot Meraj. Happy Learning. Tc
@mjmj45154 жыл бұрын
@@UnfoldDataScience Want your Good discussion on LSTM and BERT. Thank you. If you don't mind I want to add more :-) Word to vec, softmax, bigram. Thanks again
@reachDeepNeuron4 жыл бұрын
😂🤣
@anirbansarkar63063 жыл бұрын
that was just an awesome piece of educational video. Thanks Aman for spreading such easy-to-understand contents.
@UnfoldDataScience3 жыл бұрын
My pleasure Anirban.
@kumarpiyush16434 жыл бұрын
I have been learning ML and NLP from last 3 months. And was bit confused with some of the concepts like xgboost and specially NLP. But your tutorial video has cleared lot of my confusion even in xgboost. Great tutorial -Aman..! only cons I find is your white board size. If you have bigger board then you can link multiple info on the same page...!! but still u have the best concept of ML and NLP..!!
@UnfoldDataScience4 жыл бұрын
Glad it was helpful. Thanks Piyush for your feedback. Love your comments always it motivates me. Will look into it :)
@dorgeswati3 жыл бұрын
Keep it up and bring some more advanced topics . you are very clear on the concepts and making it simple for others
@UnfoldDataScience3 жыл бұрын
Thank you dorgeswati, I will
@shashankbarai18098 ай бұрын
There are numerous channels dedicated to data science, but to me, all others seem like outliers. If someone is learning from the Unfold Data Science channel, it implies that we all excel in data analysis because finding a good tutor among so many options is not an easy task.
@UnfoldDataScience8 ай бұрын
Thanks Shashank. Means a lot
@RAZZKIRAN4 жыл бұрын
great
@UnfoldDataScience4 жыл бұрын
Thank you.
@vigneshnagaraj71372 жыл бұрын
What should be the value(numerical) of the words in tf-idf to say it is good or bad to decide the importance of the particular words for a particular target variable.This will be helpful
@akd99774 жыл бұрын
Very Good Explanation. Can you upload at least one video every week to cover NLP. I found your video today while searching related to NLP. Good job man!
@UnfoldDataScience4 жыл бұрын
Thanks a lot, please watch my complete NLP playlist here. many videos are planned for future as well: kzbin.info/aero/PLmPJQXJiMoUUSqSV7jcqGiiypGmQ_ogtb
@kandukuriprathimasaran40722 жыл бұрын
Thank you for the video sir
@UnfoldDataScience2 жыл бұрын
You're welcome 🙂
@alexandregavaza38824 жыл бұрын
This is amazing, Aman! I started yesterday watching your videos and have already learned a lot. Thanks for the very good explanation. Also, I left a question in the Normalization video. Please assist: my text is in Portuguese but when I apply it, it converts words to English. Is there a parameter where we can instruct to Portuguese?
@UnfoldDataScience4 жыл бұрын
Answered.
@manideepgupta24334 жыл бұрын
Excellent explanation! If possible, Want your videos on LSTM and BERT., Word to vec, softmax, bigram. Thank you.
@UnfoldDataScience4 жыл бұрын
As soon as possible Mittapalii.
@riteshbisht62473 жыл бұрын
good explanation
@UnfoldDataScience3 жыл бұрын
Thanks for liking Ritesh
@vivekdixit2781 Жыл бұрын
Hi Aman, you mentioned this method reduces the sparseness of the dataset. How does it do that?
@sandipansarkar92112 жыл бұрын
finished watching
@krispaul77524 жыл бұрын
Great video.
@UnfoldDataScience4 жыл бұрын
Glad you Liked it Kris. happy learning!
@edrisayesmaeil41124 жыл бұрын
Very nice explanation thanks
@UnfoldDataScience4 жыл бұрын
You are welcome Edrisay!
@thamizhansudip66444 жыл бұрын
much much better than krish naik.. Outnumbered Krish
@UnfoldDataScience4 жыл бұрын
Thanks Sudip :)
@arpanduttachowdhury57524 жыл бұрын
Great explanation!
@UnfoldDataScience4 жыл бұрын
Glad it was helpful Arpan :)
@sumanmanandhar41242 жыл бұрын
while creating decision trees for text classification, as words are features how do we use this feature?
@anilmudgal44054 жыл бұрын
good job dude. Nice explanation
@UnfoldDataScience4 жыл бұрын
Thanks Anil. Happy Learning. keep watching and take care!
@nareshjadhav49623 жыл бұрын
Very excellent explanation Aman! Plz tell me approach for a question asked in interview that: if we have 10 lakh restaurent reviews ...how much input nurons should be there to neural network? how much hidden layer should be there? (Considering we are using neural network for classification using TF IDF)
@UnfoldDataScience3 жыл бұрын
Hi Naresh, See "how many input neuron" - equal to number of features "How many hidden layer", again depends on how many features we have and few other things. One thing to understand here is, it will not depend on how many reviews u are having. It will depend on how many features you are having after converting text to number, be it any way TF-IDF, bag of words, word2vec etc
@nareshjadhav49623 жыл бұрын
@@UnfoldDataScience thanks alot, god bless you always
@tedmosbey65482 жыл бұрын
Hi...how to find word count based on their stem? I mean it equals student with students and return a number
@md.shafaatjamilrokon85872 жыл бұрын
Thanks
@EngRiadAlmadani4 жыл бұрын
Good jop
@UnfoldDataScience4 жыл бұрын
Thank you.
@cicisuhaeni35233 жыл бұрын
thanks for the great explanation. and, I have a question : actually, what is the term "feature" in the text data? is it word?
@UnfoldDataScience3 жыл бұрын
Yes, after converting "text to number".
@cicisuhaeni35233 жыл бұрын
@@UnfoldDataScience so, feature is the numerical value of attribute/variable?
@dhruvsingla22122 жыл бұрын
If we divide each word by the total number of words in a document, it would only scale count of each word from 0 to 1. How would it suppress the effect of a word that occurs many times in that same document? Like you said normalising would reduce the effect of word _cricket_ in the document _cricket_ . But other words are also scaled down right? so _cricket's_ effect on other words is not nullified I guess?
@barax94623 жыл бұрын
by unique words in the corpus. Do you mean a set(like no repition of words only count words once)? does that includes ngrams too?
@UnfoldDataScience3 жыл бұрын
Yes like "Set"
@ShubhamPatil-ot7hf3 жыл бұрын
How can we exctract exact data from text? Example i have text from which i need to extract invoice number which comes after keyword "Invoice#" what approach should i follow to achieve this type of extraction.
@soumikdey1456 Жыл бұрын
Wow
@ajitchavan54792 жыл бұрын
Sir., please can you upload videos on how to extract text from multiple pdf
@UnfoldDataScience2 жыл бұрын
Run in loop with pdf extract
@sudhanshusoni15243 жыл бұрын
Sir few questions: 1. How to imlment One hot encoding instead of Bow. 2. If I apply Bow and TFIDF in same corpus, since both uses the unique words then will the no of columns in vector will be same for both methods ?
@reachDeepNeuron4 жыл бұрын
My understanding is, tf-idf is the best option to do feature extraction from text. Do you recommend stopword removal using tf idf or post stopword removal (by storing the stopwords into a variable & exclude the text that are part of stop word variable) , can we do tf-idf ? I believe tf-idf is a feature extraction and selection technique , am I correct ? Why log is required? Apart this tf-idf , any other superior technique avl to do feature selection in text data ? Also, tf-idf can lead to information loss ?
@UnfoldDataScience4 жыл бұрын
TF-IDF is for creating term frequency-inverse document frequency numbers from text data. You can call it feature engineering or input data for model training. This is what TF-IDF does. Data cleaning is a separate topic altogether. which may include stop word removal, punctuation removal, etc
@reachDeepNeuron4 жыл бұрын
@@UnfoldDataScience thanks, but how do we perform feature selection in text data
@talhajalil8674 Жыл бұрын
You said by using TF-IDF we increased the value of CRICKET. Is it good thing for the model?
@vineethgogu23093 жыл бұрын
Hello sir Could you have any video which deals tf idf to apply to a dataset and classifying the text problem ??? Using this approach ????
@UnfoldDataScience3 жыл бұрын
Sure Vineeth.
@deenasiva28294 жыл бұрын
How to extract the important word using word2vec?
@UnfoldDataScience4 жыл бұрын
Hi Deena, For extracting important word you should use frequency based model like TF-IDF or word count etc
@jitenderbishnoi20163 жыл бұрын
How can I make a "fake news detection" model using this which takes input from the user to check a news, please make a video on this as it will be a practical implementation of the discussed topic.
@UnfoldDataScience3 жыл бұрын
Yes Jitendra, there are advanced frameworks to do such kind of task. I will discuss those as well. Thanks for suggesting
@BhaumikKhamar2 жыл бұрын
If the word w has occurred in all 3 documents then IDF will be log(3/3) = 0; hence make tf-idf value as 0, irrespective of what the TF for the word w was. Is tf-idf usable in such case? I think it will neglect some of the useful words this way.
@chandinisaikumar27364 жыл бұрын
Very good explanation sir I want to predict a continuous value with a column consisting of names Can you please let me know the right way to do that Tq in advance
@UnfoldDataScience4 жыл бұрын
If your independent column is categorical, you can either create dummy variable or use Bag of words/TF-IDF model.
@ahmedhamzajandoubi8253 жыл бұрын
Excellent explanation! thanks can you help me to choose to work with TF iDF or word2vec for fake news detection project ?
@UnfoldDataScience3 жыл бұрын
Thanks Ahmed. Sure.
@sayarulhassan58683 жыл бұрын
If we use machine learning techniques for text classification on a dataset with two attributes. If we have used feature extraction techniques for this is it compulsory to use feature selection techniques also and what are the feature selection techniques for this type of dataset?.
@UnfoldDataScience3 жыл бұрын
Not always necessary if you have limited number of features.
@onyemelukwechigozie14342 жыл бұрын
I need additional training on this . I have sent am email to your box Thanks