This is so relevant in 2023, timeless explanation!
@sau0026 жыл бұрын
You spent considerable no of minutes to explain the nature of the problem before presenting the solution to the problem - I like that approach.
@amreshgiri49336 жыл бұрын
You're a genius. I was struggling to understand the word embedding concept through Stanford University videos. Your explanation and pace is much better. Thanks. Keep making such videos.
@myfolder4561 Жыл бұрын
Glad to have come across this while looking for materials to learn about word embeddings to understand how text prompts work in stable diffusion text to image models in 2023. You're a great teacher. A lot of videos on this topic across yt is full of jargons without clear explanation. in 2023 there's still tons of relevance of this video with where the current state of technology is
@panchoingham7 жыл бұрын
Dude, seriously, thank you. I am not a CS major but I got absurdly interested in AI during my last years of college. It's inspiring to see your self-learnt enthusiasm and it gives me strength to follow my interest. Keep up the good work and thank you! Cheers from Buenos Aires, Argentina
@wibulord9262 жыл бұрын
cc
@meijiishin5650 Жыл бұрын
Fun fact: This guy went on to work at OpenAI and is one of the creators of DALL-E 2.
@GodofStories Жыл бұрын
Haha nice, as soon as I saw him speak for 5 seconds, and saw the timestamp of 5 years ago. I typed this -"If this guy isn't already a founder of a leading company in this AI wave, i'll be disappointed. but hey most of the smartest people don't always see success. And fame, money isn't everything." Glad to see I was right haha. I was like there is a high probability, considering this was 5 years ago, and if nothing else in the Universe interfered with this guy's life trajectory just based on the way this guy talks, and looks which basically shout young and motivated, or hungry should mean he is one of the ls a big time guy now.
@LokeshSharma-me5pg Жыл бұрын
no wonder a man like him can do the job...
@longship444 жыл бұрын
You are very good at explaining pretty complex concepts. I appreciate the time you took to do this, it was very informative.
@salutoitoi4 жыл бұрын
I recently started learning NLP, and the part of word embeddings was just not clear at all. It makes sense now. Thank you a lot ! You won a subscriber
@sreeramv1126 жыл бұрын
For someone who knows nothing and wants to know everything in ML, this is the simply awesome explanation.
@dearyawen2 жыл бұрын
True genius knows how to explain a complex concept in a really simple and intuitive way. Solut.
@artemkoren95826 жыл бұрын
I've gone through several embedding explanations until I arrived at this one. Well done, finally all pieces make sense. Thanks!
@AshwinVel7 жыл бұрын
I honestly think this is a really good explanation to word embedding. Breaks the nitty gritty involved in word2vec and co-occurrence. I’ve read a couple article and watched a few videos but by far yours is the easiest to comprehend. Thank you so much. Cheers from Malaysia!
@ritik846292 жыл бұрын
Temperature: 5 years ago Feels like 15 years ago
@MSFTSTRIO5 жыл бұрын
Consider changing the tags and title of this video to something more like "Word2Vec uses" or something along those lines because This is the specific video I searched for but it was quite far down in the search
@12sandy3457 жыл бұрын
An Exceptional Lecture! What I loved about it is its focus on implementation and results which really helps build good intution around it, to garner interest and dig deeper into math details later which most us are boggled before knowing how powerful/useful the results are. Thank you again.
@lakshmisairamthubati90806 жыл бұрын
Probably the most clear explanation of word2vec. Thanks for the video.
@jabusch246 жыл бұрын
This is really well explained. Best word2vec explanation ive seen on youtube so far.
@bean_TM Жыл бұрын
I haven't seen a better explanation on this. Thank you. This was really good.
@naveenkalhan954 жыл бұрын
amazing man... i was going through 10's and 10's of online respurces to understand what is word embeddings! They way you explained it made me directly subscribe to your channel... very well. thank you very much
@govinda19936 жыл бұрын
i really appreciated the emphasis you gave on word embeddings rather than on word2vec.
@charlieangkor86493 жыл бұрын
good lecture, includes important information like what he finds cool, that it’s the best example what can be done etc. that allows the listener to organize information hierarchically. not like some university lectures, where just a monolithic dump of text is flowing out and we don’t know what is important to remember and what not.
@coolshoos7 жыл бұрын
Glad to see a relatively new video from you guys. I'm an old-time fan. And this is exactly what I'm attempting to learn right now.
@blancheporter12894 жыл бұрын
one word, awesome. Thanks a lot for the video. Humble, concise, brilliant... Subscribed! Why have you stopped making videos man. Miss your vids
@bloody_albatross Жыл бұрын
8:08 Thank you! This explained the missing piece to me. Multiple other videos on that topic where missing this nice and easy to understand diagram.
@joshuafishman90027 жыл бұрын
I'm glad you made this video. Now I don't have to download the vector for every word on twitter.
@sufyanqadeer27057 жыл бұрын
hello, friend. I need word file that was available on this side.Now the link is not working. Help me and send me the words vector file.Please link : embeddings.macheads101.com/
@sufyanqadeer27057 жыл бұрын
My Email : sufyan.ali7272@gmail.com
@johnandersontorresmosquera11564 жыл бұрын
Awesome explanation , after hours of looking for good stuff to understand the word embeddings. Thanks !
@kenchu7643 жыл бұрын
I just started learning ML concepts, and this video helped tremendously with word embeddings. You got another subscriber. I do have a question though. In your example, how did you decide on 64 as the other dimension of your factored matrices? Would a larger number there give you a better word embedding?
@ghazibenyoussef8424 Жыл бұрын
Im new in AI, but impressed by what happens now and this hype around NLP and deep learning. Im self learning about all this. I really like what you have done, downloading tweets and word embedding. Is it possuble to access to your source code Tks
@Alkis054 жыл бұрын
More generally, word2vec is nothing more than graph2vec. Sentences can be seen as random walks in the english-language graph, in which each word is a node and every world is connected to other words. The strength of this connections depend on how frequent they appear on the same context. Seen this as graphs allow you to run network analysis and see what other kinds of information you can extract from it. By doing it right, you might even be able to estimate the connections for words that didn't appear in the training set and try to update the model to make it better. Or use the word embedding and try to embed sentences and see how that goes.
@haridotvenkat6 жыл бұрын
Excellent work. I was looking for such an explanation on word embeddings & I am happy that I found this. Thanks.
@poonritchie6 жыл бұрын
HI Macheads, I have ran into so many hopelessly disappointing video presenters or live 'trainer" who just talk to themselves. Evenworse they make me confused about areas i already know and make you even confused about things you know. haha (even from the top IT corporations). Hope you can talk in our upcoming training session - just to demonstrate what is a quality presentation of tech ideas. You have inborn ability to explain and motivate
@jessicas29784 жыл бұрын
Thank you so much for your video! It's the best learning material I can find on youtube.
@partheshsoni19055 жыл бұрын
I liked the way you explained...crisp and clear!
@Aviator1685 жыл бұрын
Great video. I was having difficulty understand 'context'. You explained clearly. Thank you.
@anamqureshi52633 жыл бұрын
Great Explanation!
@AmeerHamza-jy5ml3 жыл бұрын
Hope you are doing good, I'm interested doing this thing in Urdu language. I wish you contribute with me to do this. thanks
@TestTest-tj4nt Жыл бұрын
The app is still up, impressive.
@sadeebahsan48046 жыл бұрын
this is really intuitive. most places I got answers like representing words with vectors which wasn't helpful. now i think i have a proper idea.
@vijeta2684 жыл бұрын
Your explanation was very clear and simple, thanks for making this video.
@Nova-Rift2 жыл бұрын
very well explained imo
@vaibhavvaghela62347 жыл бұрын
Why have you stopped making videos man. Miss your vids
@dr_ici6 жыл бұрын
Just starting out to venture into this world. Thanks for the explanation. I have a medical background, but no background on computer science. So this gives me a little bit hope in learning totally something new.
@nimeshsingh92714 жыл бұрын
This is much better explanation than that available on some of the paid courses.
@darsh_shukla7 жыл бұрын
Man you are my teacher from now onwards.
@piyalikarmakar59793 жыл бұрын
Sir, I have one query that what exactly the output layer predicts? The embedding of the input word or the context of input word?
@CarlJohnson-jj9ic6 жыл бұрын
Thank you! What else can i do to learn and engage?
@yeahorightbro7 жыл бұрын
Do you plan on doing a video on ConvNets?
@anandrikka4 жыл бұрын
Nice explanation brother. Can you share your code if possible?
@yeahorightbro7 жыл бұрын
Just checked out the web app and was wondering how you put that together? Django? It is brilliant!
@user-fy5go3rh8p4 жыл бұрын
I don't get, why all the dislikes, the explanation is great.
@poonritchie6 жыл бұрын
I just learning embedding layers and luckily I ran into this video.
@yoniziv4 жыл бұрын
This is gold! thank you (from the future :-))
@garbour4566 жыл бұрын
Awesome video man. Extremely well presented. I'm impressed with your presentation skills. thanks for the video
@ugurkaraaslan92854 жыл бұрын
Thank you very much. How can we decide number of features in embedding matrix? When you deal with colors you have 3 features (R,G,B) but how can i define features for a 10000 words corpus? Thanks in advance.
@cristianjuarez10862 жыл бұрын
I wish I could understand word embeddings just as well as you, im still a begginer as for now but this is what I want to become. Also I share your love for WE specially because I want to develop a NLP or a language model that generates answers but its too ambitious at the moment
@jaradcollier26777 жыл бұрын
Is this really how word embeddings work? Like, I'm blown away that this complex thought of what word embeddings were are just a word co-occurrence matrix decomposed into a smaller matrix. Is there more to it than that or is that the gist? More specifically, take text (documents), transform into tfidf 1-word token. Do dot product on tfifd matrix to get a square matrix (co-occurrence matrix). Take that and decompose it to say 64 components. Each of those rows are your word-vectors? The entire matrix is the word embedding at that point?
@smritidey29425 жыл бұрын
Oh Man u are excellent in explaining word2vec, hoping to see some more in text NLP.
@Skandawin787 жыл бұрын
what hardware do you have to train these models..
@rocky800855 жыл бұрын
Did you paint the bench blank?
@varunverma7445 жыл бұрын
That was an amazing explanation. Thank you!
@mahdip.46746 жыл бұрын
Thanks for the video. I have seen GloVe models that contain the stop words and basically it means that at least they do not remove stop words. I assume one can remove tham and create two different model or vectors. If so, I assume there is not that much space to talk about precision of the two approach. Right? The other thing is that in case of embedding we do not apply stemming or other similar techniques, since the process is largely on context level. Right?
@yuxiang31472 жыл бұрын
Can you just decompose a square matrix like this? The square matrix has 10^10 entries but the two decomposed matrices only have in total 1.28*10^7 unknowns, so this means you have 10^10 equations to solve for 1.28*10^7 unknowns, and you won't get a solution to fulfill all entries in the original matrix. How do you deal with this?
@anupkodlekere36333 жыл бұрын
Can I ask for the source code?
@fiddlepants59475 жыл бұрын
Humble, concise, brilliant... Subscribed!
@azai.mp46 жыл бұрын
I'm wondering if something similar to Disentangled Variational Autoencoding could be used to improve a word2vec embedding. I'm not quite sure on the details, but it seems DVA has an effect similar to stuff like factor analysis, and principal component analysis, producing a latent space whose dimensions are more akin to real "separate" dimensions. Aka, producing separate dimensions for the italic-ness and boldness of a written digit, as seen in the paper Disentangled Variational Auto-Encoder for Semi-supervised Learning by Yang Li et al. (I would link it but KZbin has a history of assuming comments with links in them are spam.) If that technique translates well into word vectors, it could for example result in a model where "maleness" is its own dimensions. i.e. "man" - "woman" ~= "king" - "queen" ~= (0, 0, 0, 1, 0, 0, ...) (A large vector that is parallel to one of the dimensions.) Another interesting venture would be to pre-process the data using an NLP library, so that different forms of the same lemma are already grouped together by default, and so that homographs can be separated. It could also expose information that a skip-gram or bag-of-words model would miss, such as dependency / sentence structure. I really ought to get my hands dirty some time instead of just thinking about this stuff in my head...
@gabriellevaillant51534 жыл бұрын
If i understood well. You have to multiply the first weight matrix (between input layer and hidden layer) with all word vector (composed of 0 and 1) to obtain the embedding matrix ? (weight matrix obtained with BackWard Propagation etc... ?) Thanks :)
@alfital23 жыл бұрын
Awesome explanation, thanks.
@juleswombat53096 жыл бұрын
That was pretty awesome. It would be nice to see some code.
@vishnukanthbonthala9745 жыл бұрын
Hi That was a great knowledge transferred..i am interested in the twitter project what you have done and feel it is really great to work with. could you please share me the code to go through it.. Thanks...
@magus6967 жыл бұрын
how do you pick up / find the nearest word from all the words possible efficiently?
@macheads1017 жыл бұрын
It just searches every word in the dictionary. This can be done efficiently using a vector library, since you can compute dot products in batch as a matrix-vector product.
@vanbap2 жыл бұрын
I really appreciate this video sir !
@Nega_Samurai3 жыл бұрын
Thanks for sharing! It helped a lot!
@nishankbani32576 жыл бұрын
Informative, interesting. Raised my interest in the topic of word embedding
@BlockDesignz4 жыл бұрын
This was brilliant. Keep on creating.
@bilalchandiabaloch84645 жыл бұрын
But i am confused about weights matrix W from where and how do we acquire these weights W and W'.
@sniperas962 жыл бұрын
still in 2022 much clearer explanation than my professor on my master.
@liam79357 жыл бұрын
is it possible to port a mac app to swift? and can you do it?
@rasyaramesh74335 жыл бұрын
omg you look kinda like hiccup from how to train your dragon xD also thank you so much you literally just saved my life
@jaysaha19674 жыл бұрын
The website is really cool🔥
@pixel70385 жыл бұрын
Out of curiosity how did you collect the twitter data? Coding wise
@nssSmooge6 жыл бұрын
So far I was able to only use normal dtm tfidf to compare speeches given at un - using text2vec in R. Not sure if I used it correctly though. It has an option for glove too, which I want to try out because I am a beginner and started in R. Python is soo confusing for me and do not let me start on gensin packages and docs.
@antoinecompagnie66407 жыл бұрын
Are word vectors sentences ?
@adammarsonoputra42864 жыл бұрын
8:20 How did the neural network optimize the weights of their neuron? I mean there's no label at the right side of the network. The network assigns weights based on only how near the context of each different words are, doesn't it? So does it mean that word embeddings is kind of a unsupervised process? Thank you
@timonix23 жыл бұрын
The right side is a list of all other words that was in the same tweet. So it is automatically labeled since it is part of your training data. It does mean that word embeddings are an unsupervised process. "I was in 'hollywood' today" Left side - "hollywood" Right side - "I","was","in","today".
@emenikeanigbogu93684 жыл бұрын
Amazing man. Thank you for your time!
@techynerdz95666 жыл бұрын
Hey how long did it take to download all that twitter data? Also, did you run it in the cloud or directly on your mac and if you ran it in the cloud, what service provider did u use? Thanks for your videos
@viharipinnenti37985 жыл бұрын
considering a sentence like "The price of kilo apples is 200rs", can we obtain (apple, 1kg, 200rs) as an output for that sentence?.whatever the sentence i give ,would it give (item,quantity,price) format as an output by using word embedding?.kindly help me in this project.thanks in advance.
@Aviator1685 жыл бұрын
If you designate kilo apples 200rs in the same context and give it enough samples for learning, it will do exactly what you said. You can experiment in google's dialogflow.
@viharipinnenti37985 жыл бұрын
@@Aviator168 Hey thanks for replying man..i need to create a dataset with enough samples..And i will try in diagflow..If i need any help kindly reply brother..thanks in advance
6 жыл бұрын
Is there a realistic approach to use this with data without a natural language character (e.g. a list of products with single words), where you can not extract context information of a "sentence"? Thanks a lot!
@chrisfalter51836 жыл бұрын
Antonio, you can use this approach with any data where words appear together in an example. @macheads101 applied word2vec to maybe a billion tweets. You could apply it to a a very large set of product lists: Each product list would be like a tweet (i.e., each product list = one example)
6 жыл бұрын
@@chrisfalter5183 thank you Chris! In fact the product list is not that large and, as far as I have red, the approach with word2vec would lose reliability with my amount of data. Do you know other word embedding approaches which do not need at least a couple million of words to unleash its power? Thanks a lot!
@chrisfalter51836 жыл бұрын
@ Naive Bayes (NB) works well for classification tasks with small to moderate amounts of data. Moreover, NB is not computationally intensive. If you could pose your questions as classification tasks, you could use NB.
@NahinAndroid3 жыл бұрын
Beautiful, great work
@mikkaruru6 жыл бұрын
Cool! Do you have this code on github?
@keres9935 жыл бұрын
Brilliant explanation! Thank you!
@barteksielicki72766 жыл бұрын
Great explanation!
@RP-fe8xo7 жыл бұрын
When are your new videos coming about ml
@WenjunLv5 жыл бұрын
how could embedding be applied to other field? Embeding is applicable to sparse features?
@lenant6 жыл бұрын
Very nice explanation, thanks
@KallMeChris7 жыл бұрын
Hey, This is totally off topic but, Im knew to this channel and I was looking at your How I'm learning AI and Machine Learning I am interested in all the ideas of machine learning, but I'm having difficulties understanding linear algebra guass law wanted to know if you can start making tutorials on Linear Algebra.
@adage32566 жыл бұрын
Awesome recap !
@vidurwadhwa68977 жыл бұрын
Great explanation!! Thanks a lot
@deniscandido41167 жыл бұрын
Hello, do you invested time on learning all Calculus things like doing some partial derivative by hand or you can abstract this? I'm kind of slipping when I see mathematical content... but I'm able to build a CNN on Tensorflow without problems. Are this painfull way?
@rushic246 жыл бұрын
Hi ,can you please share that twiter data
@medhj96795 жыл бұрын
Thanks man ! good explanation
@werthersoriginal7 жыл бұрын
Do you have the source code for this uploaded to github? Or perhaps any other repository? Thank you for uploading this vide, it came out really good.