Just wanna say that your explanations are awesome. Really helped me understand NLP better than reading a book.
@NormalizedNerd3 жыл бұрын
Thanks!! :D
@javitxuaxtrimaxion45263 жыл бұрын
Awesome video!! I've just arrived here after reading the GloVe paper and your explanation is utterly perfect. I'll sure come back to your channel whenever I find some doubts about Machine Learning or NLP. God job!
@alh78392 жыл бұрын
man your video is great ! best explanation on the whole internet !
@addoul992 жыл бұрын
Fantastic summary of the paper. I just read it and I am pleasantly surprised at how much of the paper's math you covered in detail in this vdeo! Great
@sachinsarathe11433 жыл бұрын
Very Nicely Explained Buddy .... I was going through many articles but was not able to understand the Math behind it. Your video certainly helped. Keep up the Good Work.
@NormalizedNerd3 жыл бұрын
Happy to help man!
@revolutionarydefeatism3 жыл бұрын
Perfect! Thanks, there are not much useful videos on KZbin.
@sasna88003 жыл бұрын
This is the best explanation I have seen for Glove thank you a million time
@NormalizedNerd3 жыл бұрын
❤️❤️
@riskygamiing3 жыл бұрын
I was reading the paper and somewhat struggling on what certain parts of the derivation were or why we needed them but this video is great. Thanks so much
@bhargavaiitb4 жыл бұрын
Thanks for the explanation. Feels like you explained better than the paper itself.
@NormalizedNerd4 жыл бұрын
Thanks a lot man!!
@kindaeasy97974 ай бұрын
10:48 no we don't have vector on one side of the equation , we have scaler values on the both the sides, basic math
@popamaji Жыл бұрын
this is excellent but I hope u had mentioned the training steps of that also. what and in what shape are exactly the input and output tensor.
@vitalymegabyte3 жыл бұрын
Guy, thank you very much, it was fucking masterpiece, that did my 22 minutes on railway station really profitable :)
@kavinvignesh28324 ай бұрын
based on what algorithm or model the glove model is trained using cost function? Linear Regression?
@parukadli Жыл бұрын
Is embedding for a word is fixed in Glove or it is generated every time depending on the dataset given for training the model
@dodoblasters3 жыл бұрын
5:50 2+1+1=3?
@karimdandachi92002 жыл бұрын
he meant 4
@parukadli4 жыл бұрын
Nice explanation.. .. which is better Glove or Word2vec?
@NormalizedNerd4 жыл бұрын
That depends on the dataset. I recommend trying both.
@arunimachakraborty11753 жыл бұрын
Very good explanation. Thank you :)
@NormalizedNerd3 жыл бұрын
Thanks a lot!
@ijeffking4 жыл бұрын
Very well explained. Keep it up! Thank you.
@NormalizedNerd4 жыл бұрын
Thank you more videos are coming :)
@ijeffking4 жыл бұрын
@@NormalizedNerd looking forward to......
@sarsoura7163 жыл бұрын
Good video, thanks for your efforts. I wish it had less explanation on the cost function of the GloVe model and elaborate testing of word similarity using GloVe model.
@NormalizedNerd3 жыл бұрын
You can copy the code and test it more ;)
@sujeevan90473 жыл бұрын
can you do a video on the Bert word embedding model??? it is also important
@edwardrouth4 жыл бұрын
Nice Work ! Just subscribed (y). :) Just a quick question out of curiosity "GloVe" and "Poincare GloVe" are same model ? All the best for your channel.
@NormalizedNerd4 жыл бұрын
Thank you, man! No, they are different. Poincare GloVe is a more advanced approach. In normal GloVe, the words are embedded in Euclidean space. But in Poincare GloVe, the words are embedded in hyperbolic space! Although the latter one uses the basic concepts of the original GloVe.
@edwardrouth4 жыл бұрын
@@NormalizedNerd Its total worth subscribing your channel. Looking forward for new videos from you on DS. Btw, i am also from West Bengal currently in Germany ;)
@NormalizedNerd4 жыл бұрын
@@edwardrouth Oh great! Nice to meet you. More interesting videos are coming ❤️
@khadidjatahri74283 жыл бұрын
thanks for this well explained video. I have one question, please can you explain why do you take only the numerator portion F(w_i.w_k) and ignoring the denominator?
@revolutionarydefeatism3 жыл бұрын
You can take the denominator instead! We need just one of them.
@bhrzali3 жыл бұрын
Wonderful explanation! Just a question. Why do we calculate the ratio p(k|ice)/p(k|steam)?
@NormalizedNerd3 жыл бұрын
The ratio is better at distinguishing relevant words from irrelevant words than the probabilities. And it also discriminates between relevant words. If we didn't take the ratio and work with raw probabilities then the numbers would be too small.
@83vbond4 жыл бұрын
Good explanation. Got too technical for me after the middle, but then the code and the graph clarified things. Just one thing: you keep calling the pipe | symbol as 'slash', "j slash i", "k slash ice" etc, which isn't accurate (I think you would know it if you have studied all this). It's better to use 'given', "j given i" as it's actually said, or just say 'pipe' after explaining the first time that this is what the symbol is called. 'slash' is used to mean division, and also to mean 'one or the other', neither of which is applicable here, and the symbol isn't slash anyway. This can cause confusion for some viewers.
@NormalizedNerd4 жыл бұрын
Yes, pipe would be a better choice.
@jibbygonewrong24583 жыл бұрын
It's Bayes. Anyone exposed to stats understands w/o the verbiage.
@TNTsundar3 жыл бұрын
You should read that as “probability of i GIVEN j”. The pipe symbol is read as ‘given’.
@maximuskumar5024 жыл бұрын
Nice explanation 👍, one quick question on your video, which software and hardware are you using for digital board?
@NormalizedNerd4 жыл бұрын
Thank you. I use Microsoft OneNote and a basic pen tablet. Keep supporting!
@longhoang51373 жыл бұрын
i laughed when you said 2+1+1=3 xD
@NormalizedNerd3 жыл бұрын
LOL XD
@alh78392 жыл бұрын
i was looking for the comment ^^
@gosleeeeep2 жыл бұрын
same here lol
@rumaisalateef7844 жыл бұрын
beautifully explained, thank you!
@NormalizedNerd4 жыл бұрын
Happy to hear. Keep supporting :D
@trieunguyenhai494 жыл бұрын
thank you so much, but is X_{love} equal to 4 not 3
@NormalizedNerd4 жыл бұрын
@TRIỀU NGUYỄN HẢI Thanks for pointing this out. Yes X_{love} = 4.
@ToniSkit7 ай бұрын
This was great
@momona41703 жыл бұрын
I still don't quite understand the part where ln(X_i) was absorbed by biases, please enlighten me.
@Sarah-ik8tt3 жыл бұрын
hello thank you for your explanation can you please link me the google collab link asap?
@SAINIVEDH4 жыл бұрын
@ 19:13. That is a weighting function beacuse log(X_ij) may become zero and the equ.. goes crazy. More details at towardsdatascience.com/light-on-math-ml-intuitive-guide-to-understanding-glove-embeddings-b13b4f19c010
@NormalizedNerd4 жыл бұрын
The article says f(X_ij) prevents log(X_ij) from being NaN which is not true. f(X_ij) actually puts an upper limit on co-occurrence frequencies.
@robinshort64302 жыл бұрын
Often Xij is zero, and in this cases ln(Xij) is infinity. How do you treat this issue?
@NormalizedNerd2 жыл бұрын
Good point. So, here's how they tackled the problem. They defined the weighing function f like this: f(X_ij) = (X_ij/X_max)^alpha [if X_ij < X_max] 1 [otherwise] So you see when X_ij = 0, f(X_ij) is 0. That means the whole cost term becomes 0. We don't even need to compute ln(X_ij) in this case. They addressed two problems with f. 1) not giving too much importance to the word pairs that cooccur frequently. 2) avoiding ln(0) I hope this makes sense. Please tell me if anything is not clear.
@robinshort64302 жыл бұрын
@Normalized Nerd This is true only assuming that zero times infinity is zero! Just kidding, I just want to point out that programming zero times infinity gives (rightly) an error (on numpy), so I have to write this as an if condition. Everything else is clear, thank you very much for your great work and for your answer!
@robinshort64302 жыл бұрын
@@NormalizedNerd is X_max an hyper parameter?
@WahranRai2 жыл бұрын
Your examples are not related : I love NLP... and P(k/ice) etc It will be useful to have the same sentences ...
@md.tarekhasan22063 жыл бұрын
Can you please make videos on ELMo, fasttext, and BERT also? It'll be helpful.
@NormalizedNerd3 жыл бұрын
I'll try in the future :)
@psic-protosysintegratedcyb24224 жыл бұрын
Good introduction!
@NormalizedNerd4 жыл бұрын
Glad it was helpful!
@fezkhanna69004 жыл бұрын
Fantastic video
@NormalizedNerd4 жыл бұрын
Thanks!
@CodeAshing4 жыл бұрын
Bruh you explained well
@NormalizedNerd4 жыл бұрын
Thanks man!!
@kindaeasy97974 ай бұрын
well i think by corpus you mean document , but lemme tell you corpus has repeated words as well , to form corpus you just join all the documents
@Nextalia3 жыл бұрын
I fail to see where the vectors come from... :-( I follow all the explanation without any problem, but... once you define J, where are the vectors coming from? Is there any neural network involved? Same problem when reading the article or any other explanations. They all try to explain where that J function comes, and then, magically, we have vectors we can compare to each other :-( Any help on that would be greatly appreciated. Thanks!
@NormalizedNerd3 жыл бұрын
The authors introduced the word vectors very subtly. Here's the deal: 9:50, we assume that there exists a function F which takes the word vectors and produces a scalar quantity! And no, we don't have neural networks here. Everything is based on the concurrence matrix.
@Nextalia3 жыл бұрын
@@NormalizedNerd Thanks for your answer. I found a publication that explains very well what to do after "discovering" that function: thesis.eur.nl/pub/47697/Verstegen.pdf I was somehow sure that GloVe was based in neural networks (as does word2vec), but it is not the case. However, it is a bit as a neural network since the way the vectors are created is similar to the way the weights of a NN are trained: stochastic gradient descent.
@SwapravaNath2 жыл бұрын
The vectors are actually the parameters that one is optimizing over. Actually, the objective function J should have been written with the arguments being the vector representations of the words -- which are the optimization variable. For certain choices of the F function, e.g., softmax, the optimization becomes mathematically easy. And then, it is just a multivariable optimization problem, and a natural algorithm to solve will be gradient-descent (and more). Ref: kzbin.info/www/bejne/e4PMk6qnqJ6jaZo [Stanford course on NLP]
@ccuuttww Жыл бұрын
p(Love , I ) = 2/3 ?
@TheR4Z0R9964 жыл бұрын
Great explanation thanks a lot my friend :)
@NormalizedNerd4 жыл бұрын
Glad that it helped :D...keep supporting!
@eljangoolak2 жыл бұрын
quackuarance metrics? I don't understand what that is
@bikideka78804 жыл бұрын
good explanation but plz use a bigger cursor, a lot of youtubers miss this.
@NormalizedNerd4 жыл бұрын
thanks for the suggestion :D
@kekecoo56814 жыл бұрын
where did e came from?
@NormalizedNerd4 жыл бұрын
e^x follows our condition. e^(a-b) = e^a/e^b
@u_luana.j3 жыл бұрын
5:50 ..?
@sakibahmed23734 жыл бұрын
Hello There, First of all thank you for adding such informative videos to help the beginners in DS field. I am trying to reproduce the code from Github for the "standford Glove Model" Link ---> github.com/stanfordnlp/GloVe The problem is if i execute all the statements as mentioned in the "Readme" i get the respective files which it should provide me "cooccur.bin" & "vocab.txt". The latter does have the list of words with frequency but the former is empty and there is no such error reported in the console even. For me its very weird and i dont understand what i am doing wrong. Could you please help me on this ? N.B : I am new in ML and still learning ! Best Regards.
@NormalizedNerd4 жыл бұрын
"cooccurrence.bin" should contain the word vectors. Make sure that the training actually started. You should see logs like... vector size: 50 vocab size: 71290 x_max: 10.000000 alpha: 0.750000 05/08/20 - 06:02.16AM, iter: 001, cost: 0.071222 05/08/20 - 06:02.45AM, iter: 002, cost: 0.052683 05/08/20 - 06:03.14AM, iter: 003, cost: 0.046717 ... I'd suggest you to try this on google colab once.
@sakibahmed23734 жыл бұрын
@@NormalizedNerd Hi, Thank you for your response. I never tried colab before. But what i noticed in colab is that i have to upload notebook files which i cant see in the glove project that i cloned. However I am using an online editor "repl.it". First i ran "make" command which created the "build" folder & subsequently "./demo.sh". Running this script creates a "cooccurence.bin" file but as i mentioned earlier its empty. Did i missed something here ? I am sure i missing something very small and important 😒 Below are the logs from the terminal.. make mkdir -p build gcc -c src/vocab_count.c -o build/vocab_count.o -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic gcc -c src/cooccur.c -o build/cooccur.o -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic src/cooccur.c: In function ‘merge_files’: src/cooccur.c:180:9: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&new, sizeof(CREC), 1, fid[i]); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/cooccur.c:190:5: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&new, sizeof(CREC), 1, fid[i]); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/cooccur.c:203:9: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&new, sizeof(CREC), 1, fid[i]); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gcc -c src/shuffle.c -o build/shuffle.o -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic src/shuffle.c: In function ‘shuffle_merge’: src/shuffle.c:96:17: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&array[i], sizeof(CREC), 1, fid[j]); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/shuffle.c: In function ‘shuffle_by_chunks’: src/shuffle.c:161:9: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&array[i], sizeof(CREC), 1, fin); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gcc -c src/glove.c -o build/glove.o -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic src/glove.c: In function ‘load_init_file’: src/glove.c:86:9: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&array[a], sizeof(real), 1, fin); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ src/glove.c: In function ‘glove_thread’: src/glove.c:182:9: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result] fread(&cr, sizeof(CREC), 1, fin); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ gcc -c src/common.c -o build/common.o -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic gcc build/vocab_count.o build/common.o -o build/vocab_count -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic gcc build/cooccur.o build/common.o -o build/cooccur -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic gcc build/shuffle.o build/common.o -o build/shuffle -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic gcc build/glove.o build/common.o -o build/glove -lm -pthread -O3 -march=native -funroll-loops -Wall -Wextra -Wpedantic ./demo.sh mkdir -p build --2020-05-08 17:04:13-- mattmahoney.net/dc/text8.zip Resolving mattmahoney.net (mattmahoney.net)... 67.195.197.75 Connecting to mattmahoney.net (mattmahoney.net)|67.195.197.75|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 31344016 (30M) [application/zip] Saving to: ‘text8.zip’ text8.zip 100%[======>] 29.89M 1.97MB/s in 15s 2020-05-08 17:04:29 (1.95 MB/s) - ‘text8.zip’ saved [31344016/31344016] Archive: text8.zip inflating: text8 $ build/vocab_count -min-count 5 -verbose 2 < text8 > vocab.txt BUILDING VOCABULARY Processed 17005207 tokens. Counted 253854 unique words. Truncating vocabulary at min count 5. Using vocabulary of size 71290. $ build/cooccur -memory 4.0 -vocab-file vocab.txt -verbose 2 -window-size 15 < text8 > cooccurrence.bin COUNTING COOCCURRENCES window size: 15 context: symmetric max product: 13752509 overflow length: 38028356 Reading vocab from file "vocab.txt"...loaded 71290 words. Building lookup table...table contains 94990279 elements. Processing token: 200000./demo.sh: line 43: 114 Killed $BUILDDIR/cooccur -memory $MEMORY -vocab-file $VOCAB_FILE -verbose $VERBOSE -window-size $WINDOW_SIZE < $CORPUS > $COOCCURRENCE_FILE
@NormalizedNerd4 жыл бұрын
@Sakib Ahmed repl is probably not a good idea for DL stuffs. Try to use colab/kaggle. You can directly clone the github repo in colab. I've created a colab notebook. Run this by yourself. It works perfectly! colab.research.google.com/drive/1BA-GRHQOsXrYwmkalQyejsnVE8zmoyH2?usp=sharing
@sakibahmed23734 жыл бұрын
@@NormalizedNerd Thank you so much ! It really worked... 😊 (y)
@NormalizedNerd4 жыл бұрын
@@sakibahmed2373 Do share this channel with your friends :D Enjoy machine learning.
@atomic76804 жыл бұрын
G-Love 😂
@NormalizedNerd4 жыл бұрын
Haha...Exactly what I thought when I learned the word for the first time!
@BloggerMnpr Жыл бұрын
.
@TheMurasaki14 жыл бұрын
"I love to make videos" sorry to say this, but is it correct english?
@kaustavdatta47484 жыл бұрын
Not the best English. But the model doesn't care as it will learn whatever you (or the dataset) teach it. The author's English doesn't impact the explanation of the model's workings.
@Schaelpy Жыл бұрын
Good video but the wrong pronunciation of GLoVe is killing me man
@ToniSkit7 ай бұрын
You mean the right ❤
@harshitatiwari80194 жыл бұрын
Reduce the number of ads. Ad like every min. Google has made KZbin money sucking machine. So irritating.