Ali Ghodsi, Lec [3,2]: Deep Learning, Word2vec

  Рет қаралды 14,152

Data Science Courses

Data Science Courses

Күн бұрын

Description

Пікірлер: 17
@jerry11111
@jerry11111 8 жыл бұрын
Best lecture on word2vec. It covers everything that the papers are ambiguous on the notations and explanations on what to optimize and why.
@autripat
@autripat 8 жыл бұрын
The Skip-gram model discussion starts at 17:20 (we transition away from the "intractable" continuous bag of words model). The Skip-gram training objective is to learn word vector representations that are good at predicting nearby words (context). The GloVe (Global Vectors for Word Representation) model starts ay 54:36.
@niteshroyal30
@niteshroyal30 8 жыл бұрын
Thanks Professor for such wonderful lecture on word2vec.
@wanminghuang1722
@wanminghuang1722 8 жыл бұрын
Thank you so much. Much easier to understand.
@paolofreuli1686
@paolofreuli1686 7 жыл бұрын
Awesome lecture!
@m.farahmand7440
@m.farahmand7440 8 жыл бұрын
Thanks for the informative lecture. At time 7:26 shouldn't it be gradient ascent? After all we are trying to maximize the likelihood function.
@yangli7741
@yangli7741 6 жыл бұрын
I think 7:26 is just gradient descent, and the guy who reminded that the sigma sign shouldn't exist actually understand it wrong because Prof. Ghodsi may have used confusing notation. In the log-likelihood and the summation over "w", the "w" means every word from the training set (the word as prediction given context "c"); however, when taking the derivative with respect to v_w, the "w" here actually can be any word in the vocabulary and v_w any column of weight matrix W' to be learned. So we should use a different notation, e.g., w*, in the partial derivative in v_w*. Accordingly, the summation over w should exist in the first place, because w* and w are not the same thing. Later removal of the summation in the adjustment rule, w* = w* - r(1-p(w) ) \frac{\partial v_c^T v_w}{\partial w*}, can be seen as changing from GD to SGD. The only reason that the final result didn't go wrong is because the partial derivative with respect to w* when w* eq w is just zero. That is, during the SGD, only v_w is updated.
@cem9927
@cem9927 6 жыл бұрын
If we have 4 words in the dictionary, we will have 4 v_w values and in the gradient descent update we will update each v_w seperately right ?
@tejasduseja
@tejasduseja 4 жыл бұрын
@@yangli7741 Thanks, I had same confusion in mind.
@imanshojaei7784
@imanshojaei7784 4 жыл бұрын
@@yangli7741 Are not also labels missed in formulation (i.e., empirical probabilities)?
@stolzenable
@stolzenable 8 жыл бұрын
Thank you for this lecture! It is very understandable. I wonder if the slides from this lecture are available somewhere?
@stolzenable
@stolzenable 8 жыл бұрын
+Alexey Grigorev with a bit of googling, I found them here: uwaterloo.ca/data-science/deep-learning
@aseefzahir3977
@aseefzahir3977 6 жыл бұрын
says "page not found"
@srujohn652
@srujohn652 3 жыл бұрын
@Alexey Grigorev Still page is not found
@rajupowers
@rajupowers 8 жыл бұрын
@7:20 - how can we factor v_c? there is no summation in the right term
@rajupowers
@rajupowers 8 жыл бұрын
@16:40
@rajupowers
@rajupowers 8 жыл бұрын
@19:00 - Negative sampling
Ali Ghodsi, Lec 11: Hard margin Support Vector Machine (svm)
1:13:51
Data Science Courses
Рет қаралды 9 М.
Ali Ghodsi, Lec [7], Deep Learning , Restricted Boltzmann Machines (RBMs)
1:13:13
黑的奸计得逞 #古风
00:24
Black and white double fury
Рет қаралды 26 МЛН
🕊️Valera🕊️
00:34
DO$HIK
Рет қаралды 13 МЛН
Ali Ghodsi, Lec [1,2]: Deep Learning, Perceptron, Backpropagation
1:29:16
Data Science Courses
Рет қаралды 22 М.
Ali Ghodsi, Lec [2,1]: Deep Learning, Regularization
1:29:01
Data Science Courses
Рет қаралды 12 М.
Ali Ghodsi, Lec [1,1]: Deep Learning, Introduction
56:49
Data Science Courses
Рет қаралды 39 М.
Ali Ghodsi, Lec1. Machine Learning, Introduction
41:56
Data Science Courses
Рет қаралды 26 М.
Ali Ghodsi, Deep Learning, Regularization,  Fall 2023, Lecture 4,
1:17:40
Data Science Courses
Рет қаралды 2,6 М.
Word Embedding Explained and Visualized - word2vec and wevi
38:59
Ali Ghodsi, Deep Learning, Transformers, Fall 2023, Lecture 10
1:10:11
Data Science Courses
Рет қаралды 2,9 М.
Ali Ghodsi, Deep Learning, Diffusion Models, DDPMs,  Fall 2023, Lecture 17
1:09:23
Chris Moody introduces lda2vec
1:08:04
FunctionalTV
Рет қаралды 17 М.