Word Embeddings - EXPLAINED!

No video

Word Embeddings - EXPLAINED!

Рет қаралды 13,887

CodeEmporium

Күн бұрын

Пікірлер: 22

@user-in4ij8iq4c Жыл бұрын

best explaining embedding so far from the video I watched on youtube. thanks and subscribed.

@Jonathan-rm6kt 9 ай бұрын

Thank you! This is the perfect level of summary I was looking for. I’m trying to figure out a certain use case, maybe someone reading can point me in the right direction.. How can one create embeddings that retain an imposed vector/parameter that represents the word chunks semantic location in a document? I.e, a phrase occurs in chapter 2 is meaningfully different from the same phrase in chapter 4. This seems to be achieved through parsing document by hand and inserting metadata. But it feels like there should be a more automatic way of doing this.

@sgrimm7346 Ай бұрын

Good video...explains things from a very high level, very well. But I'm trying to figure out why/how a single word would result in a large vector. So are the meanings of the word encoded into the vector? As an example, Cat would have ' fur, claws, mammal, kitten, animal...' etc, and result in a vector of say 100 elements? Even if the vector is generated by the computer, which obviously it does, each element in the vector has to represent something. I can't seem to get past this point. I understand what vec2vec does, I just don't know why it does it. Any help? Thanks.

@larrybird3729 Жыл бұрын

great video but Im still a bit confused with what is currently being used for embedding? are you saying BERT is the next word2vec for embedding? is that what chatGPT4 uses? sorry if I didn't understand!

@markomilenkovic2714 Жыл бұрын

I still don't understand how to convert words into numbers

@bofloa Жыл бұрын

you have to convert word first to corpus, which are word seperated by space, and also group this word into sentences, then decided what is going to be the vectorsize, this is an hyperparemeter value for each word then generate random number for each word to the number of vectorsize, all this must be store in 2 dimenssion array or Dictionary where the word become key to access the vector, also note that you have to cater for co-occurence of word or rather word frequencies in the corpus, so that you know number of time a particular word occured. once this done you can then decide if you want to use CBOW or Skip-Gram, the puporse of this two method is actually to create data for trainning where in CBOW you generate context as input and targetword as output, skip-gram however is opposite, you generate Target word as input and context words as ouput, then train the module in a form of supervice and unsupervice way...

@_seeker423 6 ай бұрын

Can you explain after training CBOW / Skip-gram models, how do you generate embeddings at inference time? With Skip-gram, it is a bit intuitive that you would 1-hot encode the word and extract the output of embedding layer. Not sure how it works with CBOW where the input is a set of context words.

@_seeker423 4 ай бұрын

I think I saw in some other video that while the problem formulation is different in cbow vs skipgram, ultimately the training setup is reduced to pairs of words.

@RobertOSullivan Жыл бұрын

This was so helpful. Subscribed

@CodeEmporium Жыл бұрын

Thank you so much! And super glad this was helpful

@MannyBernabe 6 ай бұрын

really good. thx.

@lorenzowottrich467 Жыл бұрын

Excellent video, you're a great teacher.

@CodeEmporium Жыл бұрын

Thanks a lot for the kind words :)

@thekarthikbharadwaj Жыл бұрын

As always, well explained 😊

@CodeEmporium Жыл бұрын

Thanks a ton :)

@edwinmathenge2178 Жыл бұрын

That some Great Gem Right here....

@CodeEmporium Жыл бұрын

Thanks so much for watching :)

@creativeuser9086 Жыл бұрын

It’s a little confusing Cz In many examples, a full chunk of text is converted into 1 embedding vector instead of multiple embedding vectors (one for each token of that chunk). Can you explain that ?

@CodeEmporium Жыл бұрын

Yea. There are versions that produced sentence embeddings as well. For example, Sentence Transformers use BERT at its core to aggregate word vectors to construct sentence vectors that preserve meaning. Not all of these sentence to vector frameworks work the same way. For example, frameworks like TF-IDF vector is constructed from word co occurrence in different documents. This however is not a continuous dense vector representation as opposed to sentence transformers though. But both of these are worth checking out.

@creativeuser9086 Жыл бұрын

Are embedding models part of the base LLMs or are they a completely different model with different weights, and how does the training of embedding models look like?

@CodeEmporium Жыл бұрын

LLMs = large language models. Models trained to perform language modeling (predict the next token given context). Aside from BERT and GPT, these are not language models as they don’t solve for this objective. So while these models may learn some way to represent words as vectors, not all of them are language models. The training of each depends on the model. I have individual videos called “BERT explained” and “GPT explained” on the channel for details on these. For the other cases like word2vec models, I’ll make a video next week hopefully outlining the process clearer