Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

  Рет қаралды 6,470

Chris Hay

Chris Hay

Күн бұрын

Пікірлер: 33
@chrishayuk
@chrishayuk 8 ай бұрын
this is the github repo: github.com/chrishayuk/embeddings
@sumandawnmobile
@sumandawnmobile 8 ай бұрын
Its an great video to understand the internals via the visualization. Thanks Chris.
@NERDDISCO
@NERDDISCO 8 ай бұрын
This came to the absolute right time! Thank you very much! I was just trying to understand this. Now I know how it works ❤
@chrishayuk
@chrishayuk 8 ай бұрын
Glad it was helpful!
@rajneesh31
@rajneesh31 5 ай бұрын
Damn, thank you KZbin for recommending this channel. @chrishayuk is a gun. Thanks Chris
@chrishayuk
@chrishayuk 5 ай бұрын
Very kind, glad you like the channel
@scitechtalktv9742
@scitechtalktv9742 8 ай бұрын
Fantastic video ! I am wondering: I think it would also be very interesting to also be able have a visualization of not only the static embeddings you already did, but also a visualization of the so-called contextualized embeddings in a later layer of the model! These are the embeddings that are exposed to the attention mechanism. That why they are also called dynamic embeddings. It adds another layer of abstraction, but are better embeddings because they are able to distinguish between homonyms: words that are the same but have completely other meanings if used in another context. A good example is the word “bank”, that has several different meanings when used in another context (for example financial institution or river bank and several other meanings! ). As a consequence the word “bank” will be represented by several different vectors in embedding space, depending on the context it is used in! This technique is called Word Sense Disambiguation (WSD). Would it be possible to visualize that too? I am curious….
@chrishayuk
@chrishayuk 8 ай бұрын
yep, you got what i'm doing... i'm literally walking the stack
@chrishayuk
@chrishayuk 8 ай бұрын
so those videos will be coming
@scitechtalktv9742
@scitechtalktv9742 8 ай бұрын
@@chrishayukFantastic ! Those embeddings are crucially important for the workings of Large Language Models !
@johntdavies
@johntdavies 8 ай бұрын
Great insight, thanks for posting this. It would be interesting to show how a fine-tuned model differs in similarities and "vocabulary". I'm also curious on the effects of quantisation, i.e. Q4, Q6, Q8, fp16 etc. on the internal "workings" of the LLM. Thanks again.
@chrishayuk
@chrishayuk 8 ай бұрын
It’s almost like you’re reading my roadmap
@guaranamedia
@guaranamedia 5 ай бұрын
Excellent explanation. Thanks for making these examples.
@chrishayuk
@chrishayuk 5 ай бұрын
You're very welcome!
@Memes_uploader
@Memes_uploader 8 ай бұрын
Thank you so much! Thank you youtube algorithm for showing such a great video!
@chrishayuk
@chrishayuk 8 ай бұрын
Glad you enjoyed it!
@khalilbenzineb
@khalilbenzineb 8 ай бұрын
I was playing a bit with finetuning to force an output schema for some 7B Models, but lately I discovered schema grammar, which is a way to dynamically play with the EOS tokens, by limiting them to a specific set of tokens, to generate the output you want, This is very stable and way efficient for many cases that we may think it requires finetuning, For me it felt like a new dimension to get the model intentions inline, I loved the unique and efficient way you create your videos, So I wanted to ask you if possible to create a video for us about this, I feel it's very important
@chrishayuk
@chrishayuk 8 ай бұрын
that's a good shout
@khalilbenzineb
@khalilbenzineb 8 ай бұрын
Thx@@chrishayuk
@kenchang3456
@kenchang3456 8 ай бұрын
Thanks the visualization really helped me.
@chrishayuk
@chrishayuk 8 ай бұрын
so glad, seeing it at a lower level really demystifies what's going on
@enlightenment5d
@enlightenment5d 7 ай бұрын
Good! Where can I find your programs?
@chrishayuk
@chrishayuk 6 ай бұрын
in my github repo github.com/chrishayuk
@andypai
@andypai 8 ай бұрын
Thank you! Great video!
@chrishayuk
@chrishayuk 6 ай бұрын
thank you, glad it was useful
@lfzuniga31
@lfzuniga31 8 ай бұрын
based
@gregherringer7700
@gregherringer7700 8 ай бұрын
This helps thanks!
@chrishayuk
@chrishayuk 8 ай бұрын
Glad it helped! :)
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 11 МЛН
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 2,8 МЛН
Walking on LEGO Be Like... #shorts #mingweirocks
00:41
mingweirocks
Рет қаралды 7 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Merge LLMs to Make Best Performing AI Model
20:17
Maya Akim
Рет қаралды 45 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 1 МЛН
Fine-Tune Llama3 using Synthetic Data
37:03
Chris Hay
Рет қаралды 3,4 М.
Don’t Embed Wrong!
11:42
Matt Williams
Рет қаралды 13 М.
Qwen Just Casually Started the Local AI Revolution
16:05
Cole Medin
Рет қаралды 76 М.
Reliable, fully local RAG agents with LLaMA3.2-3b
31:04
LangChain
Рет қаралды 61 М.
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 11 МЛН