Embeddings Walkthrough (Part 1) - Bag of Words to word2vec to Transformer contextual embeddings

  Рет қаралды 703

John Tan Chong Min

John Tan Chong Min

Күн бұрын

How did Large Language Models (LLMs) become so good at capturing the essences of the text?
Key idea: Embeddings
Mark my words - embeddings and context-dependent embeddings are the key reason why transformers work so well!
We'll go through a historical overview of how embeddings are derived, starting from Bag of Words, to word2vec, to Transformers, and to some of my more recent experiments on context-dependent embeddings and multiple abstraction spaces!
~~~~
Part 2 here: • Embeddings Walkthrough...
My slides: github.com/tanchongmin/strict...
Bag of words: www.researchgate.net/publicat...
Word2vec: courses.cs.washington.edu/cou...
word2vec paper: arxiv.org/abs/1301.3781
Transformer paper: arxiv.org/abs/1706.03762
Vision Transformer paper: arxiv.org/abs/2010.11929
Memorising Transformer paper (for that nice token prediction visualisation): arxiv.org/abs/2203.08913
Text and Code Embeddings by Contrastive Pre-training (OpenAI embeddings paper): arxiv.org/abs/2201.10005
~~~~
0:00 Introduction
1:30 Bag of words
5:55 Continuous vectors for embedding
8:42 word2vec
20:24 Next-token prediction
24:43 Transformer embeddings
43:17 Comparison: Image token embeddings
51:52 Recap on Transformer embeddings
55:47 Cosine Similarity
59:42 Sentence Embeddings
1:07:39 Why Contrastive Learning is Bad
1:11:11 Mismatch between next-token prediction and sentence meaning embedding prediction
1:19:18 Insight: Multiple Abstraction Space Prediction for Embeddings
1:27:03 Discussion
~~~~~~~~~~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: / discord
LinkedIn: / chong-min-tan-94652288
Online AI blog: delvingintotech.wordpress.com/
Twitter: / johntanchongmin
Try out my games here: simmer.io/@chongmin

Пікірлер: 7
@snehotoshbanerjee1938
@snehotoshbanerjee1938 5 ай бұрын
knowledge packed and great insight on embeddings and it current scope of improvement.
@johntanchongmin
@johntanchongmin 6 ай бұрын
Correction to 43:17 JFT-300M is 300M in size, ImageNet is 14M, so it is about 21x the size. The performance on ImageNet is about 88% zero-shot when pre-trained on JFT-300M However, the point still stands that Vision Transformers are inefficient, and requires a large dataset to overcome the lack of representational bias common for images like translational invariance/equivariance.
@johntanchongmin
@johntanchongmin 6 ай бұрын
Key moments: 51:52 - 55:47 Recap on Transformer embeddings 1:07:39 - 1:11:11 Why Contrastive Learning is Bad 1:19:18 - 1:27:03 Insight: Multiple Abstraction Space Prediction for Embeddings
@johntanchongmin
@johntanchongmin 6 ай бұрын
My slides: github.com/tanchongmin/strictjson/blob/main/Experiments/Embeddings%20Walkthrough.pdf Other useful resources: Bag of words: www.researchgate.net/publication/338511771_An_Overview_of_Bag_of_WordsImportance_Implementation_Applications_and_Challenges Word2vec: courses.cs.washington.edu/courses/csep517/20wi/slides/csep517wi20-WordEmbeddings.pdf word2vec paper: arxiv.org/abs/1301.3781 Transformer paper: arxiv.org/abs/1706.03762 Vision Transformer paper: arxiv.org/abs/2010.11929 Memorising Transformer paper (for that nice token prediction visualisation): arxiv.org/abs/2203.08913 Text and Code Embeddings by Contrastive Pre-training (OpenAI embeddings paper): arxiv.org/abs/2201.10005
@johntanchongmin
@johntanchongmin 5 ай бұрын
Part 2 here: kzbin.info/www/bejne/j4u3hZuihcxjqLc
@seankruzel8889
@seankruzel8889 6 ай бұрын
@johntanchongmin at around 1:23, you talk a lot about how sentence embeddings are not really solved, do you know of any work around entity embeddings? For example, how do you take the context dependent embeddings for the tokens of a word like "Michael Jordan" to create an entity-specific embedding?
@johntanchongmin
@johntanchongmin 6 ай бұрын
Happy to hear if you know any. So far, the entity embeddings I know are the same as the sentence embeddings - just embed the entity directly
ПОМОГЛА НАЗЫВАЕТСЯ😂
00:20
Chapitosiki
Рет қаралды 6 МЛН
小宇宙竟然尿裤子!#小丑#家庭#搞笑
00:26
家庭搞笑日记
Рет қаралды 20 МЛН
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 110 М.
This is why Deep Learning is really weird.
2:06:38
Machine Learning Street Talk
Рет қаралды 379 М.
ULTRA: Foundation Models for Knowledge Graph Reasoning
50:51
Temporal Graph Learning
Рет қаралды 3,6 М.
Contextual word embeddings in spaCy
9:06
Applied Language Technology
Рет қаралды 3,6 М.
Stop, Intel’s Already Dead! - AMD Ryzen 9600X & 9700X Review
13:47
Linus Tech Tips
Рет қаралды 1,2 МЛН
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 868 М.
Emergent linguistic structure in deep contextual neural word representations - Chris Manning
43:02
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
1:28:17
Попробуй СБЕЖАТЬ от Родителей в Майнкрафт! Скулбой в Minecraft
9:52
КУПЛИНОВ УБЕГАЕТ ОТ РОДИТЕЛЕЙ ► SCHOOLBOY RUNAWAY #1
49:40