Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

  Рет қаралды 5,853

Discover AI

Discover AI

Күн бұрын

Пікірлер: 15
@code4AI
@code4AI Ай бұрын
Please note, with the automatic dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
@mrpocock
@mrpocock Ай бұрын
Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.
@ProgrammingWIthRiley
@ProgrammingWIthRiley Ай бұрын
Brother, you are amazing. Thank you for doing this.
@williamervin3272
@williamervin3272 21 күн бұрын
I would love to see a follow up paper that explores adding another layer to create patches of patches. Then maybe the "Large Concept Model" idea can finally be realized with good performance. Fun to think about!
@wwkk4964
@wwkk4964 Ай бұрын
Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.
@TalsBadKidney
@TalsBadKidney Ай бұрын
very very cool
@themax2go
@themax2go Ай бұрын
i'm having a plantbased BLT right now
@thanhhuynh1139
@thanhhuynh1139 29 күн бұрын
I think the entropy formula should be p_x*log(1/p_x) = - p_x*log(p_x). Where did the ‘-’ go?
@davidwynter6856
@davidwynter6856 Ай бұрын
Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?
@pabloescobar2738
@pabloescobar2738 Ай бұрын
Amen
@Swooshii-u4e
@Swooshii-u4e 28 күн бұрын
What do you mean? I can't seem to make sense of your comment
@JeomonGeorge
@JeomonGeorge Ай бұрын
Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13
@King_Deundel
@King_Deundel Ай бұрын
BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)
@augmentos
@augmentos Ай бұрын
Can you expand on ‘definitely problems’ with it
@ivangoncharuk607
@ivangoncharuk607 Ай бұрын
Bacon Lettuce Tomato
New AI Discovery: Phase Transition in Learning (no fine-tuning)
50:42
Маусымашар-2023 / Гала-концерт / АТУ қоштасу
1:27:35
Jaidarman OFFICIAL / JCI
Рет қаралды 390 М.
Wednesday VS Enid: Who is The Best Mommy? #shorts
0:14
Troom Oki Toki
Рет қаралды 50 МЛН
The Lost World: Living Room Edition
0:46
Daniel LaBelle
Рет қаралды 27 МЛН
Solving Wordle using information theory
30:38
3Blue1Brown
Рет қаралды 10 МЛН
Leadership in Motion: Turning Traits into Transformative Action
49:18
The Dr. Claude Kershner Show
Рет қаралды 38
How are holograms possible?
46:24
3Blue1Brown
Рет қаралды 1,7 МЛН
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,5 МЛН
Beyond RAG: New Continual Learning of LLM w/ InCA
39:56
Discover AI
Рет қаралды 13 М.
The Dark Matter of AI [Mechanistic Interpretability]
24:09
Welch Labs
Рет қаралды 145 М.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Coding Adventure: Sound (and the Fourier Transform)
43:31
Sebastian Lague
Рет қаралды 478 М.
Two MIT Professors ACCIDENTALLY discovered this simple SECRET TO LEARNING
5:10