Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

  Рет қаралды 3,014

Discover AI

Discover AI

Күн бұрын

Пікірлер: 10
@code4AI
@code4AI Күн бұрын
Please note, with the automatic dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.
@mrpocock
@mrpocock Күн бұрын
Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.
@ProgrammingWIthRiley
@ProgrammingWIthRiley 18 сағат бұрын
Brother, you are amazing. Thank you for doing this.
@wwkk4964
@wwkk4964 Күн бұрын
Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.
@davidwynter6856
@davidwynter6856 Күн бұрын
Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?
@King_Deundel
@King_Deundel Күн бұрын
BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)
@themax2go
@themax2go Күн бұрын
i'm having a plantbased BLT right now
@TalsBadKidney
@TalsBadKidney Күн бұрын
very very cool
@JeomonGeorge
@JeomonGeorge Күн бұрын
Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13
@ivangoncharuk607
@ivangoncharuk607 Күн бұрын
Bacon Lettuce Tomato
NEW Transformer for RAG: ModernBERT
17:07
Discover AI
Рет қаралды 1,8 М.
LCM: The Ultimate Evolution of AI? Large Concept Models
30:13
Discover AI
Рет қаралды 32 М.
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
Леон киллер и Оля Полякова 😹
00:42
Канал Смеха
Рет қаралды 4,7 МЛН
Solving Wordle using information theory
30:38
3Blue1Brown
Рет қаралды 10 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Anthropic MCP + Ollama. No Claude Needed? Check it out!
18:06
What The Func? w/ Ed Zynda
Рет қаралды 8 М.
Intel's New B580 GPU: Tested on Linux!
13:48
Level1Linux
Рет қаралды 91 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4 МЛН
New AI Discovery: Phase Transition in Learning (no fine-tuning)
50:42
Evolution of software architecture with the co-creator of UML (Grady Booch)
1:30:43
The Pragmatic Engineer
Рет қаралды 47 М.