Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

NEW Transformer for RAG: ModernBERT

LCM: The Ultimate Evolution of AI? Large Concept Models

Он ждал ДВА ГОДА, чтобы преподать наглецу УРОК СКРОМНОСТИ #shorts

ВОТ ПОЧЕМУ Япония живет в будущем 🤫 Утилизация масла #япония #токио #путешествия #shorts

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

Леон киллер и Оля Полякова 😹

Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

Рет қаралды 3,014

Discover AI

Discover AI

Күн бұрын

Пікірлер: 10

@code4AI Күн бұрын

Please note, with the automatic dubbing from KZbin /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

@mrpocock Күн бұрын

Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.

@ProgrammingWIthRiley

@ProgrammingWIthRiley 18 сағат бұрын

Brother, you are amazing. Thank you for doing this.

@wwkk4964 Күн бұрын

Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.

@davidwynter6856

@davidwynter6856 Күн бұрын

Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?

@King_Deundel Күн бұрын

BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)

@themax2go Күн бұрын

i'm having a plantbased BLT right now

@TalsBadKidney Күн бұрын

very very cool

@JeomonGeorge Күн бұрын

Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13

@ivangoncharuk607

@ivangoncharuk607 Күн бұрын

Bacon Lettuce Tomato

NEW Transformer for RAG: ModernBERT

17:07

NEW Transformer for RAG: ModernBERT

Discover AI

Рет қаралды 1,8 М.

LCM: The Ultimate Evolution of AI? Large Concept Models

30:13

LCM: The Ultimate Evolution of AI? Large Concept Models

Discover AI

Рет қаралды 32 М.

Он ждал ДВА ГОДА, чтобы преподать наглецу УРОК СКРОМНОСТИ #shorts

01:00

Он ждал ДВА ГОДА, чтобы преподать наглецу УРОК СКРОМНОСТИ #shorts

BalcevMMA_BOXING

Рет қаралды 10 МЛН

ВОТ ПОЧЕМУ Япония живет в будущем 🤫 Утилизация масла #япония #токио #путешествия #shorts

00:59

ВОТ ПОЧЕМУ Япония живет в будущем 🤫 Утилизация масла #япония #токио #путешествия #shorts

Холли Лолли Live

Рет қаралды 4,7 МЛН

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

00:21

Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨

Two More French

Рет қаралды 42 МЛН

Леон киллер и Оля Полякова 😹

00:42

Леон киллер и Оля Полякова 😹

Канал Смеха

Рет қаралды 4,7 МЛН

Solving Wordle using information theory

30:38

Solving Wordle using information theory

3Blue1Brown

Рет қаралды 10 МЛН

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

57:45

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Grant Sanderson

Рет қаралды 239 М.

Anthropic MCP + Ollama. No Claude Needed? Check it out!

18:06

Anthropic MCP + Ollama. No Claude Needed? Check it out!

What The Func? w/ Ed Zynda

Рет қаралды 8 М.

Intel's New B580 GPU: Tested on Linux!

13:48

Intel's New B580 GPU: Tested on Linux!

Level1Linux

Рет қаралды 91 М.

Transformers (how LLMs work) explained visually | DL5

27:14

Transformers (how LLMs work) explained visually | DL5

3Blue1Brown

Рет қаралды 4 МЛН

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

26:52

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Snowflake Inc.

Рет қаралды 222 М.

САМЫЙ ДОРОГОЙ НОВОГОДНИЙ СТОЛ ЗА 1200 $ / ОГРОМНЫЙ ОСЬМИНОГ , ИКРА МОРСКОГО ЕЖА , КРАБЫ , ЛАНГУСТЫ

49:21

САМЫЙ ДОРОГОЙ НОВОГОДНИЙ СТОЛ ЗА 1200 $ / ОГРОМНЫЙ ОСЬМИНОГ , ИКРА МОРСКОГО ЕЖА , КРАБЫ , ЛАНГУСТЫ

vanzai

Рет қаралды 812 М.

ШУЛЬМАН о движухе у Путина, интервью Собчак, (не)отказе Соловьеву, ФСБ, Шлосберге, Медведеве, ЭХО

1:11:58

ШУЛЬМАН о движухе у Путина, интервью Собчак, (не)отказе Соловьеву, ФСБ, Шлосберге, Медведеве, ЭХО

И Грянул Грэм

Рет қаралды 428 М.

New AI Discovery: Phase Transition in Learning (no fine-tuning)

50:42

New AI Discovery: Phase Transition in Learning (no fine-tuning)

Discover AI

Рет қаралды 12 М.

Evolution of software architecture with the co-creator of UML (Grady Booch)

1:30:43

Evolution of software architecture with the co-creator of UML (Grady Booch)

The Pragmatic Engineer

Рет қаралды 47 М.

Он ждал ДВА ГОДА, чтобы преподать наглецу УРОК СКРОМНОСТИ #shorts

01:00

Он ждал ДВА ГОДА, чтобы преподать наглецу УРОК СКРОМНОСТИ #shorts

BalcevMMA_BOXING

Рет қаралды 10 МЛН