Understanding the Meta Llama 3 Tokenizer | Llama for Developers

Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters

LLAMA 3 : Explained and Summarised Under 8 Minutes (Compared to Llama 2, Meta AI)

When You Get Ran Over By A Car...

WHO LAUGHS LAST LAUGHS BEST 😎 #comedy

Smart Sigma Kid #funny #sigma #comedy

Каха на свидании #непосредственнокаха

Understanding the Meta Llama 3 Tokenizer | Llama for Developers

Рет қаралды 2,853

AI at Meta

Күн бұрын

Download Meta Llama 3 ➡️ go. kbpn54
Aston Zhang, research scientist working on Llama at Meta discusses the new tokenizer in Meta Llama 3. He discusses the improvements made to the tokenizer in Meta's latest Llama 3 models. The new tokenizer uses Tiktoken instead of SentencePiece and has a larger vocabulary size of 128k, resulting in better performance on coding, reasoning, and more. The increased vocabulary size allows for more specific and nuanced encoding of inputs, while the higher compression ratio reduces the number of tokens required to represent an input. Additionally, the use of Group Query Attention helps balance out the increased memory and compute needs, resulting in a model that can process larger batches without increasing latency.
Timestamps
00:00 Introduction
00:25 What's new in the Llama 3 tokenizer?
01:58 Vocabulary size and compression ratio
13:01 Performance, efficiency and improving costs
17:46 Recap and resources
Additional Resources
• Dive into Deep Learning ebook: go. ao405f
• Getting Started Guide: go. xucc2m
#llama3 #llm #opensource
- - -
Subscribe: kzbin.info?sub_...
Learn more about our work: ai.meta.com
Follow us on social media
Follow us on Twitter: / aiatmeta
Follow us on LinkedIn: / aiatmeta
Follow us on Threads: threads.net/aiatmeta
Follow us on Facebook: / aiatmeta
Meta AI focuses on bringing the world together by advancing AI, powering meaningful and safe experiences, and conducting open research.

Пікірлер: 8

@anirbansen7132

@anirbansen7132 6 күн бұрын

Informative

@loabrasumente2283

@loabrasumente2283 13 күн бұрын

TLDR - from llama 2 to llama3 they switched from sentencepiece to tiktoken - vocab size 32k -> 128k - ~15% fewer tokens for english, ~50% fewer for "some other languages"

@parvesh-rana 17 күн бұрын

Aston please explain the attention mechanism , Actually I am stuck in the chapter "Attention and transformer" of your book d2l

@stephennfernandes

@stephennfernandes 9 күн бұрын

could someone from the meta LLaMa 3 team please explain how to train my very own tiktoken tokenizer like you guys did for llama 3. there is no opensource steps to recreate this

@maksymkyiv1111

@maksymkyiv1111 15 күн бұрын

ok.

@Windowsmakes 2 күн бұрын

x

@user-wr4yl7tx3w

@user-wr4yl7tx3w 15 күн бұрын

i don't think this format works unless the intent is to discuss at a high level.

Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters

1:18:38

Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters

Dwarkesh Patel

Рет қаралды 723 М.

LLAMA 3 : Explained and Summarised Under 8 Minutes (Compared to Llama 2, Meta AI)

7:57

LLAMA 3 : Explained and Summarised Under 8 Minutes (Compared to Llama 2, Meta AI)

Bots Know Best

Рет қаралды 1,5 М.

When You Get Ran Over By A Car...

00:15

When You Get Ran Over By A Car...

Jojo Sim

Рет қаралды 27 МЛН

WHO LAUGHS LAST LAUGHS BEST 😎 #comedy

00:18

WHO LAUGHS LAST LAUGHS BEST 😎 #comedy

HaHaWhat

Рет қаралды 20 МЛН

Smart Sigma Kid #funny #sigma #comedy

00:25

Smart Sigma Kid #funny #sigma #comedy

CRAZY GREAPA

Рет қаралды 24 МЛН

Каха на свидании #непосредственнокаха

01:00

Каха на свидании #непосредственнокаха

К-Media

Рет қаралды 10 МЛН

Generative AI in a Nutshell - how to survive and thrive in the age of AI

17:57

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Henrik Kniberg

Рет қаралды 1,7 МЛН

How Far is Too Far? | The Age of A.I.

34:40

How Far is Too Far? | The Age of A.I.

YouTube Originals

Рет қаралды 62 МЛН

Agentic AI: The Future is here?

46:27

Agentic AI: The Future is here?

code_your_own_AI

Рет қаралды 8 М.

Geoffrey Hinton | Will digital intelligence replace biological intelligence?

1:58:38

Geoffrey Hinton | Will digital intelligence replace biological intelligence?

Schwartz Reisman Institute

Рет қаралды 152 М.

Mastering Google's VLM PaliGemma: Tips And Tricks For Success and Fine Tuning

21:01

Mastering Google's VLM PaliGemma: Tips And Tricks For Success and Fine Tuning

Sam Witteveen

Рет қаралды 9 М.

This Llama 3 is powerful and uncensored, let’s run it

14:58

This Llama 3 is powerful and uncensored, let’s run it

David Ondrej

Рет қаралды 91 М.

Using Llama Coder As Your AI Assistant

9:18

Using Llama Coder As Your AI Assistant

Matt Williams

Рет қаралды 66 М.

Fine Tuning LLM Models - Generative AI Course

2:37:05

Fine Tuning LLM Models - Generative AI Course

freeCodeCamp.org

Рет қаралды 72 М.

Cosmology in Crisis? Confronting the Hubble Tension

36:26

Cosmology in Crisis? Confronting the Hubble Tension

World Science Festival

Рет қаралды 96 М.

GraphRAG: LLM-Derived Knowledge Graphs for RAG

15:40

GraphRAG: LLM-Derived Knowledge Graphs for RAG

Alex Chao

Рет қаралды 91 М.

Easy Art with AR Drawing App - Step by step for Beginners

0:27

Easy Art with AR Drawing App - Step by step for Beginners

Melli Art School

Рет қаралды 13 МЛН

Clicks чехол-клавиатура для iPhone ⌨️

0:59

Clicks чехол-клавиатура для iPhone ⌨️

serg1us

Рет қаралды 2 МЛН

Сколько реально стоит ПК Величайшего?

0:37

Сколько реально стоит ПК Величайшего?

CONSTRUCT PC

Рет қаралды 294 М.

OZON РАЗБИЛИ 3 КОМПЬЮТЕРА

0:57

OZON РАЗБИЛИ 3 КОМПЬЮТЕРА

Кинг Комп Shorts

Рет қаралды 1,8 МЛН

Product Link in Bio ( # 1636 ) @MaviGadgets ✅ Smart Universal Magnetic Car Phone Holder

0:14

Product Link in Bio ( # 1636 ) @MaviGadgets ✅ Smart Universal Magnetic Car Phone Holder

MaviGadget

Рет қаралды 11 МЛН

А СУЩЕСТВУЕТ ТРЕХКАНАЛЬНЫЙ РЕЖИМ РАБОТЫ ОПЕРАТИВНОЙ ПАМЯТИ ? #ddr4 #ddr5 #оперативнаяпамять

0:51

А СУЩЕСТВУЕТ ТРЕХКАНАЛЬНЫЙ РЕЖИМ РАБОТЫ ОПЕРАТИВНОЙ ПАМЯТИ ? #ddr4 #ddr5 #оперативнаяпамять

Inter Link

Рет қаралды 1,2 МЛН

Ура!наконец-то куплю новый Samsung!#хочуврек#котики#футажи#shorts Автор звука:@GORA9338

0:10

Ура!наконец-то куплю новый Samsung!#хочуврек#котики#футажи#shorts Автор звука:@GORA9338

lev89k

Рет қаралды 1,4 МЛН