Are Claude 3.5 Sonnet, Llama-3 and Gemini choosing speed over quality?

  Рет қаралды 853

Chris Hay

Chris Hay

Күн бұрын

in this video chris looks at how model providers are trending towards using grouped query attention vs traditional multi-headed attention in transformer models and how this is impacting output in areas such as summarization. in this video chris shows that you get better coherent output from models such as llama-2 or claude 3-opus over new models such as llama-3 or gemini or gemma. in the end, in certain scenarios such as summarization or generative content, gpt-4o still beats sonnet.
repo
github.com/chrishayuk/mha_gqa...

Пікірлер: 14
@makepeace88
@makepeace88 9 күн бұрын
I just attended detailed anatomy of LLM session.. and it’s just wow! Nobody’s telling these details. Thanks very much Chris ❤
@chrishayuk
@chrishayuk 9 күн бұрын
Glad it was useful, I skipped a lot of details, as I wanted to keep the focus on MHA vs GQA. I will probs do some other videos on some of the other details
@trsd8640
@trsd8640 9 күн бұрын
Great video! I don’t understand it fully, had to watch it again, but I‘m getting a idea of what is happening! Thank you!
@chrishayuk
@chrishayuk 9 күн бұрын
it was quite a tough one to record, as i'm trying to avoid explaining the entire transformers architecture and attention fully (i'll do that in another video), but do enough to just show how this architectural change is affecting models output. it was a weird balance and apologies that i never explained it enough
@danielhenderson7050
@danielhenderson7050 9 күн бұрын
This was very interesting
@chrishayuk
@chrishayuk 9 күн бұрын
Glad you enjoyed, definitely a fun rabbit hole
@everyhandletaken
@everyhandletaken 9 күн бұрын
Interesting! Claude 3.5 Sonnet is definitely great for code, much better than cgpt 4-o & has really helped me solve things that are well beyond my brain capacity in the last few days.
@chrishayuk
@chrishayuk 9 күн бұрын
totally agree, much better for code than gpt-4o
@Leo-ph7ow
@Leo-ph7ow 10 күн бұрын
Excelent content! Thanks!
@chrishayuk
@chrishayuk 10 күн бұрын
Glad you liked it!
@seanknowles9985
@seanknowles9985 10 күн бұрын
Intel agencies are having their fill first. Its obviously being slowed down so three letter agencies can get ahead of this.
@chrishayuk
@chrishayuk 9 күн бұрын
lol, i'm sure 3 letter agencies are having their say but i suspect it's not on MHA vs GQA but would love to hear that conversation if they were
getting started with typespec
28:51
Chris Hay
Рет қаралды 826
Survival skills: A great idea with duct tape #survival #lifehacks #camping
00:27
Русалка
01:00
История одного вокалиста
Рет қаралды 5 МЛН
I wish I could change THIS fast! 🤣
00:33
America's Got Talent
Рет қаралды 126 МЛН
`const` was a mistake
31:50
Theo - t3․gg
Рет қаралды 128 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 860 М.
Building a RAG Pipeline with Anthropic Claude Sonnet 3.5
25:41
Richmond Alake
Рет қаралды 3,6 М.
15 INSANE Use Cases for NEW Claude Sonnet 3.5! (Outperforms GPT-4o)
28:54
I switched to Linux 30 days ago... How did it go?
28:46
Craft Computing
Рет қаралды 178 М.
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 105 М.
The future of AI agents is WebAssembly (get started now)
39:51
Chris Hay
Рет қаралды 1,5 М.
i really want to say goodbye to copilot...
35:21
Chris Hay
Рет қаралды 2,1 М.
Samsung Galaxy Unpacked July 2024: Official Replay
1:8:53
Samsung
Рет қаралды 23 МЛН
После ввода кода - протирайте панель
0:18
Up Your Brains
Рет қаралды 1,2 МЛН
Красиво, но телефон жаль
0:32
Бесполезные Новости
Рет қаралды 350 М.
Здесь упор в процессор
18:02
Рома, Просто Рома
Рет қаралды 247 М.