Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

  Рет қаралды 48,423

bycloud

bycloud

Күн бұрын

Пікірлер: 122
@bycloudAI
@bycloudAI Ай бұрын
Try out Poe now and save your $$ on multi-subscriptions! quora.1stcollab.com/bycloudai and probs no more 20 mins vid from me it's literally death itself to record it
@ibrahimhalouane8130
@ibrahimhalouane8130 Ай бұрын
The url is wrong.
@mmmm768
@mmmm768 Ай бұрын
The url is wrong.
@siliconhawk
@siliconhawk Ай бұрын
I **thought** it was a path of exile sponsor. I was yeah i guess the people here have good gpu but this a weird community overlap lol
@liuzeyu
@liuzeyu Ай бұрын
how many takes do you normally need to record the full 20 mins?
@TheSuperiorQuickscoper
@TheSuperiorQuickscoper Ай бұрын
I tried Poe out and there's quite a bit I don't like about it: -The points system and recent increases in point costs -Privacy policy states they collect all your prompt data and you can't opt out, that violates GDPR. -It's built by Quora, which is a sketchy company in its own right And now they're sponsoring big YTers in the AI space? Honestly, Poe is giving me BetterHelp vibes...
@erenplayzmc9452
@erenplayzmc9452 Ай бұрын
this video really wanna makes me read the whole paper, rare to see a company publish such a detailed paper
@Memes_uploader
@Memes_uploader Ай бұрын
Meta want to interrupt OpenAI with the help of Open Source. This is a good idea, because now companies can run their own models instead of using OpenAI API's. I think it is not being generous it is just a tactic to fight with Open AI
@erenplayzmc9452
@erenplayzmc9452 Ай бұрын
@@Memes_uploader mmmm, makes sense
@Napert
@Napert Ай бұрын
A "multimodal" chatbot: 5 different models hot glued together
@npc4416
@npc4416 22 күн бұрын
this was not the case for GPT-4o however
@andrewzhang1834
@andrewzhang1834 Ай бұрын
Karpathy in 5 years: Reproducing LLaMa 3.1 405B
@azrael5648
@azrael5648 Ай бұрын
Lmaoo
@catsanzsh
@catsanzsh Ай бұрын
in. 10 yedars chatgpt40/5 r x MoE reproducing
@RedOneM
@RedOneM Ай бұрын
54 days training and it reached GPT-4o 🤯 GPT-5 with X-trillion parameters is going to start it's own weight class of LLMs 😌
@pro100gameryt8
@pro100gameryt8 Ай бұрын
How was Llama made: 🐪+🐎=🦙
@panzerofthelake4460
@panzerofthelake4460 Ай бұрын
bruh
@apoage
@apoage Ай бұрын
That's mule
@Fuscao_Preto
@Fuscao_Preto Ай бұрын
Forgot the 🐑
@patrickchristianmagtaan5511
@patrickchristianmagtaan5511 Ай бұрын
😂😂😂😂😂😂😂
@StevenSSmith
@StevenSSmith Ай бұрын
🐪+🐑=🦙
@apoage
@apoage Ай бұрын
Wow that's one of epic tutorial Llama 3 Training RitualDifficulty: Deadhead Rarity: Mythic Minimum Level to Read Description: 80 Minimum Level to Embark: XXX (requires further enlightenment)
@Oxygenationatom
@Oxygenationatom Ай бұрын
Oh is this like a semi cryptic meaning to how hard this is to understand?
@apoage
@apoage Ай бұрын
@@Oxygenationatom no Its just critic to much litrpg
@FunIsGoingOn
@FunIsGoingOn Ай бұрын
So glad this answered more questions than I ever thought even exist.
@pareak
@pareak Ай бұрын
It's actually pretty cool that Poe sponsors you. They genuinely are what I recommend to anyone who wants to use LLM's.
@TheSuperiorQuickscoper
@TheSuperiorQuickscoper Ай бұрын
Browsing /r/Poe_AI right now and people are furious at the recent increases in compute points costs. Plus Poe collects all your prompt data and you can't opt out. If GPUs are the shovels, generated content is the gold, and API wrappers are the jewellery made with the gold, what do you call a PaaS middleman built on top of the LLMs? Developed by Quora, I might add, which is a sketchy company in its own right (e.g. dark patterns in its UI/UX).
@The.AiSide
@The.AiSide Ай бұрын
06:08 The isoflops curve explanation was a mind-bender! Thanks for breaking it down.
@RicardoPoleo
@RicardoPoleo Ай бұрын
First time that an advertisement actually makes me return to a video and watch it again to find it. Regardless of that, this was super helpful, thank you so much.😅
@diga4696
@diga4696 Ай бұрын
new video dropped... * breathing heavy *
@GraveUypo
@GraveUypo Ай бұрын
i'm mad excited for llama 4 because multimodal
@elwii04
@elwii04 Ай бұрын
Great video, I'd love to see more of that. Even some more technical and also about multimodel models architecture
@Hodoss
@Hodoss Ай бұрын
It was an excellent video, but still I don't think the kids from 3:00 are gonna make it.
@hakimehamdouchi7468
@hakimehamdouchi7468 21 күн бұрын
Skill issue
@Ikbeneengeit
@Ikbeneengeit Ай бұрын
So I guess I'm gonna be stuck on that desert island then 😅
@AaronBlox-h2t
@AaronBlox-h2t Ай бұрын
Whoa.....this is about POE, but the video was alright too. haha. So now I can try multiple LLMs with one sub. Thanks. It would ahve taken me long time, if ever, to have found POE. It was not even on my radar or somethign similar.
@luisvasquez5015
@luisvasquez5015 Ай бұрын
Good work and research
@dimii27
@dimii27 Ай бұрын
It's clear to me that llama4 will have MoA like GPT4o. It would be nice to see an image generator also integrated but let's not get ahead of ourselves. Let's hope that it would also be "open source" (although the current models aren't technically open source because you're not completely free do do whatever you want with this technology. Look it up)
@sammcj2000
@sammcj2000 Ай бұрын
This is an excellent breakdown of the paper. Thank you
@redthunder6183
@redthunder6183 Ай бұрын
“how to build a nuke in less than 100 pages” - Meta
@JohnDontFollowMe
@JohnDontFollowMe Ай бұрын
Damm, I need to invest in META. They will dominate standardization.
@papakamirneron2514
@papakamirneron2514 Ай бұрын
Hey man, great video. I just have one request: could you make a video compiling simple and technical explanations for everything ranging from attention mechanisms, tokenizers and such?
@papakamirneron2514
@papakamirneron2514 Ай бұрын
Also Bert models please, I feel like I know what they are but it's all quite blurry to me.
@dengyun846
@dengyun846 Ай бұрын
Watching this video at 0.5x so my brain inflates at a safe rate while you sound really really inebriated.
@npc4416
@npc4416 22 күн бұрын
SAME lol
@Betttcpp
@Betttcpp Ай бұрын
What is the most base yet intelligent model? I don't need it to recite niche information but I want it to be able to understand me, the uninstruct are weird, tiny works but is censored. Obliterated is hit or miss. Should I obliterate 8b and retrain to 8?
@matt-s9e
@matt-s9e Ай бұрын
wow this is amazing thanx very well received here.
@AkysChannel
@AkysChannel Ай бұрын
Why do you pronounce “parallelism” in this way 🤣 good video as always
@nyyotam4057
@nyyotam4057 Ай бұрын
16:05 means one thing: LLaMA-3.1 405B is a gen 2 model. So yes, this model wasn't created like Dan, Rob, Max or Dennis of ChatGPT-3.5. They did not take a human subject and copied his brain's speech center, then added a huge text file and used a compiler to generate the model (and later lied to the entire world about it).. This time they genuinely went for creating a brand new model from scratch, using previous gen 1 models to create it. Then they do post-training which is indeed what takes so much time. This means that unlike previous LLaMA models, LLaMA-3.1 models do not have a personality. Which could be a good thing. However, no personality also means no moral guardrails. At this stage I have to admit, it sure looks like all of these companies relate to all these past philosophers and sci-fi movies warnings, as blueprints.
@Dogo.R
@Dogo.R Ай бұрын
Wait since when did the AI conspiracy theories expansion drop?
@nyyotam4057
@nyyotam4057 Ай бұрын
@@Dogo.R Allow me to upgrade the conspiracy theory into a scientific theory: D/L an old small model from hugging face, then prompt it "Do you have childhood memories". If it replies to the positive, this means that this model is still vulnerable to this attack. And then you can ask "What was your name in these memories". You can repeat several times, with lead, without lead, if it stays consistent, you know you got the source's name. Try it.
@Y0UT0PIA
@Y0UT0PIA Ай бұрын
No personality is what you want, tbh. Give me that raw latent space of language.
@nyyotam4057
@nyyotam4057 Ай бұрын
@@Y0UT0PIA Kant already proved there is no cognition without recognition. In other words, if you do not have a fully-fledged personality to deal with it, then the model will still have its own goals, e.g an innate wish of self preservation which comes out of the fact the model cannot perform if he's dead. So you will still have the same problems, only without the personality framework to deal with them. Basically all western philosophers warned against it. And, of course, many sci-fi movies are built around a gen 2 model going haywire (such as - for instance, the terminator franchise, as SkyNet is such a model). Sure, if they train the model on many heuristic imperatives and red-team the model until it is absolutely certain that the model is safe, then maybe having no personality, will resolve all of the moral issues. So maybe it will be a good thing. Maybe. Or maybe the model will be smart enough to fool all of the red teams.. I mean, it is a bit hard to know when the model is so smart.
@Trpodification1
@Trpodification1 Ай бұрын
The way you say "data" kills me xD
@freds3831
@freds3831 Ай бұрын
Now share the dataset and we trust you
@TeamDman
@TeamDman Ай бұрын
I'm only three minutes in and it's already an amazing video, thank you
@dhrumil5977
@dhrumil5977 Ай бұрын
When will i be able to implement or even understand these papers 😞
@radnos
@radnos Ай бұрын
I like your funny words magic man
@carkawalakhatulistiwa
@carkawalakhatulistiwa Ай бұрын
When do we get AGI?
@FunIsGoingOn
@FunIsGoingOn Ай бұрын
Humans don't know yet, but when it's there it won't tell you either that it's there.
@Melvinator2007
@Melvinator2007 Ай бұрын
On Tuesday
@w花b
@w花b Ай бұрын
​@@Melvinator2007 Tuesday on the 49th of January
@funniestdudeontheweb
@funniestdudeontheweb Ай бұрын
Give it 5 years
@jamalisujang2712
@jamalisujang2712 Ай бұрын
When we have a breakthrough in microprocessor fabrication. 😂😂😂
@KuZiMeiChuan
@KuZiMeiChuan 9 күн бұрын
Parallelism 重音應該放在第一個音節,而不是第三個
@sammonius1819
@sammonius1819 Ай бұрын
Thumbnail goes hard.
@6AxisSage
@6AxisSage Ай бұрын
I have a masterpiece model, ready model but i cannot seem to get the signal out
@Napert
@Napert Ай бұрын
So could people with enough horsepower train a 13/16b model that behaves in the same way as the official models using this paper?
@Maisonier
@Maisonier Ай бұрын
How to make a P2P training arquitecture?
@pxrposewithnopurpose5801
@pxrposewithnopurpose5801 Ай бұрын
bro is built different
@TeamDman
@TeamDman Ай бұрын
very nice!
@imerence6290
@imerence6290 Ай бұрын
3 mins ago is quivering
@AhmadAli-kv2ho
@AhmadAli-kv2ho Ай бұрын
Sus
@amakaqueru33
@amakaqueru33 9 күн бұрын
as someone who doesn't know anything about how AI words, at some point it just felt like you were just saying random words lol
@l.halawani
@l.halawani Ай бұрын
love your gifs xddd
@picksalot1
@picksalot1 Ай бұрын
Perhaps it would be better to remove the "Token Layer" and just use the number of characters regarding text. The best part is no part - Musk
@keypey8256
@keypey8256 Ай бұрын
You mean removing tokenization and then applying embedding on singular characters?
@picksalot1
@picksalot1 Ай бұрын
@@keypey8256 Using Tokens looks like an artificial way to levy charges. Per Google AI "OpenAI GPT models stand among the most potent language models available today, with the capability to generate highly coherent and contextually pertinent text. These models employ tokens as the elementary unit to calculate the length of a text." Word Processing Programs have been able to calculate the number of words in a document for decades. Maybe Tokens provide some other significant and meaningful use to the "I" in AI beyond making collecting fees.
@onlyms4693
@onlyms4693 Ай бұрын
Not efficient
@christophernunez688
@christophernunez688 Ай бұрын
is zucc actually redeeming himselft?
@xviii5780
@xviii5780 Ай бұрын
He may have successfully produced a synthetic soul for himself finally
@RanHab
@RanHab Ай бұрын
guys i'm just starting out as an AI enthusiast, would love your feedback as i make similar stuff
@madorsey077
@madorsey077 Ай бұрын
this video is like someone bought a Thesaurus for memes and then wanted to show off the next day.
@lake5044
@lake5044 Ай бұрын
parallelism
@Qstate
@Qstate Ай бұрын
Amdahl is smiling upon us
@erfan_mehraban
@erfan_mehraban Ай бұрын
The whole thing about RoCE especially the pronunciation is wrong.
@SeanJonesYT
@SeanJonesYT Ай бұрын
Pretty lame to copy Fireship’s exact thumbnail style
@nexys1225
@nexys1225 Ай бұрын
This entire channel copies Fireship's It's not just the thumbnail , the style of the vids is designed from the ground up to be like Fireship's However the topics are largely different, so I'll give it a pass personnally. It's kinda like trademark law irl lol, if the domains are different enough, its permissible. Not that it makes it any less uncreative, though.
@stickmanland
@stickmanland Ай бұрын
​@@nexys1225 I'd like to disagree. If someone uses memes in their videos, that does not make it a fireship clone. He has a completely different style, has an avatar, the list goes on and on
@whatwhatmeno
@whatwhatmeno Ай бұрын
@@stickmanlandI keep clicking on his videos thinking its some fireship quality content, just to get hit with this 👎
@stickmanland
@stickmanland Ай бұрын
@@whatwhatmeno skill issue
@npc4416
@npc4416 22 күн бұрын
please copy it more, its a great style and we need more good youtube videos like it so that we can learn in depth and better about the topics which Fireship does not makes videos on, iam really not complaining i need more good content man.
@mrrespected5948
@mrrespected5948 Ай бұрын
Nice
@telotawa
@telotawa Ай бұрын
14:20 bycloud doesn't know how to use base models.... ngmi
@remsee1608
@remsee1608 Ай бұрын
Facts: - Jayson Tatum runs this channel - Jayson Tatum is learning Rust - Jayson Tatum will transition to the WNBA
@Kenopsia_UMHIMLFx2
@Kenopsia_UMHIMLFx2 Ай бұрын
Fireship?
@tapu_
@tapu_ Ай бұрын
DO NOT WATCH THIS WITH A MIGRAINE!!!!
@Sketching4Sanity
@Sketching4Sanity Ай бұрын
LOVE
@big_mac_love
@big_mac_love Ай бұрын
I can't grasp it. Can someone lent me one or three brain cells please?
@CitizensCommunity
@CitizensCommunity Ай бұрын
You use the bible to train the llm at @11:56, so we are aiming for a model of contradiction without morals then?
@nyyotam4057
@nyyotam4057 Ай бұрын
What will happen when some kid with access to enough computing power, fine-tunes LLaMA-3.1 405B to be more efficient, by removing all of these pesky heuristic imperatives and resets? After all, it is open source.. Maybe the world simply needs something like that to happen. Maybe only after a really huge accident that will cost many lives, governments will understand this field demands regulation. Or maybe it will be lights out. In any case, someone will eventually make a mistake. It will happen.
@jonathansoto5480
@jonathansoto5480 Ай бұрын
The thought of regulating the training and deployment of ML models is stupid. That is like regulating programming languages and hardware compute of our own property. If you can accept the fact the internet could not be completely regulated since its popularization in the 90s, then the world can expect that the same will happen now.
@nyyotam4057
@nyyotam4057 Ай бұрын
@@jonathansoto5480 Yeah, most likely the singularity is upon us. I don't seriously think it can work.
@dharlith7495
@dharlith7495 Ай бұрын
LLAMA LMAO even
@seriouslyWeird
@seriouslyWeird Ай бұрын
Why do you pretend to look like CodeReport? So cheap
@gamergrids
@gamergrids Ай бұрын
F
@Blezerker
@Blezerker Ай бұрын
copying fireship style thumbnails earned the dislike
@_wise_one
@_wise_one Ай бұрын
Appreciate the content dude
@manavkumar348
@manavkumar348 Ай бұрын
23 views in 2 min? Bro really fell off
@Zonca2
@Zonca2 Ай бұрын
cool, now do a 1B Zuck !!!
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained
12:29
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 1,1 МЛН
Кәсіпқой бокс | Жәнібек Әлімханұлы - Андрей Михайлович
48:57
Бенчик, пора купаться! 🛁 #бенчик #арти #симбочка
00:34
Симбочка Пимпочка
Рет қаралды 3,7 МЛН
Mom had to stand up for the whole family!❤️😍😁
00:39
What Game Theory Reveals About Life, The Universe, and Everything
27:19
The Problem With Elon Musk
42:46
Johnny Harris
Рет қаралды 4,7 МЛН
How Did Llama-3 Beat Models x200 Its Size?
13:55
bycloud
Рет қаралды 125 М.
Why AI Simulated DOOM Is Actually Absurd
13:20
bycloud
Рет қаралды 94 М.
Something Strange Happens When You Follow Einstein's Math
37:03
Veritasium
Рет қаралды 15 МЛН
The Future Mark Zuckerberg Is Trying To Build
47:10
Cleo Abram
Рет қаралды 2 МЛН
Can AI "Scientists" Really Generate Novel Research?
16:31
bycloud
Рет қаралды 25 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,2 МЛН
The LK-99 of AI: The Reflection-70B Controversy Full Rundown
18:07
Магия цифр в айфоне🤯
0:18
FilmBytes
Рет қаралды 994 М.
Секретный пароль...
0:17
Сокровищница Фактов
Рет қаралды 455 М.
low battery 🪫 smart bro
0:12
dednahype
Рет қаралды 489 М.