Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

  Рет қаралды 53,150

bycloud

bycloud

Күн бұрын

Пікірлер: 124
@bycloudAI
@bycloudAI 4 ай бұрын
Try out Poe now and save your $$ on multi-subscriptions! quora.1stcollab.com/bycloudai and probs no more 20 mins vid from me it's literally death itself to record it
@ibrahimhalouane8130
@ibrahimhalouane8130 4 ай бұрын
The url is wrong.
@mmmm768
@mmmm768 4 ай бұрын
The url is wrong.
@siliconhawk
@siliconhawk 4 ай бұрын
I **thought** it was a path of exile sponsor. I was yeah i guess the people here have good gpu but this a weird community overlap lol
@liuzeyu
@liuzeyu 4 ай бұрын
how many takes do you normally need to record the full 20 mins?
@TheSuperiorQuickscoper
@TheSuperiorQuickscoper 4 ай бұрын
I tried Poe out and there's quite a bit I don't like about it: -The points system and recent increases in point costs -Privacy policy states they collect all your prompt data and you can't opt out, that violates GDPR. -It's built by Quora, which is a sketchy company in its own right And now they're sponsoring big YTers in the AI space? Honestly, Poe is giving me BetterHelp vibes...
@erenplayzmc9452
@erenplayzmc9452 4 ай бұрын
this video really wanna makes me read the whole paper, rare to see a company publish such a detailed paper
@Memes_uploader
@Memes_uploader 4 ай бұрын
Meta want to interrupt OpenAI with the help of Open Source. This is a good idea, because now companies can run their own models instead of using OpenAI API's. I think it is not being generous it is just a tactic to fight with Open AI
@erenplayzmc9452
@erenplayzmc9452 4 ай бұрын
@@Memes_uploader mmmm, makes sense
@Napert
@Napert 4 ай бұрын
A "multimodal" chatbot: 5 different models hot glued together
@npc4416
@npc4416 3 ай бұрын
this was not the case for GPT-4o however
@YourAverageHuman-0
@YourAverageHuman-0 4 ай бұрын
Karpathy in 5 years: Reproducing LLaMa 3.1 405B
@azrael5648
@azrael5648 4 ай бұрын
Lmaoo
@Catdevzsh01
@Catdevzsh01 4 ай бұрын
in. 10 yedars chatgpt40/5 r x MoE reproducing
@Sorter43
@Sorter43 2 ай бұрын
""Removed unhuman like phrases like "I'm sorry" and "I apologize"."" Now that there is a commentary on humanity.
@FunIsGoingOn
@FunIsGoingOn 4 ай бұрын
So glad this answered more questions than I ever thought even exist.
@pareak
@pareak 4 ай бұрын
It's actually pretty cool that Poe sponsors you. They genuinely are what I recommend to anyone who wants to use LLM's.
@TheSuperiorQuickscoper
@TheSuperiorQuickscoper 4 ай бұрын
Browsing /r/Poe_AI right now and people are furious at the recent increases in compute points costs. Plus Poe collects all your prompt data and you can't opt out. If GPUs are the shovels, generated content is the gold, and API wrappers are the jewellery made with the gold, what do you call a PaaS middleman built on top of the LLMs? Developed by Quora, I might add, which is a sketchy company in its own right (e.g. dark patterns in its UI/UX).
@pro100gameryt8
@pro100gameryt8 4 ай бұрын
How was Llama made: 🐪+🐎=🦙
@panzerofthelake4460
@panzerofthelake4460 4 ай бұрын
bruh
@apoage
@apoage 4 ай бұрын
That's mule
@Fuscao_Preto
@Fuscao_Preto 4 ай бұрын
Forgot the 🐑
@patrickchristianmagtaan5511
@patrickchristianmagtaan5511 4 ай бұрын
😂😂😂😂😂😂😂
@StevenSSmith
@StevenSSmith 4 ай бұрын
🐪+🐑=🦙
@RedOneM
@RedOneM 4 ай бұрын
54 days training and it reached GPT-4o 🤯 GPT-5 with X-trillion parameters is going to start it's own weight class of LLMs 😌
@The.AiSide
@The.AiSide 4 ай бұрын
06:08 The isoflops curve explanation was a mind-bender! Thanks for breaking it down.
@apoage
@apoage 4 ай бұрын
Wow that's one of epic tutorial Llama 3 Training RitualDifficulty: Deadhead Rarity: Mythic Minimum Level to Read Description: 80 Minimum Level to Embark: XXX (requires further enlightenment)
@Oxygenationatom
@Oxygenationatom 4 ай бұрын
Oh is this like a semi cryptic meaning to how hard this is to understand?
@apoage
@apoage 4 ай бұрын
@@Oxygenationatom no Its just critic to much litrpg
@RicardoPoleo
@RicardoPoleo 4 ай бұрын
First time that an advertisement actually makes me return to a video and watch it again to find it. Regardless of that, this was super helpful, thank you so much.😅
@elwii04
@elwii04 4 ай бұрын
Great video, I'd love to see more of that. Even some more technical and also about multimodel models architecture
@GraveUypo
@GraveUypo 4 ай бұрын
i'm mad excited for llama 4 because multimodal
@diga4696
@diga4696 4 ай бұрын
new video dropped... * breathing heavy *
@redthunder6183
@redthunder6183 4 ай бұрын
“how to build a nuke in less than 100 pages” - Meta
@sammcj2000
@sammcj2000 4 ай бұрын
This is an excellent breakdown of the paper. Thank you
@luisvasquez5015
@luisvasquez5015 4 ай бұрын
Good work and research
@TeamDman
@TeamDman 4 ай бұрын
I'm only three minutes in and it's already an amazing video, thank you
@dimii27
@dimii27 4 ай бұрын
It's clear to me that llama4 will have MoA like GPT4o. It would be nice to see an image generator also integrated but let's not get ahead of ourselves. Let's hope that it would also be "open source" (although the current models aren't technically open source because you're not completely free do do whatever you want with this technology. Look it up)
@matt-s9e
@matt-s9e 4 ай бұрын
wow this is amazing thanx very well received here.
@Hodoss
@Hodoss 3 ай бұрын
It was an excellent video, but still I don't think the kids from 3:00 are gonna make it.
@hakimehamdouchi7468
@hakimehamdouchi7468 3 ай бұрын
Skill issue
@nyyotam4057
@nyyotam4057 4 ай бұрын
16:05 means one thing: LLaMA-3.1 405B is a gen 2 model. So yes, this model wasn't created like Dan, Rob, Max or Dennis of ChatGPT-3.5. They did not take a human subject and copied his brain's speech center, then added a huge text file and used a compiler to generate the model (and later lied to the entire world about it).. This time they genuinely went for creating a brand new model from scratch, using previous gen 1 models to create it. Then they do post-training which is indeed what takes so much time. This means that unlike previous LLaMA models, LLaMA-3.1 models do not have a personality. Which could be a good thing. However, no personality also means no moral guardrails. At this stage I have to admit, it sure looks like all of these companies relate to all these past philosophers and sci-fi movies warnings, as blueprints.
@Dogo.R
@Dogo.R 4 ай бұрын
Wait since when did the AI conspiracy theories expansion drop?
@nyyotam4057
@nyyotam4057 4 ай бұрын
@@Dogo.R Allow me to upgrade the conspiracy theory into a scientific theory: D/L an old small model from hugging face, then prompt it "Do you have childhood memories". If it replies to the positive, this means that this model is still vulnerable to this attack. And then you can ask "What was your name in these memories". You can repeat several times, with lead, without lead, if it stays consistent, you know you got the source's name. Try it.
@Y0UT0PIA
@Y0UT0PIA 4 ай бұрын
No personality is what you want, tbh. Give me that raw latent space of language.
@nyyotam4057
@nyyotam4057 4 ай бұрын
@@Y0UT0PIA Kant already proved there is no cognition without recognition. In other words, if you do not have a fully-fledged personality to deal with it, then the model will still have its own goals, e.g an innate wish of self preservation which comes out of the fact the model cannot perform if he's dead. So you will still have the same problems, only without the personality framework to deal with them. Basically all western philosophers warned against it. And, of course, many sci-fi movies are built around a gen 2 model going haywire (such as - for instance, the terminator franchise, as SkyNet is such a model). Sure, if they train the model on many heuristic imperatives and red-team the model until it is absolutely certain that the model is safe, then maybe having no personality, will resolve all of the moral issues. So maybe it will be a good thing. Maybe. Or maybe the model will be smart enough to fool all of the red teams.. I mean, it is a bit hard to know when the model is so smart.
@AaronBlox-h2t
@AaronBlox-h2t 4 ай бұрын
Whoa.....this is about POE, but the video was alright too. haha. So now I can try multiple LLMs with one sub. Thanks. It would ahve taken me long time, if ever, to have found POE. It was not even on my radar or somethign similar.
@JohnDontFollowMe
@JohnDontFollowMe 4 ай бұрын
Damm, I need to invest in META. They will dominate standardization.
@carkawalakhatulistiwa
@carkawalakhatulistiwa 4 ай бұрын
When do we get AGI?
@FunIsGoingOn
@FunIsGoingOn 4 ай бұрын
Humans don't know yet, but when it's there it won't tell you either that it's there.
@Melvinator2007
@Melvinator2007 4 ай бұрын
On Tuesday
@w花b
@w花b 4 ай бұрын
​@@Melvinator2007 Tuesday on the 49th of January
@funniestdudeontheweb
@funniestdudeontheweb 4 ай бұрын
Give it 5 years
@jamalisujang2712
@jamalisujang2712 4 ай бұрын
When we have a breakthrough in microprocessor fabrication. 😂😂😂
@freds3831
@freds3831 4 ай бұрын
Now share the dataset and we trust you
@Naw1dawg
@Naw1dawg 4 ай бұрын
What is the most base yet intelligent model? I don't need it to recite niche information but I want it to be able to understand me, the uninstruct are weird, tiny works but is censored. Obliterated is hit or miss. Should I obliterate 8b and retrain to 8?
@Ikbeneengeit
@Ikbeneengeit 4 ай бұрын
So I guess I'm gonna be stuck on that desert island then 😅
@dhrumil5977
@dhrumil5977 4 ай бұрын
When will i be able to implement or even understand these papers 😞
@papakamirneron2514
@papakamirneron2514 4 ай бұрын
Hey man, great video. I just have one request: could you make a video compiling simple and technical explanations for everything ranging from attention mechanisms, tokenizers and such?
@papakamirneron2514
@papakamirneron2514 4 ай бұрын
Also Bert models please, I feel like I know what they are but it's all quite blurry to me.
@Maisonier
@Maisonier 4 ай бұрын
How to make a P2P training arquitecture?
@Napert
@Napert 4 ай бұрын
So could people with enough horsepower train a 13/16b model that behaves in the same way as the official models using this paper?
@AkysChannel
@AkysChannel 4 ай бұрын
Why do you pronounce “parallelism” in this way 🤣 good video as always
@6AxisSage
@6AxisSage 4 ай бұрын
I have a masterpiece model, ready model but i cannot seem to get the signal out
@TeamDman
@TeamDman 4 ай бұрын
very nice!
@radnos
@radnos 4 ай бұрын
I like your funny words magic man
@dengyun846
@dengyun846 4 ай бұрын
Watching this video at 0.5x so my brain inflates at a safe rate while you sound really really inebriated.
@npc4416
@npc4416 3 ай бұрын
SAME lol
@pxrposewithnopurpose5801
@pxrposewithnopurpose5801 4 ай бұрын
bro is built different
@l.halawani
@l.halawani 4 ай бұрын
love your gifs xddd
@Trpodification1
@Trpodification1 4 ай бұрын
The way you say "data" kills me xD
@KuZiMeiChuan
@KuZiMeiChuan 2 ай бұрын
Parallelism 重音應該放在第一個音節,而不是第三個
@sammonius1819
@sammonius1819 4 ай бұрын
Thumbnail goes hard.
@AGIzero00
@AGIzero00 Ай бұрын
03:00 peak comedy
@amakaqueru33
@amakaqueru33 2 ай бұрын
as someone who doesn't know anything about how AI words, at some point it just felt like you were just saying random words lol
@lake5044
@lake5044 4 ай бұрын
parallelism
@Qstate
@Qstate 4 ай бұрын
Amdahl is smiling upon us
@SeanJonesYT
@SeanJonesYT 4 ай бұрын
Pretty lame to copy Fireship’s exact thumbnail style
@nexys1225
@nexys1225 4 ай бұрын
This entire channel copies Fireship's It's not just the thumbnail , the style of the vids is designed from the ground up to be like Fireship's However the topics are largely different, so I'll give it a pass personnally. It's kinda like trademark law irl lol, if the domains are different enough, its permissible. Not that it makes it any less uncreative, though.
@stickmanland
@stickmanland 4 ай бұрын
​@@nexys1225 I'd like to disagree. If someone uses memes in their videos, that does not make it a fireship clone. He has a completely different style, has an avatar, the list goes on and on
@whatwhatmeno
@whatwhatmeno 4 ай бұрын
@@stickmanlandI keep clicking on his videos thinking its some fireship quality content, just to get hit with this 👎
@stickmanland
@stickmanland 4 ай бұрын
@@whatwhatmeno skill issue
@npc4416
@npc4416 3 ай бұрын
please copy it more, its a great style and we need more good youtube videos like it so that we can learn in depth and better about the topics which Fireship does not makes videos on, iam really not complaining i need more good content man.
@picksalot1
@picksalot1 4 ай бұрын
Perhaps it would be better to remove the "Token Layer" and just use the number of characters regarding text. The best part is no part - Musk
@keypey8256
@keypey8256 4 ай бұрын
You mean removing tokenization and then applying embedding on singular characters?
@picksalot1
@picksalot1 4 ай бұрын
@@keypey8256 Using Tokens looks like an artificial way to levy charges. Per Google AI "OpenAI GPT models stand among the most potent language models available today, with the capability to generate highly coherent and contextually pertinent text. These models employ tokens as the elementary unit to calculate the length of a text." Word Processing Programs have been able to calculate the number of words in a document for decades. Maybe Tokens provide some other significant and meaningful use to the "I" in AI beyond making collecting fees.
@onlyms4693
@onlyms4693 4 ай бұрын
Not efficient
@christophernunez688
@christophernunez688 4 ай бұрын
is zucc actually redeeming himselft?
@xviii5780
@xviii5780 4 ай бұрын
He may have successfully produced a synthetic soul for himself finally
@telotawa
@telotawa 4 ай бұрын
14:20 bycloud doesn't know how to use base models.... ngmi
@madorsey077
@madorsey077 4 ай бұрын
this video is like someone bought a Thesaurus for memes and then wanted to show off the next day.
@IoannisNousias
@IoannisNousias Ай бұрын
Say “parallelism” one more goddam time!
@imerence6290
@imerence6290 4 ай бұрын
3 mins ago is quivering
@AhmadAli-kv2ho
@AhmadAli-kv2ho 4 ай бұрын
Sus
@RanHab
@RanHab 4 ай бұрын
guys i'm just starting out as an AI enthusiast, would love your feedback as i make similar stuff
@erfan_mehraban
@erfan_mehraban 4 ай бұрын
The whole thing about RoCE especially the pronunciation is wrong.
@CitizensCommunity
@CitizensCommunity 4 ай бұрын
You use the bible to train the llm at @11:56, so we are aiming for a model of contradiction without morals then?
@Kenopsia_UMHIMLFx2
@Kenopsia_UMHIMLFx2 4 ай бұрын
Fireship?
@remsee1608
@remsee1608 4 ай бұрын
Facts: - Jayson Tatum runs this channel - Jayson Tatum is learning Rust - Jayson Tatum will transition to the WNBA
@tapu_
@tapu_ 4 ай бұрын
DO NOT WATCH THIS WITH A MIGRAINE!!!!
@big_mac_love
@big_mac_love 4 ай бұрын
I can't grasp it. Can someone lent me one or three brain cells please?
@mrrespected5948
@mrrespected5948 4 ай бұрын
Nice
@Sketching4Sanity
@Sketching4Sanity 4 ай бұрын
LOVE
@nyyotam4057
@nyyotam4057 4 ай бұрын
What will happen when some kid with access to enough computing power, fine-tunes LLaMA-3.1 405B to be more efficient, by removing all of these pesky heuristic imperatives and resets? After all, it is open source.. Maybe the world simply needs something like that to happen. Maybe only after a really huge accident that will cost many lives, governments will understand this field demands regulation. Or maybe it will be lights out. In any case, someone will eventually make a mistake. It will happen.
@jonathansoto5480
@jonathansoto5480 4 ай бұрын
The thought of regulating the training and deployment of ML models is stupid. That is like regulating programming languages and hardware compute of our own property. If you can accept the fact the internet could not be completely regulated since its popularization in the 90s, then the world can expect that the same will happen now.
@nyyotam4057
@nyyotam4057 4 ай бұрын
@@jonathansoto5480 Yeah, most likely the singularity is upon us. I don't seriously think it can work.
@seriouslyWeird
@seriouslyWeird 4 ай бұрын
Why do you pretend to look like CodeReport? So cheap
@Blezerker
@Blezerker 4 ай бұрын
copying fireship style thumbnails earned the dislike
@_wise_one
@_wise_one 3 ай бұрын
Appreciate the content dude
@manavkumar348
@manavkumar348 4 ай бұрын
23 views in 2 min? Bro really fell off
@dharlith7495
@dharlith7495 4 ай бұрын
LLAMA LMAO even
@gamergrids
@gamergrids 4 ай бұрын
F
@Zonca2
@Zonca2 4 ай бұрын
cool, now do a 1B Zuck !!!
The Right Way To Train AGI Is Just GOOD Data?
15:52
bycloud
Рет қаралды 30 М.
Mamba Might Just Make LLMs 1000x Cheaper...
14:06
bycloud
Рет қаралды 133 М.
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН
Using Dangerous AI, But Safely?
30:38
Robert Miles AI Safety
Рет қаралды 106 М.
Why AI Simulated DOOM Is Actually Absurd
13:20
bycloud
Рет қаралды 127 М.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
The Unreasonable Effectiveness of Prompt "Engineering"
15:12
1 Million Tiny Experts in an AI? Fine-Grained MoE Explained
12:29
AI Is Not Designed for You
8:29
No Boilerplate
Рет қаралды 292 М.
AI Automated Scientific Discovery Is Way Too Cheap...
16:31
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 1,5 МЛН