Mamba Might Just Make LLMs 1000x Cheaper...

Рет қаралды 133,492

Күн бұрын

Пікірлер: 314

@bycloudAI 10 ай бұрын

Check out HubSpot's ChatGPT at work bundle! clickhubspot.com/twc and I missed out a paper that combined mamba & attention, you can find it here: arxiv.org/abs/2402.04248 what an interesting timeline this is

@eleven-nine 10 ай бұрын

JJK edit after joe mama joke. This video is a masterpiece.

@MisterDivineAdVenture 10 ай бұрын

Somebody (I nominate you) needs to chart knowledge half-life and relate to the landfill and SciFi and "human uselessness". Somehow this deserves at least multiple theses. If there was time.... I think that would be more telling than Kurzweil's Exponential Tech Acceleration chart. "The Matrix is a place to hide and play games...."

@skyhappy 10 ай бұрын

Can you share the video assets for the anime edit? I wanna try using it in the future. Also how did you get the voice?

@Nakamako1 9 ай бұрын

mark your vids where the examples are. i for example always skip straight to examples when im learning

@xthesayuri5756 10 ай бұрын

Bro dropped the hardest LLM anime edit and thought we wouldnt notice

@a_soulspark 10 ай бұрын

7:24 if you want to experience AI lobotomy

@zolilio 10 ай бұрын

Wasn't expecting that AT ALL 🤣🤣🤣

@JacksonMarshal 10 ай бұрын

This why he got a new sub 😂

@JorgetePanete 10 ай бұрын

wouldn't*

@C2WasAlreadyTaken 10 ай бұрын

Bruv I know. I made a clip and immediately shared. I almost feel like all technical information should be conveyed this way.

@sleepingArtist 10 ай бұрын

😂 I did not expect The JJK EDIT and died laughing

@Itachi_Uchia1 10 ай бұрын

Whats JJK??

@guiAI 10 ай бұрын

Jujutsu kaisen or smth like that@@Itachi_Uchia1

@tomerhorowitz4779 10 ай бұрын

@@Itachi_Uchia1 Jujutsu Kaisen

@AkysChannel 9 ай бұрын

Yes 🤣 So on point

@635574 10 ай бұрын

7:20 the greatest LLM anime of all time begins(JJK is within 2 letters of LLM)

@WoolyCow 10 ай бұрын

and by lack of competition...sadly the worst as well

@revimfadli4666 5 ай бұрын

LuLutsu Maisen

@Shivumgrover Ай бұрын

@@revimfadli4666

@OxidoPEZON 10 ай бұрын

That anime edit was one of the sickest media pieces I've seen, but unfortunatelly I have no friends in the intersection of jujutsu enjoyers and ai reaserch conisseurs, who would appreciate it wholly

@maoulegna 10 ай бұрын

Yeah, when I saw it I was like: I need to show it to... Wait who will ever understand it among my friends? No one

@RllXeron 10 ай бұрын

But U have Us in comments section so we can laugh together 😂

@AshT8524 10 ай бұрын

No one: Mamaba: Nah! I'd win.

@wlockuz4467 10 ай бұрын

"Stand proud Transformer, you are strong." - Mamba

@JoshuaEworo 10 ай бұрын

didn’t expect lobotomy kaisen to make its way to the LLM and AI space😭😭 best thing ever

@awesomebearaudiobooks 10 ай бұрын

In the early 2000s, here in Russia, Mamba was a very popular dating site. Good to hear they are now at the frontier of AI development!!!

@MilkGlue-xg5vj 10 ай бұрын

Just exactly like how KZbin used to be a dating site too! The story repeats itself.

@16876 10 ай бұрын

also 'cope' (the word in the thumb) is 2019-ish 4chan troll word; this video is nostalgic in many aspects!

@ponponpatapon9670 10 ай бұрын

@@16876 isn't it ironic how Twitter started abusing the FUCK out of 4chan lingo

@MilkGlue-xg5vj 10 ай бұрын

@@16876 No one asked nerd

@Spyblox007 10 ай бұрын

I love seeing this trend of LLMs getting quicker and using less resources. I think we are only a few breakthroughs away from a point where LLMs can begin running on mobile devices locally at reasonable speeds. Right now companies are spending major resources on making the models smarter through the models themselves. However, make the model small and quick enough, and you could run it multiple times, prompted by hard-coded logic, to possibly accomplish the same things as the larger models without the need for as much power or space (at the cost of time). This could allow an LLM to exist on robots without being connected to a service. The technology is in the works for quick instruction following for robots, so an LLM being able to feed the robot instructions makes the robot self guiding, which would be a sight.

@lelouch1722 10 ай бұрын

"Exponentially" should stop being misused for everything that is bigger than linear... Quadratic != exponential

@losttale1 10 ай бұрын

what is x^3?

@varungawande9321 10 ай бұрын

It isn't wrong to call a square a rectangle.

@nanubalagnanasai3006 10 ай бұрын

@@WoolyCow Quadratic is polynomial, Exponential is exponential.

@WoolyCow 10 ай бұрын

@@nanubalagnanasai3006 yeah mb, i must be stupid lol

@jonathanduck5333 10 ай бұрын

@@losttale1cubic

@lordm31 10 ай бұрын

throughout youtube clickbait and interesting facts you are the honored one

@razieren7025 10 ай бұрын

this is crazy, everytime i click i think im going to watch a fireship vid

@pattyspanker8955 10 ай бұрын

bait

@ambinintsoahasina 10 ай бұрын

The Gojo reference made me shout out loud like a little fangirl :')

@XashA12Musk 10 ай бұрын

can you give me the original source of that anime edit ?

@ambinintsoahasina 10 ай бұрын

@@XashA12Musk the manga is Jujutsu kaisen. He took from multiple parts. You can check chapter 75 for the "throughout the heavens and the earth" and chapter 221 for the "nah, I'd win"

@xthesayuri5756 10 ай бұрын

it scales quadratically not exponentially

@bycloudAI 10 ай бұрын

oops i meant it metaphorically that was a bad word choice lol

@Alice_Fumo 10 ай бұрын

I guess ² is still an exponent by which we're scaling here. I'm sure anyone watching this video will know what was meant. A correction should probably still be made.

@LeoVital 10 ай бұрын

@@Alice_Fumo But exponential means that it scales according to the nth power. n² is polynomial, better than linear but not exponential like 2ⁿ.

@triforce42 10 ай бұрын

Came here to say this. Quadratic and exponential is a huge difference.

@xxlvulkann6743 9 ай бұрын

@@LeoVital *worse than linear

@nawabifaissal9625 10 ай бұрын

naaah lobotomy kaisen is taking over everything i swear 💀💀💀💀

@zrakonthekrakon494 10 ай бұрын

Just for a few months than all the lobotomies will forgor

@zrakonthekrakon494 10 ай бұрын

Just for a few months than all the lobotomies will forgor

@avizi_ 10 ай бұрын

The last thing I expected was a jjk edit

@Artorias920 10 ай бұрын

awesome breakdown. When the other AI hype channels asked bycloud if he could go head to head with their surface level analysis, bycloud responded "Nah, I'd win" (DEEPFRIED BASS)

@david_n_nettey 10 ай бұрын

I was not expecting to get a lobotomy while watching an LLM news video today...

@rileyretzloff8778 10 ай бұрын

this was the most entertaining and somehow equally educational llm/ai video i’ve ever watched

@MemesnShet 10 ай бұрын

Yes but how much would it cost to port something like GPT4 to Mamba or if they even can or they'd have to start from scratch? It's probably wont be the only architecture to come out so i imagine OpenAI are waiting for something that is very clearly way better in almost all categories compared to transformers

@wakannnai1 10 ай бұрын

They'd probably have to start from scratch. However it could take much less time to train. If it takes 1/1000 of the time to train with similar real world performance, it might become worth it. Transformers are proven while novel architectures like mamba are not. OpenAI is selling chatgpt after all so it may not be worth it for them.

@JazevoAudiosurf 10 ай бұрын

i have some hope for byte mamba but the architecture has drawbacks and seems more like an intermediary step before something greater that builds on it

@NickMak-m2c 10 ай бұрын

My greatest hope is that is can really run 6B models like they're 2B, was that for train or for actually running them? If it's for running them, then even the 40B param issue won't matter for local models, most consumer computers would gladly take 40B models that run like they're 20s

@dualasus12 10 ай бұрын

Ngl the thumbnail tricked me, thought it was a fireship video, but it worked lol and I’m still watching.

@sarveshpadav2881 10 ай бұрын

same!!

@bazookaman1353 10 ай бұрын

I'm not sure if everyone reading this knows, though with cloud's audience it's probably most. Going from something exponential to something linear is a GODSEND. The title says 1000x, take that as you will but even if it's just 10x, due to how exponentials work this would still save much way more than 1000x in the future because if it exponentially increased the computational costs would go way out of control, but with linear it's way WAY more manageable. If this is real, it will be a complete game changer.

@tyler.walker 10 ай бұрын

Technically, LLM context length increases quadratically, not exponentially.

@verigumetin4291 10 ай бұрын

don't get your panties wet. It might just be smoke until tested to see if it works with extremely large models. But the thing about it understanding vision better, can't they just codify data into visuals and then have the LLM train on that, then you build another LLM that can translate the input and output and the problem is solved? But then again, even with the vision based MAMBA, large models still haven't been tested so who knows.

@ckpioo 10 ай бұрын

@@verigumetin4291the thing with that method is that yes it's possible but problem is similar to moe

@hoodie_cat 10 ай бұрын

You gained a like due to the high-quality of your video, however, when the JJK edit dropped, you gained my subscription and my worship. 🛐

@jaywessex5818 9 ай бұрын

Dude that was such a sick anime cut in clip. How did you make that? D that script? All ai?

@tannenbaumxy 4 ай бұрын

I come back to this video regularly just because of that sick JJK edit! Oh and also because it has the perfect balance between technical explanation and entertainment.

@Taddy_Mason 10 ай бұрын

Ngl, my bro is the Jay-Z of LLM education...out here dropping bangers.

@andrewshort6440 10 ай бұрын

Such a great video, and FUNNY! Glad you're making these

@Cdaprod 10 ай бұрын

Thanks for the papers in the description, I just put them a urls=[] and hydrated my s3 and vdb with them 5:42 😎

@EvertvanBrussel 10 ай бұрын

I just want to see a 7B mamba model trained on the same data as a 7B transformer model and get to try them both and test them on certain abilities.

@megachelick 10 ай бұрын

many of us want it too

@chamba149 10 ай бұрын

If you were one of the professors at my school I would never miss a class lol. You are great at breaking down concepts and making it funny. Keep up the good work!

@arjundeshmukh8773 10 ай бұрын

I just wish to be this talented- amazing video

@mati_5555 10 ай бұрын

Just wait until they discover Mamba No.5. There will be no going back...

@KodandocomFaria 10 ай бұрын

what if there is a 70B mamba ? it can surpass existing ones? I don't saw in any place comparations where they compare mamba 70b with any other big model. perhaps it would be a decisive analysis to see how it performes

@Spyblox007 10 ай бұрын

I assume that bigger models are still in the works. Most attention is on transformers-based models, so the money and resources to train a 70b model for Mamba are taking longer to gather. I'm definitely looking forward to seeing what becomes of it though.

@Steamrick 10 ай бұрын

Dangit, now I have Lou Bega's Mambo No. 5 stuck in my head!

@sla8tful 10 ай бұрын

I have no idea how LLMs work and I still understood some of it and the implications. Which is to say, this is a great video my dude.

@pareak 10 ай бұрын

Your videos are just the best. Humor and knowledge in its best combination

@janniksco 10 ай бұрын

Thanks!

@AlanMeigs 10 ай бұрын

Woooooowwww, 8ish minutes in was a mic drop I didn't expect. First time here, not the last.

@gemstone7818 10 ай бұрын

that was a great part at 7:21

@voidnull4282 10 ай бұрын

Straight up stealing fireship viewers with these thumbnails ☠️

@felipearrais5415 9 ай бұрын

I completely agree but at least the context doesn't let down

@princejona 10 ай бұрын

GOD LEVEL EDIT, WHO IS THIS KZbinR?

@MeowEngineer 10 ай бұрын

Hardest AI channel 🪨 🤘 subbed...

@bluemamba5317 9 ай бұрын

00:27 🔍 Self-attention mechanism in Transformers enables advanced text completion but struggles with basic arithmetic; companies integrate calculators to mitigate issues. 02:29 🔄 Mamba, a new model, addresses Transformer inefficiencies by scaling linearly, not exponentially, and doesn't rely on the attention mechanism. 05:32 🚀 Mamba potentially offers 1000x cheaper scaling than GPT-4, with quadratic scaling improvement and faster calculations, revolutionizing AI chatbots. 07:10 🧬 Mamba's long sequence handling benefits DNA modeling, audio synthesis, and analyzing high-resolution images or long-form videos. 10:29 💻 Mamba Bite model learns directly from raw bytes, eliminating tokenization biases, enhancing long sequence comprehension, and potentially enabling true multimodal models. 12:48 ⚠ Potential downside of Mamba: "Lost in the Middle" issue may lead to permanent loss of information in very long contexts; further research needed to address this.

@Deductive 10 ай бұрын

This video is where I will start with my thesis.

@yash1152 10 ай бұрын

5:02 i am all in for non-sound memes. at least it doesnt make me a weirdo when watching without earphones in an open space.

@chanxo643 10 ай бұрын

the JJK edit was INSANELY funny!

@Dr_Birthday 10 ай бұрын

The LLM anime edit earned my subscription

@ChinchillaBONK 10 ай бұрын

07:24 : This should be clipped and meme-d until all of mankind's Mamba stood erect and proud

@s3r0tav 10 ай бұрын

the jjk edit killed me, bro i love you

@berkeokur99 10 ай бұрын

Bro the JJK edit is superb

@lilstringcheese 10 ай бұрын

I cannot explain how much I enjoyed that edit

@RaaynML 2 ай бұрын

11:45 why can't we force it to process numbers as sequences of digits in certain contexts?

@phizc 10 ай бұрын

5:39 It's only 1000 times cheaper because the price was per 1000 tokens. If it was per 4000 tokens, it would get 4000 times cheaper, and so on. 😊

@absence9443 10 ай бұрын

The edit made me understand than without :D

@zainkhalid3670 10 ай бұрын

7:22 Fire 🔥🔥🔥🔥🔥

@qwertyuuytrewq825 10 ай бұрын

finally understood why AI cannot tell how many letters N word banana contains )

@Hollowed2wiz 10 ай бұрын

damn, I just tested it with gpt4, and it said that there are 3 n in banana. It's so funny 🤣

@qwertyuuytrewq825 10 ай бұрын

@@Hollowed2wiz what is interesting is that gpt4 can spell this word one letter at time if asked and then give right answer about letter N count. So it seems that despite tokinezation gpt4 knows something about spelling...

@pablitocodes 8 ай бұрын

I've been attending parties with the mamba method forever. I may have been doing that wrong.

@emmanuelikhide8998 6 ай бұрын

Yoo that LLM'S Anime edit hit hardddd😂😂

@synthclub 10 ай бұрын

7:25mins in woah!!! I want all of my tech news delivered in this format!! excitingly eccentric, suspense filled, noir comic animations with deep, rich & sexy actor voice overs.. so cool! @bycloudAI

@jarkkop1004 10 ай бұрын

VMamba has updated the scaling chart at 9:35. Performance keeps increasing with increased model size, but not much

@huangjason6557 10 ай бұрын

Didn't expect to see some jojo and gojo references in a AI model video, awesome!

@lordkacke6220 7 ай бұрын

Bro. You make these LLM videos so interesting and funny. How do you come up with these? Keep it up

@jasonhemphill8525 10 ай бұрын

13:15 that’s the most passive paper name I’ve ever seen

@maxvg9161 10 ай бұрын

Great video, I already heard about Mamba, but didn’t get into it myself! The lobotomy Kaisen edit went really hard haha. Any chance you will be making a video about Liquid Neural Networks? Keep up the good work :)

@pebre79 10 ай бұрын

Great content. Keep up the great work! I subbed

@SianaGearz 10 ай бұрын

As an electrical engineer: REAL TRANSFORMERS HAVE WINDINGS.

@bloodcraver9183 10 ай бұрын

I would not have commented if it wasn't for that "nah I'd win" Mamba edit

@rallyworld3417 10 ай бұрын

Plz tell me how do you select gifs for videos it's so precised that I doubt it's made by chatgpt no human

@Gatrehs 10 ай бұрын

This and the LPU that Groq uses are going to be insane together.

@julianvandenberg2002 10 ай бұрын

That thing was still created purely for a Transformers style model. They will need to make a new arch

@nullbeyondo 10 ай бұрын

I expected "Are your hidden states linear because you're a State Space Model? Because I can't seem to figure out your next move." :D

@ln2deep 10 ай бұрын

Dude... We are dangerously close to a bearish breakout.

@BrianMosleyUK 10 ай бұрын

Wonder what Google are doing with 1M - 10M+ context Gemini Pro 1.5?

@anren7445 7 ай бұрын

the LLM anime edit was fucking gold

@ginqus 10 ай бұрын

These comments make me feel stupid 😭 All I understood is that mamba is faster than traditional transformer thing (the party explanation was awesome, thank you), and that mamba abandons the tokenization and uses... something else instead... I kinda wish you would summarize the video at the end for silly creatures like me But for now it's time to rewatch everything!! 🥳

@joey199412 10 ай бұрын

I wonder if OpenAI switches to Mamba architecture if they will drop the "GPT" branding since technically the T will not apply anymore. I wonder if "GPT" will be like how Boomers would call every game console a "nintendo" and just used by the mainstream to mean every LLM, no matter the underlying architecture.

@LowestofheDead 10 ай бұрын

Lol Nintendo, but doesn't GPT = General Pre-Trained, not "-Transformer"?

@Kinatera. 10 ай бұрын

the video was good and i liked your style, but then the JJK edit dropped actual fire 🔥 i hope you don't mind me reposting the JJK edit section of your video tell me if you want me to take it down

@joannot6706 10 ай бұрын

I alone am the subquadratic one.

@aminzarei1557 10 ай бұрын

Sooner or later we would need a bytes-level model architecture for multi-modality. Hope the test result for this one be good🙏 . Cool video btw 👌

@tristotech 10 ай бұрын

I do love Mamba for faster token/sec. But there's still a long road to make it able to extract key information from long text. For now it still feels like Bart or Gemma 2b for short prompt

@richardnunziata3221 10 ай бұрын

Without a serious upscaled mamba it will be going no were expect for niche areas

@GeekOverdose 10 ай бұрын

Lobotomy kaisen edit was PEAK

@danisekoi 10 ай бұрын

that gojo edit has to be the hardest thing ive seen in years wtf were you on when that came to your mind

@question_mark 10 ай бұрын

bro that anime thing was just incredible

@cefcephatus 10 ай бұрын

I imagine Transformer guys are seeing MAMBA as a big 2 way transformer and yank it in the Transformer architecture forming multi-architecture transformer model.

@jichaelmorgan3796 10 ай бұрын

Imagine if the middle eastern investors invested trillions in obsolete technology

@codersanInHoodie 10 ай бұрын

bro the content on this channel is for such a niche audience only 5 people on KZbin will get all the memes and understand the science

@jefersonlemos4135 10 ай бұрын

yeah, I understand you mamba, I too have a "lost in the middle" problem

@chfr 10 ай бұрын

Cool reference to the router edit

@BosonCollider 10 ай бұрын

Even if this doesn't replace transformers, this looks like a very promising way to replace tokenization/word vectors by having a layer read the bytes and output vector tokens

@nosferiazafora 10 ай бұрын

This whole video is Gold.

@jeffg4686 9 ай бұрын

Nice ! I wonder if this would be compatible with Groq's chip.

@akvartz 10 ай бұрын

tldr by Perplexity: The video titled "Mamba Might Just Make LLMs 1000x Cheaper..." discusses a new AI architecture known as Mamba, which aims to significantly reduce the cost and improve the efficiency of Large Language Models (LLMs). Mamba differentiates itself from traditional Transformer models by utilizing a State Space Sequence Model (S4) and a selective mechanism, which allows for linear scaling and faster inference times. This architecture shows promise not only in language tasks but also in vision tasks, indicating its potential as a versatile tool in AI development. Mamba's approach to learning directly from raw byte patterns rather than tokenized text addresses the challenges posed by tokenization, such as distortion in text representation. This method enables Mamba to generate more accurate and coherent text, especially for long or complex sequences. Despite its advantages, Mamba faces challenges like information loss in long contexts. However, its introduction represents a significant advancement in AI architectures, potentially challenging the dominance of Transformers.

@jarrod752 10 ай бұрын

So this is like when the LLM learns the 5 finger touch of death?

@EobardUchihaThawne 10 ай бұрын

what about ring and flash attention? dont gemini or sora uses different atfention than mh attention to get 7M tokens?

@Said-n7o 10 ай бұрын

So they need a model where ai instead of paying attention to whole page. It pays attention to most important parts. For example it summarizes for it self a page and discards excess Of the text from the page retaining the summary of the page? Is my guess at even same universe to what this video is intending to say