Join the waitlist now for exclusive access to the OnDemand platform: on-demand.io/contact let me know if you like these kinds of research breakdown too! my newsletter: mail.bycloud.ai/
@khla.mp47 ай бұрын
I'm glad we have someone to translate research language into memes and references for us
@animalnt7 ай бұрын
Just need the soundboard and temple run in the corner and we skibidi
@Wagner-uv6yp7 ай бұрын
@@animalnt Yeah like what is this guy's background lol he sounds like a college kid not a PhD person.
@animalnt7 ай бұрын
@@Wagner-uv6yp "not a PhD person" 💀
@thebrownfrog7 ай бұрын
Good thing I'm 15. Perfect channel for me
@revimfadli46667 ай бұрын
Other than bycloud and fireship (and maybe 2 minute papers and yannic kilcher), is there another?
@SmallLanguageModel7 ай бұрын
I heard people in the RWKV and EleutherAI discord complain that they used wrong hyperparameters for some other architectures, while they used the most optimal hyperparameters for xLSTM. So the results are not entirely honest and they try to hype up their own architecture, but what else is new...
@bycloudAI7 ай бұрын
I was a tiny bit suspicious when xLSTM didn't publish codes, but damn okay
@w花b7 ай бұрын
When money ruins research... But hey, it's been a thing since research exists.
@adamrak75607 ай бұрын
The context extension comparison is practically wrong too, and highly misleading. If you train transformer for a small context length without any extra extension tricks it will perform really badly for large context, by design. Comparing to that is simply not useful. It is simply misleading for most readers/viewers. There are context extension tricks you can use which avoid perplexity explosion for the transformers, those would chance the graph very significantly.
@nnnik35957 ай бұрын
Also when you actually look at the graph at 3:54 in detail you can see how the presented combination of both [1, 1] is worse than either of the previously existing methods in almost all metrics.
@DrW1ne7 ай бұрын
Every video of yours is like my own Christmas! Keep it up.
@user-cg7gd5pw5b7 ай бұрын
So many memes per second, my brain can't process everything.
@benjamineidam7 ай бұрын
(> complex topic = > memes / s) = AWESOME!
@bobspianosbffl7 ай бұрын
This video was so good what the hell Thank you for delivering me this complex info in an easily consumable way Subscribed
@ij93757 ай бұрын
I am a simple man, I see King Baldwin, I read AI, I click thumbnail😂
@sanderbos42437 ай бұрын
Kingdom of Heaven, my love
@whoami61077 ай бұрын
Really like the way you make these videos, i couldn't have understood these things otherwise
@Meleeman0117 ай бұрын
i have first hand exp with lstms and they are amazing. i can wait to try or implement this architect a solution with it
@literailly7 ай бұрын
Editing is amazing lol. Nice job, bycloud
@KW-jj9uy7 ай бұрын
I wonder if this is good for real time robotics, where you get fast real time data in tiny chunks, and you need a fast model with memory
@OnDemandAI7 ай бұрын
This OnDemand looks like quite the hoot!
@LokmanneАй бұрын
Impressive, xLSTM might actually be worth switching from Mamba to it. Can you please make a sequel video to this comparing xLSTM, Mamba, Transformers, Jamba and Mamba-2? I am so desperate to know.
@initialsjd58677 ай бұрын
I find this all extremely interesting, but am having a hard time finding the right way into understanding these topics, does anyone have a suggestion on where to start?
@Detril20007 ай бұрын
Start by a Neural Networks introduction course. Besides that, currently your only options would be to then study this on your own or enter a Master's program in Computer Science, as all "courses" on LLMs are currently extremely dumbed down and mostly just go over how to type on ChatGPT.
@NicitoStaAna3 ай бұрын
3b1b is a great start. You're going to need to understand matrix multiplication (okay, enough. It's just sumproduct in excel) Statquest for a great overview/deepdive for any false assumptions
@aykutakguen34987 ай бұрын
Interesting, thanks for the le vid
@alkeryn17007 ай бұрын
all these cool architecture never actually used on open models.
@pallharaldsson90157 ай бұрын
3:33 the first line there is slightly wrong, former sLSTM should be mLSTM apparently (no harm done, since 0), but more importantly the ratio could I guess be e.g. [2:3] similar to hybrid Mamba/transformer models mixing different number of Mamba layers and transformer/attention layers. It should be interesting to know the optimal [x:y] ratio, and even if it makes sens to mix also more e.g. Mamba into this... As seen at 3;::28 none of the ratios seem optimal. mLSTM has d x d matrix and the d could also be tuned also I suppose.
@dominiksvestka15873 ай бұрын
So could using only sLSTM blocks be superior to using LSTM blocks for predicting time series?
@Ludifant7 ай бұрын
We have seen time and again, that modeling stuff on what we see in humans helps. Now we are venturing in to macroscopic structures like memory, where we actually have working theories of how people do it, which can give clues to smart architecture. GPT is brute forcing, LSTM is social engineering. It's not like any is guaranteed to get you in, but it is good to have both tools in the toolbelt or both expertises and viewpoints in your team. As the landscape keeps changing, the winds of fashion favour one or the other.. But when motion pictures came out, they didn't stop painting or sculpting. But when talkies came out, silent movies were left behind. We need to get past this obsession with 'better' and see that as long as they differ enough, these are all valid strategies and the choice of tool depends on the nature of problem. It helps if you understand your tools and your problem.
@ccash32907 ай бұрын
Put some clouds in the background or something so people aren't confused by the similar thumbnails to other creators
@gualcasas5287 ай бұрын
that is the goal, for people to get confused and click on the video
@jonatan01i7 ай бұрын
outperforms mamba AND TRANSFORMERS?!?!?!
@monad_tcp7 ай бұрын
nothing outperforms Decepticons thou
@AMA147007 ай бұрын
Well at this point companies now should stop training their AIs for some time in order to wit for the best architecture 😂
@w花b7 ай бұрын
Reminds me of web development where there's always a new best framework and everything becomes "obsolete" every 2 months
@mfpears7 ай бұрын
I wish I had time to research this stuff myself. This is exactly what I would be researching. Transformers seem limited in some fundamental way to me. I want to see something more recursive and dynamic. But these are impressions only since I haven't had money/time to really dive in yet.
@Alpha_GameDev-wq5cc7 ай бұрын
LSTMs were all the hype before the Transformers dropped in 2017… Love to see the prodigal son return
@AntoshaPushkin7 ай бұрын
"xLSTM" sounds so lame. Should have been "2 LS 2 TM" or maybe "LSTM: Tokyo Drift"
@nathanpotter13347 ай бұрын
Back from 2017
@tom-et-jerry7 ай бұрын
Very nice music !
@beyse1017 ай бұрын
I believe I just got Schmiedhubered
@mawungeteye66097 ай бұрын
Up next xlstm-mamba hybrid
@olegpetrov26247 ай бұрын
Superior meme taste. Thanks bycloud.
@cdkw27 ай бұрын
With enough copium we sure can 🔥🔥🔥
@catoleg7 ай бұрын
This amount of memes per second remind me of the Bad Gear channel
@SamArmstrong-DrSammyD7 ай бұрын
Hell yeah, sigmoid FTW!
@TheGiovany827 ай бұрын
My brain went out of Ram with this one 😂😂😂😂
@ZenchantLive7 ай бұрын
Love your B Rolls lol
@anshulsingh83267 ай бұрын
Ok, so how to get started with AI development
@BoHorror7 ай бұрын
Try and predict the stock market
@mrrespected59487 ай бұрын
Nice
@Kenneth_James7 ай бұрын
What happened to the videos that showed examples of where the best AI generated images and video and gave a dummy like me an idea of what could be done with the current 'best'
@JayDee-b5u7 ай бұрын
A person really can't 'shill' their own thing. That's not the right word.
@andyizawsome7 ай бұрын
i know some of these words
@renanmonteirobarbosa81297 ай бұрын
Hopfield NN is the real GOAT
@cdkw27 ай бұрын
The newsletter is fire though bro, thanks for that
@fmmmtmm7 ай бұрын
What's an "architecture"?
@drdca82637 ай бұрын
Structure of the network
@crackwitz7 ай бұрын
Hoch-rei-ter Hoooch-reeeiii-teeer
@agenticmark7 ай бұрын
funny. subbed
@butterbee23847 ай бұрын
02:40 "beated" lmao
@JhonSabowsla7 ай бұрын
bro othe memes got me crying 💀😆
@MartinDxt7 ай бұрын
to whoever thought or is thinking the the ai revolution is dead this is just the tip of the iceberg :D
@prenomnom26867 ай бұрын
Pleeeeeease talk about YOCO 😢
@Ludifant7 ай бұрын
Hoch-reiter means High Rider... Sooo he might be high on copium..
@newbie80517 ай бұрын
Fucking hate Language Models, they have single handedly shitted on the entire community making everyone focus on chatbots. Gone are the days when ppl used to showcase their work on some image-data. Meh, ig I just don't like Language Models that much. transformers as an idea is really amazing, images is make intuitive for me, cannot grasp much stuff from "embeddings for words". I'll try xLSTMs as regressors, will definitely make a good projects, thanks for the video buddy 💖💖
@BrianGarcia-t9n4 ай бұрын
Rogahn Branch
@8jhj345gg7 ай бұрын
Fireship clone?
@TwiceVisible7 ай бұрын
Everyone copying the Fireship thumbnail style these days
@Adventure18447 ай бұрын
The entire social media world is based solely on copy & paste style.
@Amasglobulaires7 ай бұрын
good video but the music is badly mixed with the music
@Pepcen7 ай бұрын
Hmmm I sure wonder where he got the idea for this thumbnail format
@ucngominh33547 ай бұрын
hi
@newyorthtimes44967 ай бұрын
the way I filter papers nowadays is through code. If they come with a github repo, I check. If not, just toss to the bin
@jasonhemphill85257 ай бұрын
2nd
@lulboiking58067 ай бұрын
Noice!👍
@JosephJair977 ай бұрын
xd
@JAD3N7 ай бұрын
This guy, copying Fireship thumbnails 😂
@ThomasButlergT63 ай бұрын
Johnson Susan White Amy Taylor Sharon
@andiiacob16277 ай бұрын
That off key minecraft sounding music in the background is really distracting
@ScottSummerill7 ай бұрын
Not watchable with all the ridiculous movie clips. Leaving.
@FaultyTwo7 ай бұрын
Ah yes. LSTM: EXTREME EDITION! 🔥🔥🔥 HIDE YO KIDS, HIDE YO DATA. THIS RNN IS OUT THERE FOR BLOOD 🩸🩸
@monstercameron7 ай бұрын
xLSTM BYTE when?
@xv0047Ай бұрын
too many gifs that make the video too distracting.
@jonnylukejs7 ай бұрын
wtf thats mine i made the block matrix lstm thats lame they made a paper and just claim it