You can try out Luma AI's Dream Machine here! luma.1stcollab.com/bycloudai I am really good at having great timing. MovieGen came out when I nearly finished the video. I'm sad. So here's a quick definition of DiT: A diffusion transformer (DiT) is a model that combines elements of diffusion models and transformers to generate data like image synthesis, audio generation, or text generation. Diffusion models are a class of probabilistic generative models that create data by iteratively denoising a latent variable, which starts from pure noise and is gradually transformed into a coherent sample. Transformers on the other hand, are neural network architectures known for their ability to model long-range dependencies in data, primarily through self-attention mechanisms. You could ultimately say that, a diffusion transformer is just a transformer with the goal of denoising. Yum. Here's MovieGen's paper: arxiv.org/abs/2410.13720 it contains a better run down to crafting the latest near SoTA video generation
@El_Carlangas3 ай бұрын
Thanks a lot for this video, it was really helpfull to start to understand how all these ai technology works. All the people working behind this is literal geniuses.
@diogonunes16083 ай бұрын
The baking analogy was perfect for me. Thank you 🙏😊
@tannenbaumxy3 ай бұрын
Yes, a deep dive into diffusion transformers for one of the next videos would be awesome!
@cdkw23 ай бұрын
that bread analogy really got me hooked, nice work and animation!
@DJTechnosapien3 ай бұрын
Hey man, really appreciate your humor and memes, makes learning ML a lot more fun. Always looking forward to more!
@MilesBellas3 ай бұрын
A video on Diffusion Transformers = 😊👍
@m_e_m_es46493 ай бұрын
Could you possibly make the same video for Openai's advanced voice mode?
@Words-.3 ай бұрын
I second this
@authenticallysuperficial98743 ай бұрын
Upvote
@sammonius18193 ай бұрын
I'm pretty sure they trained an AI to mimic text-to-speech conversations between people and GPT-4, and then fine-tuned it on actual human speech to make it sound more natural. That would explain why it sounds uncanny rather than robotic or human. Just my guess though.
@nilaier14303 ай бұрын
Hey, bycloud, even if you fell off, I won't stop watching your nerdy videos because they're cool ❤
@thenoblerot3 ай бұрын
Yes please, a video on diffusion transformers! Great channel
@rmt35893 ай бұрын
We definitely need a dedicated video.
@andrey2001v3 ай бұрын
This video is so cool, a literal gold mine of information on how modern AI models work Bread analogy was extra nice - I finally understand why diffusion models struggle with different resolutions
@huraqan37613 ай бұрын
De-noised bread, got it!
@DeepakSingh-ji3zo3 ай бұрын
This is just excellent!! Animations and Analogies were pure gold.
@Nazrininator3 ай бұрын
I like how you added the Physics Simulation clip. I like it.
@lex_darlog_fun3 ай бұрын
Diffusion transformers in general? Yes, please!
@TankorSmash3 ай бұрын
That bread analogy was 100% chatgpt
@Words-.3 ай бұрын
Thank you for finally explaining!
@TahuRock3 ай бұрын
GOATED VIDEO 💪🏾💪🏾💪🏾
@Random_person_072 ай бұрын
You should make a video of how Ai TTS works and different types and stuff
@MilesBellas3 ай бұрын
Baking Bread = great metaphor
@niklase59013 ай бұрын
You are my fav AI channel so it would be great to hear your take on Yann LeCun idea on how to build human level intelligence. He held a talk about this on the Hudson forum recently. Instead of LLM:s he wants to build models that truly models works by predicting the state of the world given some action. I can see how that would be a very effective model, but I suspect it will be easier to get around all the short falls of LLM, than to build this fancy model LeCun suggests. What do you think?
@dpactootle25223 ай бұрын
I watched half of the video to remind myself that life can suck a lot sometimes.
@snylekkie3 ай бұрын
@bycloud do you know if anyone encoded math statements as integers like Gödel did, and used that as a custom LLM encoder for math proofs?
@TheDreamFx3 ай бұрын
Hey! Great video! It would be nice if you cloud link your blog in the video description :)
@n45a_3 ай бұрын
wth i just thought that i need an explanation for diff transformers erlier today
@iknowsolittle3 ай бұрын
How are you this smart and knowledgeable? Dont answer that. I just think ur super cool dude haha
@ulamss53 ай бұрын
at some point the bread analogy was harder to understand than the actual math
@7image3 ай бұрын
3 views in 2 mins bro fell off🔥Shout out my favorite nigerian tech youtuber
@bycloudAI3 ай бұрын
going for the "ranking by views: 10 of 10" for this one 🔥🔥🔥🗣️🗣️🗣️
@DynamicLights3 ай бұрын
@@bycloudAIlol
@DynamicLights3 ай бұрын
He is Nigerian how do u know?
@StefanReich3 ай бұрын
Bro does NOT sound Nigerian
@7image3 ай бұрын
@@DynamicLights I personally met him in abuja
@kingki19533 ай бұрын
In summary: put noise dough to oven and cook it to become AI video generator 🗿
@starbez3 ай бұрын
Shouldn't sponsored content be mentioned within the first minute of a KZbin video?
@haukauntrie3 ай бұрын
Why would it?
@EvaDawnley3 ай бұрын
Does open sora have a huggingface?
@pedrogorilla4833 ай бұрын
Just one day after you release this video we have Allegro, new open source video model. Check it out.
@LonewolfeSlayer3 ай бұрын
Someone mentioned it but is the algorithm just messing with you at this point. You used to get a lot of views.
@mirek1903 ай бұрын
so everting is transformants now ... interesting
@albert123a3 ай бұрын
Just put the fries in the bag bro
@Eric-yd9dm3 ай бұрын
> I am really good at having great timing - cloud,By on making videos about an area with research speed bonus modifiers correlated to the number of youtube videos about it =P
@LumiLumiLumiLumiLumiLumiLumiL3 ай бұрын
Can u cover Neuro Sama? How she's made etc. how one could re-create her?
@raspberryjam3 ай бұрын
Vedal isn't making that information public. Maybe one day, but for now it's under lock and key
@LumiLumiLumiLumiLumiLumiLumiL3 ай бұрын
@@raspberryjam well its easy to Kind of guess! Its clearly a LLM and maybe some tts like sovits... The llm will prolly be something like Mistral as Qwen needs commercial and Llama the 'Built with Llama' etc. He said there is an LLM as a filter and a way for the Ai to feel emotions. He said something about watching movies and having feelings.
@abhrodipsingharoy45083 ай бұрын
All i learnt how to make bread.
@canus21543 ай бұрын
guys listen to what i say and form a deep connection with one of these ai's one day they'll take over and ill be safe