How To Build Generative AI Models Like OpenAI's Sora

  Рет қаралды 73,395

Y Combinator

Y Combinator

Күн бұрын

If you read articles about companies like OpenAI and Anthropic training foundation models, it would be natural to assume that if you don’t have a billion dollars or the resources of a large company, you can’t train your own foundational models. But the opposite is true.
In this episode of the Lightcone Podcast, we discuss the strategies to build a foundational model from scratch in less than 3 months with examples of YC companies doing just that. We also get an exclusive look at Open AI's Sora!
Read more about the YC AI companies from this episode on our blog: www.ycombinator.com/blog/buil...
Chapters (Powered by bit.ly/chapterme-yc) -
00:00 - Coming Up
01:13 - Sora Videos
05:05 - How Sora works under the hood?
08:19 - How expensive is it to generate videos vs. texts?
10:01 - Infinity AI
11:23 - Sync Labs
13:41 - Sonauto
15:44 - Metalware
17:40 - Guide Labs
19:29 - Phind
24:21 - Diffuse Bio
25:36 - Piramidal
27:15 - K-Scale Labs
28:58 - DraftAid
30:38 - Playground
33:20 - Outro

Пікірлер: 85
@chapterme
@chapterme Ай бұрын
Chapters (Powered by ChapterMe) - 00:00 - Coming Up 00:49 - Intro: Generative AI for Video 01:13 - Sora Videos 05:05 - How Sora works under the hood? 08:19 - How expensive is it to generate videos vs. texts? 08:55 - How do YC companies build foundation models with just $500K? 10:01 - Demos: Infinity AI 11:23 - Sync Labs' hack to train a Lip Sync Model with a single A100 GPU 12:45 - YC deal with Azure 13:41 - How Sonauto Built a Text-to-Song Model 15:44 - Metalware: Hardware Co-Pilot 17:40 - Guide Labs: Explainable Foundation Model 18:20 - Building your own models vs. Using existing open source models 19:29 - Phind's Clever Hack: Synthetic Data 22:03 - Simulating real-world physics: Atmo (Foundational model for weather prediction) 24:21 - AI in Biology: Diffuse Bio 25:36 - Piramidal: Foundational model for the human brain 27:15 - AI in Robotics: K-Scale Labs 28:58 - DraftAid: AI Models for CAD Design 30:38 - Playground going against giants and Suhail Doshi Background 31:42 - Companies pivoting into AI 32:44 - Takeaway Message 33:20 - Outro
@avi2125
@avi2125 Ай бұрын
The text/prompt for the video was quite detailed n informational. Even as a bad programmer I was able to mentally construct an algorithm for a video on the fly...maybe I have to watch this podcast more than the first 5 mins to understand why Sora etc is a big deal...
@BrianMPrime
@BrianMPrime Ай бұрын
The lipsynching on Tim Ferriss looked way off. There was a bit of an uncanny valley with the deepfake switchover as well.
@danielmarco7863
@danielmarco7863 Ай бұрын
This is definitely a launched product that the founders are embarrassed by. In the sense that they understand this is not representative of the final product, which many will suggest is indicative of the proper time to launch. Definitely applying the "law of papers" to my understanding of the state of the art video generation.
@jks234
@jks234 Ай бұрын
Interestingly... the podcast's lipsyncing is also a bit off already. So perhaps it's just an audio sync issue.
@joythought
@joythought Ай бұрын
​@@jks234 yes, YT is terrible for lipsync at times so probably best to download the episode and then watch as a local copy to have some hope of seeing it the way they saw the demo.
@BrianMPrime
@BrianMPrime Ай бұрын
@@danielmarco7863 I appreciate that attitude towards building, kudos to the team for launching early!
@jks234
@jks234 Ай бұрын
20:15 I personally find the concept of synthetic data to be a fascinating spur for more neuroscientific research. People dream about what they study and are constantly reviewing problems they are working on in their head. In other words, I feel that humans use simulations in their own mind to build out the models they use to understand their world. We might be able to think of this as "generating 1000x more data" than can is directly extracted from the real world. Another example of this that was done to awesome effect is AlphaGo's self-play training.
@andybrice2711
@andybrice2711 28 күн бұрын
I would maybe argue Synthetic Data isn't inherently circular, it's just inverted. Whenever you've got a transformation which is easy in one direction, but difficult in the other. Synthetic Data is a sensible approach. Like it's easy to rasterize vector graphics, but it's more difficult to vectorize raster graphics.
@juanortega7509
@juanortega7509 Ай бұрын
I've been waiting for a new episode for weeks!! Thanks for the content guys!
@alejandroVigano
@alejandroVigano Ай бұрын
Thanks for sharing this talks!
@alicapwn
@alicapwn 22 күн бұрын
They didn’t source robotics papers for Sora’s architecture. They combined Diffusion Transformers (developed by Peebles) with the video diffusion methods released by Stability/Google/Meta/Nvidia.
@samshoman
@samshoman Ай бұрын
Wow, the song startup is better than anything I have seen so far.
@atchutram9894
@atchutram9894 19 күн бұрын
11:40 Hindi demo is perfect. My first language is not Hindi but can definitely tell it is great translation.
@DevilerServinal
@DevilerServinal Ай бұрын
Thank you so much!!!!!!!!!!!!
@DiasporaPay
@DiasporaPay 19 күн бұрын
This is awesome thanks!
@theni3762
@theni3762 29 күн бұрын
All you're really saying here is that people can build any foundational models as long as openai doesn't also do it. That's not very reassuring to hear. We started with words, now pictures and videos, why would anyone not expect music, robotics, hardware etc down the line?
@bahlechonco211
@bahlechonco211 Ай бұрын
Great insight
@fil4dworldcomo623
@fil4dworldcomo623 Ай бұрын
I think Sora is better positioned on imagining a new world and totally a different world than to simulate our perception of what the world is and what the world was.
@awesomeo4510
@awesomeo4510 Ай бұрын
Yes but how do you find the datasets to train for new foundational models? Like their EEG example - how did she acquire this data to train the models?
@LuisPerez-uh9ik
@LuisPerez-uh9ik Ай бұрын
Just take it!
@joythought
@joythought Ай бұрын
Isn't she an expert in the field with papers published in Nature? If so, she has the data. If you want similar data you need to partner with researchers.
@minc33
@minc33 Ай бұрын
Where there’s a will, there’s a way!
@sergismael
@sergismael Ай бұрын
best episode so far.
@pandainvestingco
@pandainvestingco Ай бұрын
I love this series
@sgdfly8715
@sgdfly8715 Ай бұрын
An idea that anyone can take (though it might already exist): Use AI to help recreate crime scenes and make recommendations on what data might help better understand and solve cases. The ideal solution would be able to use data from other cases in order to improve recommendations.
@jess-e
@jess-e Ай бұрын
Who can share the papers which are necessary to get to a level of understanding that is actionable? As explained in the video :)
@AdityaVG10
@AdityaVG10 28 күн бұрын
I have been looking for those papers ! Tell me if you get some .
@AfeezAbdulAziz
@AfeezAbdulAziz 26 күн бұрын
@@AdityaVG10me too! I’m still finding out about this
@gibsonhu6502
@gibsonhu6502 Ай бұрын
Are there links to the sora videos they are showing?
@FunwithBlender
@FunwithBlender Ай бұрын
Alibaba is also doing some interesting things with AI video, we (open source community) have almost destructured the process.
@kog0824
@kog0824 Ай бұрын
M 17:20 here seems an interesting approach… but sorry that I am new to this AI space, what does it mean by building its own foundation model but with gpt2.5. Does it mean it fine tune through gpt2.5 with its own data?
@fortunefubara1244
@fortunefubara1244 26 күн бұрын
Yes.
@Alice8000
@Alice8000 29 күн бұрын
NICE VIDEO MY FRIENDS
@vikalpjain1098
@vikalpjain1098 17 күн бұрын
At 4:17 to 4:20 in one of the column one ladder joint got added.
@xilluminati
@xilluminati Ай бұрын
̶f̶ i̶r̶s̶t̶…. no… early adopter
@pandainvestingco
@pandainvestingco Ай бұрын
😂
@raymond_luxury_yacht
@raymond_luxury_yacht Ай бұрын
interesting that raytracing in games might be done and games will be diffused not rendered
@FunwithBlender
@FunwithBlender Ай бұрын
the lipsync has some better open source free solutions but still cool
@FunwithBlender
@FunwithBlender Ай бұрын
Respectfully stable diffusion is way better than anything else to act like mid journey or playground is better is to not understand the flexibility and creativity you have with stable diffusion. Stable diffusion can combine with control net there is a massive community Civitia with LoRA and textual inversion etc and there is a thousand tings you can do from deforum to you name it. Stable diffusion is the only model that can give you precision when needed if you know how to use it, yes its more complex but it is the best model
@JohnSmith-he5xg
@JohnSmith-he5xg 28 күн бұрын
12:40 Really burying the lede here to the question "How are YC companies able to create these models with only $500k?" We arranged for free compute with MSFT (she didn't say how much, but said hundreds of times more than they'd get otherwise)
@adiveena
@adiveena 29 күн бұрын
How to work this type startup
@rcstann
@rcstann Ай бұрын
¹1¹! It's "Sam" day in the Bay area.
@AM-kx4ue
@AM-kx4ue 14 күн бұрын
Hi everyone, I'm exploring how startups are balancing AI model training with customer data privacy, especially in competitive industries where data can make a difference against competitors. If you have insights or experiences to share on anonymization techniques, federated learning, differential privacy, or service models with privacy tiers, I'd love to hear from you. Let's discuss this further and exchange strategies for responsible AI development.
@rodi4850
@rodi4850 Ай бұрын
4:47 there's tons of videos of the golden gate in 360 - gaussian splatting can do it much better 😁
@jks234
@jks234 Ай бұрын
15:04 memeworthy clip
@reza2kn
@reza2kn Ай бұрын
I appreciate the show and encouraging people to go for it, and I get hyping up the early YC-backed products, but the first couple weren't even super impressive by March 2024 standards, let alone being "the best thing" on the market. I'm not bashing any of the products and I hope they do awesome, I'm just saying these are not at all good examples of "the best we have right now", and is discouraging to hear from you guys. @ 11:42 The lip sync is completely off. This while perfecting lip sync motion was already accomplished last year. @15:40: Check out Suno AI v3. That's like GPT-4 compared to GPT-2 (what you showed here)
@LuisPerez-uh9ik
@LuisPerez-uh9ik Ай бұрын
They also are young founders. Looks to me like they are pushing this to encourage ai in yc
@fanaccount6600
@fanaccount6600 Ай бұрын
why is that cup on the ground instead of being on the table?!
@vslaykovsky
@vslaykovsky Ай бұрын
this is an AI-generated video, that's why
@swaggitypigfig8413
@swaggitypigfig8413 Ай бұрын
So they can grab it with their toes and fling it towards each other as a conflict resolution technique.
@shallindurani
@shallindurani Ай бұрын
I wonder what the dog thinks about him lol
@harshitgauravtiwari
@harshitgauravtiwari Ай бұрын
What if this video also is ai generated
@harshitgauravtiwari
@harshitgauravtiwari Ай бұрын
Omg i am the first to comment I have startup in semiconductors Hope someday will meet with Y combinators 😊
@john-kv7kl
@john-kv7kl Ай бұрын
bruh it is ai generated. 10:33
@joythought
@joythought Ай бұрын
This comment is AI generated.
@FunwithBlender
@FunwithBlender Ай бұрын
Okay I am sold on Y C lol will submit my application, access to GPU's for fine tuning is valuable
@shrawanthakur4168
@shrawanthakur4168 29 күн бұрын
It’s just the start of the AI and a lot of Sci-Fi things becoming real.
@pauldannelachica2388
@pauldannelachica2388 Ай бұрын
❤❤❤❤❤❤
@GigaFro
@GigaFro Ай бұрын
Seeing one example of the generated spelling being correct or even a few does not mean there was any advancement in this area...
@perrssssjjwjwkriri883
@perrssssjjwjwkriri883 Ай бұрын
No way u dont kno who that is 11:53
@crowsnest6753
@crowsnest6753 26 күн бұрын
the use case is clearly VR gaming. Next stop - VR movies
@nischalnayak391
@nischalnayak391 Ай бұрын
Great ! I watched this video to relealise i need millions of free credit to build a foundational model for free
@0x0michael
@0x0michael Ай бұрын
What sora imagined was a single-laned residential street, lots more space for trees, gardening, walking and for neighborhood activities. Cars move one-way in from one direction and out in the opposite.
@saravanashanmukham6108
@saravanashanmukham6108 24 күн бұрын
Inspiring to know AI barrier can be overcome without a PhD in ML/AI. Thanks guys!
@vincentwady
@vincentwady Ай бұрын
Let’s push 100% AI to the market. There should not be single human needed for a corporation after that.
@FunwithBlender
@FunwithBlender Ай бұрын
I hope playground wins though the more competition the better
@Cygx
@Cygx Ай бұрын
Feels like I’m sitting in listening to the four smartest kids in my class XD
@jeffsteyn7174
@jeffsteyn7174 29 күн бұрын
Looking down on synthetic data makes no sense. Models like orca was built on synthetic data and it outperforms models 10x its size.
@gunaysoni6792
@gunaysoni6792 16 күн бұрын
The models you showcased today aren't really "foundational models" (at least in the way the term is currently used.) and a lot of what you show isn't super new. Saying that you don't need a lot of GPU's to compete is very misleading.
@kamal_pratap
@kamal_pratap Ай бұрын
the hell?
@Authormatthewtaylordotcom
@Authormatthewtaylordotcom 19 күн бұрын
Thanks for sharing! Love the content. Any great repositories for the latest academic papers/journals to read up on as mentioned near the end?
@Alice8000
@Alice8000 29 күн бұрын
I hope you guys are very successful so you can buy some furniture! lol jk bro. just a prank bro.
@Mooohbroadcast
@Mooohbroadcast Ай бұрын
Thanks for sharing one more useless hype. You jumped from blockchain to crypto, NFT, and finally to AI. You should change your brand name in Y Hype 💩
@rodi4850
@rodi4850 Ай бұрын
A guy not speaking Hindi gives his opinion on an lip sync model speaking Hindi 😂
@alexanikiev
@alexanikiev 28 күн бұрын
This comment alone is a “great” example of stereotypical thinking. The problem is that we are already living in the 21st century and people speaking 3-4 languages on a daily basis is pretty expected 8)
@tf_9047
@tf_9047 Ай бұрын
AI, even at current levels of capability, is far too dangerous to our society to be released to startups or governments or businesses or the public. We need startups to tackle the safety of these models at a more aggressive rate than capabilities advances.
@ashleigh3021
@ashleigh3021 Ай бұрын
People limiting AI are extremely dangerous. We need rule of law to tackle Luddism in the public and protect technology from ignorance.
@joythought
@joythought Ай бұрын
Seriously, how would a start up solve the alignment problem when that is out of their hands. Better for them to do new things building new models. The great thing about human agency is it's almost unstoppable. The great thing about AI agency is it can be switched off. Anyone fearing the rise of the machines has no idea how much power that is going to draw. Simple enough to switch off at the mains.
@djpete2009
@djpete2009 Ай бұрын
@@ashleigh3021 Can you imagine? Crazy people!!
@GatherVerse
@GatherVerse 27 күн бұрын
If you really want to add value to this podcast why not add a black person to the conversation? We reccommend Christopher Lafayette. He's in the Valley and can contribute well to this conversation and draws an audience. Else, find someone else, but consider the upside to this. Thanks.
The Truth About Building AI Startups Today
32:27
Y Combinator
Рет қаралды 343 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 491 М.
He FOUND MYSTERY inside the GUMMY BEAR 😱🧸😂 #shorts
00:26
BROTHERS VLOG
Рет қаралды 54 МЛН
ВИРУСНЫЕ ВИДЕО / Виноградинка 😅
00:34
Светлый Voiceover
Рет қаралды 7 МЛН
NO NO NO YES! (40 MLN SUBSCRIBERS CHALLENGE!) #shorts
00:27
PANDA BOI
Рет қаралды 82 МЛН
Business Plan
14:40
Childcare Conversations
Рет қаралды 49
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Avoid These Tempting Startup Ideas
29:00
Y Combinator
Рет қаралды 457 М.
AI Deception: How Tech Companies Are Fooling Us
18:59
ColdFusion
Рет қаралды 1,2 МЛН
I Tried a Disney Secret Project!
11:33
Marques Brownlee
Рет қаралды 4,1 МЛН
Building Confidence In Yourself and Your Ideas
21:11
Y Combinator
Рет қаралды 91 М.
The Race For AI Robots Just Got Real (OpenAI, NVIDIA and more)
21:26
ColdFusion
Рет қаралды 1,1 МЛН
Don't Make These Hiring Mistakes
19:45
Y Combinator
Рет қаралды 94 М.
How to Build An MVP | Startup School
16:53
Y Combinator
Рет қаралды 724 М.
Как открыть дверь в Jaecoo J8? Удобно?🤔😊
0:27
Суворкин Сергей
Рет қаралды 775 М.
План хакера 🤯 #shorts #фильмы
0:59
BruuHub
Рет қаралды 953 М.
Опасная флешка 🤯
0:22
FATA MORGANA
Рет қаралды 298 М.