Understood a little bit this time, and impressing how the parallax and the robotic arm control works.
@AICoffeeBreak8 ай бұрын
I love your perseverance!
@DeepFindr8 ай бұрын
So we can expect a coffee bean game soon? ;-)
@AICoffeeBreak8 ай бұрын
That would be so cool! DeepMind, give us acces!
@sasankpotluri44228 ай бұрын
Thank you very much for an amazing video. This looked very close to the V-JEPA paper
@AICoffeeBreak8 ай бұрын
Yes, it totally reminds me of what Yann Le Cun says about the importance of pure observation (no embodiment or RL).
@harumambaru8 ай бұрын
What an amazing time to be alive! :)
8 ай бұрын
I have both a video game and machine learning background. This research is extra curios for me But I would come up with a more granular approach rather than try to generate a whole game from little data. I would instead convert a standard game creation pipeline into multiple generation and revision stages. What do you think, Letitia?
@AICoffeeBreak8 ай бұрын
Did you casually call 300,000 hours of KZbin data "little data"? I do not know if I understood this right, jus wondering while being amused. :) Now coming to the very good point you raised there. Sure, game development is a pipeline, and instead of trying to generate everything at once (image, actions -> frames), one could help individual steps in that pipeline. For example, NVIDIA has these neural nets that predict plausible object or game character animations. I am not advocating against deep learning to help individual steps in game development. For the present day, I think this is the way to go. But the aim for the the long term would be to generate everything in one go (Genie style). It is hard and requires a lot of data, but when this succeeds (and I do not see a reason why it should succeed in a few years), it spares a lot of pipelining engineering and doing things correctly at each step. This is especially since there pipelines behave badly with error propagation: if the first step is wrong, the whole pipeline will be wrong / hard to recover from. Imagine if we would think about image generation as of a pipeline too: get the coloring right, do the lighting, the textures, all in a pipeline. With enough data, we do not need pipelines to break down problems into smaller ones. In the same way in which diffusion models generate images or videos in one go, Genie could do too, with action as prompts. Btw, Genie also makes the point that Yann LeCun is making, namely that observing data alone (no RL or embodiment) can really discover important features, such as actions.
@AICoffeeBreak8 ай бұрын
I know that Genie outputs today are very small resolution and extremely short. 😅 I just think of these like being the first silly GAN-generated faces 10 years ago. But wait 10 more years and this could really go somewhere.
8 ай бұрын
@@AICoffeeBreak I meant little "input" data not "training" data of course. The little you provide, the little control you have over minute details. If I want to make an impactful game, I want to have lots of dials to turn. Just providing a prompt with a video would not be enough for commercial games for now. But in the future, maybe there will be more variety of inputs, and prompts might become like GDDs, then it can generate hyper-casual games
@AICoffeeBreak8 ай бұрын
Haha, now this makes sense. 😅
@DerPylz8 ай бұрын
Thanks so much for the explanation! I'm a bit sceptical about this approach and the use cases, but I'm excited to be surprised about where this will go in the future
@AICoffeeBreak8 ай бұрын
What are you sceptical about exactly?
@DerPylz8 ай бұрын
@@AICoffeeBreak I think AI can work great for the procedural generation of background stuff in large open worlds, but I don't quite foresee it actually creating gameplay. Small games are usually either all about a specific game play idea or about telling a story (often times with environmental storytelling) so letting an AI take over there would mean losing a lot of control (as even the best models today often still just ignore parts of the prompt randomly). But, maybe I'm a bit too narrow minded here and I'll be surprised :D
@AICoffeeBreak8 ай бұрын
@@DerPylzMaybe gameplay is more promptable from text than we think. Of course, the source of entropy must come from somewhere and we need human prompters to tell it a bit into which direction it should go. But I estimate that a lot of the details would be just placed by AI.
@Micetticat8 ай бұрын
This is a Genie-al architecture!
@AICoffeeBreak8 ай бұрын
@MachineLearningStreetTalk8 ай бұрын
🧠🤌💪
@AICoffeeBreak8 ай бұрын
@hannesstark50248 ай бұрын
Thanks
@AICoffeeBreak8 ай бұрын
@tomoki-v6o8 ай бұрын
Reminds me of deep Q learning. I dont know why
@zenithparsec8 ай бұрын
It's a Game Genie. Or is that too obvious? Or too obtuse? interaction.