Thank you for also explaining Phenaki! I was curious about a non-diffusion model for video generation! 🎊
@davidyang1022 жыл бұрын
Because it resumes generation from a few frames it will lose context. Imagine generating a paragraph and then the next one using only the last word you generated. Luckily images captured a lot of information so it's not that obvious. But for example you can't do a video that looks around 360 degrees is it's generated with two iterations. Very dreamlike.
@federicolusiani77532 жыл бұрын
Thank you for your video, great content as always! One question: in the video, you say that the video encoder is auto-regressive, so that it can be used on arbitrary number of video patches. But aren't standard transformer encoders already able to process inputs of arbitrary length? Usually the auto-regressive architecture is used in the decoder, because at inference time, we need it to generate the output causally. Am I missing something?
@AICoffeeBreak2 жыл бұрын
Thanks for this great question. Transformer sequence length is an interesting topic, which we've discussed here already: kzbin.info/www/bejne/jqnXpGSfqc2opqs Basically, even if it can generate / take in variable length input, it still has a predefined maximum input / output length due to practical limitations (compute time and memory). You are asking whether a causal model could not generate infinitely long video and -- for practical reasons -- the answer is no. Unmodified causal attention means that one attends to the whole generated past and for very long sequences. This means that the attention window increases linearly and computation time and memory increases quadratically. So because of limited compute time and memory, we cannot generate indefinitely, unless one applies such tricks as the Phenki authors with MaskGIT, to only attend to a small fraction of the tokens of the past generated output.
@Handelsbilanzdefizit2 жыл бұрын
Maybe there will be a way to visualize memories and dreams, by using Electroencephalography (EEG) and Neural Networks. So you can see what others think. Or see what others see, through their eyes.
@mrinmoybanik55982 жыл бұрын
Good luck collecting training dataset🙂
@johnkintner2 жыл бұрын
researchers have already used fmri to do something similar! This was a while ago :D
@rewixx694202 жыл бұрын
i want so much infinite video generation on diffusion models
@AICoffeeBreak2 жыл бұрын
Soon. Just give Google some time to mount more TPUs in their racks. 😅
@AICoffeeBreak2 жыл бұрын
twitter.com/_akhaliq/status/1595645248243650560?t=PHepVXOP40pPdc5q3upUbQ&s=19 what about this? Didn't look into it.
@summary7428 Жыл бұрын
great video, but i think it was wrongly placed in your (awesome) diffusers playlist =)
@AICoffeeBreak Жыл бұрын
You are right, it is not a diffusion model. It's about content generation. 😅 I was more comfortable with it being in this playlist (especially as the last video in the row) rather than being nowhere close to it's fellow competition. But sure, I do not have the Paella video in the list, although Paella can be argued to be a diffusion model. I need to clean up.
@elev0072 жыл бұрын
Great explanation- thank you 🙏
@barberb2 жыл бұрын
thank you letitia
@TheGatoskilo Жыл бұрын
I wonder how do they pad the video tensors with variable sequence length.
@AICoffeeBreak Жыл бұрын
Do you see this as problematic?
@TheGatoskilo Жыл бұрын
I just wonder to the implementation level, these padding values as well as masking the tokens, did someone decide that we will fill these tensors with 0s? Does it matter what we are going to fill those vectors with? What if these padded/masked values of 0s overlap with actual data, how do we effectively instruct the model to disentangle masked values from 0s corresponding to the actual data?
@TheGatoskilo Жыл бұрын
@@AICoffeeBreak No, I just wonder how it works in the implementation
@TimScarfe2 жыл бұрын
Amazeballs ❤
@AICoffeeBreak2 жыл бұрын
Thanks, Tim! 😅 Happy to see MLST release content since what it feels a long time!
@sadface74572 жыл бұрын
Hello MsCB 👋
@AICoffeeBreak2 жыл бұрын
Hello Sad Face! Did not see you in the comments in a long time! 👋
@DerPylz2 жыл бұрын
Woohooo! Sad Face is back! 🎊
@VivaLaRevoMW3PS32 жыл бұрын
Boring video there's already like 100 of this kind, people want to know where they can use it or how they can use it