Pixel Shuffle - Changing Resolution with Style

Рет қаралды 7,050

Animated AI

8 ай бұрын

Patreon: / animated_ai
Animations: animatedai.github.io/#pixel-s...

Пікірлер: 31

@salmiac-3105 8 ай бұрын

would've loved an example image for the pixel suffle too there to really grasp what is happening

@ziggycross 8 ай бұрын

Was just about to leave a comment to say this! Was waiting for some example images, would be great to keep in mind for future videos!

@ELjoakoDELxD 8 ай бұрын

@@ziggycross I wanted some images too. I didn't understand fully what the output is going to be with pixelshuffle Edit: grammar will always be difficult for me

@djmips 8 ай бұрын

It's not working on actual pixels. The 'depth' or input to the shuffle is the feature maps generated from the low res image and it's at this last stage that the image is upsampled. This is in contrast to older methods that would upsample the image straight away and then try and process that into the super resolution output which was both less efficient and potentially introduced the artifacts mentioned in the video. For more information see the paper referenced in the video. "Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network" by Shi et al.

@johnpope1473 2 ай бұрын

@@djmips - now I understand. thanks

@VFXVideoeffectsCreator 8 ай бұрын

I have to say, that's really awesome! Especially the hint that transposed convolution is just the gradient computation of convolution w.r.t. its inputs. I regularly contribute to the backends of Deep Learning Frameworks in the Julia Programming Language, and transposed convolution (or deconvolution, or some freaky way to say it: fractionally strided convolution) is really just a function call to the function calculating the adjoint (gradient) of a normal convolution (except output_padding, but this just affects the size calculation anyway).

@chrisminnoy3637 8 ай бұрын

Thanks. Was already using that for quite some time in my super resolution upscaler. Downside of the tensorflow implementation, as far as I know, you can only use squares, but it would make sense to also just do it in one dimension, or more in a rectangle. Some work to be done there ...

@AdmMusicc 17 күн бұрын

Loved the animation thank you!!

@kevalan1042 8 ай бұрын

Beautiful work as always

@Biuret. 8 ай бұрын

Great content. Thank you!

@j________k 8 ай бұрын

Great series! Keep it up :)

@HinaTan250 8 ай бұрын

This is really cool! 😄 Thanks for the information.

@Firestorm-tq7fy 2 ай бұрын

One of the best channels! I wish u‘d be covering more topics than only CNN, but guess can’t be a top pro in every topic. I def subbed and wished u‘d have way more videos already. But i can see that it takes alot of time and effort so i will wait. Thank u so much for this work ❤

@ScottzPlaylists 7 ай бұрын

👍 you make awesome illustrations.. ❤ Can you explain Transformers encoding and inference? ❓ That would be a big hit also. 👏

@SrDlay 8 ай бұрын

thanks for your effort

@oberstvontoffel 6 ай бұрын

this video should have way more likes...

@I77AGIC Ай бұрын

this made it make a ton of sense. but one problem is pixel shuffle does not get rid of the artifacts. it introduces its own artifacts

@coryfan5872 2 ай бұрын

Hi, isn't this virtually the same effect as a stride 2, 2x2 transpose convolution with the output channel just being 4 times smaller? Its a convolutional filter with some binary weights that causes each pixel channel to be mapped to some new channel. The aforementioned transpose convolution would be the same if you just had a linear layer before the pixel shuffle.

@chrisminnoy3637 8 ай бұрын

Would be nice to have a video about TensorTrain technique

@Erosis 8 ай бұрын

Do you have a paper or resource about the artifacts in the gradient when using strided 3x3 convs?

@animatedai 8 ай бұрын

If you accept that transposed convolution (kernel size=3, stride=2) produces gridding artifacts in the output image then by definition, standard convolution (kernel size=3, stride=2) produces gridding artifacts in the input image gradient. The reason is that transposed convolution is implemented as a literal call to the gradient function of standard convolution in TensorFlow and PyTorch. I learned this at some point studying the papers and code of the StyleGAN saga. (nvlabs.github.io/stylegan2/versions.html) I wish I could narrow it down more for you, if you're trying to cite this. I have a feeling I learned it from reading their code or one of their references. You'll notice in all the versions of their code, they go out of their way to implement downsampling as a blur -> convolution rather than just a plain strided convolution. StyleGAN3 is all about aliasing.

@coryfan5872 2 ай бұрын

Its probably because some pixels overlap the convolutional filter only once (the ones in the centers), some pixels overlap the convolutional filter 2 times (the ones on the sides but not the corners), and some pixels overlap the convolutional filter 4 times (the ones in the corners). I wonder if using ConvNext's 2x2 convolutional layers still results in this sort of gradient artifacts.