What's great is I will usually look up a tutorial and it'll say something like "A residual block does this" And I'm like "Great... why?" And this lecture just put everything in perspective. It motivated the problems, showed the purpose of every step, showed how each iteration went to solve the problems of the previous. This is honestly a great lecture.
@nishantkshirsagar2150Ай бұрын
This is one of the best lectures on CNN architectures.
@Nihit-n5n4 жыл бұрын
what a great documentary on CNN architectures..slides are comprehensive and lecture(Dr Johnson) knows his stuff to extreme
@baskaisimkalmamisti3 жыл бұрын
That was a truly very impressive and informative lecture. I wish I could also listen to his summary and comments on the latest developments until 2021 such as NFnet.
@hasan07708162683 жыл бұрын
28:26 why use 3x3 kernel in vgg 39:20 Google's way of dealing with kernel size 49:33 resnets
@vaibhavdixit43774 ай бұрын
Great resource, thoroughly researched + beautifully curated! Thanks a lot for the teachings!
@VikasKM3 жыл бұрын
Amazing lecture explaining the history of convnets from alexNet to resNets and mobileNets and also gives us idea as to which network to use if we are custom designing convnet architecture for our problem. Slides contain ton of information Thank you Justin Johnson
@Jppvv492 Жыл бұрын
The formula for calculating the output size of a convolutional layer in a Convolutional Neural Network (CNN) depends on several factors, including: 1. Input size (W_in, H_in): The spatial dimensions (width and height) of the input image or feature map to the convolutional layer. 2. Filter size (F): The spatial dimensions (width and height) of the convolutional filter (kernel). 3. Stride (S): The step size at which the filter is applied to the input. It defines how much the filter is shifted across the input. 4. Padding (P): The number of pixels added to the input on all sides to preserve spatial dimensions after convolution. The formula to calculate the output size (W_out, H_out) of the convolutional layer can be given as: W_out = ((W_in - F + 2 * P) / S) + 1 H_out = ((H_in - F + 2 * P) / S) + 1 If you want to maintain the spatial dimensions (W_in, H_in) of the input after convolution (i.e., no spatial downsampling), you can set the padding as: P = (F - 1) / 2 This formula assumes that the stride S is the same in both the horizontal and vertical directions. If you use different strides for width and height, the formula will change accordingly. It's worth noting that some frameworks and implementations may use slightly different conventions for padding (e.g., 'valid' or 'same' padding), so it's essential to check the documentation and specifications of the specific CNN implementation you are using.
@sardorabdirayimov2 жыл бұрын
Truly amazing lecture to listen many times
@itchainx4375 Жыл бұрын
1:08:22 now 1K gpus for several month is common for big giant cooperation
@parkie0517 Жыл бұрын
Great lecture. Thank you so much
@xiangli11338 ай бұрын
Thank you so much!
@DED_Search3 жыл бұрын
29:50 how is output having the same dimension with input after 2 by 2 max pooling with stride 2 ?
@omerdor66443 жыл бұрын
halve in the sense of making it half of the value it used to be
@rajivb94933 жыл бұрын
Depthwise convolution...any references for this..?
@arunmehta82343 жыл бұрын
14.14 time : why didn't you include Cin in calculating FLOPS in pool 1?
@JaviOrman3 жыл бұрын
Each pooling operation is done on one input layer at a time, so it's kind of a 2D operation. Each pooling layer downsamples a corresponding input channel.
@АнтонГаничев-щ5ж4 жыл бұрын
I believe there is a small error here: AlexNet has 96(not 64) filters in first layer (48+48 = 96). But overall the lecture is awesome.
@anishhui1923 жыл бұрын
In the paper, it is 96. But I also find 64 in Pytorch alexnet model pytorch.org/docs/stable/_modules/torchvision/models/alexnet.html#alexnet