Convolutional Block Attention Module (CBAM) Paper Explained

Рет қаралды 4,884

Күн бұрын

This video explains the CBAM paper which is an extension of the Squeeze-and-Excitation Networks paper.
Paper link: arxiv.org/abs/1807.06521
Table of Content:
00:00 Intro
00:25 Overal Block Architecture
02:16 Channel Attention Module
03:33 Spatial Attention Module
04:46 ResNet Integration
05:13 Result comparison
05:53 Grad-CAM Visualization
Icon made by Freepik from flaticon.com

Пікірлер: 20

@SaraTaro 9 күн бұрын

This made it so much clear!! Great job :)

@nulliusinverba7732 Жыл бұрын

I love how clearly explained this is

@soroushmehraban Жыл бұрын

Thanks for the feedback!

@alihadimoghadam8931 Жыл бұрын

good job , keep it up

@yakuzi07 3 күн бұрын

Is there a way to use grad cam on a Siamese cnn network. I'm getting graph disconnect error whenever i try and i have read that it's because grad cam was originally designed to accept a single input instead of multiple inputs.

@vishalreddy7185 Жыл бұрын

thanks man

@soroushmehraban Жыл бұрын

You’re Welcome!

@science.20246 9 ай бұрын

The theorical basus that intuit tge design of modules

@hamidmahmoodpour3659 Жыл бұрын

nice job, I really enjoyed it. simple and clear. I wonder why sigmoid is used?

@soroushmehraban Жыл бұрын

Glad you enjoyed it. The reason why Sigmoid is used, I guess, is because it maps the elements to the range [0, 1]. So feature map elements that are close to zero after the sigmoid activation will be scaled down to even smaller values, effectively reducing their impact on the final output.

@zeinabshirvani5062 6 ай бұрын

Great! Is it possible to add an attention mechanism to the UNET for classification tasks?

@soroushmehraban 6 ай бұрын

Yes why not. But I don’t think it would be as accurate as other variants like swin-unet.

@zeinabshirvani5062 6 ай бұрын

Thank you!

@fatimazahrae3149 6 ай бұрын

The video is very clear and well made, thanks. However, I still don't how does it learn on where it should give attention? does it rely on targets? what weights are being updated?

@soroushmehraban 6 ай бұрын

Yes it relies on the targets. The channel attention module basically has the assumption that some extracted features might be more important than others and the defined shared MLP assigns higher weights to channels that are more important. But how the weights in shared MLP figures out what channels are more important is based on what it learned throughout training. It essentially learned some weights to minimize the ultimate loss function. Likewise Spatial Attention Module learns some weights to assign more weights to where the target is located. There have been also some works like Deformable Convolutional Networks (arxiv.org/abs/1703.06211) that learns to apply convolution only to where the object is located that is also interested in reading if you're interested.

@prateekkulkarni9617 7 ай бұрын

Hi...why is that called shared mlp?

@soroushmehraban 7 ай бұрын

It's because the weights are shared. In other words, The MLP receiving the MaxPool(F) is the same MLP that receives AvgPool(F).

@EngineerXYZ. 6 ай бұрын

Hello sir I am working on plant disease detection project as researcher can I combine this module in mobile netV2 model to increase the accuracy & efficiency

@soroushmehraban 6 ай бұрын

Hello, you can test and find out! Add it after the depthwise 3x3 convolutions and check the differences.

@charanteja1136 3 ай бұрын

hello sir I want to add it in resnet 50 where should i add it?@@soroushmehraban