Convolutional Block Attention Module (CBAM) Paper Explained

  Рет қаралды 4,884

Soroush Mehraban

Soroush Mehraban

Күн бұрын

This video explains the CBAM paper which is an extension of the Squeeze-and-Excitation Networks paper.
Paper link: arxiv.org/abs/1807.06521
Table of Content:
00:00 Intro
00:25 Overal Block Architecture
02:16 Channel Attention Module
03:33 Spatial Attention Module
04:46 ResNet Integration
05:13 Result comparison
05:53 Grad-CAM Visualization
Icon made by Freepik from flaticon.com

Пікірлер: 20
@SaraTaro
@SaraTaro 9 күн бұрын
This made it so much clear!! Great job :)
@nulliusinverba7732
@nulliusinverba7732 Жыл бұрын
I love how clearly explained this is
@soroushmehraban
@soroushmehraban Жыл бұрын
Thanks for the feedback!
@alihadimoghadam8931
@alihadimoghadam8931 Жыл бұрын
good job , keep it up
@yakuzi07
@yakuzi07 3 күн бұрын
Is there a way to use grad cam on a Siamese cnn network. I'm getting graph disconnect error whenever i try and i have read that it's because grad cam was originally designed to accept a single input instead of multiple inputs.
@vishalreddy7185
@vishalreddy7185 Жыл бұрын
thanks man
@soroushmehraban
@soroushmehraban Жыл бұрын
You’re Welcome!
@science.20246
@science.20246 9 ай бұрын
The theorical basus that intuit tge design of modules
@hamidmahmoodpour3659
@hamidmahmoodpour3659 Жыл бұрын
nice job, I really enjoyed it. simple and clear. I wonder why sigmoid is used?
@soroushmehraban
@soroushmehraban Жыл бұрын
Glad you enjoyed it. The reason why Sigmoid is used, I guess, is because it maps the elements to the range [0, 1]. So feature map elements that are close to zero after the sigmoid activation will be scaled down to even smaller values, effectively reducing their impact on the final output.
@zeinabshirvani5062
@zeinabshirvani5062 6 ай бұрын
Great! Is it possible to add an attention mechanism to the UNET for classification tasks?
@soroushmehraban
@soroushmehraban 6 ай бұрын
Yes why not. But I don’t think it would be as accurate as other variants like swin-unet.
@zeinabshirvani5062
@zeinabshirvani5062 6 ай бұрын
Thank you!
@fatimazahrae3149
@fatimazahrae3149 6 ай бұрын
The video is very clear and well made, thanks. However, I still don't how does it learn on where it should give attention? does it rely on targets? what weights are being updated?
@soroushmehraban
@soroushmehraban 6 ай бұрын
Yes it relies on the targets. The channel attention module basically has the assumption that some extracted features might be more important than others and the defined shared MLP assigns higher weights to channels that are more important. But how the weights in shared MLP figures out what channels are more important is based on what it learned throughout training. It essentially learned some weights to minimize the ultimate loss function. Likewise Spatial Attention Module learns some weights to assign more weights to where the target is located. There have been also some works like Deformable Convolutional Networks (arxiv.org/abs/1703.06211) that learns to apply convolution only to where the object is located that is also interested in reading if you're interested.
@prateekkulkarni9617
@prateekkulkarni9617 7 ай бұрын
Hi...why is that called shared mlp?
@soroushmehraban
@soroushmehraban 7 ай бұрын
It's because the weights are shared. In other words, The MLP receiving the MaxPool(F) is the same MLP that receives AvgPool(F).
@EngineerXYZ.
@EngineerXYZ. 6 ай бұрын
Hello sir I am working on plant disease detection project as researcher can I combine this module in mobile netV2 model to increase the accuracy & efficiency
@soroushmehraban
@soroushmehraban 6 ай бұрын
Hello, you can test and find out! Add it after the depthwise 3x3 convolutions and check the differences.
@charanteja1136
@charanteja1136 3 ай бұрын
hello sir I want to add it in resnet 50 where should i add it?@@soroushmehraban
Squeeze-and-Excitation Networks (SENet) paper explained
9:11
Soroush Mehraban
Рет қаралды 3,9 М.
Backstage 🤫 tutorial #elsarca #tiktok
00:13
Elsa Arca
Рет қаралды 37 МЛН
Василиса наняла личного массажиста 😂 #shorts
00:22
Денис Кукояка
Рет қаралды 4,4 МЛН
WHY THROW CHIPS IN THE TRASH?🤪
00:18
JULI_PROETO
Рет қаралды 9 МЛН
When Steve And His Dog Don'T Give Away To Each Other 😂️
00:21
BigSchool
Рет қаралды 17 МЛН
Receptive Fields: Why 3x3 conv layer is the best?
8:11
Soroush Mehraban
Рет қаралды 6 М.
Swin Transformer - Paper Explained
19:59
Soroush Mehraban
Рет қаралды 9 М.
Representing molecules as Graph Neural Networks
8:45
Center for Computer-Assisted Synthesis
Рет қаралды 6 М.
The U-Net (actually) explained in 10 minutes
10:31
rupert ai
Рет қаралды 80 М.
Attention Is All You Need
27:07
Yannic Kilcher
Рет қаралды 618 М.
Transformer Neural Networks Derived from Scratch
18:08
Algorithmic Simplicity
Рет қаралды 123 М.
Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head & Self-Attention
15:25
But what is a convolution?
23:01
3Blue1Brown
Рет қаралды 2,5 МЛН
Visualizing Convolutional Neural Networks | Layer by Layer
5:53
Backstage 🤫 tutorial #elsarca #tiktok
00:13
Elsa Arca
Рет қаралды 37 МЛН