Im a biologist by training, each of my degrees was in wet-laboratory biology research. Im doing a secondment which involves images. I have never had exposure to lectures or formal training in programming. I just want to leave a note to say how useful your videos are and how proud you should feel about the work you put into this channel
@legendgrable9762Ай бұрын
I looked for it and now am applying it in early detection of prostate cancer
@hejarshahabi1142 жыл бұрын
The way explain such complex models it's so nice and easy to understand, thanks for the videos you make.
@ParniaSh3 жыл бұрын
I love how clear you explained the attention mechanism. Thank you!
@DigitalSreeni3 жыл бұрын
Glad it was helpful!
@angelceballos87143 жыл бұрын
Looking forward to the next tutorial! Thanks again!
@DigitalSreeni3 жыл бұрын
Coming soon! :)
@giroyu2 жыл бұрын
I cant believe I couldnt find this cahnnel and your precious hard work till now. Thank you so much for your work. Please Keep going!
@DigitalSreeni2 жыл бұрын
Thank you so much!
@perioguatexgaming13332 жыл бұрын
I think on the entire youtube you are the only one who explained the concept of attention and how to implement it. Thank you sir.
@DigitalSreeni2 жыл бұрын
Thanks :)
@tedonk032 жыл бұрын
This is a really great explanation, thanks so much for this. However, just to note, add is not the same as concatenate. Concatenate is appending the tensor on a certain dimension, while add is actually adding the tensor values
@birdshotbill2 жыл бұрын
Your videos have been incredibly useful to me, I am studying a masters in deep learning and computer vision and your content is excellent for my learning and interest in the subject. Keep up the amazing work, I hope your channel continues to grow and receive the recognition it deserves!
@perioguatexgaming13332 жыл бұрын
I am in undergrad and you know its so difficult to actually understand and more so to implement these things. This guy is a saviour.
@nagamanigonthina9306 Жыл бұрын
This video is very clear with detailed explanation. It cleared all my doubts. Thanks a lot.
@DigitalSreeni Жыл бұрын
You are most welcome
@timtomov33613 жыл бұрын
As you said g in coming from a lower level and therefore has a lower level resolution 64x64 vs 128x128 in the skip connection. However aren´t in the "normal" design the #features larger in lower levels i.e. you would probably have g = 64x64x#2*numFeatures and x = 128x128#numFeatures ?
@tshele14882 жыл бұрын
It was rather the x that comes from the lower early layers and it has better spatial information. Also, the skip connection is x, not g.
@100000Andy9 ай бұрын
Your videos are amazing! You explain this so good that everyone can understand it! I am really impressed.
@visuality25412 жыл бұрын
This is extremely helpful; very detailed, easy, and clear. Thank you very much Sir!
@husammasalkhi7817 Жыл бұрын
very very good video and greatly apprecieate you going through the steps of the gate and also the code for it
@anonyme30293 жыл бұрын
Thank you for explaining in a very understandable way
@rbhambriiit Жыл бұрын
Thanks for the great lectures. Small feedback/correction on "Concatenating or adding same thing" - Add layer adds two input tensor while concatenate appends two tensors
@vivaliberte Жыл бұрын
very good explanation. thanks
@mansubatabassum66295 ай бұрын
this is really great explanation , thank you
@mujeebpa2 жыл бұрын
Clear and detailed explanation... Could you please post a video on Single and Multi Head attention as well?
@hafizaaymen2291 Жыл бұрын
very well explained👍 plz make a video on VGG16 model
@leonardocesar16192 жыл бұрын
Amazing explanation! Thank you so much
@ShivaniAMehta2 жыл бұрын
Thankyou so much sir for this video. It is a perfect blend of intuition of concept as well as implementation. Best wishes ahead.
@牛煜烁2 жыл бұрын
Really brilliant tutorial. I really appreciate it. Just wonder, if somewhere I could download this slice? Thanks.
@benjaminp.95722 жыл бұрын
It is very helpful. Thank you.
@DigitalSreeni2 жыл бұрын
Glad to hear that!
@SakvaUA2 жыл бұрын
So why applying 1x1 conv with stride 2 to x which discards 2/3 of information, instead of upsampling g? This way you won't need the last upsampling step.
@gerhitchman2 жыл бұрын
Cool and clear explanation -- but I'm not sure how this would work for multi-class semantic segmentation -- seems like we would need to generate many attention weight-masks.
@ParniaSh3 жыл бұрын
Just a very minor correction: to reduce the dimensions of x to half, we need a 2x2 conv with stride of 2, not 1x1.
@DigitalSreeni3 жыл бұрын
The output shape is defined by the stride and not by the kernel size of the convolution. Here is a few lines of code if anyone wats to experiment. In fact, these types of snippets can be fun and educational. import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D model = Sequential() model.add(Conv2D(1, kernel_size=(2,2), padding='same', strides=(1,1), input_shape = (256, 256,1))) print(model.summary())
@siddharthct47993 жыл бұрын
Thank you for the videos.
@kasikrit3 жыл бұрын
Very good explanation
@sahilbhardwaj23603 жыл бұрын
Looking forward to the next video :)
@yongcheng73972 жыл бұрын
Very nice and useful video, thank you
@DigitalSreeni2 жыл бұрын
Glad it was helpful!
@niladrichakraborti5443 Жыл бұрын
Very Helpful !!
@ezioauditore57053 жыл бұрын
Loved the video.... Could you also make a video on squeeze and attention networks for segmentation?
@masterlitheon3 жыл бұрын
Very good explanation!
@DigitalSreeni3 жыл бұрын
Glad you think so!
@ela_bd2 жыл бұрын
So useful, Thanks.
@shalini286r93 жыл бұрын
Great Explanation.Can you share code for Attention Unet and what modification we can made when we use AG in Unet
@DigitalSreeni3 жыл бұрын
That would be the content I'd be covering next week. Please wait until July 14th.
@soumyadrip3 жыл бұрын
Thank you for the video
@giacomobenvenuti31722 жыл бұрын
Question: in your AG schematic shouldn't the skip tensor x have less features than the gating input g? Thank you for these great lectures!!
@gpietra Жыл бұрын
Concordo
@fahd2372 Жыл бұрын
AFAIK they should have the same number of features. Because to get g, you send the layer from which g is coming from(the layer right beneath the skip connection) through a conv layer which halves the number of filters
@hanjiang46432 жыл бұрын
Thanks for your clear explaination! But what if I want to cite your figures in this video? Just add the youtube channel link as a reference?
@rajroy2426 Жыл бұрын
should not g have dobule the channel number than the skip connection ? @9:14
@trustsfundbaby73447 ай бұрын
I know this is a bit old, but yea I'm pretty sure he got it backwards. I'm looking at my implementation of RESUNET++ and the bridge block outputs more channels not reduces.
@edmald19783 жыл бұрын
Amazing explanation!!!!!!
@charlesd4572 Жыл бұрын
I think this is superb but are the number feature maps not the wrong way round between the g and w - surely the skip connection should be half the lower level channel dimension.
@vaibhavsingh10493 жыл бұрын
This is great content.
@bingbingsun63042 жыл бұрын
insightful explaniation. But is there any mathatical support for this, how it is related to probability and information theory?
@RabiulIslam-tw6qc Жыл бұрын
Wonderful.
@francisferri2732 Жыл бұрын
I love your videos ❤
@MrDagonzin3 жыл бұрын
Thanks for the video Mr. Sreeni. I'm really interested in this segmentation topic and your series of videos are amazing. I would like to ask you for a tool to segment images by hand to start a personal project. I remembered that you mentioned one in a past video but I can't find it. Thanks again
@DigitalSreeni3 жыл бұрын
You can use any image processing software that lets you paint over your image and then save the overlay as a mask. These masks are your manual segmented results. For example, you can use APEER. May be the first few minutes of this video can help you understand the process. kzbin.info/www/bejne/nnrQl6ivgZeUhtU
@cipherxen2 Жыл бұрын
Isn't it better to use 2x2 convolution for stride of 2,2?
@mingbka51344 ай бұрын
i think that the n_channels of input g would double the n_channels of input x, isn't it?
@pankjain25 Жыл бұрын
This is Channel attention or Spatial attention ?
@jagdishgg Жыл бұрын
my colab session crashes every time when I train the model. I am using Colab free account with GPU
@manavmadan7933 жыл бұрын
Hi, very good explanation. But How is 0.2 < 0.1? and what does unaligned weights mean ?
@DigitalSreeni3 жыл бұрын
I am not sure what you mean by 0.2 < 0.1. If you are referring to my example of aligned large weights get larger whereas aligned small weights get smaller, then may be I could have used better words to explain. In summary, if both weights are large then you get a large sum. If they both are small then you get a smaller sum value. If one is large and one is large, then you get something in between. Therefore, the final weights can reflect the attention given to objects of interest as the weights would be aligned at these locations. In the video, I did not say 0.2 < 0.1, I said, the sum would be small. What I missed was to say, 'the sum would would be small, in comparison to the sum of aligned weights.' I thought that was implied, but apparently not. Thanks for pointing it out.
@angelachristabel7684 Жыл бұрын
you have no idea how much this video helped me in my exams.. implementations of attention masks in u-net were scarce, and the ones i found didn't really explain step-by-step. you did, however, and in such a nice and simple way! thank you soooo much you're my savior 🥲
@DigitalSreeni Жыл бұрын
Glad it helped!
@arizmohammadi53542 жыл бұрын
like always great!
@franciscofreiri2 жыл бұрын
@DigitalSreeni Should I change the sigmoid funtion to softmax in attention mechanism if I'm segmentating multiclass images?
@zahrakhalid4731 Жыл бұрын
What is global and local attention in convolutional nural network?
@padmavathiv24293 жыл бұрын
thank you sir nice explanation sir Is U-net architecture good only for definite shapes?
@TheMayankDixit2 жыл бұрын
Very Nice Sir
@fahadsherwani24342 жыл бұрын
Thank you sir
@tilkesh2 жыл бұрын
Thank you.
@DigitalSreeni2 жыл бұрын
Welcome!
@Ajaysharma-yv7zp3 жыл бұрын
Thanks sir ... Great content
@DigitalSreeni3 жыл бұрын
Glad you liked it
@Ajaysharma-yv7zp3 жыл бұрын
@@DigitalSreeni Yes !! Sir also try to make videos on Optimization of features extracted using any transfer learning models with BAT , Gray wolf optimizer or any one else. Thanks again
@omamuzoodeh94592 жыл бұрын
Please can it be used for multi class segmentation?
@armstrongaboah45043 жыл бұрын
I like your content
@drforest10 ай бұрын
Thanks!
@DigitalSreeni10 ай бұрын
Thank you very much.
@rs91303 жыл бұрын
please make video on fcn implementation
@Champignon10003 жыл бұрын
i love your videos, thank you. - i don't mean to be rude or anything, but can you say "ok" a little bit less?
@DigitalSreeni3 жыл бұрын
ok :)
@vishawjeetjyoti65668 ай бұрын
Can someone tell me if it's called channel-wise attention or spatial attention? I'm confused about it.
@DigitalSreeni8 ай бұрын
In my code, the attention mechanism is implemented using both channel-wise attention and spatial attention. The attention mechanism is applied at multiple levels during the upsampling process. Channel-wise attention is incorporated through the use of gating signals and attention blocks. The gating_signal function creates a channel-wise gating signal by applying a 1x1 convolution to the input. The attention block (attention_block function) performs channel-wise attention by combining information from the downsampled and upsampled paths. Spatial attention is achieved through the use of upsampling layers and concatenation operations.
@kaveh34802 жыл бұрын
Tnx!
@McQLin Жыл бұрын
Why y = layers.multiply([unsample_psi, x]), but not unsample_psi multiply g, [unsample_psi, g]?
@DigitalSreeni Жыл бұрын
In an attention U-Net, the output of the attention mechanism is typically a weighting factor (also called attention coefficients or attention maps) that is used to modulate the feature maps of the U-Net. This weighting factor is computed by passing the input feature maps through a set of convolutional layers and then applying a softmax activation to obtain a set of values between 0 and 1 that represent the importance of each feature map. To use the attention weighting factor to modulate the feature maps, we need to multiply it with the original feature maps. In other words, we want to scale each feature map by its corresponding attention coefficient. This can be done using the Keras multiply layer, which multiplies two tensors element-wise. In the case of the attention U-Net, the two tensors that we want to multiply are the upsampled attention coefficients (upsample_psi) and the feature maps of the U-Net (x). We want to multiply these two tensors element-wise to obtain a set of scaled feature maps that take into account the attention coefficients. This is expressed as y = layers.multiply([upsample_psi, x]). We don't want to multiply the attention coefficients with the gating signal g, because the gating signal is used to compute the attention coefficients and is not itself a set of feature maps. The gating signal g is used to modulate the feature maps at a later stage in the network, after the attention mechanism has been applied. Specifically, the gating signal g is concatenated with the upsampled and modulated feature maps to produce the final output of the attention U-Net.
@nisrinadinda52533 жыл бұрын
Hi ser, thankyou for your amazing explanation! I want to ask, why after ReLu activation, you did the convolution with 1x1 wtih n_filters = 1, why didnt you do it by applying 1x1 convolution with n_filters = 128?
@victorstepanov84832 жыл бұрын
Hi! As I understand, this step is required to transform data acquired by summing X and G, which both are 64 x 64 x 128 at this point (128 being the number of features for each pixel) into a matrix of weights for each pixel, which must be 64 x 64 x 1 (1 being the weight of each pixel). So after applying the ReLu you need a way to 'squash' these 128 features to just 1 value - the weight of the current pixel. And this is exactly what convolution with n_filters = 1 does.
@yujanshrestha38413 жыл бұрын
Amazing content! Just curious, do you do any consulting? I have a few ML problems I could use an expert like yourself to help me with.
@vishnuvinod76832 жыл бұрын
Thank you for your video. I have a doubt. What is the difference between unet with attention and a unet transformer? Or both are same?
@amankushwaha89272 жыл бұрын
All are different architectures. Transformers have scaled dot product attention which doesn't have convolution layers. Whereas U-Net is convolution based model.
@yoyomcg3 жыл бұрын
Is there a way to add in attention layer to unet models from segmentation_models?
@DigitalSreeni3 жыл бұрын
You can import a model from segmentation models and deconstruct and reconstruct using attention. It involves some work and I do not recommend it as I am not convinced about the real effectiveness of attention.
@zakirshah78953 жыл бұрын
can we use this with CNN for image classification?
@DigitalSreeni3 жыл бұрын
I am not sure, I haven't thought about that concept. May be there is some literature out there.
@sakethsathvik41833 жыл бұрын
Hai.. I watch all your videos.. Really useful.. Thank you. I implemented 2 architectures recurrent residual unet and it's attention gated variant.. But I am getting more performance for R2unet than attention gated R2unet.. Is this possible?why so?
@DigitalSreeni3 жыл бұрын
In general, I find simple U-net to be more effective and efficient compared to many of its variants. R2Unet and Attention Unet may give you better results based on the type of details in your objects and background. Unfortunately, there are no papers I am aware of that talk about when R2Unet/Attention/ etc. will work better and when they fail. In summary, what you found out about R2Unet and its attention equivalent is possible.
@sakethsathvik41833 жыл бұрын
@@DigitalSreeni Thank you.
@pesky_mousquito2 жыл бұрын
note that this self-attention is not exactly like transformers self-attention
@caiyu5383 жыл бұрын
nice
@McQLin Жыл бұрын
what is spacial information?
@DigitalSreeni Жыл бұрын
Spatial information in the context of Attention U-Net refers to the spatial relationships between elements of an image, such as pixels or objects. The Attention U-Net uses attention mechanisms to dynamically weight different regions of an image based on their spatial information. This allows the network to focus on the most relevant features for each task and improve accuracy of image segmentation tasks.
@XX-vu5jo3 жыл бұрын
No coding tutorial?
@DigitalSreeni3 жыл бұрын
This is the intro video, the coding video will be out on July 14th., in a couple of days.
@AlainFavre-n4k Жыл бұрын
Well ...I'm a bit disappointed by this video! I know that segmenting mytochondria from liver cells is a challenge. It comes out from this tuto that 'U-Net with attention' does not seem to solve the problem.