219 - Understanding U-Net architecture and building it from scratch

Рет қаралды 60,990

DigitalSreeni

Күн бұрын

Пікірлер

@deividrumiancev7356 11 ай бұрын

Great tutorial!! Way better to learn here than via my uni lectors and teachers!! Keep it up mate! You are the best!

@cvformedicalimages6466 Жыл бұрын

Thanks for the detailed explanation. This is the first time I am understanding how a Unet works! Thanks 🙂

@Vikram-wx4hg 2 жыл бұрын

Yes, really enjoyed it! Sreeni, you are a fantastic teacher and your tutorials bring out the concepts with reamarkable simplicity and clarity.

@DigitalSreeni 2 жыл бұрын

Thank you so much 🙂

@ToxikJumper 29 күн бұрын

7 minutes in and already everything is clicking. Amazing explanation!

@javierordonez2445 16 күн бұрын

Great tutorial ! And great concise coding session. Packed a college lecture or two in a video !

@XX-vu5jo 3 жыл бұрын

I would love to see a video on 3D U-Net from scratch as well. That will really help on understanding it better.

@1global_warming1 Жыл бұрын

Thank you very much for such a clear explanation of how to build a U-net architecture from scratch

@kozhushko Ай бұрын

Thank you! That's a great talent to explain so clearly.

@drforest 11 ай бұрын

Thanks! If you had changed all the numbers of layers to say, 50, 100, 200, etc would that work, just with different designated layer numbers and whatever associated change in performance. Feels like that might have made the numbers a little easier to follow. But great work.

@DigitalSreeni 11 ай бұрын

Thank you very much :)

@AmitChaudhary-qx5mc 3 жыл бұрын

Sir i am very much greatful to your expanation on semantic segmentation. You make everything so easy and sublime.

@DigitalSreeni 3 жыл бұрын

Thanks a ton

@channagirijagadish1201 Жыл бұрын

Excellent Tutorial. Much appreciated!

@abeldechenne6915 6 ай бұрын

that was crystal clear, thank you for the good explanation!

@RRP3168 3 жыл бұрын

Great video, but I have a question: What if I want to segment my own images, how do I get the masks for training the UNET?

@rishabgangwar9901 3 жыл бұрын

Thank you so much sir for crystal clear explanation

@shivamchaurivar2794 3 жыл бұрын

I really love your videos, Hope to make a video on stateful Lstm. Its very tough to find good video on it.

@nikhilmudgal8541 3 жыл бұрын

Seems interesting. I hardly find any videos explaining Stateful LSTM myself

@SeadoooRider 3 жыл бұрын

Your channel is gold. Thank you 🙏

@antonittaeileenpious8653 3 жыл бұрын

Sir,according to what i have understood in all the layers we are getting some features and applying maxpooling to actually reduce the features extracted and in the upsampling we increase the spatial dimensions,where do we actually classify the labelled pixels,and vary their weights,and apply a particular threshold to get to our desired ROI.

@davefaulkner6302 5 ай бұрын

Excellent video; thank you. One issue, however: shouldn't batch normalization follow ReLU rather than the reverse? If you ReLU a Batch Normalized layer, it's no longer normalized. Probably works, as these models are so flexible it would probably compensate in training. Or perhaps I'm confusing Batch Norm with Layer Norm?

@talha_anwar 3 жыл бұрын

The decoder part should be same as encoder, but in the reverse direction. but when we concatenated, how thins thing maintained ?

@CD-et7vk 2 жыл бұрын

Doesn't the original U-Net use dissimilar padding so the image dimensions are lowered by 2 in each convolution per layer (572 -> 570 -> 568, etc.)?

@dhaferalhajim Жыл бұрын

What's the number of classes in this structure? I saw one in the input and output

@Алексей-ж4щ4м 9 ай бұрын

Thanks a lot! But there's a question I cant't understand: why do we use padding="same" in a decoder block and have upsampling situations? I mean our shape is not the same, it become larger. Can sb help please?

@apekshagopale7095 Жыл бұрын

Can you please tell how to create masks for SAR images?

@jetsdiver 2 жыл бұрын

For segmentation, for example, to detect things like, flood or fire, or smoke or clouds. Better to use grayscale or colored images?

@geethaneya2452 3 жыл бұрын

I would like to see video on TransUNet. That will really help to understand its concept better.

@AravTristy Ай бұрын

Please share the playlist of this video

@anshulbisht4130 2 жыл бұрын

loved ur code. i knew unet architecture but when u showed it with running code n images , it was awesome . will reimplement with some other data and try to see if it works. just one confusion what is ground truth when we are applying adam how loss is getting calculating for backprop to work.

@madeleinedawson8539 Жыл бұрын

Loved the video!!! So helpful

@rohit_mondal__ 3 жыл бұрын

Your explanation is actually very good sir. Thank you. Happy to have subscribed to your channel .

@arshadgeo8829 2 жыл бұрын

Hello Sreeni, I wanted a favor that I would like to see the complete implementation of Segnet for satellite imagery and should have idea for Segnet+Resnet (using or without transfer learning). Can you help me out?

@dyahtitisari7206 2 жыл бұрын

Thank you so much Sir. It's very great explanation

@anikashrivastava8228 11 ай бұрын

Sir, can we seperate a u-net, in sense that can we train a u-net and then save weights of encoder, decoder bottleneck separately also, and then use it separately? will we get same reconstruction of a test dataset if we do it by u-net (entire architecture) and when we do it by feeding it to enocder, then bottleneck then decoder? Please help.

@deepak_george 3 жыл бұрын

Good work @digitalsreeni ! Which tool do use to view image mask? Since in normal image viewer it shows all black.

@DigitalSreeni 3 жыл бұрын

Use imageJ.

@deepak_george 3 жыл бұрын

@@DigitalSreeni Where is the option in ImageJ to configure to see the mask? Couldn't find the video in which you mentioned this.

@IqraNosheen-ek3nk Жыл бұрын

very good explanation, thanks for making video

@msaoc22 Жыл бұрын

Thank you for nice simple explanation

@akshaybatra1777 Жыл бұрын

Does unet only works with 3 channels? I have breast mammography in dicom format, they have 1 channel (grayscale). Can I still use uNET?

@DigitalSreeni Жыл бұрын

You can use it for any number of input channels.

@akshaybatra1777 Жыл бұрын

what about the image size? My images are 4000x3000. is it possible to use unet on them>

@davidyao2856 8 ай бұрын

can this be applied to a dicom type dataset ?

@effeff3253 2 жыл бұрын

Can you please explain these two doubts: 1) Why is the number of feature maps has been reduced to half in each layer of the expansion phase? 2) Say for the 1st layer of the expansion phase, the input is 16x16 with 1024 feature maps then how does it become 32x32 with 512 feature maps after applying a simple up-convolution of 2x2. I mean up-convolution is simply copying the data into a larger block so number of feature maps should have nothing to do with this copying and be same as 1024 only.When doing up-conv 2x2, which 512 feature maps have been taken out from 1024 feature maps?

@DigitalSreeni 2 жыл бұрын

The number of feature maps has nothing to do with the convolution kernel. The number of feature maps is defined by you, as part of your model. If you define your Conv. as - Conv2D(512, (2, 2), strides=2), you are defining the number of feature maps as 512 and kernel size for the convolution operation as 2x2 and stride as 2. This means your output would have 512 feature maps and the output image dimensions would be whatever you get with a 2x2 kernel and stride 2. Most people have a misunderstanding about this concept and I am glad you asked.

@effeff3253 2 жыл бұрын

@@DigitalSreeni Thanks for replying but my doubts still remain. For e.g. in the first layer of contraction phase, the output is 64 images of 256x256. When it is subjected to max pooling, the size of each image tile is reduced to half i.e. we now have 64 images of size 128x128. Now in the 2nd layer, I have 128 filters. Are these 128 filters applied to each of the 64 images of 128x128? If it is, for each of the 64 images of size 128x128, I have 128 output images. i.e. I have a total of 64x128 images of size 128x128. which keeps on growing after each convolution operation.

@anorderedhole2197 2 жыл бұрын

I tried making images with very narrow masks with a line a pixel in thickness. I noticed that when I resize the images the line will get broken up. Does this become more severe when the image is down sampled in the Unet model? Do you need the mask to have a very broad pixel widths to be useful?

@edmald1978 3 жыл бұрын

Thank you very much for this video really amazing the way you explain. Thank you for your great Channel!!!!!!!!!!!!!!!!!!!!!!!!!!!!

@fatmagulkurt2080 3 жыл бұрын

Thank you for your effort to teach. I really appreciate your videos. I learning so much about coding. But I couldn't find any code anywhere for classifying multiclass images with DenseNet201. And also how can I do 5 fold - validation when runing theese deep learning codes. I wish you can help me. It will be so helpfull for me.

@ericthomas4072 Жыл бұрын

Very helpful! Thank you!

@random-yu5hv 3 жыл бұрын

I really appreciate your videos. Will you check segAN network in medical image segmentation? Best regards.

@DigitalSreeni 3 жыл бұрын

GANs are generative networks so I am reluctant to use them for segmentation. Besides, U-nets do a great job so I haven’t found a reason to find an alternative.

@rajithakv4449 3 жыл бұрын

Sir I have used the unet model for segmentation of filamentous structures. Though it give a good prediction, the predictions are wider than the groung truth. What could be the reason for this. Also the IOU value is around 0.33. I have also added drop out with 0.5.

@DigitalSreeni 3 жыл бұрын

Try increasing threshold values for your filamentous class; I assume the probability around the wider regions are lower. If that is not the case then please verify your labels, may be they are also exaggerated? If not, check whether you are working on images of similar size showing features in a similar dimensions. Finally, try 3D U-Net as the prediction can benefit from additional information from the 3rd dimension.

@XX-vu5jo 3 жыл бұрын

Are you familiar with the attention module? Is it possible to implement such with u net? Would love to watch a video about it.

@DigitalSreeni 3 жыл бұрын

It is coming soon, please stay tuned.

@XX-vu5jo 3 жыл бұрын

@@DigitalSreeni i am always tuned in woah! Thanks

@computingyolo5545 3 жыл бұрын

There is one aspect that is blocking me, at the line #12, small_dataset_for_training/images/12_training_mito_images.tif small_dataset_for_training/masks/12_training_mito_masks.tif it's not specified in this lesson, whether the large image and large mask stacks have to be left undefined as address. In other words, how could I address folders with many pictures and masks to be picked up? A simple example, please? Brilliant explanation, Doctor, long life to you!

@rishabgangwar9901 3 жыл бұрын

I wanted to know more about .tif format

@CRTagadiya 3 жыл бұрын

Could you please add this video under your image segmentation playlist?

@DigitalSreeni 3 жыл бұрын

Sure, thanks for reminding.

@sorasora3611 2 жыл бұрын

How write u_net is algorithem step?

@hadyanpratama 3 жыл бұрын

Thank you, very clear explanation

@DigitalSreeni 3 жыл бұрын

You are welcome!

@lucasdiazmiguez8680 2 жыл бұрын

Hi! Very nice video, just a question, do u have the link to the original paper?

@DigitalSreeni 2 жыл бұрын

arxiv.org/pdf/1505.04597.pdf

@lucasdiazmiguez8680 2 жыл бұрын

@@DigitalSreeni Thank you so much, suscribed!

@princekhunt1 19 күн бұрын

Nice tutorial dini

@caiyu538 3 жыл бұрын

excellent lectures.

@nayamascariah776 3 жыл бұрын

your videos are really amazing.. I am really thankful for your efforts.. sir I have one doubt.. if I want to add dice coefficient as a loss function.. how can I add..??

@DigitalSreeni 3 жыл бұрын

Please check my video 215 for an answer. I also covered it as part of videos 210, 211, and 214. But I wrote my own few lines for dice coefficient in video 215, so you may find it useful.

@amintaleghani2110 3 жыл бұрын

@DigitalSreeni , thank you for your effort making this informative video. I wonder if we can use ResNet for Time Series data prediction. If so, Could you pls make video on the subject. Thanks again

@himanimogra6824 Жыл бұрын

Can we pas an input size of 224 * 224 to U-Net?

@himanimogra6824 Жыл бұрын

224*224*1

@DigitalSreeni Жыл бұрын

Yes. You can pass any image size - U-Net is fully convolutional.

@himanimogra6824 Жыл бұрын

@@DigitalSreeni Thank You for the reply sir. I have one more doubt when I am training my model my kernel is getting dead again and again at the start of 1st epoch itself. What should I do? I have resized my images in 224*224*224 dimension

@tarasankarbanerjee 2 жыл бұрын

Dear Sreeni, thanks a lot for this awesome video. Just one question, shouldn't the 'decoder_block' call the 'conv_block' twice?

@tahaben-abbou7029 2 жыл бұрын

No actually the encoder block has already two convs layers the Decoder should call it one time not two. Thank you

@tarasankarbanerjee 2 жыл бұрын

@@tahaben-abbou7029 Thanks Taha for your comments. But if you look at the UNet architecture, the Decoder block also has 2 conv layers; just like the Encoder block. Hence the question.

@pycad 3 жыл бұрын

Thank you for this great explanation

@DigitalSreeni 3 жыл бұрын

You're very welcome!

@antonittaeileenpious8653 3 жыл бұрын

Sir,is the last layer a FCN layer.

@DigitalSreeni 3 жыл бұрын

U-net is a fully convolutional network, so there are no FCN layers.

@abderrahmaneherbadji5478 3 жыл бұрын

Great explanation

@orioncloud4573 Жыл бұрын

thx for the clear application.

@pallavi_4488 3 жыл бұрын

doing an amazing job

@DigitalSreeni 3 жыл бұрын

Thanks

@nandankakadiya1494 3 жыл бұрын

Thank you for great explanation sir. Code is not available in GitHub. It would be great if you upload this.

@DigitalSreeni 3 жыл бұрын

It will be there soon... usually 6 to 8 hr. delay as I need to upload manually.

@nandankakadiya1494 3 жыл бұрын

@@DigitalSreeni ok thanks for the great tutorial

@jyothir07 3 жыл бұрын

Sir, Recently joined as your student. Couldn't thank you enough for this teaching. Could you please explain how to create and use a custom dataloader for large datasets?

@DigitalSreeni 3 жыл бұрын

I plan on recording a video soon but not sure when it is going to happen. Until then you may find this useful: kzbin.info/www/bejne/jH-qg5-ca7-fh6M

@cutedevil173 3 жыл бұрын

Hi, its really interseting and educational. It would be really helpful if you train Unet on Automated Cardiac Diagnosis Challenge (ACDC) using .nifty kind of dataset

@torikulislam23 3 жыл бұрын

Well thank u ,it was really obliging ❤️

@jithinnetticadan4958 3 жыл бұрын

Will this work for 256*256 rgb images or should I increase the layers and start from 32/16?

@DigitalSreeni 3 жыл бұрын

U-net is a framework where you convert an autoencoder architecture into U-net by adding skip connections. There is no right or wrong and the network can be customized for your specific application. The example I provided will work for 256x256 RGB images, you just need to define the number of channels as 3.

@jithinnetticadan4958 3 жыл бұрын

Thanks for the reply.. I tried using the same but my single epoch takes upto 30 mins to complete. (without gpu) Is it normal?

@DigitalSreeni 3 жыл бұрын

Depends on the amount of data. It will be painfully slow without GPU. Try using Google colab where you get a free GPU.

@jithinnetticadan4958 3 жыл бұрын

Thanks a lot. Actually my dataset contains 7200 images including the masks so its impossible to make use of google colab, only option is to reduce the size of my dataset.

@jithinnetticadan4958 3 жыл бұрын

Also sir in your video you had mentioned about increasing the layers so I tried increasing the layers by 2 (16,32) but the number of parameters remains the same. What could be the reason?

@guitar300k 3 жыл бұрын

Is u-net the best for image segmentation?

@DigitalSreeni 3 жыл бұрын

It is the most widely used framework for image segmentation where a lot of papers have been published. So we know it works.

@olubukolaishola4840 3 жыл бұрын

👏🏾👏🏾👏🏾👏🏾👏🏾👏🏾

@Luxcium 9 ай бұрын

21:09 I do prefer the functional programming approach… classes are useful to describe functors, monads, maybe and even some “eithers” 😏😏😏😏 this is way easier to understand for me but I don’t say FP is better than OOP or any such… 😅😅😅😅

@biplugins9312 3 жыл бұрын

My only choice is to run your software on Colab. It uses the latest tensorflow and I had no desire to drop back to version 1.x. To correct an error, I had to change the directory structure on keras.utils and instead of trying to import from unet_model_with_functions_of_blocks, I did a %run on the program from inside colab. The changes are !pip install patchify %run '/content/drive/My Drive/Colab Notebooks/unet_model_with_functions_of_blocks.py' #from unet_model_with_functions_of_blocks import build_unet from keras.utils.np_utils import normalize I don't know why but on colab it seems to be running about 1/2 the speed you are seeing in spyder. Epoch 25/25 40/40 [==============================] - 58s 1s/step - loss: 0.0383 - accuracy: 0.9853 - val_loss: 0.1793 - val_accuracy: 0.9589 It complained that "lr" and "fit_generator" were deprecated so I fixed them to: model.compile(optimizer=Adam(learning_rate = 1e-3), loss='binary_crossentropy', metrics=['accuracy']) history = model.fit(my_generator, validation_data=validation_datagen, but it didn't help. In any case, it does work in colab, with the latest tensorflow.

@mithgaur7419 3 жыл бұрын

I came looking for copper and I found gold, it would've saved me a lot of time if I found this channel earlier thnx for the awesome content. I'm currently working on a U-Net project using google colab and I can't figure out how to define a distribution strategy for tpu. What is the correct way to do it on this code?