C4W1L06 Convolutions Over Volumes

Рет қаралды 239,101

Күн бұрын

Take the Deep Learning Specialization: bit.ly/38sgOXN
Check out all our courses: www.deeplearni...
Subscribe to The Batch, our weekly newsletter: www.deeplearni...
Follow us:
Twitter: / deeplearningai_
Facebook: / deeplearninghq
Linkedin: / deeplearningai

Пікірлер: 78

@purpleturtledotcom 4 жыл бұрын

Found this gem after wasting my time on several 'fancy' deeplearning video tutorials. "If you can’t explain something in simple terms, you don’t understand it." - Feynman

@leilalovegood4131 4 жыл бұрын

can't agree more

@nithinsai2250 3 жыл бұрын

yeah they all just use fancy words like keras, tensorflow blah blah blah

@ericksonramos4622 6 жыл бұрын

THANK YOU for ending my 4 days 9 hours search on understating CNN first layer input data structure/computation.... Moving on to the next step

@simplexination9837 Жыл бұрын

Same here👍👍👍

@muneshchauhan 2 жыл бұрын

The way Andrew deconstructed the 3D convolution into a simple series of steps just goes in to say how great teachers can accelerate learning by manifolds.

@JoaoPedro-pi9ee 3 жыл бұрын

Best explanation I've found about convolutions over multiple channels. Thanks.

@__dekana__ Жыл бұрын

He explains this so well that I want to binge the entire playlist.

@the_random_noob9860 8 ай бұрын

Blessed are the people who are passionate about nn and just made it into stanford to attend lecture given by this legend

@tomWil245 11 ай бұрын

Finally, someone who can clearly explain the material!

@harshdevmurari007 2 жыл бұрын

The most effective way of explaining depth(no of channels) of CNN

@majinfu 4 жыл бұрын

Thank you so much! This video helped me to understand CNN very much!

@cem_kaya 2 жыл бұрын

thanks for clarifying that the filter is channel deep

@sau002 6 жыл бұрын

Excellent. Convolution over volumes was bugging me for a long time.

@mitakshra1 3 жыл бұрын

thankyou sir for having great people like you in this life

@redash3861 4 жыл бұрын

Dude I really was searching this for 2 days but there was no clear explanation on volumes thanks a lot

@-MuhamadFahmiAmmar Жыл бұрын

WOW, paham juga akhirnya, thanks

@sammyj29 3 жыл бұрын

By far the best explanation I have ever seen. Such simple and crisp! I had one doubt though professor, can we use CNN with data apart from images? If so, what does the filter size represent then? And how do we interpret the features of the data in terms of number of input channels?

@mohitpandey5190 5 жыл бұрын

Deep learning k one and only Jeetu bhaiya :)

@devanshgoel3433 2 жыл бұрын

thank u sir! You are the real hero.

@rhysm8167 10 ай бұрын

great video. Thank you !

@ketilmalde3402 5 жыл бұрын

The formula in the summary is wrong, it should be (n x n x c) input, (f x f x c x z) filter, and (n-f+1 x n-f+1 x z) output dimensions - for z output filters and c input channels. So the convolution is a 4d tensor.

@the_random_noob9860 8 ай бұрын

You just accounted for the fact that there could be more than one filter and so the same number of output channels. I think the prof wrote with regard to having one filter in the summary. Not necessarily wrong ig

@ritwikamajumdar5967 7 ай бұрын

Thank you so much sir

@DrN007 4 жыл бұрын

Great! So a conv64 basically applies 64 different filters on segments of the input.

@BSelm05 5 жыл бұрын

very clear explanation thank you

@nikhilbadveli6 2 жыл бұрын

Can we use different filter sizes in the multiple filter case? And what will be the output shape then?

@chetankumar9463 2 жыл бұрын

Answer mila?

@kebakent 2 жыл бұрын

It's funny how concepts like this can be so confusing when you don't know it. I had no idea the conv layers had an extra unconfigurable dimension and going from 3d to 2d confused me.

@littletiger1228 9 ай бұрын

beautiful

@T-She-Go 5 жыл бұрын

You are my hero. Thank you so much

@luisanaya8210 5 жыл бұрын

Thank you , very well explained :)

@bobo0612 4 жыл бұрын

brilliant!

@mohammadkhubaibnasir6198 5 жыл бұрын

First Nine Numbers from red channel then 3 beneath green channel then 3 beneath blue channel? i didn't understand that aren't we taking 3x3 from each color channel?

@sandipansarkar9211 3 жыл бұрын

nice explanation

@tetouaniabdellah6714 10 ай бұрын

Thanks for video , i have a question , why does convolving a 6x6x3 * 3x3x3 = 4x4 ( which is a 2D ) we convolved 3D objects , so the output should be in 3D ?

@abrahamowos 2 жыл бұрын

Are the filter values trainable?

@fndTenorio Жыл бұрын

That is the whole point.

@pedrovelazquez138 3 жыл бұрын

Thank you!!!

@PietroMarcon 6 жыл бұрын

..so, in every 4 X 4 convoluted matrix's pixels , u put the sum of the products of the kernel pixels for the respective 3 X 3 of the imput image, for every channel (RGB)? meaning u sum the output of the dot product (kernel by respective pixels on the image) of every channel the number of one pixel in the convoluted matrix ?

@johanverm90 5 жыл бұрын

Thanks a lot!!

@אליהולוי-ד4ה 3 жыл бұрын

!thank you so much

@amitnair92 4 жыл бұрын

ok, so at first i was a little confused by what adding all the filters at last mean. say pixal at position (0,0) for RGB are 20,10,30 after applying filter adding all the channels means [20,10,30] and not [60] . correct me if i am wrong.

@franco521 5 жыл бұрын

Why is the RGB convolution output not a 4x4x***3*** image?

@Vishnupratap 5 жыл бұрын

The filter is applied to all 3 layers at once in a step, to get a single output. Simple

@sammathew243 4 жыл бұрын

Since after applying the 3-channel filter across the 3 channels of the image, we get a single output, so **1** is the 3rd dimension of the output!

@bofloa 4 жыл бұрын

@@Vishnupratap you know that not possible to apply that filter to all 3 layers at once programmatically it must be done iteratively, but I think what Andrew did not say is that when you apply the filters to each layer s you get single value a summation of the 3 filter outcome goes to the 4 x 4 matrix, that is why you don't get 4x4x3 but 4x4x1...

@aiinabox1260 4 жыл бұрын

Awesome. Hv 4 questions, scratching my head for the last 2 weeks. In my conv layer 1, I mentioned 32 filter , does that mean 32 diff features will be extracted from each image sequentially, am using greyscale image 28x28x1. Is it possible to make the filters to apply in parallel . Next, In the case of multiple filters , can the filters applied on the image in parallel or in sequential ? How to influence the conv layer to use multiple filters ? Next question is, how to override the default filter by custom filter type ?

@aiinabox1260 4 жыл бұрын

@MattAufF5 thanks a lot. But still I hv one nagging question... Let's say if 32 filters ( feature detectors) applied on a single image won't it cause any contention ?

@aiinabox1260 4 жыл бұрын

@MattAufF5 awesome, thanks a ton

@strongsyedaa7378 3 жыл бұрын

From 3×3 convolution how comes 4x4?

@jacksonvaldez5911 Жыл бұрын

Why is the output 2 dimensions? If you convolve over a 2d image with a 2d filter, you get a 2d output. Wouldnt this mean if you convolve over a 3d image(R, G, B) with a 3d filter, then the output should be 3 dimensions as well right? Edit: I think I get it now. It's because the size of the 3rd dimension is the same for both the filter and the rgb image, so it only has to convolve over the z axis once, producing a 3rd dimension size of 1 in the output. So technically the output is 3 dimensions, it's just that the 3rd dimension is a size of 1 which is basically just 2d If you convolved over an rgb image with a 2x2x2 filter, than the output would then be 3 dimensions.

@GagarineYuri 4 жыл бұрын

@3:11 : So do we add the 3 convolution to output the value of the 4x4 feature map ?

@robbellis5944 4 жыл бұрын

Yes. Instead of thinking of it as 3x 2D convolutions added together, try thinking of it as 1x 3D convolution. It's still an element-wise product and sum of the cube of filters (or kernel) and a 3D portion of the stack of images.

@muhammadmaazwaseem7452 Жыл бұрын

Why do we add the 3 convolutions, why not take thake their average value?

@muhammadmaazwaseem7452 Жыл бұрын

@@robbellis5944Why do we add the 3 convolutions instead of taking their average value?

@jayshah4016 6 жыл бұрын

Are these 3D convolutions ?

@elgs1980 4 жыл бұрын

5:51, I don't understand why the result is not 4x4x3, but 4x4. So where are the 3 layers?

@agueconfle4889 4 жыл бұрын

it seems like each layer resulted from dot product is added up to a single number. That means, you have 3x3x3 (27) multiply operations that sums up.

@elgs1980 Жыл бұрын

3:21 answered my question, add them all those numbers.

@kavitabhosale4861 6 жыл бұрын

is it possible that input 1 X 1 X 155 and filter 1 X 1 X 155 for pixel classification

@ragibishrak1310 6 жыл бұрын

Kavita Bhosale I might be wrong, but I think that won’t be of much use. Since such network will just learn to match the input with the training images. It won’t be able to extract lower level features such as edges etc. It probably will show impressive performance on the training set but would not generalise well. Hoping for feedback from specialists on the topic.

@md.jahidhasan9337 6 жыл бұрын

1p x 1p is so much much much tinny input not generalize

@ВасЯПронин-щ2э 2 жыл бұрын

anda perlu menjelaskan kandungan

@rs9130 4 жыл бұрын

output of rgb channels after convolution must be 4x4x3 right?

@wiz7716 6 жыл бұрын

Why are you stacking the features on each other? I don't get it! normally don't we just SUM UP the features so we have only one layer of features (e.g. horizontal + vertical edges)?

@ericksonramos4622 6 жыл бұрын

That really confused me as well. I had to step back and understand how a computer reads an image. Computers reads an imagine as an example 6x6x3 volume. Breaking it down you have matrix of 6x6 for red color, 6x6 for green color and 6x6 for blue. They refer to the colors as 'depth' or 'channel'. With that being said, when you convolve the filters with the input image, you have to apply it to all 3 'channel' (colors). That's why one filer is again as an example 3x3x3. Watch just the introduction part in this video kzbin.info/www/bejne/q56qe2ZmYpZolaM

@amitnair92 4 жыл бұрын

@@ericksonramos4622 so at last Adding all three filter what does it mean, does the RGB [23,45,23] concerts to single value 51 ??

@ericksonramos4622 4 жыл бұрын

@@amitnair92 i dont quite follow what you said. Elaborate more. You dont add the filter data together. You slide or convlve them with the input image.

@thealgorithm7633 4 жыл бұрын

Is it possible that the number of filter channels greater than the number of input channels?

@salmahayani5683 5 жыл бұрын

HEllo please is it possible to use 256*256*3 images for LeNet architecture .?

@gunslingerarthur5865 5 жыл бұрын

yes it is

@jesuispac 5 жыл бұрын

a god

@latifahouria9120 5 жыл бұрын

I am a beginner in the field of deep learning if there is anyone who can help me in my project

@MuhannadGhazal 3 жыл бұрын

6:02, i was expecting the output to be 4 x 4 x 3. why it was just 4 x 4 ?

@adhoc3018 3 жыл бұрын

It think that it is because he is using the 3 filters as a cube. Thus, after the multiplication, you should sum everything. For the output to be 4 x 4 x 3 I think it would be necessary to have 3 filters for each channel