C4W1L08 Simple Convolutional Network Example

Рет қаралды 195,139

DeepLearningAI

Күн бұрын

Пікірлер: 77

@Ankitdigarsey 7 жыл бұрын

@1:28 He meant Valid Convolution instead of Same Convolution. :-)

@omarkadhim3193 Жыл бұрын

Started learning with Andrew in 2015 on Coursera. What a generous and gifted person. So grateful.

@TheSpec90 10 ай бұрын

We're blessed to live in the same time with people like him, and more blessed to be able to see his lectures for free on the internet

@ruifengwang9690 5 жыл бұрын

It's so lucky to learn all this with you. Thank you.

@ramonbodie513 3 жыл бұрын

Sorry to be so off topic but does anyone know a trick to get back into an Instagram account?? I somehow lost the login password. I appreciate any assistance you can offer me

@jaxxdamien7452 3 жыл бұрын

@Ramon Bodie instablaster :)

@ramonbodie513 3 жыл бұрын

@Jaxx Damien i really appreciate your reply. I got to the site thru google and Im in the hacking process atm. Looks like it's gonna take a while so I will get back to you later with my results.

@ramonbodie513 3 жыл бұрын

@Jaxx Damien It did the trick and I now got access to my account again. I am so happy:D Thank you so much, you saved my ass :D

@jaxxdamien7452 3 жыл бұрын

@Ramon Bodie No problem xD

@abcdxx1059 6 жыл бұрын

I have heard he running for president in 2020

@GeneralKenobi69420 4 жыл бұрын

Andrew (Ya)ng

@amoonmohammed8041 Жыл бұрын

amazing explanation! thank you sir for making things easy to understand.

@manuel783 3 жыл бұрын

Simple Convolutional Network Example *CORRECTION* Correction in "Simple Convolutional Network Example," 1:14: p=0 means valid convolutions instead of same convolutions. Setting the padding to zero results in "valid convolutions"

@muhammadalam2498 4 жыл бұрын

In the first layer we used 3x3x3 with 10 different filters. I know from the previous videos that the filters channel and the input channel would be the same and would result in a 37x37x1 output matrix and with 10 filters would result in 37x37x10. In the second layer, would our filter still be 5x5x3? or 5x5x10? if 10, what would those 10 represent? because in the first layer the 3 represents the color channel.

@guyindisguise 4 жыл бұрын

In the second layer it would be 5x5x10. The 10 represents the channels, same as before, except this time the channels do not contain information about colors (RGB) but instead contain information on whatever those 10 filters/kernels learned in the previous layer. So out of those 10 channels one could be horizontal edges, another could be vertical edges, another could be a diagonal gradient etc. Each filter of layer 2 will pick up on different combinations of those 10 new channels. As a hypothetical example if one filter of layer 2 had a 1 in the top left for the horizontal edge channel and also a 1 in the top left for the vertical edge channel (and every other weight set to 0) then it would likely detect top left edges with something similar to a 90 degree angle) So to recap: in the first layer filters combine information on different colors, while later they combine information on other different concepts that were previously learned.

@sokhibtukhtaev9693 5 жыл бұрын

so the full size of filters used in the first conv process is 3x3x3 (10 of them), and full size of filters used in the second conv process is 5x5x10 (20 of them), and full size of filters used in the third conv process is 5x5x20 (40 of them)? that is a bit confusing especially when you only input the 2D filter size and the number of filters in Keras conv2D. In that case it would be model.add(Conv2D(10, (3, 3), padding="same", activation="relu")) model.add(Conv2D(20, (5, 5), padding="same", activation="relu")) model.add(Conv2D(40, (5, 5), padding="same", activation="relu")) And the last dimensions of the filters would automatically be input based on the previous number of filters (or just 3 for the first layer, because 3 is RGB of input image here). Anything to correct or add?

@jgc9700 5 жыл бұрын

I have that same question, please could someone solve it? Thanks

@David-mr4cn Жыл бұрын

i would be really curious about the exact information flow if you have a 39x39x3 input so an input with 3 channels and you use 10 filters in the conv-layer i would suppose, in order to conserve all information, one would apply all 10 filters on the 3 channels independtly, leaving you with 3*10 output channels with the dimensions 37x37 so now since the output dimension is 37x37x10 and not 37x37x30, my question is : what happens there to merge the information of the channels from 30 to 10 is it summed or averaged i dont see it ??

@VipinKumar-mf5lv 6 жыл бұрын

thanks for making high rich information videos. good platform to learn deep neural network basics

@ankitmishra9108 4 жыл бұрын

Can you please make a video for training CNN from scratch? How will we got those filters by training?

@elgs1980 4 жыл бұрын

Andrew, can you please explain why the 3 channels suddenly become 1 when convolved to the next layer? Did you add them together?

@reggaebin 4 жыл бұрын

I think that he considered 1 channel of 3. The picture has 3 channels of RGB colors.

@yashwanths2622 4 жыл бұрын

See the 'convolutions over volumes' video.

@yashwanths2622 4 жыл бұрын

@@reggaebin No.

@n2o_tv513 3 жыл бұрын

The prev video has the explanation.

@valeriafonsecadiaz1527 3 жыл бұрын

Yes, they are added, see two previous videos

@ZANO439 5 жыл бұрын

best explanation well done

@sau002 6 жыл бұрын

Beautifully explained. Thank you for this video. I understand the intuition behind a convolutional filter (or kernel) . e.g. It could be performing edge detection. What is the intuition behind applying 2 consecutive layers of Convolution filters? i.e. the output of first convolution filter going into the second convolution filter.

@edubezerra35 6 жыл бұрын

In a face recognition setting, for example, the first filter could detect edges (as you said), while the next filter could detect more complex forms resulting from composing two or more edges (e.g., a nose, an mouth, an eye, etc). Then another filter could detect entire faces based on the activations produced by the previous filter.

@dhruvb2689 6 жыл бұрын

+Eduardo Bezerra a question of mine is: we would like to believe that cnns work this way, ie with each layer a more complicated feature is detected. However, when training an actual network, we are starting with randomized parameters and optimizing over them. How do we know that we will always end up with a network in which the 1st layer detects edges etc ?

@edubezerra35 6 жыл бұрын

@@dhruvb2689 Although we sort of managed to emulated this behaviour present in biological brains on our artificial neural nets, we still do not know how this happens. See kzbin.info/www/bejne/d6rdgIiYoLqZaa8 for more details. This presentation is 10 years old, but I dont think the situation changed much.

@yashwanths2622 4 жыл бұрын

@@dhruvb2689 We don't, which is why have to tune the hyperparameters :).

@long3850 3 жыл бұрын

best cnn explantation on youtube !

@sandipansarkar9211 3 жыл бұрын

great explanation.need to watch again

@theexplorer9012 Жыл бұрын

lol

@ahmadbelhaj1756 6 жыл бұрын

So I did not get when does mean stride of 1 or 2 mean? So when say stride 1 it means we multiply the kernel with each pixel and it's neighbors of kernel size. So the number of multiplication will corresponds to the number of pixels, whereas when say stride of 2 the number of multiplications will halved. Can any correct me

@MrRfvideos 6 жыл бұрын

Stride is only about how you move your kernel on the image as I understood. As with stride one you move your kernel always one to the right or down in each step of convolution, but with stride 2 you move your kernel 2 pixels to the right or down.

@swfsql Жыл бұрын

6:00 you could also describe that last layer as a conv3d layer with one 7x7x40 filter

@aguinaldomulondemulonde7800 7 ай бұрын

you need to use this formula output conv2=n-2p-f/s+1

@johnsonli6467 2 жыл бұрын

May i ask for fully connected layer, how to calculate the weigh? (With the input )

@YazeedAlkosai 4 жыл бұрын

thanks for the fruitful video, I have a 3D model need to measure the feature of length, width, and depth by implementing CNN. is it possible to use CNN to gain better measuring?

@asdkazmi 4 жыл бұрын

why you didn't use such a filter which can reduce output into 7x7x40 within just one layer? e.g. if we choose filter in first layer 15x15x40 with stride = 4 and padding = 0 then output within one layer will be 7x7x40 It is also seen that in all the lectures, usually used filter size was 3x3x(No. of Channels) or max up-to 5x5x(No. of Channels). Is it always recommended to take a small size of filter?

@doggilovh 2 жыл бұрын

The network can learn more sophisticated functions with more layers.

@rubempacelli6815 11 ай бұрын

AMAZING!

@devanshgoel3433 2 жыл бұрын

thank u sir

@mp0157 5 жыл бұрын

Horizontal and vertical are examples of two filters, anybody know what are the other type of filters used in the example that brings the filter count to 20? Thanks

@kishanlal676 5 жыл бұрын

Horizontal and vertical filters detect horizontal(0°) and Vertical Edges(90°) in the input image..Whereas remaining 18 filters can be used to detect edges that are in different angles say 45°,60°,etc. Correct me if I'm wrong

@mp0157 5 жыл бұрын

@@kishanlal676 Thank you :) That was my first thought too! I was wondering if there was more to it than angular edge detection filters.

@mudassarahmad5729 Жыл бұрын

can you explain the mathematical operation of the layers? how can a 3*3 fliter use 39*39*3 tensor to give 37*37*1 output

@zawarkhan2245 8 ай бұрын

(image - kernel + 2* padding)/stride +1 39-0-3/1 + 1 = 37, if only 1 filter is used then 37x37x1

@gravisriders8124 2 жыл бұрын

Are all filters unique in all layer or some filters might be the same in multiple layers (e.g. in layer 1 and 2)?

@doggilovh 2 жыл бұрын

The filters are learned by the network. Its unlikely that the network would learn the same exact filter in multiple layers.

@muhammadtahirmahmood6559 4 жыл бұрын

Is this just me or someone else also noticed? Or may be i am wrong. On layer 3 when video is at 4:12 its f3 and s2, shouldn't it be s3 ?. Please correct me.

@bharshavardhan2007 4 жыл бұрын

Yes you are right. s and p always corresponds to current layer.

@luisstalinjaraobregon5044 6 жыл бұрын

I would like to know how to choose the number of filters

@jeetsensarma3033 6 жыл бұрын

depending on your requirements.

@chrisliu3500 6 жыл бұрын

I would either

@elgs1980 6 жыл бұрын

Can a higher order AI help to choose the number of filters?

@razaali236 5 жыл бұрын

what do you mean by requirement, can u give an example or explain it? @@jeetsensarma3033

@mohammedqaraad6539 5 жыл бұрын

@@razaali236 means what are you looking for from your CNN , filters actually using to extracted features from input image so if you are working on face recognition you have filter to detected eyes and another to detected the nose and others filter to detected all regions on the face , filters or kernels known as (weights) or kind of CNN hyper-parameter :)

@eyenzz93 6 жыл бұрын

what if the n(height) is not equal to the n(width)?

@rohankavari8612 3 жыл бұрын

it will work

@mueez.mp4 4 жыл бұрын

Are we implicitly implying that the first filter has 3 channels and the second 10?

@VishalBalaji 3 жыл бұрын

No, all the filters mentioned have dimension of f x f x 3. This convolves with input to give n x n output. If we have, let's say, 10 filters, then we have 10 such n x n outputs. Hence, the dimension would be n x n x 10.

@MrRfvideos 6 жыл бұрын

I would suggest to use a dark background, something like a dark theme. The complete white background is too much for the eyes and causes headache if you watch 10 of the videos in a row! Thanks for the great videos.

@PRATIK1900 5 жыл бұрын

i dont know if its the same for others but I find that watching bright stuff is harder on the eyes if the room you are in is dark. Maybe watch in a well-lit room? might be less stressful for your eyes

@NikhilAngadBakshi 5 жыл бұрын

@@PRATIK1900 Or reduce the brightness of your screen :p