@1:28 He meant Valid Convolution instead of Same Convolution. :-)
@omarkadhim3193 Жыл бұрын
Started learning with Andrew in 2015 on Coursera. What a generous and gifted person. So grateful.
@TheSpec9010 ай бұрын
We're blessed to live in the same time with people like him, and more blessed to be able to see his lectures for free on the internet
@ruifengwang96905 жыл бұрын
It's so lucky to learn all this with you. Thank you.
@ramonbodie5133 жыл бұрын
Sorry to be so off topic but does anyone know a trick to get back into an Instagram account?? I somehow lost the login password. I appreciate any assistance you can offer me
@jaxxdamien74523 жыл бұрын
@Ramon Bodie instablaster :)
@ramonbodie5133 жыл бұрын
@Jaxx Damien i really appreciate your reply. I got to the site thru google and Im in the hacking process atm. Looks like it's gonna take a while so I will get back to you later with my results.
@ramonbodie5133 жыл бұрын
@Jaxx Damien It did the trick and I now got access to my account again. I am so happy:D Thank you so much, you saved my ass :D
@jaxxdamien74523 жыл бұрын
@Ramon Bodie No problem xD
@abcdxx10596 жыл бұрын
I have heard he running for president in 2020
@GeneralKenobi694204 жыл бұрын
Andrew (Ya)ng
@amoonmohammed8041 Жыл бұрын
amazing explanation! thank you sir for making things easy to understand.
@manuel7833 жыл бұрын
Simple Convolutional Network Example *CORRECTION* Correction in "Simple Convolutional Network Example," 1:14: p=0 means valid convolutions instead of same convolutions. Setting the padding to zero results in "valid convolutions"
@muhammadalam24984 жыл бұрын
In the first layer we used 3x3x3 with 10 different filters. I know from the previous videos that the filters channel and the input channel would be the same and would result in a 37x37x1 output matrix and with 10 filters would result in 37x37x10. In the second layer, would our filter still be 5x5x3? or 5x5x10? if 10, what would those 10 represent? because in the first layer the 3 represents the color channel.
@guyindisguise4 жыл бұрын
In the second layer it would be 5x5x10. The 10 represents the channels, same as before, except this time the channels do not contain information about colors (RGB) but instead contain information on whatever those 10 filters/kernels learned in the previous layer. So out of those 10 channels one could be horizontal edges, another could be vertical edges, another could be a diagonal gradient etc. Each filter of layer 2 will pick up on different combinations of those 10 new channels. As a hypothetical example if one filter of layer 2 had a 1 in the top left for the horizontal edge channel and also a 1 in the top left for the vertical edge channel (and every other weight set to 0) then it would likely detect top left edges with something similar to a 90 degree angle) So to recap: in the first layer filters combine information on different colors, while later they combine information on other different concepts that were previously learned.
@sokhibtukhtaev96935 жыл бұрын
so the full size of filters used in the first conv process is 3x3x3 (10 of them), and full size of filters used in the second conv process is 5x5x10 (20 of them), and full size of filters used in the third conv process is 5x5x20 (40 of them)? that is a bit confusing especially when you only input the 2D filter size and the number of filters in Keras conv2D. In that case it would be model.add(Conv2D(10, (3, 3), padding="same", activation="relu")) model.add(Conv2D(20, (5, 5), padding="same", activation="relu")) model.add(Conv2D(40, (5, 5), padding="same", activation="relu")) And the last dimensions of the filters would automatically be input based on the previous number of filters (or just 3 for the first layer, because 3 is RGB of input image here). Anything to correct or add?
@jgc97005 жыл бұрын
I have that same question, please could someone solve it? Thanks
@David-mr4cn Жыл бұрын
i would be really curious about the exact information flow if you have a 39x39x3 input so an input with 3 channels and you use 10 filters in the conv-layer i would suppose, in order to conserve all information, one would apply all 10 filters on the 3 channels independtly, leaving you with 3*10 output channels with the dimensions 37x37 so now since the output dimension is 37x37x10 and not 37x37x30, my question is : what happens there to merge the information of the channels from 30 to 10 is it summed or averaged i dont see it ??
@VipinKumar-mf5lv6 жыл бұрын
thanks for making high rich information videos. good platform to learn deep neural network basics
@ankitmishra91084 жыл бұрын
Can you please make a video for training CNN from scratch? How will we got those filters by training?
@elgs19804 жыл бұрын
Andrew, can you please explain why the 3 channels suddenly become 1 when convolved to the next layer? Did you add them together?
@reggaebin4 жыл бұрын
I think that he considered 1 channel of 3. The picture has 3 channels of RGB colors.
@yashwanths26224 жыл бұрын
See the 'convolutions over volumes' video.
@yashwanths26224 жыл бұрын
@@reggaebin No.
@n2o_tv5133 жыл бұрын
The prev video has the explanation.
@valeriafonsecadiaz15273 жыл бұрын
Yes, they are added, see two previous videos
@ZANO4395 жыл бұрын
best explanation well done
@sau0026 жыл бұрын
Beautifully explained. Thank you for this video. I understand the intuition behind a convolutional filter (or kernel) . e.g. It could be performing edge detection. What is the intuition behind applying 2 consecutive layers of Convolution filters? i.e. the output of first convolution filter going into the second convolution filter.
@edubezerra356 жыл бұрын
In a face recognition setting, for example, the first filter could detect edges (as you said), while the next filter could detect more complex forms resulting from composing two or more edges (e.g., a nose, an mouth, an eye, etc). Then another filter could detect entire faces based on the activations produced by the previous filter.
@dhruvb26896 жыл бұрын
+Eduardo Bezerra a question of mine is: we would like to believe that cnns work this way, ie with each layer a more complicated feature is detected. However, when training an actual network, we are starting with randomized parameters and optimizing over them. How do we know that we will always end up with a network in which the 1st layer detects edges etc ?
@edubezerra356 жыл бұрын
@@dhruvb2689 Although we sort of managed to emulated this behaviour present in biological brains on our artificial neural nets, we still do not know how this happens. See kzbin.info/www/bejne/d6rdgIiYoLqZaa8 for more details. This presentation is 10 years old, but I dont think the situation changed much.
@yashwanths26224 жыл бұрын
@@dhruvb2689 We don't, which is why have to tune the hyperparameters :).
@long38503 жыл бұрын
best cnn explantation on youtube !
@sandipansarkar92113 жыл бұрын
great explanation.need to watch again
@theexplorer9012 Жыл бұрын
lol
@ahmadbelhaj17566 жыл бұрын
So I did not get when does mean stride of 1 or 2 mean? So when say stride 1 it means we multiply the kernel with each pixel and it's neighbors of kernel size. So the number of multiplication will corresponds to the number of pixels, whereas when say stride of 2 the number of multiplications will halved. Can any correct me
@MrRfvideos6 жыл бұрын
Stride is only about how you move your kernel on the image as I understood. As with stride one you move your kernel always one to the right or down in each step of convolution, but with stride 2 you move your kernel 2 pixels to the right or down.
@swfsql Жыл бұрын
6:00 you could also describe that last layer as a conv3d layer with one 7x7x40 filter
@aguinaldomulondemulonde78007 ай бұрын
you need to use this formula output conv2=n-2p-f/s+1
@johnsonli64672 жыл бұрын
May i ask for fully connected layer, how to calculate the weigh? (With the input )
@YazeedAlkosai4 жыл бұрын
thanks for the fruitful video, I have a 3D model need to measure the feature of length, width, and depth by implementing CNN. is it possible to use CNN to gain better measuring?
@asdkazmi4 жыл бұрын
why you didn't use such a filter which can reduce output into 7x7x40 within just one layer? e.g. if we choose filter in first layer 15x15x40 with stride = 4 and padding = 0 then output within one layer will be 7x7x40 It is also seen that in all the lectures, usually used filter size was 3x3x(No. of Channels) or max up-to 5x5x(No. of Channels). Is it always recommended to take a small size of filter?
@doggilovh2 жыл бұрын
The network can learn more sophisticated functions with more layers.
@rubempacelli681511 ай бұрын
AMAZING!
@devanshgoel34332 жыл бұрын
thank u sir
@mp01575 жыл бұрын
Horizontal and vertical are examples of two filters, anybody know what are the other type of filters used in the example that brings the filter count to 20? Thanks
@kishanlal6765 жыл бұрын
Horizontal and vertical filters detect horizontal(0°) and Vertical Edges(90°) in the input image..Whereas remaining 18 filters can be used to detect edges that are in different angles say 45°,60°,etc. Correct me if I'm wrong
@mp01575 жыл бұрын
@@kishanlal676 Thank you :) That was my first thought too! I was wondering if there was more to it than angular edge detection filters.
@mudassarahmad5729 Жыл бұрын
can you explain the mathematical operation of the layers? how can a 3*3 fliter use 39*39*3 tensor to give 37*37*1 output
@zawarkhan22458 ай бұрын
(image - kernel + 2* padding)/stride +1 39-0-3/1 + 1 = 37, if only 1 filter is used then 37x37x1
@gravisriders81242 жыл бұрын
Are all filters unique in all layer or some filters might be the same in multiple layers (e.g. in layer 1 and 2)?
@doggilovh2 жыл бұрын
The filters are learned by the network. Its unlikely that the network would learn the same exact filter in multiple layers.
@muhammadtahirmahmood65594 жыл бұрын
Is this just me or someone else also noticed? Or may be i am wrong. On layer 3 when video is at 4:12 its f3 and s2, shouldn't it be s3 ?. Please correct me.
@bharshavardhan20074 жыл бұрын
Yes you are right. s and p always corresponds to current layer.
@luisstalinjaraobregon50446 жыл бұрын
I would like to know how to choose the number of filters
@jeetsensarma30336 жыл бұрын
depending on your requirements.
@chrisliu35006 жыл бұрын
I would either
@elgs19806 жыл бұрын
Can a higher order AI help to choose the number of filters?
@razaali2365 жыл бұрын
what do you mean by requirement, can u give an example or explain it? @@jeetsensarma3033
@mohammedqaraad65395 жыл бұрын
@@razaali236 means what are you looking for from your CNN , filters actually using to extracted features from input image so if you are working on face recognition you have filter to detected eyes and another to detected the nose and others filter to detected all regions on the face , filters or kernels known as (weights) or kind of CNN hyper-parameter :)
@eyenzz936 жыл бұрын
what if the n(height) is not equal to the n(width)?
@rohankavari86123 жыл бұрын
it will work
@mueez.mp44 жыл бұрын
Are we implicitly implying that the first filter has 3 channels and the second 10?
@VishalBalaji3 жыл бұрын
No, all the filters mentioned have dimension of f x f x 3. This convolves with input to give n x n output. If we have, let's say, 10 filters, then we have 10 such n x n outputs. Hence, the dimension would be n x n x 10.
@MrRfvideos6 жыл бұрын
I would suggest to use a dark background, something like a dark theme. The complete white background is too much for the eyes and causes headache if you watch 10 of the videos in a row! Thanks for the great videos.
@PRATIK19005 жыл бұрын
i dont know if its the same for others but I find that watching bright stuff is harder on the eyes if the room you are in is dark. Maybe watch in a well-lit room? might be less stressful for your eyes
@NikhilAngadBakshi5 жыл бұрын
@@PRATIK1900 Or reduce the brightness of your screen :p
@moeininstructor Жыл бұрын
Too goo 👍
@VipinKumar-mf5lv6 жыл бұрын
sir, please share some documents or links where we get more information in deep neural network and machine learning .
@132_gehna_anand62 жыл бұрын
explain starting 4:37
@fire_nakamura4 ай бұрын
Here to learn English
@VipinKumar-mf5lv6 жыл бұрын
we r waiting to apply theroy in coading with python