I've been wanting a video like this for a long time, you nailed the balance between showing and explaining. I subscribed and liked the video.
@far1din2 жыл бұрын
Thank you my friend. Glad you got value out of it! 🔥
@swedenontwowheels Жыл бұрын
This is an excellent video explaining how convolutional layers work at a glance! Fantastically well done!
@fdshdsfdsqq Жыл бұрын
Man, you're amazing! This is a perfect visualization, thank you God give you health!
@far1din Жыл бұрын
Haha, thank you for the kind words my friend :D
@AnaS-lp4mf9 ай бұрын
Literally thank you so much for your video. You don’t know how much relief you brought me in understanding it! Your voice is so clear and your explanations were just spot on and easy to understand. I’ve spent like weeks trying to figure out how the pieces go together and you have completed the puzzle of confusion in my mind. Thank you so much !!!
@teofilciobanu Жыл бұрын
Hi there! Pretty cool video :D One suggestion I would have would be to show how the output looks if we change the input. For example, at 2:50 and 4:30 a 2x10 and 4x10 grid respectively where the ith column is the layer output for the ith digit.
@ArashSoleimanifar Жыл бұрын
You are fantastic! I look forward to watching your next videos. please produce mooore videos!
@far1din Жыл бұрын
Thank you brother. Working on it! 😃🚀
@joaogustavo62688 ай бұрын
This is art! Never delete this video bro!
@sugurufoundation41277 ай бұрын
Thank you very much for this explanation. I have been struggling to understand how exactly convolutions work, and you explained it perfectly. Thanks keep up the good work bro.
@Democracy_Manifest Жыл бұрын
Excellent video. We needed this, thank you. Keep it up!
@dndr4818 Жыл бұрын
One of the best video explaining this topic. What a great job and thank you!
@programmingwithchatti40632 жыл бұрын
why is the background music soo soothing xD I want to focus not lost in thoughts xD
@far1din2 жыл бұрын
Haha, next time myfriend. I will turn down the music 🚀
@r0cketRacoon6 ай бұрын
WWOWW! I need a video visualizing CNN from layer to layer like that, and I encountered ur channel. The best CNN visualization for me now. Tks u!
@SterileNeutrino4 ай бұрын
Nice. I remember working on digit recognition using handcoded analysis of pixel runs a long time ago. It never worked properly 😂 And it was computationally intensive.
@maxedin79102 жыл бұрын
Amazing visualization! I wish for more short vids like this!
@boramin30777 ай бұрын
Amazing work! Please continue to make more videos on other ML topics. I find your videos are really helpful to understand the concepts.
@RAHUL11819952 жыл бұрын
This was really helpful....Thank you so much for the vizualization...Keep up the good work...Looking forward to your future uploads.
@khayyamnaeem56012 жыл бұрын
This is my favourite channel. I especially like it when you explain everything so nicely. I wish you a lot of success with the channel and happy life.
@far1din2 жыл бұрын
That’s great to hear. I wish you all the best! 😃
@mehdismaeili3743 Жыл бұрын
excellent. 3 video and 1.89 k subscriber !!! it's amazing, and you are amazing .
@far1din Жыл бұрын
Thank you my friend! 💯
@AutobahnBuilder Жыл бұрын
I gotta say your animation is great and the way you explain it is really good. It feels like 3blue 1brown and just as well explained. Keep the good work!
@far1din Жыл бұрын
Thank you my friend 🔥
@KarolBura11 ай бұрын
thank you from Silicon Valley Startup
@Sowmya_codes9 ай бұрын
Hello user-zn1hm6tk1o Do let me know if you're in need of an intern !
@alfashow24392 жыл бұрын
Great video! you got me interseted!
@_aitalks_2 жыл бұрын
Nice walkthrough!
@far1din2 жыл бұрын
I am glad you enjoyed the content!
@zafer-x1v28 күн бұрын
Wonderfullexplanation. I appreciate it.
@hackeymabel8296 Жыл бұрын
thank you really needed this
@websciencenl79942 жыл бұрын
Great video, Thank you!
@lapusikrid4752 жыл бұрын
Great video! So good and affordable explanation! Thank you, it helps me a lot!)
@locngoxuan13292 жыл бұрын
it's really informative for me. Tks bro!!
@far1din2 жыл бұрын
Thank you brother. I will try to post a more in depth video soon!
@sivaladeepak1538 Жыл бұрын
Great Viz!!!
@fiya..2 жыл бұрын
Nice video! Thank you!
@jaberfarahani66455 ай бұрын
the best channel❤
@suthanraja16575 ай бұрын
Thank you so much!
@ForbiddenPrime4 ай бұрын
Thank you for the source code. this will help me to create some content for my syllabus. with love
@far1din3 ай бұрын
Glad it was helpful although not the most ideal code 😂
@randomtypebeats73482 жыл бұрын
keep em coming brodie!
@far1din2 жыл бұрын
For sure! :)
@IvoVacca2 жыл бұрын
Very cool.
@emadilabid7 ай бұрын
Bravo dude!
@nvmvel2 жыл бұрын
Wow. Such a high quality content with great effort. U deserve million subscribers.. Hope soon u ll get popular and make more videos, 🙌
@far1din2 жыл бұрын
Haha, stay tuned! 🚀🔥
@KhanhTran-qp5pm Жыл бұрын
Love your video.
@emmanuelsheshi9616 ай бұрын
nice work sir
@mattialoszach64059 ай бұрын
Awesome video🫶
@EpromLee11 ай бұрын
thank u!!
@Dreadwinner2 жыл бұрын
🧡🖤
@richardfoldesi39802 жыл бұрын
best
@far1din2 жыл бұрын
🚀
@Leuel48Fan Жыл бұрын
4:24 kind of wild how after this layer there's no resemblance or perceived logic to our eyes how it's characterizing the digit. Can you show maybe 10 examples each of the 10 different digits at this step (just the output)? I wonder if they will look similar in pattern to each other right before the fully connected final layer.
@far1din Жыл бұрын
It is very strange how all og this just «works». I will try to do that for one of my next videos! 😊
@ratfuk93402 жыл бұрын
4:10 How are the outputs of the two convolutions for each filter combied to get 1 filter map / filter?
@far1din2 жыл бұрын
So, you multiply the values from one filter with the corresponding input layer. Then you add together the outputs + the bias. Then you pass them through an activation funtion (in this case the sigmoid). After this you move the filter one step to the right (since the stride is one) and do the same thing until everything is covered. This is hard to explain in text. I will do my best to get out an in depth video by this weekend where everything will be explained in detail! 💯
@ratfuk93402 жыл бұрын
@@far1din I think I get it. Anyway thanks for making these videos, they help a lot.
@far1din2 жыл бұрын
@@ratfuk9340 kzbin.info/www/bejne/oHXIZnV3qLllY7s
@ahamedismail1630 Жыл бұрын
So we use filters to catch the features, and maxpooling to capture the most active feature? Thus if we use maxpooling of higher dimensions, we loose the other less significant features to reduce the model complexity but results in loss of information?
@far1din Жыл бұрын
That is the idea and it seems to be working in practice. However, people use average-pooling and min-pooling also. It all comes down to what yields the best result in the end! :D A lot of this is just trial and error until the best performance is acchieved.
@abdullahhashmi6542 жыл бұрын
For the convolution, you have two filters through which you pass the original image and you get two images which have been passed by the filter. But for the second convolution, you have 4 filters each for the two images, so a total of 8 filters for the two images. 4 for one and 4 for the other. But in the visualization, dissimilar to the first convolution, you run both the filters through both the images in parallel and then combine them. So instead of getting 8 images from the second convolution, we get 4. Why?
@far1din2 жыл бұрын
Great question my friend. This is one of the reasons why I felt I had to make this animation. First of all, in the second convolution, we have one “image” consisting of two channels. You can think of an RGB (colored) image. These images consists of three channels that you stack together. See reference 1 for image. This basically implies that if you have a 28 x 28 pixel RGB image, when you extract the values for the three channels (Red, Green and Blue) you will have three matrices. The dimensions are therefore actually 28 x 28 x 3, where 3 is the number of channels. Now, when you build a convolutional neural network, you choose how many filters you want in each layer. If you have X amount of channels in the inputted image, each of the filters will have X amount of channels. Examples: Input = 28 x 28 x 3 and you choose five filters with dimensions 5 x 5, you will technically have five filters that are 5 x 5 x 3. Let’s say you run the convolution. Your outputted activation layer will have the dimensions 24 x 24 x 5. Where 5 is the number of channels. This implies that if you choose to have a convolutional layer next with two filter with the dimensions 7 x 7, these will actually have the dimensions 7 x 7 x 5. I hope this made sense and I know it can be a bit confusing. I highly recommend you watch Andrew NG’s video on convolutions over volumes (Reference 2) as he explains the concept very well. Reference 1: media.geeksforgeeks.org/wp-content/uploads/Pixel.jpg Reference 2: kzbin.info/www/bejne/gYWlkIJ8pKaEmcU
@nithyaprakash5812 Жыл бұрын
Great job.. U got a new subscriber 🎉😊
@ektor12129 күн бұрын
ma la rete come fa a riconoscere il numero iniziale se dopo il maxpooling finale e il layer piatto finale non si capisce il numero di riferimento?
@vivek65522 Жыл бұрын
Great Video, a quick question, does kernel values (in every conv. layer) changes during training of CNN? Does the kernel changes it self so as to only extract exactly the shapes and pattern required for classification ?
@kneehaw8093 Жыл бұрын
Convolutional kernels will train their weights such that they “learn” whatever feature detection is most desirable. I believe that is a “yes” to your question.
@far1din Жыл бұрын
1. Does the kernel values change? It depends. Normally it does, but you can also freeze the network and train for example only the last 3 layers. This is common for transfer learning: kzbin.info/www/bejne/fIKwYmZ-oKqZeM0 2. It definetly looks like that is the case if you look at a network post training. Why it extract certain shapes and patterns are still unknown for me.
@leesohaeng Жыл бұрын
hi, how to make 5*5 kernel? just random?
@far1din Жыл бұрын
It is initialized randomly if that is what you were asking! :)
@triplemmm32 жыл бұрын
What do use use for making videos ?
@far1din2 жыл бұрын
Manim! www.manim.community
@Brandonator245 ай бұрын
I'm curious, why is the first convolution using ReLU and then later convolutions using sigmoid? Edit: Also, when convolving over the previous convolution-max pooling output, we have two 2 images, how are the convolutions from these two separate images combined? Is it just adding them together?
@far1din5 ай бұрын
Hey Brandon! 1. The ReLU and Sigmoid are just serving as examples to showcase the different activation functions. This video is just a «quick» visualization from the longer in depth version. 2. Not sure if I understood, but if your referring to the filters, they are added. I go through the math behind this with visualizations in the in depth video. I believe it should clarify your doubts! 😄
@Brandonator244 ай бұрын
@@far1din Will be checking that out, thanks!
@very_old_dog Жыл бұрын
I didn't get why in 1st convolution after passing through 5*5 filter image of size 28*28 we've ended up with image of size 24*24. Stride is 1 hence filter should perfectly marsh till the end of the image with no bypassing certain pixels. Probably I miss smth but what?
@far1din Жыл бұрын
Brother, for example, let’s say you have an image with the size of 7 x 7 and a filter that is 5 x 5. You start at the left, where the left side of the filter touch the left side of the image. You slide it to the end where the filters **right** side touch the right side of the image. Since the filter is 5 x 5, you can only slide it two times to the right. Essentially, 5 «squares» of the image is already covered when you start, and there is only two more left. I will highly recommend you draw this out and «scale down» the image to 7 x 7 or something to get some clarification. I’ll also link a video that show the process. Watch the animations in this video for clarification: kzbin.info/www/bejne/qoK8i5R6o8Seick
@very_old_dog Жыл бұрын
@@far1din Bro, I've got it !!! Thank You so much for a great clarification and a link. That helped a lot.
@paedrufernando2351 Жыл бұрын
why use 2nd convolution with 8 filters
@far1din Жыл бұрын
Hello my friend, the number of filters are typically choosen. A lot of deep learning is trial and error. I used 4 filters in the second convilution and these filters had two channels. Therefore, technically four filters. There is no underlying reason for this as this video was just meant to visualize the process. I could have gone with 10 filters, but the video would have been impossible to render.
@ritwiksharma43592 жыл бұрын
Why there are 2 parts of each filter in layer 2? Is this just for the different color channels?
@far1din Жыл бұрын
Yes my friend. If there where four channels that were convoluted over, then each filter would have had four channels as well. The amount of channels in the filter will mirror that of the input! 💯
@phanstudio Жыл бұрын
How do you create the visualizations
@far1din Жыл бұрын
Manim! docs.manim.community/en/stable/
@figloalds Жыл бұрын
If you start from the last layer with a 1 on the 7 and 0 on the rest then do the reverse process, how would the image look like?
@far1din Жыл бұрын
My friend, I do not understand. Could you explain in detail? :)
@figloalds Жыл бұрын
@@far1din i mean trying to reverse the process from a given output all the way to a generated image and see a visualization of what the network "understands" of a 7 for example. I was seeing some videos of people reversing AI outputs in this way for deep dream and gpt and they found very weird stuff, like for example very weird tokens in GPT that causes the model to completely break
@Leuel48Fan Жыл бұрын
@@figloalds I believe I've seen this tried somewhere and it'll look nothing like a '7'. Basically it can't generate images, you'd just get random gibberish static looking pixels turned on. I also think if you input a smiley face or letter, unless specifically trained for "not a number" scenario, you may get an inconclusive output or falsely confident on one number. Even though the math of the process checks out, nothing about this whole concept makes any intuitive sense, which is why I find it fascinating tbh lol
@far1din Жыл бұрын
@@Leuel48Fan Yes haha. I believe I have seen this been done on a dumbbell. It was very noisy, but you could still see the outer line. However, I was unable to find it now 🤷🏾♀️
@far1din Жыл бұрын
@@figloalds Ahaa, yes I know what you are talking about. This is a great idea. This would be what the computer perceives as the ideal «7» or ideal «1». I will try to make a video on that! 🚀
@emotional_stuff2 жыл бұрын
Love it! Sub+
@gamercatsz5441 Жыл бұрын
Then you verify it’s a 7,and the AI algorithm gets better. Does the machine learn what is a 7? or does it learn how to get positive feedback from us. If we use advanced AI to ask advanced questions. Do we actually know for certain, it is not lying to us to get good feedback? Advanced AI like chat gtp 4 which is rapidly advancing will already is much smarter than 99% of humans, potentially is a lying manipulating horrible thing. It’s better to make 100% sure it is truthfull and fully understand intentions before making it smarter and more powerfull, train it woth more data, more computers, ect…
@far1din Жыл бұрын
Hello, my friend. Here is my take on this topic. First of all, these kinds of AI models are mathematical models that learn to get the "correct" response. This is called supervised learning, where the model is trained on labelled data. It is hard to define whether or not a model is lying, as you have to define lying. Is it lying if the answers are correct? The concerns are valid, as models can converge towards outputting responses that are not correct but perceived as such. This is something researchers and users should be wary about. On the other hand, human intelligence and deep neural networks seem to take different paths. Even though deep neural networks are inspired by how the brain works and are capable of many of the "intelligent tasks" we throw at them, the learning process is not the same. Modern deep learning models use backpropagation while humans don't (or at least, that is the perception for now). Although we come to the same answer, the consciousness factor might differ. Therefore, it is hard to tell if a model is lying because lying is something I would like to think of as a conscious act. You may find it beneficial to watch an interview with Geoffrey Hinton, as he discusses some of the concerns you've raised: kzbin.info/www/bejne/p6HSg4JpbJ2kiLs
@justinnejbauer7 ай бұрын
@@far1dinI would argue that humans do use back propagation. When you make a mistake in life, how many times have you looked back to ask yourself what you would have done differently to minimize the negative effects.