A picture is a 3D tensor (width, height, color channels) Number of weights = input_size x number_of_neurons = (100 * 100 * 3) * 1000 Too many parameters → Overfitting Conclusion: Image processing does not “require” fully connected networks ****9:20** We can identify something using its critical features** One can see that this is a bird when they notice the beak, the feet, and the wings. → Neural network doesn’t need the whole image ****11:00** Receptive Field** For a single neuron in a convolutional layer, the local receptive field is the small region of the input image (or previous layer) that this neuron processes. For example, if the convolutional kernel size is 3×3, the local receptive field of a neuron in that layer is a 3×3 region of the input. **15:44 Classic Receptive Field Arrangement Parameters:** 1. kernel_size: The height * width of the field is called the kernel size. (PyTorch Conv2D 打 (kernel_size=3) 會給你 3 x 3的kernel) 2. stride: 從現在的field跳幾格會到下一個field Stride should be small. You want the fields to overlap, otherwise, you may risk missing important patterns 1. padding: The field may go out of bound, so you fill the area with 0s 21:30 **Same key features appearing in different receptive fields** We can let neurons from different receptive fields share parameters. Two neurons with the same receptive field would not share params. Every shared param is called a “**filter**” ****27:00** Receptive Field + Parameter Sharing = Convolutional Layer** Models that utilize convolutional layers are called “CNN”s ****29:00** CNN Explanation 2nd Version** ****34:00** Feature Map** Filter performs convolution with each field and generates a map of scores. ****38:30** Summary** ****40:00** Pooling** Decreasing the resolution will not change the object Objective: Decrease the amount of computation required. Max Pooling: Pick the largest member from a group of scores. Disadvantage: Not suitable for intricate images 目前運算資源夠用,Pooling可有可無 ****54:00** CNN can’t deal with scaling and rotation** 放大或旋轉一張圖它會無法辨認 → We need data augmentation (create new data by scaling and rotating existing images)