Рет қаралды 34,472
The simplest possible back propagation example done with the sigmoid activation function.
Some brief comments on how gradients are calculated in actual implementations.
Edit: there is a slight omission/error in the da/dw expression, as pointed out by Laurie Linnett. The video has da/dw = a(1-a), but it should be ia(1-a), because the argument to a is the function (iw), whose derivative (with respect to w) is i.