L14.3.2.1 ResNet Overview

  Рет қаралды 3,680

Sebastian Raschka

Sebastian Raschka

Күн бұрын

Пікірлер
@imanmokwena1593
@imanmokwena1593 Жыл бұрын
Thanks!
@davefaulkner6302
@davefaulkner6302 11 ай бұрын
You cannot put the second ReLU inside the skip-block because then you cannot generate a residual (delta, difference) of the input as ReLU always outputs a positive value. What I don't understand is why a tanh function is not used instead of the first ReLU? That would seem a more efficient way to get to a meaningful residual in the first stage.
@prateekpatel6082
@prateekpatel6082 9 ай бұрын
would love to see some depth i teaching . 2 activation in a row , seems too vague , why is that an issue
@imanmokwena1593
@imanmokwena1593 Жыл бұрын
My intuition may be completely incorrect here, but do we also not risk exploding gradients if our activations dont output zero. Would that not lead to some of the signals being redundant?
@davefaulkner6302
@davefaulkner6302 11 ай бұрын
Exploding gradients are avoided in ResNet architectures because the skip connection allows for a unit gradient path end-to-end that competes with the gradient multipliers of the convolution paths. This creates s forcing function stabilizing the gradient decent of the entire model.
@hoaxuan7074
@hoaxuan7074 3 жыл бұрын
ReLU block information from passing through about 50% of time. Destroying input information before it can be used and destroying information being generated for the output before it can arrive there. Hence the need to allow alernative paths for information to percolate as you see with resnet. If you used 2 sided Parametric ReLU then the net should be able to organize information pathways itself. Leaky ReLU is somewhat of an alternative I suppose.
@SebastianRaschka
@SebastianRaschka 3 жыл бұрын
Yes. But even with alternatives like leaky relu you can have small gradients, and if you have many of them it can add up in the chain rule during backpropagation.
@hoaxuan7074
@hoaxuan7074 3 жыл бұрын
@@SebastianRaschka Yeh, that's true. I never have that problem because I always train nets with a simple evolution slgorithm. I forget that other people use BP exclusively. My observation is that evolution always smoothly reduces error over time without being temporarily trapped in local minima or even slowed by saddle points. Which to me suggests there is s simple learning mode involving adjustments to the statistical responses of the neurons. Maybe that can be proved some day🍸
@tilkesh
@tilkesh 5 ай бұрын
Thank you
@davefaulkner6302
@davefaulkner6302 11 ай бұрын
For all the scribbling going on there is a better way of saying it: ResNet blocks learn a residual, or delta, activation value set with respect to the input. Within each resnet block, rather than learning a new transformation input to output, simply learn a function which is the delta (residual, difference) from the input. This seems like less to learn because you can always just use the identity function (i.e., no learning) and has the side effect of stabilizing gradients during optimization of the model.
L14.3.2.2 ResNet-34 in PyTorch -- Code Example
18:48
Sebastian Raschka
Рет қаралды 11 М.
LLMs: A Journey Through Time and Architecture
19:44
Sebastian Raschka
Рет қаралды 5 М.
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 81 МЛН
Developing an LLM: Building, Training, Finetuning
58:46
Sebastian Raschka
Рет қаралды 68 М.
Self Attention with torch.nn.MultiheadAttention Module
12:32
Machine Learning with Pytorch
Рет қаралды 17 М.
L13.5 What's The Difference Between Cross-Correlation And Convolution?
10:38
Building LLMs from the Ground Up: A 3-hour Coding Workshop
2:45:10
Sebastian Raschka
Рет қаралды 87 М.
Insights from Finetuning LLMs with Low-Rank Adaptation
13:49
Sebastian Raschka
Рет қаралды 7 М.
Understanding PyTorch Buffers
13:33
Sebastian Raschka
Рет қаралды 6 М.
Scaling PyTorch Model Training With Minimal Code Changes
15:25
Sebastian Raschka
Рет қаралды 3,7 М.
Something about BGP routing in ISP networks
14:40
Paco CHAN
Рет қаралды 19
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН