Why Non-linear Activation Functions (C1W3L07)

  Рет қаралды 91,853

DeepLearningAI

DeepLearningAI

7 жыл бұрын

Take the Deep Learning Specialization: bit.ly/2IcuTOr
Check out all our courses: www.deeplearning.ai
Subscribe to The Batch, our weekly newsletter: www.deeplearning.ai/thebatch
Follow us:
Twitter: / deeplearningai_
Facebook: / deeplearninghq
Linkedin: / deeplearningai

Пікірлер: 48
@tonyflow6244
@tonyflow6244 Жыл бұрын
As a layman clicking buttons to try get a better understanding of large language models I thought I was making some progress, then I watched this video, now I think I should go back to primary school 😢
@lets-see-where-it-is
@lets-see-where-it-is 29 күн бұрын
Have you been watching the previous videos in the playlist? If so, I'm very surprised this video was the one you found challenging. Feels like a breather after the last few videos
@anirudhsriram2125
@anirudhsriram2125 3 жыл бұрын
These lectures deserve to be recognized as the bible of machine learning
@moeinhasani8718
@moeinhasani8718 5 жыл бұрын
it was like i was blind about lin activation function but now im gifted with vision :)
@gpietra
@gpietra 2 жыл бұрын
*computer vision
@darshild5853
@darshild5853 5 жыл бұрын
Prof Ng is the man!
@mario1ua
@mario1ua 9 ай бұрын
I needed this, thank you!
@sandipansarkar9211
@sandipansarkar9211 3 жыл бұрын
nice explanation
@ysun5125
@ysun5125 3 жыл бұрын
superb
@TheThunderSpirit
@TheThunderSpirit 3 жыл бұрын
I got a question - does the use of non-linear activation increase model capacity?
@imdb6942
@imdb6942 3 жыл бұрын
Exactly my thoughts, came here to gain insight as to how non linear functions like ReLU drastically change usefulness of hidden layers. Instead it's just re-iterated common knowledge of linear functions producing no value when used on hidden layers.
@almoni127
@almoni127 3 жыл бұрын
Definitely yes. Without non-linearity you can only express linear functions. With non-linearities such as ReLU and sigmoid you can approximate any continuous function on a compact set (search "Universal approximation theorem" for more information).
@TheThunderSpirit
@TheThunderSpirit 3 жыл бұрын
@@almoni127 Sure, but look at that statement and tell me if that always true, even for a single layer network. How can we claim that model capacity has increased after using a non-linear activation instead of linear activation, since it is not quantifiable. How much capacity has increased?
@almoni127
@almoni127 3 жыл бұрын
@@TheThunderSpirit A good analogue is Boolean circuits. With no hidden layers you don't have much expressiveness. Already with one hidden layer you have universality, albeit the required hidden layer size might be too large. With arbitrarily deep networks you can approximate all polynomial computations.
@imdb6942
@imdb6942 3 жыл бұрын
@@almoni127 Thank you
@SantoshGupta-jn1wn
@SantoshGupta-jn1wn 6 жыл бұрын
Great video!!! I am still confused about why Relu works when it's properties are quite linear. I mean, I know it's a piece-wise linear function, therefore does not meet the mathematical definition of linear function. But by using Relu, the output is still just a linear combination. Perhaps some neurons don't 'contribute', but the output is still a mathematical result of a linear combination of numbers.
@dengpan8086
@dengpan8086 6 жыл бұрын
Santosh Gupta Hi, in my understanding, the aim of a neural network is to simulate a function that usually cannot be represented by a closed form expression. A linear activation function could cause the final result to be linear, which is useless as the professor explained. However, the ReLU can avoid this problem, just image that, the linear combination of several outputs that activated by ReLU, is equivalent to a piece-wise function, although each piece in the function is linear, we can somehow view it as an approximation of another complex function. It is just like what we do in computer graphics, we do not draw a curve, instead, we draw a lot of straight lines to simulate a curve. Hope this could be helpful.
@SantoshGupta-jn1wn
@SantoshGupta-jn1wn 6 жыл бұрын
Thanks!!!
@dpacmanh
@dpacmanh 6 жыл бұрын
For a function to be linear, its slope must be constant throughout. As the ReLu has a lil kink, it makes it a non-linear function.
@jchhjchh
@jchhjchh 6 жыл бұрын
Hi, I am confused. ReLU will kill the neuron only during the forward pass? Or also during the backward pass?
@dpacmanh
@dpacmanh 6 жыл бұрын
An activation function like ReLu is used only during forward propagation. During the backward propagation, you just take the derivatives and update the weights and biases.
@chitralalawat8106
@chitralalawat8106 5 жыл бұрын
From the videos above, I got to understand that RELu is a linear function... Rest are non linear functions.. But how can consider sigmoid function as binary?? Because a binary function always give output in the form of 0 or 1 but the sigmoid function varies between -ve infinity to +infinity & touching y axis at 0.5?
@Mats-Hansen
@Mats-Hansen 5 жыл бұрын
ReLU is NOT a linear function, just like the sigmoid function or tanh f.ex. Also. The sigmoid function does not vary between -inf, +inf, its output is in the range (0, 1).
@saurabhshubham4448
@saurabhshubham4448 5 жыл бұрын
@@Mats-Hansen How you are saying relu as non-linear function? Its combination of two linear function.
@Mats-Hansen
@Mats-Hansen 5 жыл бұрын
@@saurabhshubham4448 Yes it is. But a combination of two linear functions doesn't need to be linear.
@saurabhshubham4448
@saurabhshubham4448 5 жыл бұрын
@@Mats-Hansen but whenever you apply it on input, it's output will be linear, and thus relu don't help in adding non linearity to the model.
@Mats-Hansen
@Mats-Hansen 5 жыл бұрын
@@saurabhshubham4448 The output is linear yes, but as far as I understand you lose the linear dependencies between the weights. Let's take two weights w1 = -0.2 and w2 = 0.5, and a linear function like lets say f(x) = 3x+0.2. Then f(w1) = -0.4 and f(w2) = 1.7. This function preserves the difference (up to a linear factor) between the two weights (w2-w1 = 0.7, and f(w2)-f(w1) = 3* (w2-w1)=2.1. You will always have this with linear functions, but with a function like ReLU you will not (only interesting of course if at least one of the weights are negative). Now maybe this math is just nonsense, but I think I got a point here somewhere. In a sense you cannot write old weights as a "linear combination" of new weights.
@ttxiton9622
@ttxiton9622 5 жыл бұрын
It's so ridiculous that a video whose speaker is a Chinese but we only got Korean subtitle!!
@yangli8575
@yangli8575 5 жыл бұрын
you know you can actually choose English subtitle in setting...
@user-bz7ki7dl1r
@user-bz7ki7dl1r 4 жыл бұрын
And hopefully you could add Chinese subtitle to the video.
@lets-see-where-it-is
@lets-see-where-it-is 29 күн бұрын
h
@Charles-rn3ke
@Charles-rn3ke 6 жыл бұрын
Grandpa telling bedtime story.........
@saanvisharma2081
@saanvisharma2081 5 жыл бұрын
Then get out of here. SOB!!!
@chitralalawat8106
@chitralalawat8106 5 жыл бұрын
Yeps.. ✌️
@netional5154
@netional5154 5 жыл бұрын
Some of us very much appreciate his way of teaching.
@techanddroid1164
@techanddroid1164 2 жыл бұрын
This person has a lot of knowledge, he picks one thing and then starts explaining other and other. this disease is called explainailibalible sorry
@aqwkpfdhtla9018
@aqwkpfdhtla9018 6 жыл бұрын
Your English is difficult to understand. I keep going back to figure out what you mean by some words.
@abdelrahmanabdallah9530
@abdelrahmanabdallah9530 6 жыл бұрын
lol he is doctore in stanford
@Xirzhole
@Xirzhole 5 жыл бұрын
maybe learn the basics such as ReLU or Sigmoid will help? I don't think these are daily English words.
@yangli8575
@yangli8575 5 жыл бұрын
Do you know who this guy is? You should cherish this opportunity that such a great talent teaches you online. Also, this is the first time that I see someone picking on his English. Maybe it is time for you to improve your listening skills...
@saanvisharma2081
@saanvisharma2081 5 жыл бұрын
Such a sick person you're. Neeku burra dobbindhi Ra houla sale gaa,(Telugu)
@umairgillani699
@umairgillani699 5 жыл бұрын
his English sucks!
Derivatives Of Activation Functions (C1W3L08)
7:58
DeepLearningAI
Рет қаралды 63 М.
A Review of 10 Most Popular Activation Functions in Neural Networks
15:59
Machine Learning Studio
Рет қаралды 11 М.
EVOLUTION OF ICE CREAM 😱 #shorts
00:11
Savage Vlogs
Рет қаралды 7 МЛН
Mom's Unique Approach to Teaching Kids Hygiene #shorts
00:16
Fabiosa Stories
Рет қаралды 33 МЛН
Smart Sigma Kid #funny #sigma #comedy
00:26
CRAZY GREAPA
Рет қаралды 22 МЛН
The Essential Main Ideas of Neural Networks
18:54
StatQuest with Josh Starmer
Рет қаралды 913 М.
Activation Functions - EXPLAINED!
10:05
CodeEmporium
Рет қаралды 112 М.
But what is a convolution?
23:01
3Blue1Brown
Рет қаралды 2,5 МЛН
Watching Neural Networks Learn
25:28
Emergent Garden
Рет қаралды 1,2 МЛН
But what is a neural network? | Chapter 1, Deep learning
18:40
3Blue1Brown
Рет қаралды 16 МЛН
Which Activation Function Should I Use?
8:59
Siraj Raval
Рет қаралды 263 М.
The Sigmoid Function Clearly Explained
6:57
Power H
Рет қаралды 100 М.
EVOLUTION OF ICE CREAM 😱 #shorts
00:11
Savage Vlogs
Рет қаралды 7 МЛН