Activation Functions (C1W3L06)

Рет қаралды 92,510

Күн бұрын

Take the Deep Learning Specialization: bit.ly/32IxMzO
Check out all our courses: www.deeplearning.ai
Subscribe to The Batch, our weekly newsletter: www.deeplearning.ai/thebatch
Follow us:
Twitter: / deeplearningai_
Facebook: / deeplearninghq
Linkedin: / deeplearningai

Пікірлер: 27

@thepresistence5935 2 жыл бұрын

I use always Leaky Relu, it gives good results while doing the model building.

@Jaspinik 2 жыл бұрын

Can we make the parameter we apply at Relu function a part of hidden learning parameters like the weights and bias?

@saanvisharma2081 5 жыл бұрын

At 9:01 coordinates of tanh graph should be Z not X ????

@abekang3623 6 жыл бұрын

The tanh definition is not correct at 9:01. The denominator should be exp(z) + exp(-z). So the whole function should be tanh = (exp(z) - exp(-z)) / (exp(z) + exp(-z))

@nands4410 6 жыл бұрын

thanks Abe Kang

@sandipansarkar9211 3 жыл бұрын

very very clear explanation

@SiddharthDesai1 3 ай бұрын

At 4:12 downsides of sigmoid and tanh: how does gradient descent slow down when slope of sigmoid or tan-h becomes very small? I understand gradient descent takes smaller steps on graph of J(w) v/s w, as it approaches the minimum, but what does the slope of gradient descent have to do with slope of these functions?

@pivasmilos 4 жыл бұрын

Can't we reassign the 0 labels to -1, and then use tanh() for the output layer?

@Jaspinik 2 жыл бұрын

Most of it is a convention based thing, 0 represents "off" and 1 represents "on". However, a more mathematical explanation can be ;- 0 has a diminishing effect, if a number is multiplied with the output which tends to zero, the result will become negligible in further calculation*. And -1 has a reversing effect, a number multiplied with minus 1 reverses it's sign, and thus, can have a major effect later. * Well, that matters if there actually are some calculations ahead (like other models depending on results from current)

@sreemantokesh3999 5 жыл бұрын

Do we need an activation function when in case of regression problems where output is continuous values????

@siddhartha8886 5 жыл бұрын

i think no

@ameyatathavadkar2743 5 жыл бұрын

no....for continuous value prediction you use linear regression which does not actually require neural nets to exec...so no need of act funct

@legacies9041 4 жыл бұрын

In the case of linear regression, You will use activations functions also except for the output, no activation function there

@rp88imxoimxo27 3 жыл бұрын

But ReLU or even Leaky ReLU still are linear functions(but more complex ofc) which means they cant fit some complex classification problems, isn't it?

@olifsun948 3 жыл бұрын

datascience.stackexchange.com/questions/26475/why-is-relu-used-as-an-activation-function

@prismaticspace4566 4 жыл бұрын

So I finally understand why certain activation functions are chosen, it's because we usually use them..

@manuel783 3 жыл бұрын

Clarification: Activation Function From 7:58, this visual comparison of 4 activation functions. All 4 should have "z" as the horizontal axis. The top right chart in the slide shows "x", which should be "z".

@user-uy6gv8hi5b 2 жыл бұрын

Why we need an activation function in hidden layers?

@hectorgarces5383 4 жыл бұрын

A sigmoid function is one that have a "S" curve, is not a synonym of logístic regression, in fact tanh func is a type of a a sigmoid function, so min 2:58 is kind of incorrect the explaining. Apart of that nice vid.

@acidtears 4 жыл бұрын

Logistic regression also has an S curve... The terms are used as synonyms in machine learning.

@xerocool2109 5 жыл бұрын

what about Softmax ? is it better than RELU?

@chaitanyakrishnadukkipaty9825 5 жыл бұрын

I am relatively new to the field, but according to me softmax is preferably used in the final layer when we have a multi-class classification. And in hidden layers we preferably stick to relu or leaky relu in most cases

@abhimanyutiwari100 5 жыл бұрын

ReLU is used in the hidden layer only and softmax applies to the output (basically it calculate the probability of the classes).

@angelaju6327 3 жыл бұрын

I think you mean softplus? softplus is an activate function which is similar with ReLU

@EranM 5 жыл бұрын

5:15 the 0 button got stuck :>

@istvanbenedek3578 3 жыл бұрын

Hey Andrew, I really like your videos, the reason why I've just written this feedback is to contribute to the improvement. at 1:45: you say "tanh is shifted version of the Sigmoid function" which is a little bit underdefined as the value domain of tanh(x) is larger than the value domain of the sigmoid, simple shift does make tanh from the sigmoid. at 8:23: horizontal axis of the tanh(z) labeled with X, and tanh has a typo.

@videos4mydad Жыл бұрын

I am beginning to see math notation is simply just really poorly named variables in programming. Its too bad that the "one letter" variable naming in math notation has such deep roots. In programming we would call it "sigmoid()" but in math even when you run out of letters of the alphabet, we go to different alphabets...