L11.6 Xavier Glorot and Kaiming He Initialization

Рет қаралды 12,438

Күн бұрын

Пікірлер: 7

@mahdimoshtaghi9903 Жыл бұрын

The terms fan- in and fan-out come from Digital Electronics . Fan-in is the max number of logic Gates can be connected to the input of a particular Gate. Fan-out is the same to the output.

@sightreader2507 3 жыл бұрын

I think you are not describing Xavier initialization. Xavier initialization is equations (16) in the paper. Equation (1) is what you are showing with only fan_in, and this is what they argue was a common but bad heuristic

@SebastianRaschka 3 жыл бұрын

Thanks for the note, you are right. Wasn't careful here. Will make a note to fix that.

@gramlin17 3 жыл бұрын

thanks for the mention. I was wondering after reading the paper why no one talks about equations (16). I must say I see so many different interpretations that I am totally confused. Also, where does He take into account the nonlinearity of ReLU? We see sqrt in both formulas ... the multiplication by 2 is due to the fact that ReLU cuts off half below 0, right?

@ewankenobi22 7 ай бұрын

I noticed that too. Shame as I would like to understand better where the root 6 comes from in the actual Xavier initialisation equation

@hamzamohiuddin973 Жыл бұрын

at 5:46 should the summation iterator variable be 'k' instead of 'j'?

@hamzamohiuddin973 Жыл бұрын

At 6:12 on the second line of equations, the part which is marked by blue circle, can someone please clarify how the variance of the product of 2 independent variables can be expanded to the product of the variances of those variables? I can't seem to find any such property...can someone point to some helpful material . Thank you