The terms fan- in and fan-out come from Digital Electronics . Fan-in is the max number of logic Gates can be connected to the input of a particular Gate. Fan-out is the same to the output.
@sightreader25073 жыл бұрын
I think you are not describing Xavier initialization. Xavier initialization is equations (16) in the paper. Equation (1) is what you are showing with only fan_in, and this is what they argue was a common but bad heuristic
@SebastianRaschka3 жыл бұрын
Thanks for the note, you are right. Wasn't careful here. Will make a note to fix that.
@gramlin173 жыл бұрын
thanks for the mention. I was wondering after reading the paper why no one talks about equations (16). I must say I see so many different interpretations that I am totally confused. Also, where does He take into account the nonlinearity of ReLU? We see sqrt in both formulas ... the multiplication by 2 is due to the fact that ReLU cuts off half below 0, right?
@ewankenobi227 ай бұрын
I noticed that too. Shame as I would like to understand better where the root 6 comes from in the actual Xavier initialisation equation
@hamzamohiuddin973 Жыл бұрын
at 5:46 should the summation iterator variable be 'k' instead of 'j'?
@hamzamohiuddin973 Жыл бұрын
At 6:12 on the second line of equations, the part which is marked by blue circle, can someone please clarify how the variance of the product of 2 independent variables can be expanded to the product of the variances of those variables? I can't seem to find any such property...can someone point to some helpful material . Thank you