Cool video thanks! 00:00:00 Intro: linear regression 00:23:55 NTKs start here 01:01:33 link between NNs and ODEs (ordinary differential equations)
@debadeepta4 жыл бұрын
Really nice lecture! I was looking to quickly learn NTKs before diving deep into the original papers and this really helped.
@zl74603 жыл бұрын
+1. Most well-explained DL lecture I've seen for a long time
@StratosFair2 жыл бұрын
Incredibly clear lecture, allowed me to fill the gaps in my understanding of NTK. Thank you professor !
@dv0194 жыл бұрын
Great video, thank you! To the student asking about Kernels: the word is overloaded. It is used in linear algebra to mean the set of all vectors mapped to 0 by a linear transformation. Sometimes Green's functions in PDEs are called integral kernels. In general a kernel is "the central or most important part of something". I don't like how overloaded the word is either, but c'est la vie.
@weisenjiang91794 жыл бұрын
great intro to NTK, benefit me a lot
@mstislavmaslennikov3262 жыл бұрын
The lecturer is imho doing a great job explaining difficult material!
@AyushSharma-ie7tj Жыл бұрын
Really nice lecture with a very even pace. Thank you for sharing.
@MetaOptimizer3 жыл бұрын
41:07 Do we consider the large width of parameter (m) in empirical observation as an extremely large network such as GPT3? In other words, could I interpret the meaning of "the width of parameters" as "the number of trainable parameters"? Thank for your valuable lecture :)
@sikun78944 жыл бұрын
Thank you so much for sharing these lectures! Really useful
@DarkNinja-242 жыл бұрын
Beautiful explanation!
@itachi72434564 жыл бұрын
These are fantastic, thanks!
@nhl85862 жыл бұрын
Super useful for understanding NTK in 15 mins!
@joonho04 жыл бұрын
Thanks a lot for sharing this lecture!
@meghbhalerao52082 жыл бұрын
If I understand right, the NTK is derived when we only consider quadratic mse loss, right? can it be generalized to other loss functions?
@yuwu75472 жыл бұрын
Very useful and easy-catching lecture. Thanks a lot!
@AlexanderGoncharenko-e7o3 жыл бұрын
Awesome lesson! Straight and clear!
@chongyizheng77583 жыл бұрын
Question about the first-order Taylor approximation of neural network: Why the first term f(w_0, x) is not included in the kernel function since it is nonlinear w.r.t. x?
@ramanasubramanyam11103 жыл бұрын
The first derivative is included (and called NTK) because it resembles the operation of a kernel on an input, i.e a transformation function mapping to a higher dimension
@chongyizheng77583 жыл бұрын
@@ramanasubramanyam1110 Thanks for your reply, but I don't think I am asking for that. Let me clarify: My question is about the constant (the first) term f(w_0, x) at 41:16 instead of the derivative (the second) term in the equation. f(w_0, x) seems also nonlinearly depend on x, why it was excluded in the definition of NTK?
@hw14512 жыл бұрын
I think since it's a constant, we can always subtract it from y.
@yuzhema25062 жыл бұрын
Thanks for the nice lecture! One question: the bias term in the Taylor approximation seems dependent on x, which means for different input x, the bias term varies. This is different from the traditional kernel view where the bias term is the same for different transformed input phi(x). In other words, for NTK, the inputs in the transformed space do not strictly follow the same linear model. How do we interpret such deviation? Thanks
@sayeedchowdhury113 жыл бұрын
thanks for the nice lecture, I have a query, we're evaluating the gradient at w0, does it mean the kernel is evaluated based on gradients obtained from an untrained NN which has just been initialized? i mean is the f(w,x) a trained NN or just an initialized one?
@sinaasadiyan2 жыл бұрын
great explanation, just Subscribed!
@tanchienhao2 жыл бұрын
Thanks for the awesome lectures!!
@da_lime2 жыл бұрын
Awesome, thanks!
@chenamora16533 жыл бұрын
So amazing
@vi5hnupradeep2 жыл бұрын
Thankyou so much!
@ihany90613 жыл бұрын
lifesaver!
@freerockneverdrop12365 ай бұрын
The formula for the neural network in this video should be a 2 level summation instead of one level.