What a masterpiece, mind blowing!, I am waiting for video like this long ago.
@andyandurkar78142 күн бұрын
Amazing topic and much needed clarity. You are a real teacher and you have passion for teaching!
@coffeezealot85358 күн бұрын
waoo what a clear and concise way to present this topic!
@KumR12 күн бұрын
Always has been looking for an answer to this. Thank you Luis. Lego analogy made it so easy to understand.
@SerranoAcademy12 күн бұрын
Thank you so much @KumR, I'm glad you liked it!
@TotallyNotARobot__3 күн бұрын
Excellent. Thank you.
@dr.mikeybee12 күн бұрын
You always do a really fine job of explaining difficult material in an easy to understand way. The universal approximation theorem is absolutely key to understanding why neural networks are not stochastic parrots, therefore, the universal approximation theorem is the key to understanding how neural networks learn. Might I suggest that you follow this up with an episode on holistic training?
@SerranoAcademy12 күн бұрын
Thank you so much! Holistic training, that sounds awesome! I don't know much about it, do you know any good resources?
@neeloor2004able12 күн бұрын
Absolutely new information and thanks for explaining it in this simple and detail
@SerranoAcademy12 күн бұрын
@@neeloor2004able thank you! I’m glad you liked it :)
@MrProgrammer-yr1ed10 күн бұрын
Hey Luis please keep it up, make more video on why neural networks work.👍
@asv576910 күн бұрын
Very interesting points at 11:40 about non polynomial functions. Yes on Uni we always learned that we can use Taylor series to approximate nonpolynomial functions with sum of polynomial functions. How beautiful is that?
@asv57697 күн бұрын
I hope you will soon make a video about DeepSeek, there are already plenty, with clickbait titles, but it would be nice to have at least one from a professional. I have enjoyed your course about probability in Machine Learning specialisation on coursersa. Keep up the good work. All the best.
@MrProgrammer-yr1ed9 күн бұрын
Please make a video on how relu makes patterns in neural networks.
@AravindUkrdКүн бұрын
Please do a video that discusses how the Deepseek model is different from other LLMs.
@diemilio12 күн бұрын
Great video. Thank you!
@SerranoAcademy12 күн бұрын
Thank you so much, I'm glad you liked it!
@AJoe-ze6go11 күн бұрын
This sounds functionally identical to a Fourrier series - by adding a sufficient number of the right kinds of simple wave functions, you can approximate any continuous curve.
@chessfighter-r5g11 күн бұрын
how it maps output and input never seen before , my answer is it splits as chunks and then it produces according to first chunk and then continuation is second chunk + first output and then third chunk + second output and it splits as chunks by cosine similarity if it becomes to big cuts off so this way chunks happens . what do you think about this
@ocamlmail6 күн бұрын
Thank you so much, fantastic! But what is wrong with polynomials if we can approximate any differentiable (continuous) functions with Taylor series which are polynoms?
@Pedritox095312 күн бұрын
Great video! Peace out
@SerranoAcademy12 күн бұрын
Thank you!
@sanjayshekhar712 күн бұрын
Wow! Just wow!!
@SerranoAcademy12 күн бұрын
:) Thank you!
@neelkamal335712 күн бұрын
crazy video
@serhatakay83517 күн бұрын
is this going to link to kolmogorov-arnold networks in the following videos?
@SerranoAcademy7 күн бұрын
@@serhatakay8351 good question! not really. It’s in the same spirit of the Kolmogorov Arnold theorem of universal approximation with only two layers, but other than that there’s no relation.
@tomoki-v6o12 күн бұрын
what happens in 2D . 7:02
@chessfighter-r5g12 күн бұрын
Hi , do you explain what is difference between o1 and normal transformers 4o, and why it waits 30 about seconds ,what it makes in that range of time
@SerranoAcademy12 күн бұрын
That's a great question! Two things that o1 does are RAG, and chain of prompting. RAG means before talking, it searches for the answer, either on Google, or other databases. Chain of prompting means that it first generates an answer, then reads it and elaborates on it by expanding it, and may do this a few times. These methods make it more consistent, and remove hallucination.
@chessfighter-r5g12 күн бұрын
@@SerranoAcademy Thank you so much , how it maps output and input never seen before , my answer is it splits as chunks and then it produces according to first chunk and then continuation is second chunk + first output and then third chunk + second output and it splits as chunks by cosine similarity if it becomes to big cuts off so this way chunks happens . what do you think about this
@neelkamal335712 күн бұрын
I love how it's 2024 and we still don't know what " values in neural networks " actually represent
@SerranoAcademy12 күн бұрын
Great point! It can mean a lot of things, it could be the outputs, the weights, etc. I'm probably use that term for several things... :)
@dmip988412 күн бұрын
waiting for Kolmogorov-Arnold representation theorem descripton as more strong theoretic basis for KAN
@SerranoAcademy12 күн бұрын
Thanks! Here's a video on the Kolmogorov-Arnold theorem! kzbin.info/www/bejne/pISVmaGjZa-FeM0 (in there, there's a link to one on Kolmogorov-Arnold networks too)
@waelreda54122 күн бұрын
Has anyone ever told you that Your voice sound very similar to steve buscemi ?
@SerranoAcademy2 күн бұрын
@@waelreda5412 lol! You’re the second one. I also got Joe Pesci in my cousin Vinnie 🤣