Universal Approximation Theorem - The Fundamental Building Block of Deep Learning

  Рет қаралды 2,694

Serrano.Academy

Serrano.Academy

Күн бұрын

The Universal Approximation Theorem is the most fundamental theorem in deep learning. It says that any continuous function can be approximated, as closely as we want, by a neural networks of only one hidden layer (this layer may be huge).
In this video, we see a very simple explanation of why the Universal Approximation Theorem works, using an analogy with Lego blocks.
Grokking Machine Learning Book:
www.manning.co...
40% discount promo code: serranoyt

Пікірлер: 27
@MrProgrammer-yr1ed
@MrProgrammer-yr1ed 2 күн бұрын
What a masterpiece, mind blowing!, I am waiting for video like this long ago.
@asv5769
@asv5769 Күн бұрын
Very interesting points at 11:40 about non polynomial functions. Yes on Uni we always learned that we can use Taylor series to approximate nonpolynomial functions with sum of polynomial functions. How beautiful is that?
@MrProgrammer-yr1ed
@MrProgrammer-yr1ed Күн бұрын
Hey Luis please keep it up, make more video on why neural networks work.👍
@KumR
@KumR 3 күн бұрын
Always has been looking for an answer to this. Thank you Luis. Lego analogy made it so easy to understand.
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Thank you so much @KumR, I'm glad you liked it!
@neeloor2004able
@neeloor2004able 3 күн бұрын
Absolutely new information and thanks for explaining it in this simple and detail
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
@@neeloor2004able thank you! I’m glad you liked it :)
@dr.mikeybee
@dr.mikeybee 3 күн бұрын
You always do a really fine job of explaining difficult material in an easy to understand way. The universal approximation theorem is absolutely key to understanding why neural networks are not stochastic parrots, therefore, the universal approximation theorem is the key to understanding how neural networks learn. Might I suggest that you follow this up with an episode on holistic training?
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Thank you so much! Holistic training, that sounds awesome! I don't know much about it, do you know any good resources?
@MrProgrammer-yr1ed
@MrProgrammer-yr1ed Күн бұрын
Please make a video on how relu makes patterns in neural networks.
@AJoe-ze6go
@AJoe-ze6go 3 күн бұрын
This sounds functionally identical to a Fourrier series - by adding a sufficient number of the right kinds of simple wave functions, you can approximate any continuous curve.
@diemilio
@diemilio 3 күн бұрын
Great video. Thank you!
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Thank you so much, I'm glad you liked it!
@sanjayshekhar7
@sanjayshekhar7 3 күн бұрын
Wow! Just wow!!
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
:) Thank you!
@chessfighter-r5g
@chessfighter-r5g 2 күн бұрын
how it maps output and input never seen before , my answer is it splits as chunks and then it produces according to first chunk and then continuation is second chunk + first output and then third chunk + second output and it splits as chunks by cosine similarity if it becomes to big cuts off so this way chunks happens . what do you think about this
@Pedritox0953
@Pedritox0953 3 күн бұрын
Great video! Peace out
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Thank you!
@chessfighter-r5g
@chessfighter-r5g 3 күн бұрын
Hi , do you explain what is difference between o1 and normal transformers 4o, and why it waits 30 about seconds ,what it makes in that range of time
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
That's a great question! Two things that o1 does are RAG, and chain of prompting. RAG means before talking, it searches for the answer, either on Google, or other databases. Chain of prompting means that it first generates an answer, then reads it and elaborates on it by expanding it, and may do this a few times. These methods make it more consistent, and remove hallucination.
@chessfighter-r5g
@chessfighter-r5g 3 күн бұрын
@@SerranoAcademy Thank you so much , how it maps output and input never seen before , my answer is it splits as chunks and then it produces according to first chunk and then continuation is second chunk + first output and then third chunk + second output and it splits as chunks by cosine similarity if it becomes to big cuts off so this way chunks happens . what do you think about this
@neelkamal3357
@neelkamal3357 3 күн бұрын
crazy video
@neelkamal3357
@neelkamal3357 3 күн бұрын
I love how it's 2024 and we still don't know what " values in neural networks " actually represent
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Great point! It can mean a lot of things, it could be the outputs, the weights, etc. I'm probably use that term for several things... :)
@tomoki-v6o
@tomoki-v6o 3 күн бұрын
what happens in 2D . 7:02
@dmitrypereverzev9884
@dmitrypereverzev9884 3 күн бұрын
waiting for Kolmogorov-Arnold representation theorem descripton as more strong theoretic basis for KAN
@SerranoAcademy
@SerranoAcademy 3 күн бұрын
Thanks! Here's a video on the Kolmogorov-Arnold theorem! kzbin.info/www/bejne/pISVmaGjZa-FeM0 (in there, there's a link to one on Kolmogorov-Arnold networks too)
The Lever Paradox
24:43
Steve Mould
Рет қаралды 746 М.
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
coco在求救? #小丑 #天使 #shorts
00:29
好人小丑
Рет қаралды 120 МЛН
College Algebra Fractional Inequalities
6:41
Dr. Shu's Math Tutorials
Рет қаралды 28
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
The Frobenius Problem (and numerical semigroups) - Numberphile
18:55
How AI Took Over The World
24:03
Art of the Problem
Рет қаралды 117 М.
The Kolmogorov-Arnold Theorem
23:00
Serrano.Academy
Рет қаралды 12 М.
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 1,5 МЛН
The Mathematician So Strange the FBI Thought He Was a Spy
13:11
How might LLMs store facts | DL7
22:43
3Blue1Brown
Рет қаралды 1 МЛН
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН