26. Structure of Neural Nets for Deep Learning

  Рет қаралды 52,256

MIT OpenCourseWare

MIT OpenCourseWare

Күн бұрын

Пікірлер: 32
@davidbellamy1388
@davidbellamy1388 4 жыл бұрын
God bless Gilbert Strang and may his good health carry on for many more years. He is gifting the world with his work.
@iwonakozlowska6134
@iwonakozlowska6134 3 жыл бұрын
I like it when Mr G.S. ask "everybody with it ?"
@dcrespin
@dcrespin Жыл бұрын
For BPP, deep learning and for the general structure of neural networks the following comments may be useful. To begin with, note that instead of partial derivatives one can work with derivatives as the linear transformations they really are. It is also possible to look at the networks in a more structured manner. The basic ideas of BPP can then be applied in much more general cases. Several steps are involved. 1.- More general processing units. Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights; this is a linear transformation expressed as the sum of its restrictions to a pair of complementary subspaces. 2.- More general layers (any number of units). Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations. 3.- Networks with any number of layers. A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations. 4.- Quadratic error of a function. ... --- Since this comment is becoming too long I will stop here. The point is that a very general viewpoint clarifies many aspects of BPP. If you are interested in the full story and have some familiarity with Hilbert spaces please google for papers dealing with backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available. For a glimpse into a completely new deep learning algorithm which is orders of magnitude more efficient, controllable and faster than BPP search in this platform for a video about deep learning without backpropagation; in its description there are links to a demo software. The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and perceptrons are functionally equivalent. For the elementary conceptual basis of NNs see the article Neural Network Formalism. Daniel Crespin
@moazzamjadoon4436
@moazzamjadoon4436 2 жыл бұрын
The formula at the end is given as exercise in A first course in Probability by SM Ross. This Is really based on Combinatorics. The formula is based on combinations i.e. nCr or n combinations of r things at a time.
@saurabhgupta5662
@saurabhgupta5662 3 жыл бұрын
Does anyone know to which exact paper by Dr. Jon Kleinberg does Prof Strang refer to towards the end?
@wagsman9999
@wagsman9999 Жыл бұрын
We live in the best of times and the worst of times. Social media content ranging from fantastic learning content like this… vs. conspiracy-laden content leading to an attack on our nation’s great Capitol.
@NothingMaster
@NothingMaster 5 жыл бұрын
“I won, but it didn’t pay off!” 😂
@seraphimwang
@seraphimwang 3 жыл бұрын
I would pay to have a lecture with this gentleman, one day. Hilarious how people laugh and enjoy. What a brilliant and soulful person he is 🎩
@GauravSharma-ui4yd
@GauravSharma-ui4yd 5 жыл бұрын
Doesn't the number of flat pieces depend on how we fold the plane. Because if the second fold is parallel to the first fold then we get 3 flat pieces rather then 4 and the same argument holds for 3,4,... folds. So how we rely on that formula which gives incorrect answers when we change our fold keeping the no of folds same as it doesn't consider the spatial pattern we use to have a fold. Is I am right or I just missed something important?
@abcdxx1059
@abcdxx1059 5 жыл бұрын
i have the same question
@bogdansavkovic3441
@bogdansavkovic3441 4 жыл бұрын
Every fold (or line) has to cross previously made ones (all of them), so there are no parallel folds.
@seraphimwang
@seraphimwang 3 жыл бұрын
Precisely, that should be part of the assumption otherwise, this recursion formula (binomial coefficients) doesn’t hold. The formula holds for no ordered no repetition look up Combinatorics, k-permutations
@High_Priest_Jonko
@High_Priest_Jonko 4 жыл бұрын
Why not just add a shit load of neurons and features to boost your chances of successful classification?
@ulfadatau5798
@ulfadatau5798 3 жыл бұрын
Bcs this might produce a models that overfits = does not generalize well with new data. and it's very computationally expensive
@sanjaykrish8719
@sanjaykrish8719 4 жыл бұрын
Prof Strang is an example that age is no bar for learning
@tchappyha4034
@tchappyha4034 5 жыл бұрын
I think in general mathematicians are good at teaching. I guess professor Gilbert Strang is not a specialist on machine learning, but he can teach neural nets better than specialist on machine learning.
@PrzemyslawSliwinski
@PrzemyslawSliwinski 2 жыл бұрын
Likely because (three years later, and counting...) machine learning is still more an art and engineering than science.
@JulieIsMe824
@JulieIsMe824 3 жыл бұрын
Thanks Prof. Strang!! very clear and easy to understand!
@DarianMiller
@DarianMiller 3 жыл бұрын
35:25 Piecewise linear conferences... multiple dimensional dry humor!
@AKSHAYAKUMARSEC
@AKSHAYAKUMARSEC 3 жыл бұрын
Respected sir, may god bless you,. You will live for more than 100 years, you are a legend. Love from India ❤️
@vangjushkomini369
@vangjushkomini369 2 жыл бұрын
What a fantastic professor :) :)
@robertwest6244
@robertwest6244 3 жыл бұрын
I love this man. He is so concise
@mathhack8647
@mathhack8647 2 жыл бұрын
الله يبارك في عمره ويعطيه الصحة والعافية. ما شاء الله عليه. Pure Pleasure .
@imrematajz1624
@imrematajz1624 2 жыл бұрын
Minute 46: There might be a more simple solution. Instead of the Binomial formula I would suggest using Number of Edges + Number of Nodes + 1 = number of folds (spaces). N+E+1=S
@neoblackcyptron
@neoblackcyptron 3 жыл бұрын
God bless you sir. I see in this video professor Strang has injured his left hand. Yet he taught the class. I don’t know if any of the other viewers noticed that.
@moazzamjadoon4436
@moazzamjadoon4436 2 жыл бұрын
For example at 7 or 8 minutes see the video kzbin.info/www/bejne/l4iZZGCkorCEmaM
@DistortedV12
@DistortedV12 4 жыл бұрын
46:46 "he had m=2, but we should yeah grow up and get m to be 5 dimensional" lol this guy is so funny :)
@formationformationb8708
@formationformationb8708 2 жыл бұрын
excellent
@georgesadler7830
@georgesadler7830 3 жыл бұрын
Professor Strang thank you for another well plan lecture on Structure of Neural Nets for Deep and Machine Learning. Learning is a life long process.
@Darkev77
@Darkev77 2 жыл бұрын
Oh no, 4:30 beware LGBTQ+!
@suchalooser1175
@suchalooser1175 3 жыл бұрын
What a curiosity at this age!!!
27. Backpropagation: Find Partial Derivatives
52:38
MIT OpenCourseWare
Рет қаралды 59 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 586 М.
Minecraft Creeper Family is back! #minecraft #funny #memes
00:26
Стойкость Фёдора поразила всех!
00:58
МИНУС БАЛЛ
Рет қаралды 3,3 МЛН
Variational Autoencoders
15:05
Arxiv Insights
Рет қаралды 500 М.
9. Four Ways to Solve Least Squares Problems
49:51
MIT OpenCourseWare
Рет қаралды 119 М.
26. Chernobyl - How It Happened
54:24
MIT OpenCourseWare
Рет қаралды 2,8 МЛН
Daniel Everett, "Homo Erectus and the Invention of Human Language"
1:10:43
Harvard Science Book Talks and Research Lectures
Рет қаралды 455 М.
22. Gradient Descent: Downhill to a Minimum
52:44
MIT OpenCourseWare
Рет қаралды 78 М.
Meet the Mind: The Brain Behind Shor’s Algorithm
9:12
MIT CSAIL
Рет қаралды 23 М.
Lecture 1: Introduction to Superposition
1:16:07
MIT OpenCourseWare
Рет қаралды 8 МЛН
Support Vector Machines: All you need to know!
14:58
Intuitive Machine Learning
Рет қаралды 146 М.
20. Definitions and Inequalities
55:01
MIT OpenCourseWare
Рет қаралды 27 М.
An introduction to Reinforcement Learning
16:27
Arxiv Insights
Рет қаралды 654 М.