26. Structure of Neural Nets for Deep Learning

  Рет қаралды 50,620

MIT OpenCourseWare

MIT OpenCourseWare

Күн бұрын

MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: ocw.mit.edu/18-065S18
KZbin Playlist: • MIT 18.065 Matrix Meth...
This lecture is about the central structure of deep neural networks, which are a major force in machine learning. The aim is to find the function that's constructed to learn the training data and then apply it to the test data.
License: Creative Commons BY-NC-SA
More information at ocw.mit.edu/terms
More courses at ocw.mit.edu

Пікірлер: 32
@davidbellamy1388
@davidbellamy1388 4 жыл бұрын
God bless Gilbert Strang and may his good health carry on for many more years. He is gifting the world with his work.
@sanjaykrish8719
@sanjaykrish8719 4 жыл бұрын
Prof Strang is an example that age is no bar for learning
@AKSHAYAKUMARSEC
@AKSHAYAKUMARSEC 3 жыл бұрын
Respected sir, may god bless you,. You will live for more than 100 years, you are a legend. Love from India ❤️
@robertwest6244
@robertwest6244 3 жыл бұрын
I love this man. He is so concise
@mathhack8647
@mathhack8647 2 жыл бұрын
الله يبارك في عمره ويعطيه الصحة والعافية. ما شاء الله عليه. Pure Pleasure .
@JulieIsMe824
@JulieIsMe824 3 жыл бұрын
Thanks Prof. Strang!! very clear and easy to understand!
@tchappyha4034
@tchappyha4034 5 жыл бұрын
I think in general mathematicians are good at teaching. I guess professor Gilbert Strang is not a specialist on machine learning, but he can teach neural nets better than specialist on machine learning.
@PrzemyslawSliwinski
@PrzemyslawSliwinski Жыл бұрын
Likely because (three years later, and counting...) machine learning is still more an art and engineering than science.
@iwonakozlowska6134
@iwonakozlowska6134 3 жыл бұрын
I like it when Mr G.S. ask "everybody with it ?"
@neoblackcyptron
@neoblackcyptron 3 жыл бұрын
God bless you sir. I see in this video professor Strang has injured his left hand. Yet he taught the class. I don’t know if any of the other viewers noticed that.
@vangjushkomini369
@vangjushkomini369 2 жыл бұрын
What a fantastic professor :) :)
@georgesadler7830
@georgesadler7830 2 жыл бұрын
Professor Strang thank you for another well plan lecture on Structure of Neural Nets for Deep and Machine Learning. Learning is a life long process.
@suchalooser1175
@suchalooser1175 3 жыл бұрын
What a curiosity at this age!!!
@formationformationb8708
@formationformationb8708 2 жыл бұрын
excellent
@DarianMiller
@DarianMiller 2 жыл бұрын
35:25 Piecewise linear conferences... multiple dimensional dry humor!
@imrematajz1624
@imrematajz1624 Жыл бұрын
Minute 46: There might be a more simple solution. Instead of the Binomial formula I would suggest using Number of Edges + Number of Nodes + 1 = number of folds (spaces). N+E+1=S
@saurabhgupta5662
@saurabhgupta5662 3 жыл бұрын
Does anyone know to which exact paper by Dr. Jon Kleinberg does Prof Strang refer to towards the end?
@moazzamjadoon4436
@moazzamjadoon4436 2 жыл бұрын
The formula at the end is given as exercise in A first course in Probability by SM Ross. This Is really based on Combinatorics. The formula is based on combinations i.e. nCr or n combinations of r things at a time.
@NothingMaster
@NothingMaster 5 жыл бұрын
“I won, but it didn’t pay off!” 😂
@seraphimwang
@seraphimwang 2 жыл бұрын
I would pay to have a lecture with this gentleman, one day. Hilarious how people laugh and enjoy. What a brilliant and soulful person he is 🎩
@dcrespin
@dcrespin Жыл бұрын
For BPP, deep learning and for the general structure of neural networks the following comments may be useful. To begin with, note that instead of partial derivatives one can work with derivatives as the linear transformations they really are. It is also possible to look at the networks in a more structured manner. The basic ideas of BPP can then be applied in much more general cases. Several steps are involved. 1.- More general processing units. Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights; this is a linear transformation expressed as the sum of its restrictions to a pair of complementary subspaces. 2.- More general layers (any number of units). Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations. 3.- Networks with any number of layers. A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations. 4.- Quadratic error of a function. ... --- Since this comment is becoming too long I will stop here. The point is that a very general viewpoint clarifies many aspects of BPP. If you are interested in the full story and have some familiarity with Hilbert spaces please google for papers dealing with backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available. For a glimpse into a completely new deep learning algorithm which is orders of magnitude more efficient, controllable and faster than BPP search in this platform for a video about deep learning without backpropagation; in its description there are links to a demo software. The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and perceptrons are functionally equivalent. For the elementary conceptual basis of NNs see the article Neural Network Formalism. Daniel Crespin
@DistortedV12
@DistortedV12 3 жыл бұрын
46:46 "he had m=2, but we should yeah grow up and get m to be 5 dimensional" lol this guy is so funny :)
@GauravSharma-ui4yd
@GauravSharma-ui4yd 4 жыл бұрын
Doesn't the number of flat pieces depend on how we fold the plane. Because if the second fold is parallel to the first fold then we get 3 flat pieces rather then 4 and the same argument holds for 3,4,... folds. So how we rely on that formula which gives incorrect answers when we change our fold keeping the no of folds same as it doesn't consider the spatial pattern we use to have a fold. Is I am right or I just missed something important?
@abcdxx1059
@abcdxx1059 4 жыл бұрын
i have the same question
@bogdansavkovic3441
@bogdansavkovic3441 4 жыл бұрын
Every fold (or line) has to cross previously made ones (all of them), so there are no parallel folds.
@seraphimwang
@seraphimwang 2 жыл бұрын
Precisely, that should be part of the assumption otherwise, this recursion formula (binomial coefficients) doesn’t hold. The formula holds for no ordered no repetition look up Combinatorics, k-permutations
@wagsman9999
@wagsman9999 11 ай бұрын
We live in the best of times and the worst of times. Social media content ranging from fantastic learning content like this… vs. conspiracy-laden content leading to an attack on our nation’s great Capitol.
@moazzamjadoon4436
@moazzamjadoon4436 2 жыл бұрын
For example at 7 or 8 minutes see the video kzbin.info/www/bejne/l4iZZGCkorCEmaM
@High_Priest_Jonko
@High_Priest_Jonko 4 жыл бұрын
Why not just add a shit load of neurons and features to boost your chances of successful classification?
@ulfadatau5798
@ulfadatau5798 3 жыл бұрын
Bcs this might produce a models that overfits = does not generalize well with new data. and it's very computationally expensive
@Darkev77
@Darkev77 2 жыл бұрын
Oh no, 4:30 beware LGBTQ+!
27. Backpropagation: Find Partial Derivatives
52:38
MIT OpenCourseWare
Рет қаралды 57 М.
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 257 М.
When Jax'S Love For Pomni Is Prevented By Pomni'S Door 😂️
00:26
WHY THROW CHIPS IN THE TRASH?🤪
00:18
JULI_PROETO
Рет қаралды 9 МЛН
26. Chernobyl - How It Happened
54:24
MIT OpenCourseWare
Рет қаралды 2,8 МЛН
6. Singular Value Decomposition (SVD)
53:34
MIT OpenCourseWare
Рет қаралды 216 М.
Geometric Deep Learning: GNNs Beyond Permutation Equivariance
1:25:12
Petar Veličković
Рет қаралды 11 М.
12a: Neural Nets
50:43
MIT OpenCourseWare
Рет қаралды 524 М.
But what is a neural network? | Chapter 1, Deep learning
18:40
3Blue1Brown
Рет қаралды 16 МЛН
Necessity of complex numbers
7:39
MIT OpenCourseWare
Рет қаралды 2,5 МЛН
Lecture 1: The Column Space of A Contains All Vectors Ax
52:15
MIT OpenCourseWare
Рет қаралды 313 М.
5. Positive Definite and Semidefinite Matrices
45:28
MIT OpenCourseWare
Рет қаралды 151 М.
When Jax'S Love For Pomni Is Prevented By Pomni'S Door 😂️
00:26