26. Structure of Neural Nets for Deep Learning

Рет қаралды 50,620

Күн бұрын

MIT 18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning, Spring 2018
Instructor: Gilbert Strang
View the complete course: ocw.mit.edu/18-065S18
KZbin Playlist: • MIT 18.065 Matrix Meth...
This lecture is about the central structure of deep neural networks, which are a major force in machine learning. The aim is to find the function that's constructed to learn the training data and then apply it to the test data.
License: Creative Commons BY-NC-SA
More information at ocw.mit.edu/terms
More courses at ocw.mit.edu

Пікірлер: 32

@davidbellamy1388 4 жыл бұрын

God bless Gilbert Strang and may his good health carry on for many more years. He is gifting the world with his work.

@sanjaykrish8719 4 жыл бұрын

Prof Strang is an example that age is no bar for learning

@AKSHAYAKUMARSEC 3 жыл бұрын

Respected sir, may god bless you,. You will live for more than 100 years, you are a legend. Love from India ❤️

@robertwest6244 3 жыл бұрын

I love this man. He is so concise

@mathhack8647 2 жыл бұрын

الله يبارك في عمره ويعطيه الصحة والعافية. ما شاء الله عليه. Pure Pleasure .

@JulieIsMe824 3 жыл бұрын

Thanks Prof. Strang!! very clear and easy to understand!

@tchappyha4034 5 жыл бұрын

I think in general mathematicians are good at teaching. I guess professor Gilbert Strang is not a specialist on machine learning, but he can teach neural nets better than specialist on machine learning.

@PrzemyslawSliwinski Жыл бұрын

Likely because (three years later, and counting...) machine learning is still more an art and engineering than science.

@iwonakozlowska6134 3 жыл бұрын

I like it when Mr G.S. ask "everybody with it ?"

@neoblackcyptron 3 жыл бұрын

God bless you sir. I see in this video professor Strang has injured his left hand. Yet he taught the class. I don’t know if any of the other viewers noticed that.

@vangjushkomini369 2 жыл бұрын

What a fantastic professor :) :)

@georgesadler7830 2 жыл бұрын

Professor Strang thank you for another well plan lecture on Structure of Neural Nets for Deep and Machine Learning. Learning is a life long process.

@suchalooser1175 3 жыл бұрын

What a curiosity at this age!!!

@formationformationb8708 2 жыл бұрын

excellent

@DarianMiller 2 жыл бұрын

35:25 Piecewise linear conferences... multiple dimensional dry humor!

@imrematajz1624 Жыл бұрын

Minute 46: There might be a more simple solution. Instead of the Binomial formula I would suggest using Number of Edges + Number of Nodes + 1 = number of folds (spaces). N+E+1=S

@saurabhgupta5662 3 жыл бұрын

Does anyone know to which exact paper by Dr. Jon Kleinberg does Prof Strang refer to towards the end?

@moazzamjadoon4436 2 жыл бұрын

The formula at the end is given as exercise in A first course in Probability by SM Ross. This Is really based on Combinatorics. The formula is based on combinations i.e. nCr or n combinations of r things at a time.

@NothingMaster 5 жыл бұрын

“I won, but it didn’t pay off!” 😂

@seraphimwang 2 жыл бұрын

I would pay to have a lecture with this gentleman, one day. Hilarious how people laugh and enjoy. What a brilliant and soulful person he is 🎩

@dcrespin Жыл бұрын

For BPP, deep learning and for the general structure of neural networks the following comments may be useful. To begin with, note that instead of partial derivatives one can work with derivatives as the linear transformations they really are. It is also possible to look at the networks in a more structured manner. The basic ideas of BPP can then be applied in much more general cases. Several steps are involved. 1.- More general processing units. Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights; this is a linear transformation expressed as the sum of its restrictions to a pair of complementary subspaces. 2.- More general layers (any number of units). Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations. 3.- Networks with any number of layers. A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations. 4.- Quadratic error of a function. ... --- Since this comment is becoming too long I will stop here. The point is that a very general viewpoint clarifies many aspects of BPP. If you are interested in the full story and have some familiarity with Hilbert spaces please google for papers dealing with backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available. For a glimpse into a completely new deep learning algorithm which is orders of magnitude more efficient, controllable and faster than BPP search in this platform for a video about deep learning without backpropagation; in its description there are links to a demo software. The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and perceptrons are functionally equivalent. For the elementary conceptual basis of NNs see the article Neural Network Formalism. Daniel Crespin

@DistortedV12 3 жыл бұрын

46:46 "he had m=2, but we should yeah grow up and get m to be 5 dimensional" lol this guy is so funny :)

@GauravSharma-ui4yd 4 жыл бұрын

Doesn't the number of flat pieces depend on how we fold the plane. Because if the second fold is parallel to the first fold then we get 3 flat pieces rather then 4 and the same argument holds for 3,4,... folds. So how we rely on that formula which gives incorrect answers when we change our fold keeping the no of folds same as it doesn't consider the spatial pattern we use to have a fold. Is I am right or I just missed something important?

@abcdxx1059 4 жыл бұрын

i have the same question

@bogdansavkovic3441 4 жыл бұрын

Every fold (or line) has to cross previously made ones (all of them), so there are no parallel folds.

@seraphimwang 2 жыл бұрын

Precisely, that should be part of the assumption otherwise, this recursion formula (binomial coefficients) doesn’t hold. The formula holds for no ordered no repetition look up Combinatorics, k-permutations

@wagsman9999 11 ай бұрын

We live in the best of times and the worst of times. Social media content ranging from fantastic learning content like this… vs. conspiracy-laden content leading to an attack on our nation’s great Capitol.

@moazzamjadoon4436 2 жыл бұрын

For example at 7 or 8 minutes see the video kzbin.info/www/bejne/l4iZZGCkorCEmaM