*DeepMind x UCL | Deep Learning Lectures | 2/12 | Neural Networks Foundations* *My takeaways:* *1. Plan for this lecture **0:40* *2. What is not covered in this lecture **1:59* *3. Overview* 3.1 Neural network applications 3:59 3.1 What started the deep learning revolution 5:10 *4. Neural networks **9:17* *5. Single-layer neural networks **17:14* 5.1 Activation function: sigmoid 17:53 5.2 Loss function: cross-entropy for binary classification 19:50 5.3 Final activation function for multi-class classification: softmax 23:15 5.4 Uses 26:01 5.5 Limitations 27:28 *6. Two-layer neural networks **28:15* *7. Tensorflow playground **32:34* *8. Universal Approximation Theorem **33:55* *9. Deep neural networks **40:29* 9.1 Activation function: ReLU 41:05 9.2 Intuition behind network depth 44:20 9.3 Computational graphs 49:00 *10. Learning/training **52:27* 10.1 Optimizer: Gradient descent 53:24 10.2 Optimizers that are built on gradient descent: Adam, RMSProp 54:39 10.3 Computational graphs for training 55:45 10.4 Backpropagation, chain rule 57:15 10.5 Linear layers as computational graph 1:00:48 10.6 ReLU layers as computational graph 1:02:30 10.7 Softmax as computational graph 1:03:09 10.8 Cross-entropy as computational graph 1:04:04 10.9 "Cross-entropy Jungles" 1:06:00 10.10 Computational graph example: 3-layer MLP with ReLU 1:06:58 *11. Pieces of the puzzle: max, conditional execution **1:08:20* *12. Practical issues **1:10:44* 12.1 Overfitting and regularization 1:10:51 12.2 Lp regularization 1:12:44 12.3 Dropout 1:13:14 12.4 As models grow, their learning dynamics changes: double descent 1:13:32 12.5 Diagnosing and debugging 1:16:15 *13. Bouns: Multiplicative interactions **1:19:48*
@Amy_Yu20234 жыл бұрын
Lei Xun Thanks for sharing
@rauljrlara99943 жыл бұрын
Fool
@kamilziemian9952 жыл бұрын
Thank you.
@intuitivej93273 жыл бұрын
Fantastic~!! 숨을 죽이며 되돌려 보기를 몇 번.. 정말 좋은 강의다.. 문과출신 경력단절 10년 아이 둘 엄마인 내가 인공지능을 알아가는 재미에 빠짐.... 수학은 알수록 참 많은 이야기를 담고 있다.... math is not about numbers but logic and... storytelling as well. Thank you so much from South Korea.
@bingeltube4 жыл бұрын
Very recommendable! However, the tiny inserts providing further reading recommendations are hard to read. I suggest these recommended readings should be included in the text section underneath the video link.
@synaesthesis4 жыл бұрын
The slides are in the video description; you can copy the titles and authors of the readings from there. :)
@bingeltube4 жыл бұрын
@@synaesthesis thank you! Pardon my oversight! :-)
@eduardoriossanchez33934 жыл бұрын
53:17 I think there is a mistake in the Jacobian's formula, in the left-down corner.
@pranavpandey29654 жыл бұрын
yes, it should have been df_k/dx_1
@Alex-ms1yd4 жыл бұрын
oh, this are just projections!.. Its such a great intuition, it finally clicked now.. thanks!
@cryptorevolution95473 жыл бұрын
We appreciate your compliment..for more guidance WhatsApp.....+::1,,,3,,,,,,1,,,,,,3,,,,,,5,,,,,,3,,,,,,9,,,,,,8,,,,,,.2, ,,,,,7,,,,,,0,,,,,,,,
@annercamping4 жыл бұрын
my auto play was on and i just woke up to this
@glmchn2 жыл бұрын
🤣
@senorperez5 ай бұрын
lol
@lukn41003 жыл бұрын
Great lecture and big thanks to DeepMind for sharing this great content. - Wojciu, dalej tak. Super!
@raghavram64194 жыл бұрын
Wow! Great lecture covering the fundamentals. I liked the focus on computational graphs and on reasoning WHY certain components work or don't work .
@SudhirPratapYadav3 жыл бұрын
same, usually engineering side of things are missing in deep learning lectures
@danielpark60104 жыл бұрын
20:50 May I have a brief explanation about what "logarithm of probability of correct/entirely correct classification" in these two slides means? What is the significance of it and why is it helpful to negate it?
@luisleal41693 жыл бұрын
Lot of reasons and statistical considerations, but as intuitive argument(not proof) negating it helps interpreting small values as good and big values as bad, exactly what a loss function in ML is interpreted.
@Theoneandonly_Justahandle2 жыл бұрын
You might also look at shannon`s information entropy and associated measures: en.wikipedia.org/wiki/Quantities_of_information . In short, -log(p) is the a measure of the amount of information an event e with prob. p carries. Intuitively, it tells you how often you`d have to divide your space of possibilities in half in order to locate/find the event e with absolute certainty among all other events. (see also kzbin.info/www/bejne/rGebq4yvlqqge6M&ab_channel=3Blue1Brown for a great explanation)
@christopherparsonson71194 жыл бұрын
Are the slides for this series available?
@vmikeyboi3234 жыл бұрын
this
@jingtao11814 жыл бұрын
@@vmikeyboi323 where?
@mateusdeassissilva80094 жыл бұрын
@@vmikeyboi323 where?
4 жыл бұрын
@@jingtao1181 When people just comment "this", it usually means something like "I agree". Don't ask my why the word "this", I don't know either.
@jingtao11814 жыл бұрын
@ thanks for the reminder
@neurophilosophers9944 жыл бұрын
If a NN can't compute distances without multiplicative dot products how did Alpha Fold calculate the evolution of protein folding states entirely based on distances?
@JY-pf7bc4 жыл бұрын
Intuitive, fun, weaved with valuable experience and new research results. Excellent!
@taimurzahid78774 жыл бұрын
Can we please get the slides for these series?
@Aeradill4 жыл бұрын
They are in the video description Taimur
@nguyenngocly14844 жыл бұрын
You can turn artificial neural networks inside-out by using fixed dot products (weighted sums) and adjustable (parametric) activation functions. The fixed dot products can be computed very quickly using fast transforms like the FFT. Also the number of overall parameters required is vastly reduced. The dot products of the transform act as statistical summary measures. Ensuring good behavour. See Fast Transform (fixed filter bank) neural networks.
@jingtao11814 жыл бұрын
Thank you for the lecture! In 3e-5, what does e stand for?
@AndreyButenko4 жыл бұрын
That is scientific notation: 3e-5 is the same as 3 * 10 ^ -5 which is 0.00003 😊
@jingtao11814 жыл бұрын
@@AndreyButenko Thank you so much. so this means that setting the learning rate at 0.00003 is really helpful?
@bingeltube4 жыл бұрын
e stands for Euler's constant and it refers in this case to the exponential function see e.g. en.wikipedia.org/wiki/E_(mathematical_constant)
@jingtao11814 жыл бұрын
@@bingeltubeHi, I was wondering whether e is a constant before, but it does not make sense to me either. If e is a constant approximately 2.7, then 3e - 5 = 3.1. However, I think an effective learning rate is somewhat between 0.001-0.1.
@jingtao11814 жыл бұрын
@@bingeltube I think Andrey's answer makes more sense.
@sumanthnandamuri21684 жыл бұрын
It will be even better if you can release course assignments to public for practice
@xmtiaz4 жыл бұрын
33:29 33:29 "Play is the highest form of research" - Albert Einstein
@havelozo2 жыл бұрын
What software was used to make these slides?
@davidolushola34194 жыл бұрын
Wow it's awesome. Please can I get the slide for this lecture. Thanks 😊
@SudhirPratapYadav3 жыл бұрын
its in video description
@iinarrab194 жыл бұрын
Intuitive and explains in detail.
@pervezbhan17082 жыл бұрын
kzbin.info/www/bejne/qJC0YmWLfsuAoqc
@jonathan-._.-4 жыл бұрын
:o there is a writing mistake in "neural networks as computational graphs" sotfmax instead of softmax
@ramakanthrama85784 жыл бұрын
haha
@marcospereira60344 жыл бұрын
The explanations are a little too handwavy in this lecture. Wojciech seems to assume some intuitions are obvious when they aren't for someone without a lot of experience in the field. For example, when showing how sigmoids can emulate an arbitrary function, he said we can just "average" the sigmoid and the reversed sigmoid to form a bump, without mentioning that this averaging comes from the softmax (or am I wrong and it is something else?).
@marcospereira60344 жыл бұрын
Still, I appreciate the effort and think the lecture is great overall. Just need to complement it with other sources. Speaking of which, I would recommend Chris Olah's post on understanding neural networks through topology: colah.github.io/posts/2014-03-NN-Manifolds-Topology/ Thank you guys at deepmind for this course!
@marcospereira60344 жыл бұрын
Also at 1:22:06, it's hard to interpret the graphs - it's not immediately obvious what blue and green represent. Why would one assume the labels are obvious? Graphs should always be labelled 🙂
@AeroGDrive4 жыл бұрын
I just think that the averaging with the reverse sigmoid comes from a second neuron, in fact he says there are 6 neurons, and each pair of neurons is the sigmoid + reverse sigmoid, then we have 3 resulting bells (as in the graph) which are then weighted average in the next layer
@eyeofhorus13014 жыл бұрын
@@marcospereira6034 And if you know nothing like me there's soo soooo much more that he's hand wavy about and expects to grasp incredibly quickly... lol
@heyna882 жыл бұрын
Idk. I felt the entire explanation was quite superficial. I’m not sure whether this depends on the audience though
@TheNotoriousPhD4 жыл бұрын
Thanks for this Interesting series. Any place I can get related math knowledge ?
@123456wei4 жыл бұрын
same, would love to have some links to related math content
@SudhirPratapYadav3 жыл бұрын
Go for Engineering Mathematics (rather than pure, note same topics you will find in pure mathematics too, but learn it from engineering perspective, thus select courses/books named 'engineering mathematics') Topics -> Linear Algebra, Multivariate Calculus, Optimisation, graph theory, discrete mathematics & Most important in this case Numerical Methods for _________ (fill here whatever you like)
@SudhirPratapYadav3 жыл бұрын
I forgot to mention - Finite point airthmatic
@JakobGille24 жыл бұрын
Me: being happy about a deep learning lecture. Also me: seeing complicated formulas and closing the tab.
@speedfastman4 жыл бұрын
Don't let it discourage you!
@jingtao11814 жыл бұрын
just focus on the concept. There are major APIs that can do the math for you.
@yifanyang8064 жыл бұрын
It’s not that complicated, don’t give up.
@luksdoc4 жыл бұрын
A very nice lecture.
@mahsaabtahi66334 жыл бұрын
I have to try hard to hear and understand the lecturer, I wish he spoke more clearly.
@salomeolusoji75514 жыл бұрын
Use caption
@aromax5044 жыл бұрын
Can someone explain in a simpler way the term '"Numerically Stable"
@aromax5044 жыл бұрын
@Prasad SeemakurthiThank you Prasad. It was really healpful
@marcospereira60344 жыл бұрын
numerical stability refers to the accumulation of errors. an algorithm that is numerically unstable is sensitive to errors and allows them to accumulate, causing the final result to diverge from the correct value.
@SudhirPratapYadav3 жыл бұрын
I need to clarify one more thing -- here 'error' really comes from converting thing from continuous to discrete. Basically you can't implement (actually you can by analytical/symbolic maths but leaving it aside) continuous computations in computers. So what we do --> We discrete (convert to numbers in this case) and thus there are some 'errors' which accumulate and computations don't converge (they usually blow up to infinity or to 0) -> thus even that function is theoretically (in sense continuous space) converging. It will not do so in actual computation on computers --> thus numerically unstable.
@lizgichora64723 жыл бұрын
Thank you.
@wy25284 жыл бұрын
It is super clear
4 жыл бұрын
1:04:30 Hold on… Are you telling me that, given a neural network and a set of weights, I could generate a picture of the most doggish dog ever? :D Edit: The answer is "yes"! kzbin.info/www/bejne/qZm5fJuForljfqc
@s3zine3423 жыл бұрын
35:57
@ThePentanol2 жыл бұрын
I really like this explanation. But yet this is not the best. There are a lot of blind spots in this work. This course is not for beginners
@s3zine3423 жыл бұрын
9:18
@chungweiwang52403 жыл бұрын
The tricky triangle lamentably look because comic clasically enjoy near a meek purple. spicy, sordid spot
@staceymcsharry27254 жыл бұрын
The wet calendar normally tempt because tomato alternately release inside a obnoxious open. chubby, exultant panty
@padenzimmermann18924 жыл бұрын
*Neural Net self destructs*
@joshuahinojosa73483 жыл бұрын
The historical production longitudinally bake because bead practically trade since a wrathful feature. vengeful, dangerous wealth