The Problem with Gradient Descent

  Рет қаралды 2,314

0Mean1Sigma

0Mean1Sigma

Күн бұрын

Пікірлер: 15
@42svb58
@42svb58 10 ай бұрын
Incredible video. I watched many videos on getting into data science and machine learning. This is the best one with graphics, content, pace, and complexity.
@korigamik
@korigamik 9 ай бұрын
Man I really like this video! Can you share the code for the animations that you used in the video with us?
@0mean1sigma
@0mean1sigma 9 ай бұрын
Glad you liked the video. Unfortunately the complete code for the animations got deleted (accidentally) as I didn't have any kind of workflow back then (but the animation of the network learning is available on the GitHub with link in the video description). However I'm improving on that and have uploaded the code for the animation in my latest video. I'm really sorry about this again.
@mister-8658
@mister-8658 Жыл бұрын
you have my vote for SoME
@0mean1sigma
@0mean1sigma Жыл бұрын
I appreciate it. Thanks a lot. I tried to keep it simple and I’m a little worried that maybe I made it too simple (in turn less interesting).
@mister-8658
@mister-8658 Жыл бұрын
@@0mean1sigma I think you found a decent balance here's hoping you place
@sakchais
@sakchais Жыл бұрын
These videos are excellent!
@0mean1sigma
@0mean1sigma Жыл бұрын
Thanks!!!
@swfsql
@swfsql Жыл бұрын
Really good video and explanation! I'm not sure if this is what you meant, but I'd say that gradient descent is pretty good, but that's if we have infinite steps to work with. If we do have, we can make the update changes approach zero and avoid any "jumps". But since we don't have infinite steps to work with, then yeah, some techniques help with the updates!
@0mean1sigma
@0mean1sigma Жыл бұрын
Not entirely true. I’ve tried training a basic neural net on MNIST dataset using vanilla GD and it fails to converge. For visualisation I’ve kept the dimensions to 2 but in higher dimensions this becomes a serious problem. That’s why you see optimisers like Adam used almost all the time in practice.
@swfsql
@swfsql Жыл бұрын
@@0mean1sigma I understand, but you didn't train with infinite time in your hands (pushing the alpha towards ~0, like 1e-36). I agree that in practice, with limited training time, then those problems with GD become practical.
@KarmaKomet
@KarmaKomet Жыл бұрын
Great video!
@HanwenJin
@HanwenJin Жыл бұрын
Good
@notu483
@notu483 10 ай бұрын
Thanks ❤
@ZephyrysBaum
@ZephyrysBaum Жыл бұрын
subbed!
Understanding Heat Equation | From Derivation to Solution
9:38
Gradient Descent Algorithm: How Machines Learn
8:32
0Mean1Sigma
Рет қаралды 2 М.
UFC 310 : Рахмонов VS Мачадо Гэрри
05:00
Setanta Sports UFC
Рет қаралды 1,2 МЛН
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
one year of studying (it was a mistake)
12:51
Jeffrey Codes
Рет қаралды 201 М.
This open problem taught me what topology is
27:26
3Blue1Brown
Рет қаралды 1 МЛН
Can we keep AI honest? [Mechanistic Interpretability]
24:09
Welch Labs
Рет қаралды 142 М.
Perceptron | Gradient Descent
20:34
slazysloth
Рет қаралды 219
Automatic Differentiation: Differentiate (almost) any function
8:41
ChatGPT vs Stockfish: ABSURD CHESS
25:01
GothamChess
Рет қаралды 809 М.
Finally: Grokking Solved - It's Not What You Think
27:02
Discover AI
Рет қаралды 15 М.
How Randomness Powers Physics & AI
15:27
CompuFlair
Рет қаралды 6 М.
Applying the Momentum Optimizer to Gradient Descent
7:10
John Lins
Рет қаралды 3,3 М.
UFC 310 : Рахмонов VS Мачадо Гэрри
05:00
Setanta Sports UFC
Рет қаралды 1,2 МЛН