"What's the point of going fast if you don't know the direction?" seriously amazing advice!
@cedricvillani8502 Жыл бұрын
That's what makes racing so much fun
@Verrisin Жыл бұрын
running away
@rafagd Жыл бұрын
He's stocastically gradients himself.
@lolcat6910 ай бұрын
hearing this just after I've fallen of my skate lmfao, actually a good advice
@flobuilds Жыл бұрын
4 hours of fun let's go
@eboubaker3722 Жыл бұрын
Go where, go where??
@Humble_Electronic_Musician11 ай бұрын
My Favorite quote: 2:46:46 "It is not difficult per se: it is complicated"
@pauleagle97 Жыл бұрын
After around 7 days of digesting the calculus, designing and implementing an algorithm (Python), I got it all working today. It outperformed the finite difference computation, yielding the same cost reduction 30 times faster! I learned so much and so fast, I could hardly imagine any university giving me that. Your contribution to the ML community is terrific. Thank you lad, and I can't wait to see the rest of your streams dedicated to this fascinating topic 💪
@chillydill47038 ай бұрын
Big props to tsoding on drawing how things actually works. Excellent teaching!
@yc699bk Жыл бұрын
Andrej Karpathy has a great explanation on how backprop works
@flickering.embers Жыл бұрын
Thanks for your incredible videos, as always! Just one thing: the approach you took for deriving the backpropagation formula -- defining intermediate cost functions -- is very interesting, but I thnk there were some mistakes along the way. In 2:10:00 e_i is left out from the differentiation process for the reason that it is a constant. But according to the definition of e_i, its value depends on the activation values of the layer 2 which is in turn dependent on the weights and biases of the current layer 1, so it cannot be a constant. Since ai^(1) - ei = \pd[ai^(1)]C^(2), the correct way to differentiate ai^(1) - ei by w^1 would be to compute the second partial derivative \pd[w^1]\pd[ai^(1)]C^(2) (which immediately shows that the result of this computation cannot coincide with that based on the chain rule, since the latter only consists of joining first partial derivatives together). If we carry on with the calculation, \pd[ai^(1)]C^(2) has a bunch of ai^(2)'s multiplied together, so for each term we would have to plug in \pd[w^1]ai^(2) = ai^(2) * (1 - ai^(2)) * w^2 * ai^(1) * (1 - ai^(1)) * xi. I have not followed through this line of thought, but I suspect the result would be fairly complicated. Anyway, if my calculations are correct, the formula actually used in the code differs from the conventional one by a factor of two. So for a neural network with n layers, I would predict that the weights and biases of neurons in the first layer will be updated roughly 2^n times more rapidly than they would have if the conventional formula had been used. I didn't have time to test it out, but it would be interesting to actually see how this variant formula performs compared to the conventional one.
@Bolvarsdad Жыл бұрын
As a first year in applied AI, I can't wait until the day when I can read what the fuck you just said and be like "Oh yeah that's entirely sensible". I'm taking Calc 2 and Calc 3 next year as elective courses to better understand third year material such as gradients, hats off to you sir.
@TsodingDaily Жыл бұрын
Feel free to submit a patch: github.com/tsoding/nn.h/blob/63ae4a1ba3396215be67196b789593e4c56f9139/nn.h#L306-L360 And also correct the tex file: github.com/tsoding/ml-notes/blob/fd64cea05648e1480b6f9556a887237def0c6300/papers/grad.tex
@Сергей-ц2т2х Жыл бұрын
Однозначно лучший канал по программированию на Ютубе, вы помогли мне понять очень много вещей, и показали мне новый способ программирования. Спасибо! P. S. У меня сложилось впечатление, что вы говорите по русски
@foxcirc Жыл бұрын
Your way of explaining things is great and I really learned a lot about derivatives (and LaTex) in this video. Love your content and community ❤
@DGHere12 Жыл бұрын
I was not able to get this concept any how, By this video i have learnt a lot in depth, Your videos are awesome.
@sharokhkeshawarz2122 Жыл бұрын
Bro i just started learning machine learning few days ago this is gonna be helpful!
@RickeyBowers Жыл бұрын
Thank you for posting to the tubes - I missed this stream.
@anthonypace535410 ай бұрын
Great content! But there is one more step needed at 41:10 You could remove many superfluous multiplications if you move the constant 2 outside the sum, so 1/n becomes 2/n I watched later videos where you address that people told you to remove the 2, but that doesn't seem necessary as you will be modifying the outside constant with the learning rate anyways, and topped with a different activation function I think you are doing fine. Again, great content! Your personality is awesome!
@jdobdob894711 ай бұрын
I love learning from raw code especially in c. Awesome.
@CodePhiles Жыл бұрын
the journey through this was amazing , even when I get lost, thank you for your effort and insights during this, looking forward to complete watching and understanding the whole series.
@mohammedalhaddar2182 Жыл бұрын
It was the first time in my life when i've seen GCC optimization giving that big of performance jump! I can't believe I'm learning all of this for free
@ecosta Жыл бұрын
My greatest regret from college was not investing enough time to learn LaTeX. The language is ugly, the tools are verbose, but the end result is so amazing. No word processor could describe math (or science in general) in such elegant way while providing you with full control. And I feel really sorry for whoever has to use the "Web 2.0" tools to write their college stuff nowadays.
@ffuuzzuu5943 Жыл бұрын
if you still wanna learn a language with the power of LaTeX, but without the verbosity and ugliness, have a look a typst. its newer, and is obviously far less widely used, but it builds upon many of LaTeXs problems, and its syntax is quite nice
@623-x7b Жыл бұрын
You can copy and paste equations from Symbolab into math mode if that makes it easier.
@raphaelmorgan2307 Жыл бұрын
my spouse is a math major rn and they're using LaTeX! And for both of our stats classes we're using R and markdown lol
@eukaryote0 Жыл бұрын
Pretending to be non-scientist successfully failed
@lkda01 Жыл бұрын
wow, I realized the usage of ++i instead of i++ simply made engineer and mathematician happy. Love it!
@bakaneko39 Жыл бұрын
Very good video ! Would love to see you re-implement this with BLAS !
@blacklistnr1 Жыл бұрын
When you said "weights and biases" I was expecting an ad and was surprised you have a sponsor... Can back-prop de-ad my brain? :))
@FirstnameLastname-cw1my Жыл бұрын
Sir your videos are really good i am not into machine learning right now and still i am getting everything thanks alot for quality content......Have a great day:]
@joseppratmonell4969 Жыл бұрын
question1: in backprop method, why you do not divide /n the activations (.as) too? as you do for weights and bias? question2: should we divide by (n-j) instead of just (n)?
@ratchet1freak Жыл бұрын
1. because 3b1b said so and if you work them out you will find that the activation actually applies for each neuron they are the input to so it makes sense to not diminish their contribution
@dohyun0047 Жыл бұрын
thanks for the awesome video and quality, everything made more sense when I watched the video you mentioned 3b1b, and had an ah-ha moment. thank you so much~!!!!! really appreciate
@AhmedRefaat532 Жыл бұрын
I've been following you since the Haskell days
@anon_y_mousse Жыл бұрын
The thing that stuck out most in the entire stream was something you said early on about everyone using cheat sheets, and my thought on why teachers don't like them is because they're trying to teach you not only how to do it but to understand it, and if you don't need to refer to the cheat sheet to do it then presumably you understand how it works. In other words, the best cheating method is to memorize it. It just so happens to have the added benefit of making you faster too. As for the math, I've got an idea I'd like to try about reorganizing the graphs as it processes that *might* enable faster growth. "Might" being the keyword as it's probably a stupid idea. I really should get a GitHub account so that I could post things and share.
@galbalandroid Жыл бұрын
the problem is, most teachers don't teach in an intuitive way but rather just throw definitions at you, so it's your responsibility to find the way to understand it intuitively. which leads to why the most just stick to cheat sheets.
@anon_y_mousse Жыл бұрын
@@galbalandroid Indeed, and there's nothing wrong with that, in my opinion, but it makes sense for teachers to want you to understand the material, whether *they* have any idea how to impart that knowledge or not. Also, if you understand the material you should have all of what's on the cheat sheet memorized and thus be faster at solving the equations.
@TsodingDaily Жыл бұрын
> if you understand the material you should have all of what's on the cheat sheet memorized No.
@stewartzayat7526 Жыл бұрын
But differentiating functions usually boils down to a lot of mechanical work for which you do not need any understanding of what differentiating is, but do need a lot of formulas and rules. You either have those memorized, which comes naturally if you differentiate enough functions, or you don't have them memorized, in which case you look at a cheat sheet and arrive at the same result. There is no benefit to having them memorized other than that you are faster. You don't need to memorize those formulas to understand what a derivative is.
@anon_y_mousse Жыл бұрын
@@stewartzayat7526 If you truly understand it then you can derive those formulae yourself, and in such cases you don't need the cheat sheet. Faster is always better, memorizing is the ultimate cheat.
@Anonymous-fr2op9 ай бұрын
3:30:18 idk why tf j+bits in the second assignment crashes my program, am using c++ btw. If anybody got any idea why, plz explain. Cuz imo, the index is perfectly bound as well, yet it just doesn't wanna work
@joseppratmonell4969 Жыл бұрын
since a_i^(0)=x_i and x_i is a constant. partial derivative a_i^(0)C became all full of constants which is 0. Thats why we do not compute that. And thanks to a_i^(n)=y_i being something that we know, we can start back propagation from there. I think I understood something... at least 1% hahaha Intra layers have to take into account partial derivation of activations since those will be like weights and biases, we do not know them, and, thats how layers actually chain with each other, at least in my mind hahaha And thanks for this awesome series!
@postmodernist1848 Жыл бұрын
Omg a long one! Just like the old days
@filipposcaramuzza2953 Жыл бұрын
Thanks god, it's the first time I hear someone finding the derivative notation having a very low SNR as I do. In high school we were used to use the good old f' and suddenly in uni those nerds started using d/dx f(x) and ruin our game
@grawlixes Жыл бұрын
you are legit one of the best teachers I've ever watched. thanks!
@AlguienMas555 Жыл бұрын
I kept this moment for me. 32:50
@renefernandez360 Жыл бұрын
Before this I did know that matrix multiplication was an integral part of AI, but never got to understand it the way you are explaining it. But this gets me thinking: Isn't AI an heuristic algorithm with extra steps? You have a randomized initial solution that you want to optimize or drive to a local optimum, you need a function to know how good your solution is(objective or cost function), you randomly do small changes towards a better solution until you reach a certain number of iterations where you either reached a local optimum or are still going towards it.
@bayesianmonk6 ай бұрын
The law of total derivative is the one you are referring to.
@bhavik_1611 Жыл бұрын
CPU at constant 90*+ and a 6.7 gib porn folder 💀my guy is upto somthing
@lucifer-5ybtn Жыл бұрын
Can anybody help me out with one thing? I haven’t had the need for “high level math” till now in any of my projects. But I have been looking into the AI/ML stuff lately along with deep learning and thought that I should pick up math now. My math skills ARE HORRIBLE. It’s because my basics aren’t clear as I couldn’t care any less about the “education system”. I have been thinking about starting math (and physics too) from scratch and will do a self study kind of thing in the sense that I wouldn’t have to mug up a ton of unnecessary rules and then spit them out in some exam. My sole focus shall be on concepts which I can apply in “real world” stuff. If there is anyone here who can help me out by suggesting a path for learning math and physics from scratch?
@atharvparlikar8765 Жыл бұрын
I'm no maths expert but I do understand fair enough as a CS student who's been decent since school. I'm also working on a deep learning/ autograd library written in python so I might be able to help you with neural nets and their ins and outs. As for maths I'm not very confident but I'll help as much as I can, first off be very specific about how much maths you know right now, and we'll see how we can go further
@adama7752 Жыл бұрын
Calculus 101. Just integrals and derivatives.
@anon-fz2bo Жыл бұрын
Do some linear algebra & calculus, & discreet math for theory stuff (graphs theory and stuff) I suck ass too but I had to learn these things for my CS degree
@mentalsquirt1423 Жыл бұрын
3blue1brown and statquests
@lucifer-5ybtn Жыл бұрын
@@atharvparlikar8765 No trigo No calculus, basic algebra basic geometry. But I am good at logic building, I can easily understand what the problem is (programming problem) and how I can come up with a solution. But i’d say that assume that I’m a complete beginner who does not know anything more than 4 basic operations and fractions.
@thedebapriyakar Жыл бұрын
pure legend
@hamzadlm6625 Жыл бұрын
ur are a universal blessing upon our lives
@xhivo97 Жыл бұрын
Since Odin supports matrices at a language level would that make it a good fit for ML?
@xhivo97 Жыл бұрын
Nevermind I think it only supports matrices up to 16
@freestyle85 Жыл бұрын
It will be nice to add separate activation functions to inner layers like ReLU, because a sigmoid activation slowdown speed of learning. In practice, more the one hidden layer with sigmoid does not using, but with ReLU can be.
@JohnWasinger Жыл бұрын
28:50 You speak German too? What are you using to edit your LaTex document? Vim plugin perhaps?
@GegoXaren Жыл бұрын
He's using Emacs. He just runs the latexpdf command, and the PDF reader auto updates.
@MrOnePieceRuffy Жыл бұрын
double time stream, poggers
@michaelhaag3367 Жыл бұрын
man, I love you, hope everything is fine where you are❤
@MatthewPherigo Жыл бұрын
Is your screen lagged from your cam and audio? Sometimes you seem to react to something on screen before it appears
@RahulPawar-dl9wn Жыл бұрын
4:04:09 what this "-03" parameter ?
@nibrobb Жыл бұрын
Optimization level. -O3 is the most agressive optimization, it unrolls for-loops and uses lots of tricks to go faster. Read the man pages for detailed info
@kuyajj68 Жыл бұрын
How to become smart like tsoding?
@GegoXaren Жыл бұрын
Time. Patience. Practice.
@SomeSubhuman Жыл бұрын
Also a couple extra IQ points
@johanngambolputty5351 Жыл бұрын
Oh no, please don't tell me you're calling glorified chain rule the most important ML alg, it hurts my soul. Most important for neural nets perhaps, but obviously not super applicable to other ML techniques, especially if they're non-differentiable.
@TsodingDaily Жыл бұрын
Please, forgive me. That what you have to do on KZbin to survive.
@johanngambolputty5351 Жыл бұрын
@@TsodingDaily Forgiven :) also I don't think the stochastic part of sgd is the "shuffling", I think its because you're sampling a subset of your dataset. It's an estimator of the full cost (and the gradient that you calculate is an estimator of the true gradient), which is like an average of costs across individual datapoints anyway. If you like, the stochastic, or random part, is the error from the true cost. Haven't overly looked into it, but you essentially just want an update rule that is robust to this error I think.
@silencedspec1711 Жыл бұрын
Just finished last video and we have more!
@garygonsalves6956 Жыл бұрын
Thank you very much for the educational videos! Is the source code for each video available somewhere?
@JJacquesLoheac Жыл бұрын
Very Interesting, Great job for a not AI guy
@__noob__coder__ Жыл бұрын
More machine learning stuff from now on. Please 😅
@gasun1274 Жыл бұрын
this couldnt have come at a better time lol
@AMith-lv2cv Жыл бұрын
Zozi got hacked?! wth, bitcoin support? Is this new on the channel or did I miss smth..
@UnrealCatDev Жыл бұрын
It's kinda off to name carry of, shouldn't it be carry on?
@antropod5 ай бұрын
It is black boxes all the way down
@anashussain3540 Жыл бұрын
Which IDE did you use?
@DuskyDaily Жыл бұрын
Emacs
@DuskyDaily Жыл бұрын
@Numerius because vim is the best editor of all time
@anashussain3540 Жыл бұрын
@@DuskyDaily thankyou, i like your typing, very attractive :)
@spectrespect Жыл бұрын
@Numerius Ed? Oh man, ed is the hell
@XELER53 Жыл бұрын
Khello Khello
@MuhammadJamil-ho6wl Жыл бұрын
He is not writing code he just copy past from his mind.
@edgaremmanuel319711 ай бұрын
i tried the framework for xor with 3 inputs.
@edgaremmanuel319711 ай бұрын
did not work, is there something am missing
@edgaremmanuel319711 ай бұрын
i trained the model using 70% of the data
@lkda01 Жыл бұрын
lol im lost
@alonner-gaon6012 Жыл бұрын
I opened an issue in the github which I think will interest you @Tsoding