Loves your paper streams. Keep 'em coming. I watches them to start my day.
@michaeltraynor589326 күн бұрын
This guy's energy stresses me out but like, in a way I find comforting? Very strange. Also love the shade thrown at extropic (even though I have a soft spot for Gill as a Canadian). Thanks for introducing me to Normal I didn't know about them.
@tairad65Ай бұрын
My new fav channel
@wolpumba4099Ай бұрын
Summary starts at 1:29:40
@wolpumba4099Ай бұрын
Summary of "Thermodynamic Gradient Descent" * *Challenge:* Second-order optimization methods like Natural Gradient Descent (NGD) offer faster convergence for AI training but are computationally expensive on digital computers. * *Innovation:* Thermodynamic Natural Gradient Descent (TNGD) uses a hybrid system, combining a GPU with a specialized analog computer called a Stochastic Processing Unit (SPU). * *SPU Magic:* The SPU leverages the physics of heat dissipation, implementing an Ornstein-Uhlenbeck process to efficiently approximate NGD at a computational cost similar to first-order methods (SGD, Adam). * *Benefits:* * Faster convergence than SGD/Adam, particularly for large models [not shown yet?] and complex tasks. * Smooth interpolation between first and second-order optimization by controlling the SPU's evolution time. * Inherent momentum-like effect due to system delays further improves performance. * *Proof-of-Concept:* TNGD demonstrates its superiority over Adam on MNIST classification and shows promising results on language model fine-tuning (distilled BERT), outperforming both pure NGD and Adam. * *Looking Ahead:* TNGD represents an early step in thermodynamic computing for AI. Scaling up the technology, refining the implementation, and exploring its wider applicability are key next steps. i used gemini 1.5 pro to summarize the transcript and paper
@badrraitabcasАй бұрын
@@wolpumba4099 the disclaimer is dope
@BlueBirdgg28 күн бұрын
Ty for your videos.
@PulsatingShadowАй бұрын
Thanks
@rickybloss8537Ай бұрын
I believe the third order is jolt
@johnny594123 күн бұрын
I searched Wikipedia:4th snap(jounce),5th crackle,6th pop. I am assuming op knows 3rd is jerk
@j.rumbleseed29 күн бұрын
Yep yep, and the big winner will be the one that replaces the FPGA's that are a work around, and introduces the code in the thermodynamic actuation of the system. Soon it seems.
@TreeLuvBurdpu29 күн бұрын
It's important to understand that the government funding of the chip fabs always means that they will be responding to political incentives, political "likes", rather than actual market incentives from people with actual skin in the game, and it will be slower to respond to changes.
@MDNQ-ud1ty27 күн бұрын
Why the heck is everyone pronouncing ss - ß as sh? I have seen this in almost every CS video now ;/