The spelled-out intro to language modeling: building makemore

  Рет қаралды 774,758

Andrej Karpathy

Andrej Karpathy

Күн бұрын

Пікірлер: 817
@AndrejKarpathy
@AndrejKarpathy 2 жыл бұрын
Update: I added some suggested exercises to the description of the video. imo learning requires in-person tinkering and work, watching a video is not enough. If you complete the exercises please feel free to link your work here. (+Feel free to suggest other good exercises!)
@AndrejKarpathy
@AndrejKarpathy 2 жыл бұрын
@@ibadrather oh. Please go to Discord then, linked in the description. Sorry :\
@ibadrather
@ibadrather 2 жыл бұрын
I don't know if this is a good place for Q&A but there is something I need to ask that I cant't wrap my head around. I was training the trigram language model and loss was less than for the bigram language model but the model was worse. I tried to generate a few names and I reliased I made a huge error in data preparation. The question I have is how big of an indicator is loss? Is loss the only thing that matters for model performance. I understand there are other metrics of model perfromance. I have actually faced something in my work. I am stabilizing a video using IMU sensor. And I am training a NN for camera pose estimation. For different architectures the lower loss models don not necessarily perform better. When our team looks at the stanilized video many times the model with higher loss generates a visually better stabilized video. I don't quiet understand this. That's why I am asking how much is loss the indicative of model performance. I don't expect you to answer this here but if you may talk abou this in your future lectures or somewhere else.
@OwenKraweki
@OwenKraweki 2 жыл бұрын
My loss for trigrams, count model, using all data for training was around 2.0931 and I was able to get close with NN approach. I'm not sure if the resulting names were better, but I wasn't able to generate the exactly same names with the count and NN approaches anymore (even using the same generator). Also I'm not sure how to best share/link my solution (I have the notebook on my local drive).
@sanesanyo
@sanesanyo 2 жыл бұрын
I built the trigram model by concatenating the one hot encoded vector for the first two letters & feed them through a neuron & rest is the same. I think that is fine way to train a trigram model. Any views on that? I did attain a lower loss compared to bigram although the results are not significantly better.
@stanislavnikolov7423
@stanislavnikolov7423 Жыл бұрын
@@ibadrather Not an expert myself, but here’s how I would explain it: Coming up with a loss function is like coming up with a target you optimise for. Apparently your perception of how good a result is (your human brain loss function) differs from what you optimise your network toward. In that case you should come up with a better equation to match your gut feeling. Practical example. Let’s say you want to train a network that produces ice cream. Your loss function is the amount of sugar in the ice cream. The best network you train crushes the loss, but produces simple 100% sugar syrup. It does not have the texture and consistency of real ice cream. A different network may make great ice cream texturewise, but put less sugar in it, thus having worse loss. So, adjust your loss function to score for texture as well.
@lukeanglin263
@lukeanglin263 10 ай бұрын
Never in my life have I found an educator like this. This is free gold.
@vincentd2418
@vincentd2418 2 жыл бұрын
What a privilege to be learning from someone as accomplished as Andrej, all for free. The internet at its best🙏
@RalphDratman
@RalphDratman Жыл бұрын
Just what this is -- a privilege indeed! We don't even have to pay tuition, or travel to Stanford.
@barjeshbarjesh8215
@barjeshbarjesh8215 Жыл бұрын
I am not lucky; I am blessed!
@Anonymouspanda8
@Anonymouspanda8 Жыл бұрын
absolutely!!!
@kamikaze9271
@kamikaze9271 Жыл бұрын
So true!
@iNTERnazionaleNotizia589
@iNTERnazionaleNotizia589 Жыл бұрын
Hopefully KZbin will be free FOREVER AND EVER, not like Medium or Towardsdatascience...
@RalphDratman
@RalphDratman Жыл бұрын
I cannot imagine a better -- or kinder -- teacher. He is feeding his audience knowledge and understanding, in small delicious bites, without worrying about their level of prior knowledge. And he is smiling irrepresively all the time! Such a good person.
@Slimshady68356
@Slimshady68356 22 күн бұрын
@minjunesong6667
@minjunesong6667 2 жыл бұрын
I haven't commented on a youtube video since 2017. But I have to, in the slim case that you actually read this comment Adrej! Please keep doing what you are doing! You are an absolute gem of an educator, and the millions of minds you are enlightening with each video will do great things that will compound and make the world a better place.
@AndrejKarpathy
@AndrejKarpathy 2 жыл бұрын
reminded of kzbin.info/www/bejne/eGmmZqagn82mqdE :D
@pissedmeenoughtocomment
@pissedmeenoughtocomment 2 жыл бұрын
@@AndrejKarpathy you sure made YT comments section a better place lol.. Excellent videos, please keep them coming, or shall I say make more! Thank you!!
@NewGirlinCalgary
@NewGirlinCalgary Жыл бұрын
@@AndrejKarpathy 🤣🤣🤣
@khalilsabri7978
@khalilsabri7978 Жыл бұрын
thanks for writing this comment for all of us ! please keep us these videos , as Minjune said , you're a gem of an educator !
@PatrickHoodDaniel
@PatrickHoodDaniel Жыл бұрын
@@AndrejKarpathy So lo no mo. Classic!
@clray123
@clray123 Жыл бұрын
The reason why this is such excellent teaching is because it's constructed bottom-up. It builds more abstract concepts using more concrete ones, generalizations follow concrete examples. At no point there is a situation in which the learner has to "just assume" something, which will "become clear later on" (in most instances when a teacher says it, it doesn't; it just causes people to desperately try to guess the required knowledge on their own to fill in the gaps, distracting from anything that follows, and producing mental discomfort). The bottom up approach produces pleasure from a series of little "a-ha" and "I agree" moments and a general trust in the teacher. I contrast this to the much worse fastai courses - in which there are lots of gaps and hand waving because of their top-down approach.
@deyjishnu
@deyjishnu Жыл бұрын
This is exactly my experience as well. Well said.
@MihikChaudhari
@MihikChaudhari 6 ай бұрын
so in your opinion would taking the fastai course after this series be a better way to approach it? or are they overlapping/similar in terms of the topics?
@clray123
@clray123 6 ай бұрын
@@MihikChaudhari I don't think you need a fastai course, you can just follow Andrej's series, maybe also watch some lectures on neural nets (he also has some) for the theoretical underpinning.
@dx5nmnv
@dx5nmnv 2 жыл бұрын
You're literally the best, man. These lessons are brilliant, hope you keep doing them. Thank u so much
@talhakaraca
@talhakaraca 2 жыл бұрын
Seeing him back to education is great. Hope to see some computer vision lessons 👍👌
@HarishNarayanan
@HarishNarayanan 2 жыл бұрын
@@talhakaraca If you search on this very site you will find truly clear lessons on computer vision from Andrej (from like 2016 or so)!
@talhakaraca
@talhakaraca 2 жыл бұрын
@@HarishNarayanan thanks a lot. i found it 🙏
@HarishNarayanan
@HarishNarayanan 2 жыл бұрын
@@talhakaraca You are very welcome. It was that lecture series that got me first informed and interested in deep learning.
@jakobpcoder
@jakobpcoder 2 жыл бұрын
Absolutely insane levels of detail you are going into. This Series is invaluable for beginners in the field as well as for people like me, who are building own models all the time, but want to go back to basics from time to time to not get stuck in wrong assumptions learned from fast success with Keras :D I really hope you will continue this Series for quite a while! Thanks a lot, AI Grand Master Andrej!
@amanjha9759
@amanjha9759 2 жыл бұрын
The scale of impact these lectures will have is going to be enormous. Please keep doing them and thanks a lot Andrej.
@samot1808
@samot1808 2 жыл бұрын
I am pretty confident that the impact of his CS231n course is bigger than even his work at Tesla. I know too many people working in machine learning that where introduced to the field by CS231n. It changed my life. Makes you wonder if he should just spend all his efforts to teaching. The impact is truly exponential.
@XorAlex
@XorAlex 2 жыл бұрын
Too many people are working on AI performance and not enough people are working on AI alignment. If this trend continues, the impact might be enormously negative.
@samot1808
@samot1808 2 жыл бұрын
@@XorAlex please explain
@izybit
@izybit 2 жыл бұрын
@@samot1808 AI alignment and the AI control problem are aspects of how to build AI systems such that they will aid rather than harm their creators. it basically means that we are like kids playing with plutonium and it won't take much for someone to turn it into a bomb (on purpose or by mistake) and make everyone's life a living hell. All that leads to a need for more regulation and oversight of the really advanced AI models because otherwise we may end up with AI generators that can take a photo of you and create a video showing you killing babies, or worse, an AI that self-replicates and takes over entire systems leading to collapsed economies and countries (or, maybe, even something like the Terminator).
@ShadowD2C
@ShadowD2C 5 ай бұрын
@@samot1808 Hi, is there a way I can find his CS231n course online, Im all the way in Africa
@lukao7969
@lukao7969 3 ай бұрын
I've been learning ML for a year now. I've read dozens of books, watched courses, and practiced myself. It always feels that I have 10x more theory knowledge than I use in practice. When I have to build something like this, I stare at the screen and can't decide what/where/how to use it. Andrej's courses are the ones that glue everything. It shows that it is not some abstract math you must memorize but the math you built logically on top of each other. Watch how well he explains the maximum likelihood and why it is used as a loss function. Brilliant. Thanks a lot, Andrej. I truly feel, you enjoy and love to teach!
@ahmeterdonmez9195
@ahmeterdonmez9195 3 ай бұрын
I definitely agree with you. As a software developer, I noticed this too. Great professors from the best universities in the world spend hours and hours explaining theory and mathematics, drawing and writing. But they don't even write a single line of code. If I had the chance, I would shout at them: "Sir, should I make for example a speech recognition from the strange graphs you have drawn during these 30 hours?" Examine quickly the worst software development course and you'll see that the instructor writes code on the screen for at least 70% of the course. No one should tell me that I studied ML and DL at this great university. We see how they educate😁😁😁
@ax5344
@ax5344 2 жыл бұрын
OMG, I feel soooo grateful for the internet! I would have never met a teacher this clear and to my needs in real life. I have watched the famous Standford courses before; they have set a standard in ML courses. It is always the Standford courses and the rest. Likewise, this course is setting a new standard on hands-on courses. I'm only half an hour into the video. I'm already amazed by the sensitivity, clarity and organization of the course. Many many thanks for your generosity to step out and share your knowledge with numerous strangers in the world. Much much indebted! Thank you!
@krzysztofwos1856
@krzysztofwos1856 2 жыл бұрын
Andrej, your videos are the clearest explanations of these topics I have ever seen. My hat off to you. I wish you have taught my ML and NLP classes in college. There's a huge difference between your ground-up, code-first approach and the usual dry, academic presentation of these topics. It also demonstrates the power of KZbin as an educational tool. Thank you for your efforts!
@myfolder4561
@myfolder4561 Жыл бұрын
Thank you Andrej. I can't stress more how much I have benefited and felt inspired by this series. I'm a 40 yo father with a young kid. Work and being a parent have consumed lots of my time - I have always wanted to learn ML/neural network from the ground up but a lot of materials out there are just thick and dense and full of jargons. Coming from a math and actuarial background I had kind of expected myself to be able to pick up this knowledge without too much stumbling but seriously not until your videos did I finally feel so strongly interested and motivated in this subject. It's really fun learning from you and coding along with you - I'm leaving your lectures each time more energized than when it first started. You're such a great educator as many have said.
@李文强-q6p
@李文强-q6p 6 ай бұрын
You are the best teacher I've seen in my life. Thank you so much for doing these videos and materials for free. It is so clear to demonstrate how all the stuff is actually working.
@saintsfan8119
@saintsfan8119 2 жыл бұрын
Lex guided me here. I loved your micrograd tutorial. It brought back my A level calculus and reminded me of my Python skills from years back - all whilst teaching me the basics of neural networks. This tutorial is now putting things into practise with a real-world example. Please do more of these, as you're sure to get more people into the world of AI and ML. Python is such a powerful language for manipulating data and you explain it really well by building things up from a basic foundation into something that ends up being fairly complex.
@sandipandas4420
@sandipandas4420 14 күн бұрын
Pure fundamentals...how frequentist and probabilistic views are connected and presented in such an elegant manner for building a mini-llm with actual code in practice is simply awesome. Thank you so much @AndrejKarpathy!
@MattRosinski
@MattRosinski 2 жыл бұрын
I love how you make the connections between the counting and gradient based approaches. Seeing the predictions from the gradient descent method were identical to the predictions from statistical probabilities from the counts was, for me, a big aha moment. Thank you so much for these videos Andrej. Seeing how you build things from the ground up to transformers will be fascinating!
@dansplain2393
@dansplain2393 Жыл бұрын
I’ve literally never had heard the logits are counts, softmax turns into probs, way of thinking before. Worth the ticket price alone!
@BT-te9vx
@BT-te9vx Жыл бұрын
10 mins into the video, I'm amazed, smiling and feeling as if I've cracked the code to life itself (with the dictionary of bigram counts). of course, I can't build anything with the level of knowledge I have currently but I sure can appreciate how it works in a much better manner. I always knew that things are predicted based on their occurrence in data but somehow seeing those counts(for eg. for first 10 words, of `('a', ''): 7`) makes it so glaringly obvious which no amount of imagination could've done for me. You are a scientist, researcher, high paid exec, knowledgeable, innovator but more than anything, you are the best teacher who can elucidate complex things in simple terms which then all make sense and seem obvious. And that requires not just mastery but a passion for the subject.
@vil9386
@vil9386 Жыл бұрын
What a sense of "knowledge satisfaction" I have after watching your video and working out the details as taught. THANK YOU Andrej.
@mrdbourke
@mrdbourke 2 жыл бұрын
Another weekend watch! Epic to see these videos coming out! Thank you for all your efforts Andrei!
@Consural
@Consural 2 жыл бұрын
A teacher that explains complex concepts both clearly and accurately. I must be dreaming. Thank you Mr. Karpathy.
@cassie8324
@cassie8324 2 жыл бұрын
No one on youtube is producing such granular explanations of neural network operations. You have an incredible talent for teaching! Please keep producing this series, it is so refreshing to get such clear, first-principles content on something I'm passionate about from someone with a towering knowledge of the subject.
@palaniappanviswanathan6827
@palaniappanviswanathan6827 Ай бұрын
You are an embodiment of humbleness, selflessness and kindness. A person of your caliber having led the Tesla autopilot program as a director and being one of the founders of open ai, to come and teach the world for free is unheard off. You are a true inspiration for the whole world. I am truly blessed to learn from you. May GOD give you and your family a long healthy, happy prosperous and a peaceful life
@tusharkalyani4343
@tusharkalyani4343 Жыл бұрын
This video is the goldmine. It's so intuitive and easy to understand. Even my grad classes could not squeeze this information over a semester-long course. Hats off and it's a privilege to be learning from the accomplished AI master and the best. Thank you for the efforts Andrej :).
@akshaygulabrao4423
@akshaygulabrao4423 Жыл бұрын
I love that he understood what a normal person wouldn’t understand and explained those parts.
@hlinc2
@hlinc2 2 жыл бұрын
The clarity from this video of all the fundamental concepts and how they connect blew my mind. Thank you!
@bluecup25
@bluecup25 Жыл бұрын
I love how you explain every step and function so your tutorial is accessible for non-python programmers as well. Thank you.
@mahdi_sc
@mahdi_sc Жыл бұрын
The video series featured on your channel undoubtedly stands as the most consequential and intuitive material I have encountered. The depth of knowledge gained from these instructional materials is significant, and the way you've presented complex topics with such clarity is commendable. I find myself consistently recommending them to both friends and colleagues, as I truly believe the value they offer to any learner is unparalleled. The gratitude I feel for the work you've put into these videos is immense, as the concepts I've absorbed from your content have undoubtedly expanded my understanding and competence. This invaluable contribution to the field has benefited me tremendously, and I am certain it has equally enriched others' learning experiences.
@realquincyhill
@realquincyhill 2 жыл бұрын
Your intro to neural networks video is amazing, especially how you focused on the raw mathematical fundamentals rather than just implementation. I can tell this is going to be another banger.
@taylorius
@taylorius 2 жыл бұрын
I think minimal, simple-as-possible code implementations, talked through, are just about the best possible way to learn new concepts. All power to you Andrej, and long live these videos.
@steveseeger
@steveseeger 2 жыл бұрын
Andrej thanks for doing this. You can have a larger impact bringing ML to the masses and directly inspiring a new generation of engineers and developers than you could have managing work at Tesla.
@darielsurf
@darielsurf 2 жыл бұрын
Hi Andrej, I heard two days ago (from a Lex Fridman podcast) that you were thinking in pursuing something related to education. I was surprised and very excited, wishing that it was true and accessible. Today, I ran into your KZbin channel and I can't be happier, thanks a lot for doing this and for sharing your valuable knowledge! The lectures are incredible detailed and interesting. It's also very nice to see how you enjoy talking about these topics, that's very inspiring. Again, thank you!
@a9raag
@a9raag Жыл бұрын
This is one of the best educational series I have stumbled upon on YT in years! Thank you so much Andrej
@alternative1967
@alternative1967 9 ай бұрын
You're lifting the lid on the black box and it feels like Im sitting on a perceptron and watching the algos make their changes, forward, and back. It has provided such a deeper understanding of the topics in the video. I have recommended it to my cohort of AI students, of which I am one, as supplementary learning. But to be honest, this is the way it should be taught. Excelllent job, Andrej.
@rachadlakis1
@rachadlakis1 4 ай бұрын
This is a great overview of your work! It's impressive how you break down complex topics into manageable parts. The resources you've shared will surely help many people learn and practice. Looking forward to seeing more videos!
@gabrielmoreiraassuncaogass8044
@gabrielmoreiraassuncaogass8044 Жыл бұрын
Andrej, i'm from Brazil and love ML and to code. I have tried several different classes with various teachers, but yours was by far the best. Simplicity with quality. Congratulations! I loved the class! Looking forward to taking the next ones. The best!.
@Stravideo
@Stravideo 2 жыл бұрын
What a pleasure to watch! love the fact there is no shortcut, even for what may seem easy. Everything is well explained and easy to follow. It is very nice to show us the little things to watch for.
@niclaswustenbecker8902
@niclaswustenbecker8902 2 жыл бұрын
I love your clear and practical way of explaining stuff, the code is so helpful in understanding the concepts. Thanks a lot Andrej!
@talis1063
@talis1063 Жыл бұрын
You're such a good teacher. Nice and steady pace of gradual buildup to get to the end result. Very aware of points where student might get lost. Also respectful of viewers time, always on topic. Even if I paid for this, I wouldn't expect this quality, can't believe I get to watch it for free. Thank you.
@felipe_marra
@felipe_marra 3 ай бұрын
It is truly amazing how clearly you can explain things and how enjoyable your lesson is
@prashantjoshi5763
@prashantjoshi5763 Жыл бұрын
36:20 The way he showed us that how an untrained model and an trained model differs by their output, oh man.... Really I don't comment at all but this has to be said, you are the best in this field. Really looking forward to learn more from you sir.
@pneptun
@pneptun 5 ай бұрын
pure gold content!!! this is how it should be taught! too many tutorials just jump into the "practical" examples without explaining the underlying fundamentals. great lesson, thank you for posting this! 🙏
@RaviAnnaswamy
@RaviAnnaswamy 2 жыл бұрын
Mastery is ability to stay with fundamentals. Andrej derives the neural architecture FROM the counts based model! So the log counts, counts and probs are wrapped around the idea of how to get to probs similar to the counts model. Thus he explains why you need to log, why you need to normalizes, then introduces the name for it called softmax! What a way to teach. Brilliant master stroke is when he shows that the samples from the neural model exactly match the samples from the counts model. Wow, I would not have guessed it and many teachers might not have checked it. The connection between 'smoothing' and 'regularization' was also a nice touch. Teaching the new concepts in terms of the known so that there is always a way to think about new ideas rather than taking them as given. For instance the expected optimal loss of the neural model is what one would see in the counts model. Thanks Andrej! By the way one way to interpret the loss, is perplexity. What the number 2.47 says is that every character on average has typically about 2 or 3 characters that are more likely to follow it.
@roudytouly
@roudytouly 3 ай бұрын
Thank you for the clarity, simplicity and quality. This is the best NN course I have ever seen
@SHUBHAMJHA-o3g
@SHUBHAMJHA-o3g 3 күн бұрын
The explanation of the bug that can happen is just phenomenal! Demn this is good stuff
@anvayjain4100
@anvayjain4100 10 ай бұрын
The way he explained zip method even a beginner can understand. From very basic python to an entire language model. I can't thank this man enough for teaching us.
@AntLabJPN
@AntLabJPN Жыл бұрын
Another absolutely fantastic tutorial. The detail is incredible and the explanations are so clear. For anyone watching this after me, I feel that the micrograd tutorial is absolutely essential to watch first if you want to really understand things from the ground up. Here, for example, when Andrej runs the loss.backward() function, you'll know exactly what's happening, because you do it manually in the first lesson. I feel that the transition from micrograd (where everything is built from first principles) to makemore (relying on the power of pytorch) leaves you with a suprisingly deep understand of the fundamentals of language modeling. Really superb.
@adityay525125
@adityay525125 2 жыл бұрын
I just want to say, thank you Dr Karpathy, the way you explain concepts is just brilliant, you are making me fall in love with neural nets all over again
@FelipeKuhne-us4cl
@FelipeKuhne-us4cl 7 ай бұрын
I’ve just finished the whole playlist and, for some reason, I started from the last one (GPT Tokenizer), went through the ‘makemore’ ones, and finally watched this one. Each one is better than the other. I couldn’t appreciate more what you’re doing for the community of ‘homeless scientists’ (those who want to become better at their crafts but are not associated with an academic institution) out there, Andrej. The way you teach says a lot about how you learn and how you think others should learn. I hope to find more videos like yours and more people like you. Cheers!! 👏👏👏
@nipunsandamal9882
@nipunsandamal9882 Жыл бұрын
Recently I've been facing tough times, lost focus on my goal, regret my decisions, and im now working on self-improvement. I watched this video by Andre(Thank you so much for sharing your valuable ideas and time ), which inspired me. Leaving this comment as a reminder; I'll return when I reach my goal 😊
@biswaprakashmishra398
@biswaprakashmishra398 2 жыл бұрын
The density of information in these tutorials is hugeeeee.
@jeankunz5986
@jeankunz5986 2 жыл бұрын
Andrej, the elegance and simplicity of your code is beautiful and an example of the right way to write python
@actualBIAS
@actualBIAS Ай бұрын
This is just a wonderful series. Thank you so much! For everyone who wants to visualize this "network", here is the code. Run !pip install torchviz PIL in your notebook for this to work: from torchviz import make_dot from PIL import Image def visualize(tensor): params = {name: p for name, p in tensor.named_parameters()} if hasattr(tensor, 'named_parameters') else {} make_dot(tensor, params=params).render("computation_graph", format="png") img = Image.open("computation_graph.png") img.show()
@nikosfertakis1565
@nikosfertakis1565 2 жыл бұрын
This is amazing, thank you so much for all the hard work you've been doing Andrej! For anyone as confused as I first where as to why both P[0].sum() and P[:, 1].sum() would add to 1, note that this happens only because of the way N was constructed in the first place. Each row of N sums to the same amount that the corresponding column of N does, so N[0] == N[:, 0], N[1] == N[:, 1], etc. This is because each row has all the bigrams starting with a letter and each corresponding column has all the bigrams ending with that same letter. Another way to think about it is that each letter will be counted twice: once as the starting letter (adding 1 to the sum of its row) and once as the ending letter (adding 1 to the sum of its column). So the sums end up being equal which means a bug like dividing each column by the sum of each row would result to a matrix with columns summing to 1!
@richardmthompson
@richardmthompson Жыл бұрын
There's so much packed in there. I spent the whole day on this and got to the 20 minute mark, haha. Great teacher, thank you for this logical and practical approach.
@jedi10101
@jedi10101 Жыл бұрын
new sub here. i started w/ "let's build gpt: from scratch, in code, spelled out". i learned lot, enjoyed coding along, appreciated the thoughtful explanations. i'm hooked & will be watching the makemore series. thank you very much sir for sharing your knowledge.
@edz8659
@edz8659 2 жыл бұрын
This is the first of your lessons I have watched and you really are one of the best teachers I've ever seen. Thank you for your efforts.
@groundingtiming
@groundingtiming 11 ай бұрын
Andrej, you are simply amazing for doing this makemore series. I do not usually comment on videos, have not commented in a very long time, I just want to say thanks for your work and that the AI world is probably crazy now, it is videos like these that help even trained engineers get a proper understanding of how the models are made and the thoughts behind it, and not just implement and run or spend hours debugging because of a bug like broadcasting...
@jtl_1
@jtl_1 9 ай бұрын
Besides having the best explanation of LLMs from this great teacher, you get a free hands on python course, which has also better explanation than lots of others. Thx a lot Andrejq!
@jimmy21584
@jimmy21584 Жыл бұрын
I’m an old-school software developer, revising machine learning for the first time since my undergrad studies. Back in the day we called them Markov Chains instead of Bigram Models. Thanks for the fantastic refresher!
@lorisdeluca610
@lorisdeluca610 2 жыл бұрын
Wow. Just WOW. Andrej you are simply too good! Thank you for sharing such valuable content on KZbin, hands down the best one around.
@ronaldlegere
@ronaldlegere Жыл бұрын
There are so many fantastic nuggets in these videos even for those already with substantial pytorch experience!
@FireFly969
@FireFly969 8 ай бұрын
I love how you take nn, and explain to us, not by already built in function in pytorch, but by how things works, then giving us what the equivelent lf it in pytorch
@noah8405
@noah8405 2 жыл бұрын
Taught me how to do the Rubik’s cube, now teaching me neural networks, truly an amazing teacher!
@sebastianbitsch
@sebastianbitsch 2 жыл бұрын
What an incredible resource - thank you Andrej. I especially enjoyed the intuitive explanation of regularization, what a smooth way of relating it to the simple count-matrix
@filipcvetic6606
@filipcvetic6606 Жыл бұрын
Andrej’s way of explaining is exactly how I want things to be explained to me. It’s actually insane these high quality videos are free.
@jorgitozor
@jorgitozor 7 ай бұрын
Really incredible how you can explain clearly a complex subject only with raw material. Thanks a lot for the valuable knowledge
@DigiTechAnimation-xk1tp
@DigiTechAnimation-xk1tp 10 ай бұрын
The music in this video is perfect. It really sets the mood and creates a powerful emotional connection.
@mohitkumar-nt3sp
@mohitkumar-nt3sp 8 ай бұрын
If anyone confused at 1:17:00, just think like first row of x get multiplied with w's first cols corresponding entries, and returns a scaler value which then takes place of x@w[1,1], the operation continues for all the remaining 26 columns of w, and eventually filling values of x@w[1,j], j representing the column number of w. This process starts again with 2nd row of x and filling the values of x@w[2,j], this continues till we are exhausted with x's rows which are 5 here so, the final row is x@w[5, j], or x@w has shape of i,j where i = number of rows of x and j = number of rows of w.
@stephanewulc807
@stephanewulc807 Жыл бұрын
Brillant, simple, complete, accurate, I have only compliments. Thank you very much for one of the best class I had in my life !
@log_of_1
@log_of_1 Жыл бұрын
Thank you for taking the time out of your busy days to teach others. This is a very kind act.
@noobcaekk
@noobcaekk 2 жыл бұрын
Andrej is the MAN! Such a level of detail and explanation that I've yet to find anywhere. Thanks for these incredible videos! I am SO close to being done with this. Took a long time to get to the end (I used to do programming as a hobby in C++, Visual Basic, mIRC (and other IRC platforms) but that was years ago) so getting back in the mix has been a process. I am getting a "ValueError: only one element tensors can be converted to Python scalars" on the ix variable at the very end. I've commented out the old methods and typed the code as shown at 1:56:08 and can't seem to figure out the issue. The first rendition of ix works just fine, and commenting that out made no difference to this last section. Aye aye aye, so close!!
@adamderose9468
@adamderose9468 2 жыл бұрын
num_samples has to be 1 for the multinomial fn
@noobcaekk
@noobcaekk 2 жыл бұрын
@@adamderose9468 thanks for the tip! I found a single typo that was breaking things and managed to fix it a little while back. at this point I don't remember what it was, but it wasn't exactly where the error was showing, and again was just a measly typo LOL. gotta love it
@parthvadera1
@parthvadera1 5 ай бұрын
I love you Andrej! Learning from this series has been therapeutic for me. I’m glad you have decided to be full time educator. All the best with Eureka AI! It’s gonna make a positive impact on a lot of our lives. You da best!!❤
@wonderousw33
@wonderousw33 4 ай бұрын
Its most inspiring that there are always such genuine people in the world.
@mlock1000
@mlock1000 11 ай бұрын
Doing this again to really solidify the fundamentals and github copilot is hilarious. It's seen this code so many times that if it is enabled you can't actually type everything out for yourself! It all comes rolling out (pretty much perfect, tweaked to my style and ready to run) after the first character or two. Amazing times. Got to say, whoa. This is so good. So sick of fumbling about with tensors and this is a masterclass for sure. Thank you thank you thank you.
@yuanhu6031
@yuanhu6031 11 ай бұрын
I absolutely love these entire episode, high quality content and very educational. Thanks a lot for doing this for the good of general public.
@naveennarsaglla9195
@naveennarsaglla9195 4 ай бұрын
Thank you so much. Your videos are artistic, using math as your palette to create these beautiful outcomes. They are mesmerizing.. Really appreciate it for making these videos, you are truly making the world a better place.
@muhannadobeidat
@muhannadobeidat Жыл бұрын
Amazing delivery as always. Fact that he spent time explaining broadcast rules and some of the quirks of Keepdim shows how much knowledgeable he is and fact that he knows that most struggle with little things like that to get past what they need to do.
@adarshsonare9049
@adarshsonare9049 Жыл бұрын
I went through building micro grad 3~4 times, It took me a week to understand a good portion of that and now started with this. I am really looking forward to going through this series. Thanks for doing this Andrej, you are amazing.
@JuanuHaedo
@JuanuHaedo Жыл бұрын
Please DONT STOP soing this! The world is so lucky to have you sharing this knowledge!
@RodRigoGarciaCasado
@RodRigoGarciaCasado Жыл бұрын
Andrej enseña de manera pedagógica y sencilla un tema muy complejo y además regala muchos tips invaluables de programación, python, Torch y cómo aproximarse a la solución de un problema. Tengo varios años aprendiendo ML, casi literalmente desde cero (no soy ingeniero, ni estadístico, ni programador), y estas lecciones me ordenaron muchas cosas en mi cabeza, (ahá moments, como comentó alguien en el hilo), entendí mucho mejor conceptos y procesos que antes apenas alcanzaba a intuir. De verdad es como abrir la AI y ver cómo es por dentro. Recomiendo ver los videos en orden, antes de este vi el de Micrograd y me pareció increíble entender todo. De verdad, mil gracias por este aporte Andrej.
@imliuyifan
@imliuyifan Жыл бұрын
Note if you are following this in torch 2.0, the multinomial function might behave differently in getting the idx (3 instead of 13). Just downgrade to torch==1.13.1 if this bothers you.
@eriklinde
@eriklinde Жыл бұрын
Thank you! Was scratching my head a bit... Edit: Actually I still can't get it to reproduce 13 (getting 3)....
@RounakJain91
@RounakJain91 Жыл бұрын
Wow thank you. I've been getting a 10 on torch 2.1.0+cu118
@paullarkin2970
@paullarkin2970 Жыл бұрын
same@@RounakJain91 which is funny because when i left num samples = 20, the first sample was 13 but 1 sample gives me 10...
@NoahSDprof
@NoahSDprof 6 ай бұрын
Just finished this video, I loved it. I have decent intuition with code (mainly cpp about 7 years ago) but limited Python development experience, so some of the data structures were beyond me at first but I'm slowly assimilating! I really can't overstate how incredibly well made this video is for those who are eager to learn. Thank you, Adrej!
@yangchenyun
@yangchenyun Жыл бұрын
This lecture is well paced and introduces concepts one by one where later complex ones built on top of previous ones.
@asafprivman
@asafprivman 2 жыл бұрын
Wow Andrej this is really good stuff! I have 7 years SWE exp at a FANG company and the way you explain your process is so nice to learn from! thank you for the great content!
@asafprivman
@asafprivman 2 жыл бұрын
I'm blown away! I've never thought I'd say this but I actually like ML for the first time, and I'm a simple SWE with little knowledge of ML and this is your first video I've seen. Really interesting and easy to follow! thank you for being such a good teacher!
@allurbase
@allurbase 10 ай бұрын
Thanks for taking the time to explain broadcasting, the rules, and such a gentle introduction to Torch
@benjaminlai5638
@benjaminlai5638 Жыл бұрын
Andrej, thank you for creating these video. They are the perfect balance of theory and practical implementation.
@NarendraBME
@NarendraBME Жыл бұрын
Let me say it, THE best educational series. Sir, I don't have enough words to thank you.
@stefHin
@stefHin 4 ай бұрын
this video (and the whole tutorial series) is extremely good. Breaking down a complex topic into its components and explaining them in a structured way is definitely not easy and requires a lot of work. Thank you for doing this, I'm learning a lot. One thing I noticed (just to be clear, this is not really critique but just something interesting I noticed): at 1:56:05 the results are not exactly equal - the third and fifth example generated are different. The last few characters are equal again, since the models just use the previous character to determine the next, so when both models reach the same character again they are likely to continue the same way.
@felipeoyarce8288
@felipeoyarce8288 Жыл бұрын
I don't know why but these videos make me feel safe in a strange way. Andrej you're very kind and patient :)
@kindoblue
@kindoblue 2 жыл бұрын
Thank God you are pushing videos! Grateful 🤟
@yashdixit3609
@yashdixit3609 16 күн бұрын
Absolutely love these videos, Andrej! I wish i stumbled upon these sooner but glad i'm here now.
@Pythoncode-daily
@Pythoncode-daily Жыл бұрын
Thank you for the unique opportunity to learn how to write code from the most advanced developer, Andrej! An almost priceless and irreplaceable opportunity! Extremely useful and efficient!
@MihaiNicaMath
@MihaiNicaMath 2 жыл бұрын
The level of pedagogy is so so good here; I love that you start small and build up and I particularly love that you pointed out common pitfalls as you went. I am actually teaching a course where I was going to have to explain broadcasting this term, but I think I am just going to link my students to this video instead. Really excellent stuff! One small suggestion is to consider using Desmos instead of wolframalpha is you just want to show a simple function
@mahmoudabuzamel7038
@mahmoudabuzamel7038 Жыл бұрын
You're getting me speechless the way you explain things and simplify concepts!!!!
@LakhanPalRawat
@LakhanPalRawat Жыл бұрын
Like why dint I stumble across this video before? HOWWW??? This is a gem! It is almost magical that @AndrejKarpathy knew how to touch that sweet spot where people with no knowledge of AI vs people who knew about AI would both take away tons of information from this video. Sheer magic.
@yuluqin6463
@yuluqin6463 Жыл бұрын
Andrej, I know millions of people have already said that, but you are amazing. Thank you.
@richardbutobar3920
@richardbutobar3920 27 күн бұрын
You solidified both my learning and my will to learn.
@TonyStark-cp3tj
@TonyStark-cp3tj Жыл бұрын
Got to know from your talk with Lex that you spend about 10 hrs on making the content. Thanks a ton for offering this to us!
@Mutual_Information
@Mutual_Information 2 жыл бұрын
This is how I'm spending my time off from Thanksgiving break. Watching this whole series 🍿
Building makemore Part 2: MLP
1:15:40
Andrej Karpathy
Рет қаралды 366 М.
The spelled-out intro to neural networks and backpropagation: building micrograd
2:25:52
Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей
00:19
Beat Ronaldo, Win $1,000,000
22:45
MrBeast
Рет қаралды 158 МЛН
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН
A Hackers' Guide to Language Models
1:31:13
Jeremy Howard
Рет қаралды 539 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 404 М.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Let's build the GPT Tokenizer
2:13:35
Andrej Karpathy
Рет қаралды 667 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,4 МЛН
Building makemore Part 3: Activations & Gradients, BatchNorm
1:55:58
Andrej Karpathy
Рет қаралды 329 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,2 МЛН
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 5 МЛН
Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей
00:19