RL Course by David Silver - Lecture 1: Introduction to Reinforcement Learning

  Рет қаралды 1,550,175

Google DeepMind

Google DeepMind

Күн бұрын

Пікірлер: 330
@jordanburgess
@jordanburgess 8 жыл бұрын
Just finished lecture 10 and I've come back to write a review for anyone starting. *Excellent course*. Well paced, enough examples to provide a good intuition, and taught by someone who's leading the field in applying RL to games. Thank you David and Karolina for sharing these online.
@Gabahulk
@Gabahulk 8 жыл бұрын
I've finished both of them, and I'd say that this one has a better and much more solid content, although the one from udacity is much more light and easy to follow, so it really depends on what you want :)
@adarshmcool
@adarshmcool 8 жыл бұрын
This course is more thorough and for someone who is looking to make a career in Machine Learning, you should put in the work and do this course.
@TheAdithya1991
@TheAdithya1991 8 жыл бұрын
Thanks for the review!
@devonk298
@devonk298 8 жыл бұрын
One of the best , if not the best , courses I've watched!
@saltcheese
@saltcheese 8 жыл бұрын
thanks for the review
@zingg7203
@zingg7203 8 жыл бұрын
0:01 Outline Admin 1:10 About Reinforcement Learning 6:13 The Reinforcement Learning problem 22:00 Inside an RL angent 57:00 Problems within Reinforcement Learning
@차정민-b1z
@차정민-b1z 8 жыл бұрын
Good job. Very thankful :)
@enochsit
@enochsit 7 жыл бұрын
thanks
@trdngy8230
@trdngy8230 7 жыл бұрын
You made the world much easier! Thanks!
@michaelc2406
@michaelc2406 6 жыл бұрын
Problems within Reinforcement Learning 1:15:53
@mairajamil001
@mairajamil001 3 жыл бұрын
Thank you for this.
@zhongchuxiong
@zhongchuxiong 2 жыл бұрын
1:10 Admin 6:13 About Reinforcement Learning 6:22 Sits in the intersection of many fields of science: solving decision making problem in these fields. 9:10 Branches of machine learning. 9:37 Characteristics of RL: no correct answer, delayed feedback, sequence matters, agent influences environment. 12:30 Example of RL 21:57 The Reinforcement Learning Problem 22:57 Reward 27:53 Sequential Decision Making. Action 29:36 Agent & Environment. Observation 33:52 History & State: stream of actions, observations & rewards. 37:13 Environment state 40:35 Agent State 42:00 Information State (Markov State). Contains all useful information from history. 51:13 Fully observable environment 52:26 Partially observable environment 57:04 Inside an RL Agent 58:42 Policy 59:51 Value Function: prediction of the expected future reward. 1:06:29 Model: transition model, reward model. 1:08:02 Maze example to explain these 3 key components. 1:10:53 Taxonomy of RL agents based on these 3 key components: policy-based, value-based, actor critic (which combines both policy & values function), model-free, model-based 1:15:52 Problems within Reinforcement Learning. 1:16:14 Learning vs. Planning. partial known environment vs. fully known environment. 1:20:38 Exploration vs. Exploitation. 1:24:25 Prediction vs. Control. 1:26:42 Course Overview
@passerby4278
@passerby4278 4 жыл бұрын
what a wonderful time to be alive!! thank god we have the opportunity to study a full module from one of the best unis in the world. taught by one of the leaders of its field
@tylersnard
@tylersnard 5 жыл бұрын
I love that David is one of the foremost minds in Reinforcement Learning, but he can explain it in ways that even a novice can understand.
@DEVRAJ-np2og
@DEVRAJ-np2og 4 ай бұрын
hlo, can u please suggeest roadmap for rl.
@nguyenduy-sb4ue
@nguyenduy-sb4ue 5 жыл бұрын
how lucky we are to have access to this kind of knowledge only with a button ! Thank you all in DeepMind public this course
@BhuwanBhatta
@BhuwanBhatta 4 жыл бұрын
I was going to say the same. Technology has really made our life easier and better in a lot of ways. But a lot of times we take it for granted.
@sachinkalwar4359
@sachinkalwar4359 3 жыл бұрын
@@BhuwanBhatta fvy5tym 🎉4ufgc🙏😎4g🔥f9f4c6v f😎j 9c
@anniekhoekzema9344
@anniekhoekzema9344 3 жыл бұрын
@@BhuwanBhatta ji kghkfktghjkhhiljcujfjpjkui jikskjgjpj
@NganVu
@NganVu 4 жыл бұрын
1:10 Admin 6:13 About Reinforcement Learning 21:57 The Reinforcement Learning Problem 57:04 Inside an RL Agent 1:15:52 Problems within Reinforcement Learning
@mathavraj9662
@mathavraj9662 4 жыл бұрын
bless u :)
@tga3532
@tga3532 7 жыл бұрын
The complete set of 10 lectures is brilliant. David's an excellent teacher. Highly recommended!
@socat9311
@socat9311 6 жыл бұрын
I am a simple man. I see a great course, I press like
@eyeofhorus1301
@eyeofhorus1301 6 жыл бұрын
Just finished lecture 1 and can already tell this is going to be one of the absolute best courses 👌
@ShalabhBhatnagar-vn4he
@ShalabhBhatnagar-vn4he 4 жыл бұрын
Mr. Silver covers in 90 minutes what most books do not in 99 pages. Cheers and thanks!
@vipulsharma3846
@vipulsharma3846 5 жыл бұрын
I am taking a Deep Learning course rn but seriously the comments here are motivating me to get into this one right away.
@ethanlyon8824
@ethanlyon8824 7 жыл бұрын
Wow, this is incredible. I'm currently going through Udacity and this lecture series blows their material from GT out of the water. Excellent examples, great explanation of theory, just wow. This actually helped me understand RL. THANK YOU!!!!!
@JousefM
@JousefM 4 жыл бұрын
How do you find the RL course from Udacity? Thinking about doing it after the DL Nanodegree.
@pratikd5882
@pratikd5882 4 жыл бұрын
@@JousefM I agree, those explanations by GT professors were confusing and less clear, the entire DS nanodegree which had ML, DL and RL was painful to watch and understand.
@guupser
@guupser 6 жыл бұрын
Thank you so much for repeating the questions each time.
@elichen
@elichen Ай бұрын
I'm really appreciating the intuitive style of this course, as contrasted to the Stanford course.
@Abhi-wl5yt
@Abhi-wl5yt 2 жыл бұрын
I just finished the course, and the people in this comment section are not exaggerating. This is one of the best courses on Reinforcement learning. Thank you very much DeepMind, for making this free and available to everyone!
@DrTune
@DrTune 2 жыл бұрын
Excellent moment around 24:10 when David makes it crystal clear that there needs to be a metric to train by (better/worse) and that it's possible - and necessary - to try to come up with a scalar metric that roughly approximates success or failure in a field. When you train something to optimize for a metric, important to be clear up-front what that metric is.
@JustinArmstrong-u5w
@JustinArmstrong-u5w Жыл бұрын
David is awesome at explaining a complex topic!. Great lecture. The examples really helped in understanding the concepts..
@yuwuxiong1165
@yuwuxiong1165 4 жыл бұрын
Take swimming as example: learning is part that you directly jump into the water and learn swimming to survive; planning is that part that before jumping into the water, you read books/instructions on how to swim (obviously sometimes planning helps, sometimes not, sometimes counter-helps).
@tristanlouthrobins
@tristanlouthrobins 10 ай бұрын
This is one of the clearest and most illuminating introductions I've watched on RL and its practical applications. Really looking forward to the following instalments.
@yehu7944
@yehu7944 7 жыл бұрын
Could you please turn on the auto generated subscript?
@주동욱-l9j
@주동욱-l9j 6 жыл бұрын
Plz..
@Zebra745
@Zebra745 6 жыл бұрын
As a learner of reinforcement learning, you should become an agent and improve yourself with getting rewards in this environment
@nirajabcd
@nirajabcd 4 жыл бұрын
Just completed Coursera's Reinforcement Learning Specialization and this is a nice addition to reinforce the concept I am learning.
@ImtithalSaeed
@ImtithalSaeed 6 жыл бұрын
I can say that I 've found a treasure..really
@Newascap
@Newascap 4 жыл бұрын
I actually prefer this 2015 class over the most recent 2019 one. Nothing wrong on the other expositor, but David kinda makes the course more smoothly.
@kiuhnmmnhuik2627
@kiuhnmmnhuik2627 7 жыл бұрын
@1:07:00. Instead of defining P_{ss'}^a and R_s^a, it's better to define p(s',r|s,a), which gives the joint probability of the new state and reward. The latter is the approach followed by the 2nd edition of Sutton&Barto's book.
@dalcimar
@dalcimar 5 жыл бұрын
Can you enable the automatic captioning to this content?
@hassan-ali-
@hassan-ali- 8 жыл бұрын
lecture starts at 6:30
@dbdg8405
@dbdg8405 3 ай бұрын
This is a superb course on so many levels. Thank you
@aam1819
@aam1819 11 ай бұрын
Thank you for sharing your knowledge online. Enjoying your videos, and loving every minute of it.
@AndreiMuntean0
@AndreiMuntean0 8 жыл бұрын
The lecturer is great!
@lauriehartley9808
@lauriehartley9808 4 жыл бұрын
I have never heard a punishment described as a negative reward at any point during my 71 orbits of the Sun. You can indeed learn something new every day.
@deviljin6217
@deviljin6217 Жыл бұрын
the legend of all RL courses
@rohitsaka
@rohitsaka 4 жыл бұрын
For Me : David Silver is God ❤️ What a Man ! What an Explanation. One of the Greatest Minds who changed the Dynamics of RL in the past few years.Thanks Deep mind for uploading this Valuable course for free 🤍
@mgonetwo
@mgonetwo Жыл бұрын
Rare opportunity to listen to Christian Bale after he is finished with dealing with criminals as Batman. On a serious note, overall great series of lectures! Thanks, prof. David Silver!
@wireghost897
@wireghost897 Жыл бұрын
It's really nice that he gives examples.
@vorushin
@vorushin 10 ай бұрын
Thanks a lot for the great lectures! I enjoyed watching every one of them (even #7). This is a great complement to reading Sutton/Barto and the seminal papers in RL. I remember looking at the Atari paper in the late 2013 and having hard time to understand why everyone is going completely crazy about it. A few years later the trend was absolutely clear. Reinforcement Learning is the key to push the performance of AI systems past the threshold where the humans can serve as wise supervisors to the limit when the different kinds of intelligence help each other to improve via self-play.
@mdoIsm771
@mdoIsm771 Жыл бұрын
I took this playlist as a reference for my thesis in "RL for green radio".
@asavu
@asavu 2 жыл бұрын
David is awesome at explaining a complex topic!
@erichuang2009
@erichuang2009 4 жыл бұрын
5 days to train per game. now is 5 minutes to complete a train based on recent papers. envolve fast!
@Edin12n
@Edin12n 5 жыл бұрын
That was brilliant. Really helping me to get my head around the subject. Thanks David
@43SunSon
@43SunSon 9 ай бұрын
Im back again, watching the whole video again.
@linglingfan8138
@linglingfan8138 3 жыл бұрын
This is really the best RL course I have seen!
@johntanchongmin
@johntanchongmin 4 жыл бұрын
Really love this video series. Watching it for the fifth time:)
@TheAIEpiphany
@TheAIEpiphany 3 жыл бұрын
His name should be David Gold or Platinum I dunno. Best intro to RL on YT, thank you!
@tianmingdu8022
@tianmingdu8022 7 жыл бұрын
The UCL lecturer is awesome. Thx for the excellent course.
@saranggawane4719
@saranggawane4719 2 жыл бұрын
42:00 - 47:55 : Information State/Markov State 57:13 RL Agent
@43SunSon
@43SunSon 4 жыл бұрын
I have to admit, david silver is slightly smarter than me.
@zhichaochen7732
@zhichaochen7732 7 жыл бұрын
RL could be the killer app in ML. Nice lectures to bring people up to speed!
@iblaliftw
@iblaliftw 2 жыл бұрын
Thank you very much, I recently got a good grade in RL thanks to your great teaching skills!!
@Esaens
@Esaens 4 жыл бұрын
Superb David - you are one of the giants I am standing on to see a little further - thank you
@dhrumilbarot1431
@dhrumilbarot1431 7 жыл бұрын
Thank you for sharing.It kinda inspires me to always remember that I have to pass it on too.
@yuxinzhang9403
@yuxinzhang9403 3 жыл бұрын
Any observation and reward could be wrapped up into abstract data structure in an object for sorting.
@Delta19G
@Delta19G Жыл бұрын
This is my first taste of deep mind
@AntrianiStylianou
@AntrianiStylianou 2 жыл бұрын
anyone can confirm if this is still relevant in 2022? I would like to study RL. It seems that there is a more recent series but with a different professor on this channel.
@HazemAzim
@HazemAzim 3 жыл бұрын
just amazing and different than any intro to RL
@AhmedThabit99
@AhmedThabit99 5 жыл бұрын
if you can activate the subtitle from youtube, it will be great, Thanks
@sng5192
@sng5192 8 жыл бұрын
Thanks for a great lecture. I got grasp the point of reinforcement learning !
@AlessandroOrlandi83
@AlessandroOrlandi83 4 жыл бұрын
Amazing teacher I wish I could partecipate to this course! I did a course on Coursera but it was so quick to explain very complex things.
@pratikd5882
@pratikd5882 4 жыл бұрын
Are you referring to the RL specialization by Alberta university? If so, then how good was it on the programming/practical aspects?
@AlessandroOrlandi83
@AlessandroOrlandi83 4 жыл бұрын
@@pratikd5882 Yes, I did that. The exercises were good, but I'm not an AI guy but a simple programmer. I managed to do the exercises but I think that explainations were very concise. So in 15 minutes they explain what you get in 1 hour on those lectures. I think that is very summarized. But it's good they have exercises. So I don't think after doing that I'm actually able to do much
@satishrapol3650
@satishrapol3650 2 жыл бұрын
Do you have any suggestions about which one to start with , the Lecture series here or the RL specialization by Alberta University (on Coursera). I need to apply RL on my own project work. By the way I did the course on Machine learning by NG Andrews and I could follow the pace it was good enough for me and besides the programming exercises helped me alot than I could imagine. But I am not sure if so would be the case with RL by Coursera as well. Can you guide me on this?
@alpsahin4340
@alpsahin4340 5 жыл бұрын
Great lecture, great starting point. Helped me to understand the basics of Reinforcement Learning. Thanks for great content.
@Dynamyalo
@Dynamyalo 4 ай бұрын
Right now I am sitting in my pajamas in the comfort of my home, eating a peanut butter and jelly sandwich and I have the ability to watch an entire course about an advanced topic online for free. What a time to be alive
@sachinramsuran7372
@sachinramsuran7372 5 жыл бұрын
Great lecture. The examples really helped in understanding the concepts.
@umountable
@umountable 6 жыл бұрын
46:20 this also means that it doesn't matter how you got into this state, it will always mean the same.
@rossheaton7383
@rossheaton7383 6 жыл бұрын
Silver is a boss.
@ajibolashodipo8911
@ajibolashodipo8911 3 жыл бұрын
Silver is Gold!
@smilylife7515
@smilylife7515 3 жыл бұрын
Please add subtitles to make it more helpful for those who are from non English native countries
@bennog8902
@bennog8902 7 жыл бұрын
awesome course and awesome teacher
@bocao3491
@bocao3491 5 жыл бұрын
Awesome! This is succinct and clarifies some concepts that I was confused of.
@dashingrahulable
@dashingrahulable 7 жыл бұрын
On Slide "History and State" @ 34:34, does the order of Actions, Observations and Rewards matter? If yes, then why the order isn't Observations, Rewards and Actions; the reasoning is that the agent sees the observations first, assesses the reward for actions and then takes a particular action? Please clarify if the chain-of-thought went awry at any place. Thanks.
@prashanthduvvuri7845
@prashanthduvvuri7845 4 жыл бұрын
The future is independent of the past given the present - David Silver
@utsabshrestha277
@utsabshrestha277 4 жыл бұрын
Only if it have Markov state
@prashanthduvvuri7845
@prashanthduvvuri7845 4 жыл бұрын
The above comment was meant to be in the context of your life. Your brain is a cumulative of all your prior experiences and the choices/decisions which you make will be an a action taken by your brain(which is a markov state). So what I perceived from that statement was that, "you need to forget your past and move on".
@lazini
@lazini 5 жыл бұрын
Thanks very much. But I need Eng.subtitle. Could you change setting of this videos? :)
@life42theuniverse
@life42theuniverse 2 жыл бұрын
The environment state(e,t) is Markov ... though it’s unknowable.
@subrahmanyaswamyperuru2469
@subrahmanyaswamyperuru2469 3 жыл бұрын
Watching at 0.75X speed helps to follow better.
@taherhabib3180
@taherhabib3180 3 жыл бұрын
His 2021 "Reward is Enough" paper makes us agree to the Reward Hypothesis @ 24:18 . :D
@mechanicalmonk2020
@mechanicalmonk2020 4 жыл бұрын
Lecture 1 has half a million views, 10 has 36k. I'm surprised it's even 36k
@CTPowerVideos
@CTPowerVideos 4 жыл бұрын
Is that David's Brompton in the corner I wonder?
@mehershrishtinigam5449
@mehershrishtinigam5449 Жыл бұрын
imp point at 1:00:30 1:00:22 gamma's value is less than 1
@pp-1954
@pp-1954 10 ай бұрын
I wonder Markov property is the same as Newtonian mechanics in physics which is deterministic like Markov states. Newtonian mechanics say that if you know every momentum and acceleration of every particles existent, you can calculate/know the future 100%.
@pp-1954
@pp-1954 10 ай бұрын
Revision) If Markov state talks about the "instant" moment in time (delta t->0), I guess you cannot figure out the velocity or momentum so it is different from Newtonian.
@jy2883
@jy2883 5 жыл бұрын
Is it possible to add subtitles or autogenerated captions to these lecture videos?
@filippomiatto1289
@filippomiatto1289 7 жыл бұрын
Amazing video, a very well-designed and well-delivered lecture! I'm going to enjoy this course, good job! 👍
@michaelleventhal2705
@michaelleventhal2705 4 жыл бұрын
Regretfully the audio volume is a tad low ... I can't follow the content on my phone in a somewhat noisy environment. I think this is why many people have asked for captioning
@abrahamyalley9973
@abrahamyalley9973 Жыл бұрын
Now starting...want to know if there are any coding aspects in this course
@lcswillems
@lcswillems 7 жыл бұрын
A really good introduction course!! Thank you very much!!
@MimJim6784
@MimJim6784 3 жыл бұрын
Please enable the auto subtitle generator!
@_jiwi2674
@_jiwi2674 3 жыл бұрын
at 30:45, isn't the agent getting reward after taking action? It's not taking action based on the reward it receives
@aidan9876
@aidan9876 3 жыл бұрын
I found these psychologically useful. Are subtitles available? "The future is independent of the past ,given the present."
@donamincorleone
@donamincorleone 8 жыл бұрын
Great video. Thanks. I really needed something like this :)
@razzlefrog
@razzlefrog 8 жыл бұрын
Only slide that threw me off a bit was the RL taxonomy one. There was some confusion with the redundant labeling, otherwise it was a great lecture!
@vballworldcom
@vballworldcom 5 жыл бұрын
Captions would really help here!
@cheungtyrone3615
@cheungtyrone3615 4 жыл бұрын
In the slide about information state, wouldn't it be better to say $$\Pr[H_{t+1}|S_t] = \Pr[H_{t+1}|S_1,\cdots,S_t]$$? Just saying that $S_{t+1}$ is independent of $S_{0:t-1}$ given $S_t$ doesn't $S$ to the reinforcement setup.
@nisargbhatt8828
@nisargbhatt8828 7 жыл бұрын
I have a question, would be great if someone could reply. If the Markov Rules says that the next Step would just depend on previous step and not the entire history. So the question is, for example, a helicopter is in certain angular position and certain velocity and angular velocity at that moment then I think it would be great if we have a history because maybe in the previous history, it may have same angular position and velocity so it would be knowing how to react in a better form as compared to just knowing of the previous step. Regards, NIsarg
@ihebbibani7122
@ihebbibani7122 2 жыл бұрын
actually begin from the beginning. It's not very true that we know only the previous step because the previous step depends also on its previous step. Which means the current step depends also on 2 steps back etc but actually we say that the current depends only on the previous step because at each time step , the agent accumulates information so as the information needed is already "here" no need to look back again except the previous one because we didn't yet accumulate its information. Hope it's a bit clearer.
@xinyuzhou7398
@xinyuzhou7398 13 күн бұрын
I am just new to RL and this course. Can anyone share the slides of this course? It seems that the offered link expires. Additionally, I was wondering if this course is appropriate to learn in 2024, since it has been 9 years.
@MGO2012
@MGO2012 7 жыл бұрын
Excellent explanation. Thank you.
@shivramshetty5502
@shivramshetty5502 7 жыл бұрын
Hi David, Thank you very much for such great video course. I would be very much appreciated if you help clear the following: 1. Is there a need for training by playing (hundreds of time) say the game as an example before being ready? if you have generalized program designed for reinforcement learning is it ready to play a new game who rules are different or need to be trained again. 2. There is mathematical proof that intermediate rewards can be calculated based on past experience - there is no need to know the intermediate rewards,. It can be derived say using deep mind Q function.- please correct - thank you. 3. In helicopter example current state depends on the past states like velocity acceleration - so not Markov 4. Can the reinforcement learning be realistically used to drive a car or run robot using generalized program that has inputs setup appropriately? Thank you, Sam
@edmonddantes4705
@edmonddantes4705 Жыл бұрын
1 Hundreds of times is absolutely nothing in ML or RL. In order to learn to play games with a rich state space like go or chess, AlphaGo or AlphaZero can play billions of games. Also, by "generalised program" you mean architecture or algorithm, like algorithms based on Q-learning, concrete policy gradient methods, etc. If the rules are different, you can use the same algorithm if you reckon it is suitable, but you would have to train from scratch (unless there is a lot of similarity between the games and you can reuse some weights, as one can do with word embeddings in diverse NLP problems). 2 You are not being rigorous about the statement of your question, so it is very hard to understand what you exactly mean. It absolutely necessary to measure intermediate rewards. 3 How is the helicopter state not Markov? The helicopter movement is simulated by running a controlled ODE, and a numerical solver for an ODE is Markov by definition. Of course it is Markov. 4 Self-driving cars are an example of that.
@abhijeetghodgaonkar
@abhijeetghodgaonkar 6 жыл бұрын
Excellent Indeed!
@deepschoolai
@deepschoolai 7 жыл бұрын
Err have you disabled captions on this video?
@eyeofhorus1301
@eyeofhorus1301 6 жыл бұрын
O.o
@bergamobobson9649
@bergamobobson9649 6 жыл бұрын
Great Lecture but for non-native speaker, my advice is too listen at low velocity but it's kind boring, he's too fast without subscript is difficult to follow at least for me
@ProfessionalTycoons
@ProfessionalTycoons 6 жыл бұрын
amazing introduction and very cool
@TriThom50
@TriThom50 2 жыл бұрын
When he talks about Reinforcement Learning vs. Planning, he says that RL is model free. But this is just talking about the task/problem of RL right and not that there are no model based algorithms?
@redberries8039
@redberries8039 6 жыл бұрын
...another vote for auto-captions ...it's useful just to watch as an attention keeper
@florentinrieger5306
@florentinrieger5306 Жыл бұрын
This is so good!
RL Course by David Silver - Lecture 2: Markov Decision Process
1:42:05
Google DeepMind
Рет қаралды 638 М.
[르세라핌] 자기 얘기 전해달라고 예나에게 부탁한 채원
0:41
르세라핌 탐구영역
Рет қаралды 7 М.
А я думаю что за звук такой знакомый? 😂😂😂
00:15
Денис Кукояка
Рет қаралды 3,1 МЛН
How To Choose Mac N Cheese Date Night.. 🧀
00:58
Jojo Sim
Рет қаралды 88 МЛН
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 2,9 МЛН
1, 2, 3, 4, 5, 6, 7, 8, 9 🙈⚽️
00:46
Celine Dept
Рет қаралды 111 МЛН
Sergey Levine - Reinforcement Learning in the Age of Foundation Models - RLC 2024
1:05:13
Reinforcement Learning Conference
Рет қаралды 2,4 М.
RL Course by David Silver - Lecture 3: Planning by Dynamic Programming
1:39:09
Emotional Intelligence: From Theory to Everyday Practice
1:02:29
Yale University
Рет қаралды 911 М.
The art of negotiation: Six must-have strategies | LBS
56:55
London Business School
Рет қаралды 5 МЛН
The Turing Lectures: The future of generative AI
1:37:37
The Alan Turing Institute
Рет қаралды 623 М.
Lecture 01   The Learning Problem
1:21:28
CorestratAI
Рет қаралды 1,3 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 3,7 МЛН
MIT 6.S191: Reinforcement Learning
1:00:19
Alexander Amini
Рет қаралды 57 М.
А я думаю что за звук такой знакомый? 😂😂😂
00:15
Денис Кукояка
Рет қаралды 3,1 МЛН