Reinforcement Learning: Crash Course AI #9

  Рет қаралды 138,503

CrashCourse

CrashCourse

Күн бұрын

Reinforcement learning is particularly useful in situations where we want to train AIs to have certain skills we don’t fully understand ourselves. Unlike some of the techniques we’ve discussed so far, reinforcement learning generally only looks at how an AI performs a task AFTER it has completed it. And when an AI completes that task figuring out when and how to reward an AI, called credit assignment, is one of the hardest parts of reinforcement learning. So today, we’re going to explore these ideas, introduce a ton of new terms like value, policy, agent, environment, actions, and states and we’ll show you how we can use strategies like exploration and exploitation to train John Green Bot to find things more efficiently next time.
Crash Course AI is produced in association with PBS Digital Studios:
/ pbsdigitalstudios
Crash Course is on Patreon! You can support us directly by signing up at / crashcourse
Thanks to the following patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
Eric Prestemon, Sam Buck, Mark Brouwer, Indika Siriwardena, Avi Yashchin, Timothy J Kwist, Brian Thomas Gossett, Haixiang N/A Liu, Jonathan Zbikowski, Siobhan Sabino, Zach Van Stanley, Jennifer Killen, Nathan Catchings, Brandon Westmoreland, dorsey, Kenneth F Penttinen, Trevin Beattie, Erika & Alexa Saur, Justin Zingsheim, Jessica Wode, Tom Trval, Jason Saslow, Nathan Taylor, Khaled El Shalakany, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, David Noe, Shawn Arnold, William McGraw, Andrei Krishkevich, Rachel Bright, Jirat, Ian Dundore
--
Want to find Crash Course elsewhere on the internet?
Facebook - / youtubecrashcourse
Twitter - / thecrashcourse
Tumblr - / thecrashcourse
Support Crash Course on Patreon: / crashcourse
CC Kids: / crashcoursekids
#CrashCourse #ArtificialIntelligence #MachineLearning

Пікірлер: 74
@hjaiswal768
@hjaiswal768 4 жыл бұрын
"A trade off between exploration and exploitation" - Thats life
@peka2478
@peka2478 4 жыл бұрын
Well, no, (Conquistador) Life can be both. Usually is.
@shashanksams
@shashanksams 4 жыл бұрын
yay, this was actually better than most of the explanatory videos i have seen. thanks for providing us always with informative content, crash course. looking forward for more of these videos
@TheCheck999
@TheCheck999 4 жыл бұрын
8:38 John Green bot from the past
@cesardachimp8172
@cesardachimp8172 4 жыл бұрын
oh man, i didnt notice that
@Pravaification
@Pravaification 4 жыл бұрын
*waves robo hand* Mr. Green Bot! Mr. Green Bot! What if I took TEN actions?
@thebrianpaige
@thebrianpaige 4 жыл бұрын
Loving this series. Thank you so much! There's so much info in every episode; just fantastic.
@pedro_a_martins
@pedro_a_martins 4 жыл бұрын
I'm really liking this series of videos Keep ip the good work :)
@StevieRZ
@StevieRZ 4 жыл бұрын
now,after watching multiple episodes in a row, I really want donuts :P also really enjoying this series :)
@LearnandGrowKidsTV
@LearnandGrowKidsTV 4 жыл бұрын
Such a great video! We most definitely do whatever it takes to get more cookies 🍪 😉
@_Baleful
@_Baleful Жыл бұрын
This video is fire. Insta-subbed!
@janyjj1969
@janyjj1969 4 жыл бұрын
I WOULD LOVEEEE IT IF CRASH COURSE HAD AN ACCOUNTING COURSE!!❤️️.
@Logan_Explores
@Logan_Explores 4 жыл бұрын
Jany JJ try Kahn Academy. I’m pretty sure they have an accounting section.
@janyjj1969
@janyjj1969 4 жыл бұрын
@@Logan_Explores Alright, thank you so much😻.
@randomalpaca
@randomalpaca 4 жыл бұрын
US or Canada or other?
@TheBlizzardfan7
@TheBlizzardfan7 4 жыл бұрын
Thanks for your Awsome Course, I got interested in Machine learning and I am planning to study that for my M.A.
@geoffreywinn4031
@geoffreywinn4031 4 жыл бұрын
Cool video!
@gitoshrisen7687
@gitoshrisen7687 4 жыл бұрын
This reminds me of Pavlov from my psychology class.
@TreeNap
@TreeNap 4 жыл бұрын
In the john green bot example, is the objective to find the shortest path or get the most points? What would getting more points even do, I feel like in that case exploration is best so that you can find the shortest path, exploiting only when racing another bot
@rajcivils
@rajcivils 4 жыл бұрын
Are you going to use openai for rl and keras when we come to deep reinforcement learning When will this playlist be finished.
@chaneldelarosa9786
@chaneldelarosa9786 4 жыл бұрын
Thank you!
@jimmybutler4076
@jimmybutler4076 4 жыл бұрын
Yo as a black guy with dreadlocks who likes coding, it’s really cool to listen to another black guy with dreadlocks who likes coding
@jacquelyncuturic8072
@jacquelyncuturic8072 Жыл бұрын
jabril got a new hat did u notice?
@cesardachimp8172
@cesardachimp8172 4 жыл бұрын
Can we do Crash Course Law? Like if you agree
@masternobody1896
@masternobody1896 4 жыл бұрын
Crash course doesn't make sense it makes jibril look bad
@nealkelly9757
@nealkelly9757 4 жыл бұрын
How about crash course Current Events. It's so hard to keep up with all the developments in US politics and such
@Muhammad_Syakir
@Muhammad_Syakir 4 жыл бұрын
PBS can try invite Meghan Markle and her team to do all the law educational stuff, she is very good at it. :)
@thethirdjegs
@thethirdjegs 4 жыл бұрын
@@nealkelly9757 crash course current events would have indefinite number of episodes.
@cesardachimp8172
@cesardachimp8172 4 жыл бұрын
@@masternobody1896 ??? what do you mean?
@krellend20
@krellend20 4 жыл бұрын
Jabril really has it out for bagels.
@cornellwaters9089
@cornellwaters9089 4 жыл бұрын
🏃 Thank You!
@MarkyMark1221
@MarkyMark1221 4 жыл бұрын
Can we get crash course geography
@shendosmani4509
@shendosmani4509 4 жыл бұрын
5:12 open all no risk high reward
@billniko9310
@billniko9310 4 жыл бұрын
Can we get computation theory lesson ,CC ?
@pinkponyofprey1965
@pinkponyofprey1965 4 жыл бұрын
0:17 That cookie looked completely ... edible hahaha! What brand is that? :D Screw Reinforcement Learning .... I'm now officially hungry! [back from the kitchen] Reinforcement Learning is actually extremely interesting! :D
@billniko9310
@billniko9310 4 жыл бұрын
Love you love you love you love ❤️
@anttibjorklund1869
@anttibjorklund1869 4 жыл бұрын
Why would JohnGreenBot in that battery example only go in straight lines? Would it not be better to go in a diagonal path?
@EclecticFruit
@EclecticFruit 4 жыл бұрын
because that is how he was taught to see the room and navigate it. If you want diagonals, maybe we'd have to arrange the room in hexagons instead.
@nahhale9088
@nahhale9088 4 жыл бұрын
Can we get crash course music theory?
@lincolnpepper816
@lincolnpepper816 4 жыл бұрын
that's the main thing i've been wanting for years and many people have asked for it. So you know, i remember on one other comment asking for it crash course replied that they've been thinking about it as well. Not sure what the chances are now, but that gives me hope
@sarujanrupan4831
@sarujanrupan4831 4 жыл бұрын
It's Jabril!!!
@void2509
@void2509 4 жыл бұрын
5:11 I'll just take all three items
@ianrbuck
@ianrbuck 4 жыл бұрын
Not sure the kitchen metaphor works for me. Why is the bag more likely to contain donuts than the box? It sure looked like the kind of box that donuts come in to me.
@castro_hassler
@castro_hassler 4 жыл бұрын
Nice ,
@johnopalko5223
@johnopalko5223 4 жыл бұрын
This fellow and his donut obsession. I don't know... 😊
@dvklaveren
@dvklaveren 4 жыл бұрын
Black & White and Black & White: Creature Island used reinforcement learning. The creature you commanded could learn incredibly complex routines, such as planting a sapling, water the tree with the water miracle, then pick it up and throw it into the resource center and repeat. With enough training. I really hope that we'll see more games exploring that kind of relationship with a computer character. Imagine a game where you're personally teaching a group of monsters how to hunt and then guiding their instincts by reinforcing or punishing a particular set of circumstances, until they conquer their world.
@kurtphilly
@kurtphilly 4 жыл бұрын
I don't agree with the bagel/donut choice example. Why choose the option of two bagels or donuts vs. the greater risk of more donuts (6) or a guaranteed single donut?
@amitzahirr
@amitzahirr 4 жыл бұрын
is this related to dijkstra's or a*?
@dkecskes2199
@dkecskes2199 4 жыл бұрын
9:40 is my hometown. Ok AI, good reinforcement for me.
@slikh
@slikh 4 жыл бұрын
Yeah, I recognized Boston St too :)
@AB-vw8wz
@AB-vw8wz 4 жыл бұрын
Markov decision process and Q learning, fcking tedious
@Graghma
@Graghma 4 жыл бұрын
Is there a better reason than consolidating the total amount of stored data the reason we only store a single value per square? Why not store 4 values per square so you can store a value per direction you could go from the current spot. That way you could find/exploit the near black hole shortcut that the current algorithm is too scared to find.
@TheSentinelAI
@TheSentinelAI 4 жыл бұрын
🤘🤘🤘
@senthu07
@senthu07 4 жыл бұрын
12th comment is mine...
@alexixeno4223
@alexixeno4223 4 жыл бұрын
I feel there are people in place of power/rich who need to watch this video... >.>
@JustaReadingguy
@JustaReadingguy 4 жыл бұрын
Outline what they should take way from the video and why.
@COOPSTOP
@COOPSTOP 4 жыл бұрын
Agent? like..... Agent Smith???!!!
@argonair4838
@argonair4838 4 жыл бұрын
Open Ai and Alpha Go
@hamzadbz1
@hamzadbz1 4 жыл бұрын
Who drives a car looking at side ways?
@user-zu6ts5fb6g
@user-zu6ts5fb6g 4 жыл бұрын
When jabrils is talking his mouth is moving. That is illegal.
@jimmyshrimbe9361
@jimmyshrimbe9361 4 жыл бұрын
That was a robot playing Don't Wake Daddy.
@rintume8631
@rintume8631 4 жыл бұрын
Like if AI beats slavery
@fabio2234
@fabio2234 4 жыл бұрын
First second!
@masternobody1896
@masternobody1896 4 жыл бұрын
Knowledge but man I don't understand can you make it easy
@thethirdjegs
@thethirdjegs 4 жыл бұрын
This episode is hard because it's oversimplified
@liliabel3160
@liliabel3160 4 жыл бұрын
You're good but You are going too fast
@clvrswine
@clvrswine 4 жыл бұрын
Jabril? Jabril? Laughing too much to type. Hey, I'm Jabril. Unreal, these people.
@sefp
@sefp 4 жыл бұрын
Rare too see a black guy talking about this subject but glad I did.
Symbolic AI: Crash Course AI #10
13:22
CrashCourse
Рет қаралды 110 М.
An introduction to Reinforcement Learning
16:27
Arxiv Insights
Рет қаралды 641 М.
The delivery rescued them
00:52
Mamasoboliha
Рет қаралды 6 МЛН
How I prepare to meet the brothers Mbappé.. 🙈 @KylianMbappe
00:17
Celine Dept
Рет қаралды 58 МЛН
Q-learning - Explained!
11:54
CodeEmporium
Рет қаралды 13 М.
Training an unbeatable AI in Trackmania
20:41
Yosh
Рет қаралды 12 МЛН
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Unsupervised Learning: Crash Course AI #6
12:35
CrashCourse
Рет қаралды 168 М.
Multi-Agent Hide and Seek
2:58
OpenAI
Рет қаралды 10 МЛН