Markov Decision Processes - Computerphile

  Рет қаралды 156,456

Computerphile

Computerphile

Жыл бұрын

Deterministic route finding isn't enough for the real world - Nick Hawes of the Oxford Robotics Institute takes us through some problems featuring probabilities.
Nick used an example from Mickael Randour: bit.ly/C_MickaelRandour
This video was previously called "Robot Decision Making"
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Пікірлер: 118
@Deathhead68
@Deathhead68 Жыл бұрын
This guy was my lecturer about 10 years ago. He was very down to earth and explained the concepts in a really friendly way. Glad to see he's still doing it.
@centcode
@centcode Жыл бұрын
We might have crossed paths at uni of bham
@Deathhead68
@Deathhead68 Жыл бұрын
@@centcode was there 2012-2015
@Mounta1ngoat
@Mounta1ngoat Жыл бұрын
Glad to see Nick here, he definitely provided some of the clearest and most interesting explanations throughout my degree. As well as setting us loose with a lot of Lego robots and watching chaos ensue.
@erazn9077
@erazn9077 Жыл бұрын
@@Mounta1ngoat lol that sounds great
@symonkanulah3809
@symonkanulah3809 Жыл бұрын
@@Deathhead68 it sounds great 👍
@CalvinHikes
@CalvinHikes Жыл бұрын
This channel makes me appreciate the human brain more. We do all that automatically with barely a moment's thought.
@Ceelvain
@Ceelvain Жыл бұрын
It also fail spectacularly from time to time. For instance the so-called "sunk cost fallacy" might make you stay at the train station for much too long. You've already invested so much time into waiting for the train, you don't want this time to go to waste. The fallacy is that the time spent waiting is not an investment. It's a pure loss.
@raginald7mars408
@raginald7mars408 Жыл бұрын
which causes ALL the Problems we create and we get ever more creative
@GizmoMaltese
@GizmoMaltese Жыл бұрын
The key is we don't always make the best choice. For example, if you're choosing a path to work as in this example, you may not make the best choice but it doesn't matter.
@Ceelvain
@Ceelvain Жыл бұрын
@@real_mikkim and with all this computation, it still manages to fall for the most basic fallacies. I'm very much unimpressed.
@mateuszdziezok8631
@mateuszdziezok8631 Жыл бұрын
OMG as a Robotics student, I'm amazed how well explained that is. Love it
@tlxyxl8524
@tlxyxl8524 Жыл бұрын
Just took a RL course. Bellman equation and Markovian assumptions are so familiar. Btw, for those who are interested, the algorithm to solve discrete MDP (or model based RL problems in general) are Value Iterations and Policy Iterations, which are all based on Bellman equation.
@gasdive
@gasdive Жыл бұрын
I made these decisions for my real commute. The train was fastest, but occasionally much longer. The car was fast, but the cost of parking equalled 2 hours of work, so was effectively slowest. The latest I could leave and be sure of being on time was walking.
@blacklabelmansociety
@blacklabelmansociety Жыл бұрын
Please, bring more from this guy
@SachinVerma-lx5bx
@SachinVerma-lx5bx Жыл бұрын
Where the formal definitions for concepts like MDP can get overwhelming , it really helps to have these easy to understand explanations
@pierreabbat6157
@pierreabbat6157 Жыл бұрын
There is a 3% chance that, somewhere along the route, there's a half-duplex roadblock because they're fixing the overhead wires or something. There's a 0.1% chance that a power line or tree fell across the road, forcing you to take an extremely long detour, but half of the time this happens, you could get past it on a bike.
@engineeringmadeasy
@engineeringmadeasy Жыл бұрын
Nice one, I met Professor Nick at Pembroke College Oxford. It was an honour.
@tobiaswegener1234
@tobiaswegener1234 Жыл бұрын
This was a fantastic simple explanation, very enlightening.
@phil9447
@phil9447 Жыл бұрын
MDP is the topic of my bachelorthesis and the example really helped understanding everything a lot better and I think I'll be using it throughout the thesis to understand the theory I have to write about. It's a lot easier to understand than some state a,b and c and action 1,2,3 :D
@Ceelvain
@Ceelvain Жыл бұрын
I rarely put a like on a video, but this one deserves it. I definitely want to hear more about the algorithms to solve MDP problems.
@Ceelvain
@Ceelvain Жыл бұрын
I heared a lot about MDP and policy functions in the context of reinforcement learning. But this is the best explanation I ever heared.
@yvesamevoin8720
@yvesamevoin8720 Жыл бұрын
You can read passion in every word he is pronouncing. Very good explanation.
@asfandiyar5829
@asfandiyar5829 Жыл бұрын
I literally had my final year project use a kalman filter to solve this problem. That's awesome! Edit: spelling
@Maciej-Komosinski
@Maciej-Komosinski Жыл бұрын
Kalman
@BobWaist
@BobWaist 9 ай бұрын
great video! Really well explained and interesting
@tristanlouthrobins
@tristanlouthrobins Ай бұрын
This is such a fascinating breakdown of Markov decision making. I love the mathematics that underpins Markov, but the creativity and imagination applied to the example and its host of solutions are delicious brain food.
@elwood.downey
@elwood.downey Жыл бұрын
the best explanation of this I've ever heard. many thanks.
@lucrainville4372
@lucrainville4372 Жыл бұрын
Fascination look into decision-making.
@cerealport2726
@cerealport2726 Жыл бұрын
I'd like an autonomous taxi system that would decide it's all too hard to take me to the office, and would just take me back home, or, indeed, just refuse to take me to the office. "Sorry, I"m working from home today because the car refused to drive itself."
@IceMetalPunk
@IceMetalPunk Жыл бұрын
"My robot ate my transportation, boss, there was nothing I could do *except* put my comfy PJs back on."
@cerealport2726
@cerealport2726 Жыл бұрын
@@IceMetalPunk Sounds legit, take the rest of the week off.
@Techmagus76
@Techmagus76 Жыл бұрын
Once the AI works well enough it puts the bike in the car and if noticed the traffic is high then takes the bike out and travel just the rest by bike. Next option use the bike to go to the train station and if the train is not coming directly switch to the bike.
@spyboyb321
@spyboyb321 Жыл бұрын
The timing of this video! I am currently trying to work on a project that uses this in my AI class
@Imevul
@Imevul Жыл бұрын
I've unconciously done something similar with my commute to work. I can take the subway or the bus. The subway usually always takes the same amount of time every time, but there's a longer walk and rarely there's signaling issues that may force me to take the bus anyways. During winter, the bus may have problems with getting stuck in the snowy hills, and then I'm forced to take a taxi. The bus also has a connection that I will sometimes barely miss, so I may need to wait either ~1 minute or ~15 minutes for the next one. But one upside is, if the connecting bus takes too long, or never comes, I'm pretty close to work already so I could walk the rest of the way in a pinch. The biggest problem is, I have no idea how to assign the right probabilities to each of those events. There's just not enough data (that I have access to at least). Usually, I just take the bus to work (less walking, and don't have to deal with signaling issues), and the subway home (to avoid the connecting bus). If nothing goes wrong, they are pretty similar in time.
@khaledsrrr
@khaledsrrr 4 ай бұрын
Phenomenal All the respect
@TGUGCL
@TGUGCL Жыл бұрын
Very interesting video. What about adding multiple criterias to the model. For instance: time, money in the model about commuting. Is there a software that can help you created and solve these types of Multiple criteria stochastic decision making problems? Something like Enterprise Dynamics, a discrete event simulation software platform
@firsttyrell6484
@firsttyrell6484 Жыл бұрын
image stabilization would be nice
@SystemSh0cker
@SystemSh0cker Жыл бұрын
Another perfect Video. Thanks for that! But I'm still asking myself... Will this continuous printing ever run out ??? :D
@Veptis
@Veptis Жыл бұрын
So is there a way to compute the solutions? Like I assume some matrices show up. One for probabilities and one for the sum of times. Then you can multiply it and get different time distributions for every strategy?
@LukaszWiklendt
@LukaszWiklendt Жыл бұрын
16:17 if you're allowed to remember how many cycles you waited for the train, does this mean you lose the Markov property? Or does the Markov property relate to the environment rather than your decision?
@mgostIH
@mgostIH Жыл бұрын
Looking up on Wikipedia it seems like they define the policy to only take the current state rather than current state + reward. Granted, you can always augment the state space to include each possible wait for the train at some specific amount of time on the clock and make it markovian, but the example they made does violate the markovian property if the nodes described are the states.
@vsandu
@vsandu Жыл бұрын
Excellent!!! Cheers.
@Leon-pu3vm
@Leon-pu3vm Жыл бұрын
Extremely nice
@samt2226
@samt2226 Жыл бұрын
What sort of paper is being used for the diagrams?
@GBlunted
@GBlunted Жыл бұрын
You shouldn't be afraid to ask the teacher, "Okay, explain that one more time..." So they get a chance at a better, cleaner more polished bits to put in the video.
@chipsafan1
@chipsafan1 Жыл бұрын
Am I correct to assume that a first-order Markov system is similar to frequentist statistical models as a methodology?
@WalkerRacing
@WalkerRacing Жыл бұрын
Brady will you please find someone to interview about chess engines/chess programming/neural nets. That would be super interesting
@ideallyyours
@ideallyyours Жыл бұрын
This interviewer isn't Brady. Says in the description: "This video was filmed and edited by Sean Riley."
@alphgeek
@alphgeek Жыл бұрын
Are the policies analogous to a reward function in a neural network?
@rd42537
@rd42537 Жыл бұрын
That paper takes me back!
@opusdei1151
@opusdei1151 Жыл бұрын
How does the algorithm work with imperfect information game like poker? Can you apply it to poker?
@DanielkaElliott
@DanielkaElliott Жыл бұрын
Its like if you are already late just take the bus bit if you have time according to Google maps take the fastest route Otherwise take the simplest route you have time for. (With least changes and walking)
@patrickbateman455
@patrickbateman455 Жыл бұрын
Very nice.
@bigprovola
@bigprovola Жыл бұрын
Let's see Paul Allen's Markov chain.
@jasontrunk3353
@jasontrunk3353 Жыл бұрын
this is great
@jonr6680
@jonr6680 Жыл бұрын
Fascinating and useful overview. I've watched a few machine learning lectures, it intrigues me that the logic, theory, mechanics etc is (at this 101 level) identical to decision theory that any human should - could - would use to live their lives efficiently... But never do! Because we were never taught how. So I bet even the scientists who program their AI for some corporate exploitative system (probably), ironically waste their life taking dumb decisions every day... And the example given of commuting to work is the classic First World Problem... Like gamblers we all think we know how to game the system, but by playing it we have ALREADY LOST. Did I just invent computational philosophy?? Per the reboot movie Tron - .
@avinier325
@avinier325 7 ай бұрын
Can anyone pls tell me where did he get his watch from.
@SozImaScrub
@SozImaScrub Жыл бұрын
@MarkovBaj any thoughts?
@timng9104
@timng9104 Жыл бұрын
wow probabilistic computing is kinda interesting. can u do a video on physical unclonable functions? I need an explainer like this XD
@deep.space.12
@deep.space.12 Жыл бұрын
So... next video gonna be POMDP?
@IanKjos
@IanKjos Жыл бұрын
There's no point in an edge going home from the railway station because having been at the railway station does not change the stochastic costs of the other options. Once you've decided the rail has the lowest stochastic cost, you're done. Now if we add a concept of traffic changing with time, then we have a higher-order model and the edge becomes pointful again.
@brettbreet
@brettbreet Жыл бұрын
What's the watch model he's wearing?
@chiboubamine5970
@chiboubamine5970 2 ай бұрын
I have a problem called Facilities Layout Problem which I am trying to solve it using Reinforcement Learning. The initial state is a layout that has a cost and the goal is chnage the facilities layout in order to minmize the cost. My question do this problem should be treated episodically or continously and what to do in the case where there is no absorbing state?? I would be extremely happy if someone could help.
@ChristophTungersleben
@ChristophTungersleben Ай бұрын
If episodicaly or continously depends on the beginning state of the 'system' each action is a episode but it is possible to have the optimal by chance. Without break a loop might follow.
@ENI232
@ENI232 3 ай бұрын
More!
@bongsurfer
@bongsurfer Жыл бұрын
Thanks
@ohsweetmystery
@ohsweetmystery Жыл бұрын
The bike can also take longer than 60 minutes. Flat tires, catastrophic mechanical failure, getting hit by another vehicle, etc.
@scottcox503
@scottcox503 Жыл бұрын
True but it's much more within your control
@terencewinters2154
@terencewinters2154 3 ай бұрын
Do robots cue up ?
@danielg9275
@danielg9275 Жыл бұрын
Coo coo cachoo the probability depends on you!
@TheThunderSpirit
@TheThunderSpirit Жыл бұрын
im too doing reinforcement learning now
@KibbleWhite
@KibbleWhite Жыл бұрын
This is great, except you got the percentages for traffic probability wrong. Light traffic is 10%, medium traffic is 20% and heavy traffic is 70% of the time.
@geniusdavid
@geniusdavid Жыл бұрын
Things to have as a computer scientist, a marker and paper. 😮
@marklonergan3898
@marklonergan3898 Жыл бұрын
I have to go to the bank and trust me I will be there in about the time of the year is starting to stir fry sauce instead of garlic on the way home now anyway I think I have a few things to do in the morning. There's predictive text models at work. Start with "I " and keep hammering the predicted word and see what comes out. 😁.
@Bill0102
@Bill0102 4 ай бұрын
Remarkable work! This content is fantastic. I found something similar, and it was beyond words. "Game Theory and the Pursuit of Algorithmic Fairness" by Jack Frostwell
@2k10clarky
@2k10clarky 7 ай бұрын
You might also have a soft deadline for arriving to work so for example as long as your late only 1% of the time
@RayCase
@RayCase Жыл бұрын
2022. Still using tractor feed printer paper as scrap.
@jasonmcfarlane7243
@jasonmcfarlane7243 Жыл бұрын
To all the people in the commentz-- No, he doesn't look 'wierd' or 'wrong', he has a lazy eye or similar condition. These conditions are common and normal. Shame on you.
@odiseezall
@odiseezall 11 ай бұрын
This is exactly what AI assistants should allow us to do - apply mathematical analysis to real world problems, in real-time.
@Jkauppa
@Jkauppa Жыл бұрын
make the difference/similarity of strict algorithm and fuzzy probabilistic selection algorithm clear
@Jkauppa
@Jkauppa Жыл бұрын
in the end the bayesian decision is the same as the strict algorithm, but implementation is wildly different and cleanness/interpretation of the algorithm can be clear/fuzzy (same problem, different paths, between step partial results, end result as logged)
@Jkauppa
@Jkauppa Жыл бұрын
fuzzy probabilistic ai vs dijkstra for shortest path
@Jkauppa
@Jkauppa Жыл бұрын
all algorithms give same kinds of answers for same problem but in different logical/math ways
@Jkauppa
@Jkauppa Жыл бұрын
describe dijkstra/A* in infinite memory probabilistic state algorithm
@Jkauppa
@Jkauppa Жыл бұрын
an algorithm might decide on fly while training if it remembers previous states or not
@Eagle3302PL
@Eagle3302PL Жыл бұрын
This video presents a problem, names a solution, doesn't present the named solution, then just ends. The whole video can be summed up as "in computer science sequential decisions with probable outcomes are made by using some approach, the approach requires some conditions to be determined for a desired outcome". IT NEVER SHOWS A SOLUTION, IT JUST SAYS THERE IS ONE. WHAT'S THE POINT?
@OwenPrescott
@OwenPrescott Жыл бұрын
It really bothers me that he's waving the pen around without the lid on
@Lion_McLionhead
@Lion_McLionhead Жыл бұрын
These shortest path algorithms convinced lions that whoever designs these algorithms is a lot smarter than a lion, spent an entire career designing just 1 algorithm, & it's pointless to try to remember them all.
@deanmarktaylor
@deanmarktaylor Жыл бұрын
I watched the film "The Mist" (2007) last night, it seems like "David" could have used a little "help" with this kind of decision making in the end.
@gollolocura
@gollolocura Жыл бұрын
Always take the bike
@alexandrumacedon291
@alexandrumacedon291 Жыл бұрын
there are no decisions. there are choices. and all are random. if the parameters are obscure. just like us we are biological machines we know rules but we chose as we please.
@iwir3d
@iwir3d Жыл бұрын
Lets go skynet! ..... Lets go skynet! Long live the robot overlords.
@hurktang
@hurktang Жыл бұрын
No one understand how trains work in this video. The infographic makes the train jitter on his route and no one ever heard of train schedules. We should also factor cost. The risk of accident, the health benefit, the capacity to read your email in the train...
@Computerphile
@Computerphile Жыл бұрын
The graphic illustrates that the route goes via somewhere else... (Unrealistic route for the timings but inspiration taken from my route from Nottingham to Oxford to meet Nick) HTH -Sean
@hurktang
@hurktang Жыл бұрын
​@@Computerphile ​Ah sorry ! That make sens. You turned a 150 minutes train ride in a 30 minutes train ride and I found the ride quite bumpy. Thanks for wasting the time to reply to me
@Computerphile
@Computerphile Жыл бұрын
You're welcome :0)
@BritishBeachcomber
@BritishBeachcomber Жыл бұрын
*Self driving car.* Bike swerves in front. Action? 1. Brake hard, but can you stop in time?. 2. Swerve left, but what about that little kid? 3. Swerve right and hit incoming traffic, maybe killing many more people? Humans are very bad when faced with uncertainties like that. Machines would be no better.
@6DAMMK9
@6DAMMK9 Жыл бұрын
“How to guide AI to draw 5 fingers instead of forcing it” or use chopstick to eat noodles or bake a cake
@ShadowGameAlchemy
@ShadowGameAlchemy Жыл бұрын
I really love all your videos, but I cant stand the sound of marker pen against the paper. That kinda hiss sound irritates to my core. I might be the only one in the world, but my brain is programmed that way. Can you please remove that sound or use a different ball point or other pen ? I have to hold my earphones far when you start writing. Please consider this.
@liftingisfun2350
@liftingisfun2350 Жыл бұрын
What happened to him
@michaelmueller9635
@michaelmueller9635 Жыл бұрын
My Sunday ...a chameleon is teaching me about robot decisions ...I'm trippin bro xD
@buraktekgul2079
@buraktekgul2079 11 ай бұрын
The paper's voice is so bad .please use white board for next videos
@veeek8
@veeek8 Жыл бұрын
So there is a scientific theory behind why i prefer cycling 😂
@D1ndo
@D1ndo Жыл бұрын
I was waiting for 17 minutes for him to actually solve the problem using the algorithm, yet he never got the the point, only babbled about the same thing over and over again. Big dislike.
@johnsenchak1428
@johnsenchak1428 Жыл бұрын
REPORTED NOT COMPUTER RELATED
How to Break free from your own Reality - Part 2
6:56
Manifestation Magic with Sandro Paolini
Рет қаралды 12
12.post.03 « Markov Rewards Process « Machine Learning « NUS School of Computing
30:53
ШЕЛБИЛАР | bayGUYS
24:45
bayGUYS
Рет қаралды 664 М.
I PEELED OFF THE CARDBOARD WATERMELON!#asmr
00:56
HAYATAKU はやたく
Рет қаралды 35 МЛН
I MADE A CARDBOARD SWING!#asmr
00:40
HAYATAKU はやたく
Рет қаралды 32 МЛН
where is the ball to play this?😳⚽
00:13
LOL
Рет қаралды 14 МЛН
Markov Decision Processes
43:18
Bert Huang
Рет қаралды 73 М.
Markov Chains Clearly Explained! Part - 1
9:24
Normalized Nerd
Рет қаралды 1,1 МЛН
Emulation - Computerphile
22:36
Computerphile
Рет қаралды 198 М.
Tricking AI Image Recognition - Computerphile
12:32
Computerphile
Рет қаралды 118 М.
COMPSCI 188 - 2018-09-18 - Markov Decision Processes (MDPs) Part 1/2
1:25:00
Webcast Departmental
Рет қаралды 39 М.
How I’d learn ML in 2024 (if I could start over)
7:05
Boris Meinardus
Рет қаралды 833 М.
What's Virtual Memory? - Computerphile
22:40
Computerphile
Рет қаралды 172 М.
I Day Traded $1000 with the Hidden Markov Model
12:33
ritvikmath
Рет қаралды 10 М.
The Next Era of Computing | Extropic
8:45
S3
Рет қаралды 34 М.
ШЕЛБИЛАР | bayGUYS
24:45
bayGUYS
Рет қаралды 664 М.