Monte Carlo Tree Search

Рет қаралды 132,866

Күн бұрын

This is a video I made for my class "CS310: Foundations of Artificial Intelligence" at the University of Strathclyde. The video has a brief description of the Monte Carlo Tree Search algorithm and includes a worked example.

Пікірлер: 213

@luiz00estilo 4 жыл бұрын

This is a ridiculously good explanation for MCTS. Keep teaching, you're really good at it

@tukuri91 6 жыл бұрын

Best explanation for MCTS

@omriram6960 4 жыл бұрын

I watched this vid like 10 times to make my monte carlo agent. thnx man

@johnlevine2909 4 жыл бұрын

You're very welcome!

@vnpikachu4627 3 жыл бұрын

The visualization make this algorithm so easy to understand! Thanks for your great work, professor!

@TusharPal93 2 жыл бұрын

Honestly, the best explanation for MCTS so far. This helped greatly, thank you ☺

@puturavindrawiguna3025 Жыл бұрын

Oh my god, this video is so great, the explanation is so clear, all with flowchart , pseudocode, visualization. Thank you for providing this wonderful learning material

@jiangfenglin4359 3 жыл бұрын

Your videos are so helpful. Thank you for posting these online!

@angelogross8390 4 жыл бұрын

Great explanation of the algorithm, showing all the elements which are needed to get a basic understanding of MCTS. Thank you.

@jarayu6494 5 жыл бұрын

What you did is admirable sir, great thanks for the explanation. Saved my day.

@taeyonkim 6 жыл бұрын

Great video. It was well explained and easy to follow. I'd love to see other videos related to deep learning if you have them.

@tommatensalatxd565 2 жыл бұрын

Thank you for your explanation. Your video was the only source where I could understand this algorithm. It really helps writing my seminar work

@johnlevine2909 2 жыл бұрын

You're very welcome! Glad it was of use.

@wangshiyao 3 жыл бұрын

Thank you Prof! The worked example makes the absolute difference!

@vaibhavsingh2898 9 ай бұрын

I came here after watching many vids on MCTS and none of the videos were able to make me understand the concept but this video is the best

@monkey_see_monkey_do 3 жыл бұрын

Sir, thank you so much for the explanations. Even dumb like me now feels like I will now apply MCTS to my tictactoe game! If teachers like you taught me when I was a kid my life would definitely be different. Just can't thank enough for this lecture!

@MegaHoegaarden 6 жыл бұрын

The best explanation. Thank you!

@cheyneravenscroft2945 3 жыл бұрын

Incredibly clear, after just one watch i now understand MCTS

@vaibhavsingh8715 2 жыл бұрын

By far, the simplest explanation of MCTS. Thank you so much.

@indieelectroworks 4 жыл бұрын

Very clear explanation of the algorithm and the video is also well made! Thank you!

@MrJustinRobertson 2 жыл бұрын

By far the best explanation I've seen. Thank you.

@cmulation5460 3 жыл бұрын

Thank you so much! I'm working on my FYP and this is the best explanation of MCTS out there!

@kate-145 10 ай бұрын

Thank you so much for the motivation this gives me to not give up, its a lot easier to believe in myself after understanding these concepts!

@andrewminhnguyen9446 2 жыл бұрын

What a great explanation! I've heard other explanations, but this is the condensed one I'll refer to in the future!

@florianhamel17 4 жыл бұрын

Thank you so much, because this is the first source I find that gives a true example. So once again, thank you so much. I think I'll now be able to implement it in C for my connect 4 :)

@tljstewart 2 жыл бұрын

This is pure gold! Thank you infinitely

@kitgary 3 жыл бұрын

This is the best step by step explanation of MTCS! You save my day! Thanks!

@mortenbrodersen8664 3 жыл бұрын

Great video. Explaining things clearly is hard. You make it look easy John.

@RickeyBowers 4 жыл бұрын

After reading about this method, the rollout was confusing. The verbosity of your example completely eliminates that confusion. Thank you.

@akhilezai 3 жыл бұрын

One of the best explanations of MCTS

@nikokolas1138 2 жыл бұрын

Your explanations are much better than the Wikipedia page. Congrats! Thought that MCTS was only applicable to games in which you have an opponent, glad to see it is not the case and will apply it to my pb. Thanks!

@aorusaki 4 жыл бұрын

Very good presentation. Thank you! I really needed clarification on whether to increment visit count BEFORE or AFTER backpropagation. Thanks for this :)

@TheDaAndy 7 жыл бұрын

Thank you for the video, you explained very well the single steps. That helps me a lot, most of all the expansion step was the one I didn't understand until now.

@johnlevine2909 7 жыл бұрын

You're very welcome! It took me a while to work out how the expansion step works, since the standard explanation offered is a bit unclear on this point.

@darmilatron 6 ай бұрын

Thank you for such a clear explanation of MCTS.

@sahhaf1234 3 жыл бұрын

a monument to clarity. thank you very much.

@micahwang4091 2 жыл бұрын

Very clear lecture. Great explanation! Thank you!

@johnr3936 2 жыл бұрын

Truly incredible explanation.

@durgaharish1993 7 жыл бұрын

Thanks a lot for the lecture, its was really helpful :)

@johnlevine2909 7 жыл бұрын

You're very welcome!

@goddoes8 6 жыл бұрын

Thank you for the great lecture.

@yuanhuang2489 6 жыл бұрын

Great lecture and vivid example! Thanks a lot! ^_^

@ThePositiev3x 4 жыл бұрын

At the beginning, your root node's visit number is 0, but you expand it instead of rollout. I know it is the right way but then you should alter your algorithm. At the left board you said "rollout if the ni value is 0" You can do this by saying the root node has the visit value 1 from the start.

@anthonyzavala7173 9 ай бұрын

Don't you expand and then rollout just like he did, meaning that the visits would be 0.

@The_Gingiraffe 3 жыл бұрын

Thank you! Explained it better than my professor ever did

@anthonyapm Ай бұрын

Beautifully explained

@joaquinvanloon7341 6 жыл бұрын

Great video! Helped me a lot, thanks.

@talhatariqyuluqatdis 5 жыл бұрын

Extremely helpful! Thank you.

@ianyang8228 Жыл бұрын

The best explanation I can find.

@sc0820 3 жыл бұрын

The content is so exciting even the camera shares the feeling.

@poshsagar 6 жыл бұрын

Really great lecture. Continue the good work :)

@charlesphillips1963 3 жыл бұрын

life saver , great run through

@robertmcvickers5975 3 жыл бұрын

Wow, this is liquid gold. Thanks John. Nice to hear from a fellow brit and not MIT OCW! Big up America too tho..

@AlessandroOrlandi83 3 жыл бұрын

Thanks the best explanation of MCTS. I wish to understand how it would work with a game with multiple players.

@someshpatel7660 6 жыл бұрын

Bro u rocked, alot respect for u. Thanks

@TheAIEpiphany 3 жыл бұрын

5:25 - It's not the smartest idea to take them in numerical order it will somewhat bias the algo, I think a better policy would be a random uniform one or to put some priors over the actions using some heuristic or better yet an ML model (maybe a policy dnn like in the case of AlphaGo) in case you know some specifics of the problem you're trying to solve with MCTS. It's also maybe worth mentioning that the expansion threshold doesn't necessarily have to be 1, it's the hyperparam of MCTS. So maybe you'll have to visit the node 40 times before expanding it. Finally in order to pick the best action at the end sometimes you don't want to choose it looking at the average reward of the child states but simply take the visit count - but I guess this is problem specific. Loved the walk-through, thank you John! Great pedagogical methods you've got.

@IqweoR 2 жыл бұрын

Thanks for the insights!

@GanymedeCroft Жыл бұрын

Thank you for the explanation. It was clear!

@AGCrush 3 жыл бұрын

Great Explanation!! Thank you alot!! Keep the good work going!

@alexfalconer-athanasssakos5066 7 жыл бұрын

Right on time for end of term, great video

@johnlevine2909 7 жыл бұрын

Thanks! Glad you found it useful.

@jayanmistry1768 5 жыл бұрын

Amazing explanation!!

@jhhh0619 5 жыл бұрын

Jeez. You are the best!

@sultanahmed5646 4 жыл бұрын

Thanks a lot for sharing this video Sir, It was really very help full.

@lesalemome 6 жыл бұрын

Great explanation of MCTS

@tolis947 3 жыл бұрын

First of all, thank you for the excellent lecture and explanation! The algorithm is very clear and understandable from the way it's presented. My question is: In the end you mention that what this gives us, is that we should take action a2 over a1. Why can't we take it a step further and also infer that, for example a5 is better than a6 (assuming we have visited them both more times than what is presented in the example). Why do we only use a whole search tree for just one decision, and then scrap it and rebuild it anew, instead of making more sequential decisions using the same tree?

@brendongong7295 Ай бұрын

amazing walkthrough, thank you

@irvingzhang5657 6 жыл бұрын

really helpful, thank you!

@ercanatam6344 2 жыл бұрын

Dear Prof. John, thank you for the amazing video on MCTS. Where ca we find the pseudocode for the MCTS? Which materials do you suggest for this?

@Gornak40 3 жыл бұрын

Very nice explanation! Thank you! I have a question: what should I do if I am in the leaf node and I should extend it, but the state of this node is terminal?

@robertjurjevic6580 6 жыл бұрын

thanks for sharing the lecture 😊

@meetkumarnaik1272 5 жыл бұрын

Clean explanation!

@xXUkanlosXx 3 жыл бұрын

Thank you very much! My game of othello is fully working now! :)

@NormanForster 6 жыл бұрын

Very nice explanation, thank you

@arbutusunedo6866 4 жыл бұрын

Best possible explanation...

@ahmadsalehiyan5610 10 ай бұрын

Great explanation, thanks

@smallwang 6 жыл бұрын

Thanks a lot, it's really clear!

@railgunpat3170 4 жыл бұрын

Great explanation.

@CSRocks 7 жыл бұрын

Very good and easy to follow explanation!

@johnlevine2909 7 жыл бұрын

Thank you! Glad you liked it.

@sevfx Жыл бұрын

This will go in my bachelor thesis about alpha go zero :) Thanks so much for the explanation! Do you have some sources for citation on MCTS maybe?

@samsonyu5679 Жыл бұрын

Thank you, this is very helpful 👍

@ivanmateev 2 жыл бұрын

awesome explanation. thank you

@Sakibkhan-lc8jy 2 жыл бұрын

Thank you very much.. You are great! ❤️

@malinyamato2291 2 жыл бұрын

thank u for helping me in my AI studies... ure great.

@Galakyllz Жыл бұрын

I really appreciate the algorithm trace.

@zucoorsmart5977 Жыл бұрын

So clear! Thank you!

@sajadshafaf534 9 ай бұрын

thank you very much, it helped me a lot!

@sqliu9489 Жыл бұрын

Super clear explanation!!! 🌟🌟🌟🌟🌟

@stivstivsti 2 жыл бұрын

Brilliant.

@seanwu2343 Жыл бұрын

Perfect video

@tobiasarndt5640 2 жыл бұрын

fantastic!

@zorankonkoli2050 2 жыл бұрын

great explanation! thx! a quick question about how the algorithm is constructed. the leaf nodes: the algorithm is constructed to rollout on the first visit, on the second visit to expand the tree & rollout, and on any subsequent visit to just traverse using UCB1 formula. is my understanding correct?

@gorkemvids4839 2 жыл бұрын

Amazing explanation. Thank you. But i'm not sure how to get terminal value of leaf state. Is it average score you get after you do random actions from this point?

@hazemtarek1302 6 жыл бұрын

on what basis you give a value to the unvisited nodes or you just generate them randomly?

@JO-vj9kn 4 жыл бұрын

Thanks for a great explanation! Question: Is there any reason you compute the sum of values when backpropagating instead of the incremental mean? I've seen both versions but I'm not sure why and when one should be picked over the other.

@martinkunev9911 2 жыл бұрын

the incremental mean is prone to numerical inaccuracies because floating point operations are imprecise

@dashadower9637 7 жыл бұрын

I've been searching for ages for a proper explanation on MCTS, and this is the best of all. Thanks for taking the time to make this! Just one question, however; how did you get the value of the leaf modes on rollout? I see that you randomized until terminal state. Do you use an evaluation function there?

@johnlevine2909 7 жыл бұрын

The idea is that the rollout should end in a state that really is terminal - the end of the game. For example, if we're playing tic-tac-toe, the terminal state is a win for one player or a draw because there are no moves left. In my example run, I'm imagining a game with a more sophisticated scoring system than tic-tac-toe, so the score that I'm using is the reward of the terminal state to the player that's making the decision. Hope this helps!

@dashadower9637 7 жыл бұрын

John Levine Thanks for the reply. However in the expansion phase, we're adding possible moves we can make, but before that, how can we simulate where the enemy will place his move?

@dashadower9637 7 жыл бұрын

John Levine If I elaborate: in the expansion phase, our current state is the state where the AI placed the stone. In a turn based game, the nodes added to S2 in the example in the video should be the opponent's moves. How do you simulate that?

@johnlevine2909 7 жыл бұрын

In a 2-player game the opponent will choose moves that keep that minimise the first player's score, while also choosing moves that keep the exploration term high. So the easiest way to do this is to put a negative sign in front of the first term, and then return the move that gives the maximum value for this new formula.

@MastersofTrickshots 6 жыл бұрын

Thanks for the great explanation! I have one question though: If we now decide after four iterations to choose action a_2 as you showed in the video, then the next four iterations would start in s_2 to determine the next action. For this, do we use the results of the calculations from previous iterations or do we "throw everything away" and start with all values initialized to 0?

@johnlevine2909 6 жыл бұрын

Let's say we do select a_2 for execution. So we perform a_2 and go into state s_2, as you say. s_2 now becomes the new root node, and we need to decide what action we should take next. As you also say, we already have some results stored in the tree which are useful to us, so it makes sense to start the next iteration using the sub-tree with s_2 as the root node.

@matthiasegli6846 4 жыл бұрын

really great video!

@justinrobertson6241 2 жыл бұрын

According to your flowchart, shouldn't you do a rollout immediately from S0 since it is a leaf node and it has never been visited?

@Must23 2 жыл бұрын

The voice recording system is so good, please tell me how you record this segment?

@mohamedalihammal5177 6 жыл бұрын

A perfectly explained video et method, my only issue for now is how the randon simulation module works, can anyone give me a hand with this ?

@marionmeyers9836 6 жыл бұрын

Thank you so much for this video, really helped me a lot. I just have one question : when expanding, why do you only roll out from one of the children? Wouldn't it give a more accurate score for the parent if we simulate all its children?

@johnlevine2909 6 жыл бұрын

Good question. In general, MCTS is trying to make the best decision it can from the limited number of rollouts that it can make in the time available. Therefore, every time it does a rollout, it updates the whole tree and then uses this updated tree too decide where to sample next - it could be that sampling only a single child will tell us that the node above is a poor choice, and so we will start sampling elsewhere in the tree.

@user-ce4zv9vd1p 2 жыл бұрын

Great Video!

@lialkalo4093 3 жыл бұрын

Very nice explanation, but I still don't get what the rollout phase does. Is it a continuation of choosing random states until the algorithm hits some terminal criterion (with the subsequent evaluation)?

@gianlucasperti4159 7 ай бұрын

Sorry I have a question, when we do a rollout from a leaf node, we have to start backpropagating the value from the leaf to the radix or from where we started the backpropagation to the radix?