Monte Carlo Tree Search

Рет қаралды 136,845

Күн бұрын

This is a video I made for my class "CS310: Foundations of Artificial Intelligence" at the University of Strathclyde. The video has a brief description of the Monte Carlo Tree Search algorithm and includes a worked example.

Пікірлер: 216

@luiz00estilo 4 жыл бұрын

This is a ridiculously good explanation for MCTS. Keep teaching, you're really good at it

@tukuri91 6 жыл бұрын

Best explanation for MCTS

@omriram6960 4 жыл бұрын

I watched this vid like 10 times to make my monte carlo agent. thnx man

@johnlevine2909 4 жыл бұрын

You're very welcome!

@vnpikachu4627 3 жыл бұрын

The visualization make this algorithm so easy to understand! Thanks for your great work, professor!

@TusharPal93 2 жыл бұрын

Honestly, the best explanation for MCTS so far. This helped greatly, thank you ☺

@puturavindrawiguna3025 2 жыл бұрын

Oh my god, this video is so great, the explanation is so clear, all with flowchart , pseudocode, visualization. Thank you for providing this wonderful learning material

@RickeyBowers 4 жыл бұрын

After reading about this method, the rollout was confusing. The verbosity of your example completely eliminates that confusion. Thank you.

@tommatensalatxd565 2 жыл бұрын

Thank you for your explanation. Your video was the only source where I could understand this algorithm. It really helps writing my seminar work

@johnlevine2909 2 жыл бұрын

You're very welcome! Glad it was of use.

@vaibhavsingh2898 10 ай бұрын

I came here after watching many vids on MCTS and none of the videos were able to make me understand the concept but this video is the best

@monkey_see_monkey_do 3 жыл бұрын

Sir, thank you so much for the explanations. Even dumb like me now feels like I will now apply MCTS to my tictactoe game! If teachers like you taught me when I was a kid my life would definitely be different. Just can't thank enough for this lecture!

@kitgary 3 жыл бұрын

This is the best step by step explanation of MTCS! You save my day! Thanks!

@angelogross8390 4 жыл бұрын

Great explanation of the algorithm, showing all the elements which are needed to get a basic understanding of MCTS. Thank you.

@jarayu6494 5 жыл бұрын

What you did is admirable sir, great thanks for the explanation. Saved my day.

@jiangfenglin4359 3 жыл бұрын

Your videos are so helpful. Thank you for posting these online!

@MegaHoegaarden 6 жыл бұрын

The best explanation. Thank you!

@tljstewart 2 жыл бұрын

This is pure gold! Thank you infinitely

@vaibhavsingh8715 2 жыл бұрын

By far, the simplest explanation of MCTS. Thank you so much.

@wangshiyao 3 жыл бұрын

Thank you Prof! The worked example makes the absolute difference!

@florianhamel17 4 жыл бұрын

Thank you so much, because this is the first source I find that gives a true example. So once again, thank you so much. I think I'll now be able to implement it in C for my connect 4 :)

@ThePositiev3x 4 жыл бұрын

At the beginning, your root node's visit number is 0, but you expand it instead of rollout. I know it is the right way but then you should alter your algorithm. At the left board you said "rollout if the ni value is 0" You can do this by saying the root node has the visit value 1 from the start.

@anthonyzavala7173 10 ай бұрын

Don't you expand and then rollout just like he did, meaning that the visits would be 0.

@taeyonkim 6 жыл бұрын

Great video. It was well explained and easy to follow. I'd love to see other videos related to deep learning if you have them.

@MrJustinRobertson 2 жыл бұрын

By far the best explanation I've seen. Thank you.

@cmulation5460 3 жыл бұрын

Thank you so much! I'm working on my FYP and this is the best explanation of MCTS out there!

@kate-145 11 ай бұрын

Thank you so much for the motivation this gives me to not give up, its a lot easier to believe in myself after understanding these concepts!

@cheyneravenscroft2945 4 жыл бұрын

Incredibly clear, after just one watch i now understand MCTS

@indieelectroworks 4 жыл бұрын

Very clear explanation of the algorithm and the video is also well made! Thank you!

@akhilezai 3 жыл бұрын

One of the best explanations of MCTS

@sahhaf1234 3 жыл бұрын

a monument to clarity. thank you very much.

@anthonyapm 2 ай бұрын

Beautifully explained

@andrewminhnguyen9446 2 жыл бұрын

What a great explanation! I've heard other explanations, but this is the condensed one I'll refer to in the future!

@joaquinvanloon7341 6 жыл бұрын

Great video! Helped me a lot, thanks.

@micahwang4091 2 жыл бұрын

Very clear lecture. Great explanation! Thank you!

@mortenbrodersen8664 3 жыл бұрын

Great video. Explaining things clearly is hard. You make it look easy John.

@TheDaAndy 7 жыл бұрын

Thank you for the video, you explained very well the single steps. That helps me a lot, most of all the expansion step was the one I didn't understand until now.

@johnlevine2909 7 жыл бұрын

You're very welcome! It took me a while to work out how the expansion step works, since the standard explanation offered is a bit unclear on this point.

@TheAIEpiphany 3 жыл бұрын

5:25 - It's not the smartest idea to take them in numerical order it will somewhat bias the algo, I think a better policy would be a random uniform one or to put some priors over the actions using some heuristic or better yet an ML model (maybe a policy dnn like in the case of AlphaGo) in case you know some specifics of the problem you're trying to solve with MCTS. It's also maybe worth mentioning that the expansion threshold doesn't necessarily have to be 1, it's the hyperparam of MCTS. So maybe you'll have to visit the node 40 times before expanding it. Finally in order to pick the best action at the end sometimes you don't want to choose it looking at the average reward of the child states but simply take the visit count - but I guess this is problem specific. Loved the walk-through, thank you John! Great pedagogical methods you've got.

@IqweoR 2 жыл бұрын

Thanks for the insights!

@durgaharish1993 7 жыл бұрын

Thanks a lot for the lecture, its was really helpful :)

@johnlevine2909 7 жыл бұрын

You're very welcome!

@yuanhuang2489 6 жыл бұрын

Great lecture and vivid example! Thanks a lot! ^_^

@aorusaki 4 жыл бұрын

Very good presentation. Thank you! I really needed clarification on whether to increment visit count BEFORE or AFTER backpropagation. Thanks for this :)

@johnr3936 2 жыл бұрын

Truly incredible explanation.

@goddoes8 6 жыл бұрын

Thank you for the great lecture.

@talhatariqyuluqatdis 5 жыл бұрын

Extremely helpful! Thank you.

@darmilatron 7 ай бұрын

Thank you for such a clear explanation of MCTS.

@nikokolas1138 2 жыл бұрын

Your explanations are much better than the Wikipedia page. Congrats! Thought that MCTS was only applicable to games in which you have an opponent, glad to see it is not the case and will apply it to my pb. Thanks!

@charlesphillips1963 3 жыл бұрын

life saver , great run through

@poshsagar 6 жыл бұрын

Really great lecture. Continue the good work :)

@AGCrush 3 жыл бұрын

Great Explanation!! Thank you alot!! Keep the good work going!

@jayanmistry1768 5 жыл бұрын

Amazing explanation!!

@ianyang8228 Жыл бұрын

The best explanation I can find.

@someshpatel7660 6 жыл бұрын

Bro u rocked, alot respect for u. Thanks

@The_Gingiraffe 3 жыл бұрын

Thank you! Explained it better than my professor ever did

@brendongong7295 2 ай бұрын

amazing walkthrough, thank you

@jhhh0619 5 жыл бұрын

Jeez. You are the best!

@robertmcvickers5975 3 жыл бұрын

Wow, this is liquid gold. Thanks John. Nice to hear from a fellow brit and not MIT OCW! Big up America too tho..

@GanymedeCroft Жыл бұрын

Thank you for the explanation. It was clear!

@sc0820 3 жыл бұрын

The content is so exciting even the camera shares the feeling.

@sqliu9489 Жыл бұрын

Super clear explanation!!! 🌟🌟🌟🌟🌟

@ahmadsalehiyan5610 11 ай бұрын

Great explanation, thanks

@ivanmateev 2 жыл бұрын

awesome explanation. thank you

@alexfalconer-athanasssakos5066 7 жыл бұрын

Right on time for end of term, great video

@johnlevine2909 7 жыл бұрын

Thanks! Glad you found it useful.

@railgunpat3170 4 жыл бұрын

Great explanation.

@zucoorsmart5977 2 жыл бұрын

So clear! Thank you!

@arbutusunedo6866 4 жыл бұрын

Best possible explanation...

6 жыл бұрын

Very nice explanation, thank you

@irvingzhang5657 6 жыл бұрын

really helpful, thank you!

@zoc2 Ай бұрын

John levine and josh starmer are carrying me so hard my feet aren't even touching the ground

@meetkumarnaik1272 5 жыл бұрын

Clean explanation!

@user-ce4zv9vd1p 3 жыл бұрын

Great Video!

@CSRocks 7 жыл бұрын

Very good and easy to follow explanation!

@johnlevine2909 7 жыл бұрын

Thank you! Glad you liked it.

@sultanahmed5646 4 жыл бұрын

Thanks a lot for sharing this video Sir, It was really very help full.

@lesalemome 6 жыл бұрын

Great explanation of MCTS

@Sakibkhan-lc8jy 2 жыл бұрын

Thank you very much.. You are great! ❤️

@matthiasegli6846 4 жыл бұрын

really great video!

@AlessandroOrlandi83 3 жыл бұрын

Thanks the best explanation of MCTS. I wish to understand how it would work with a game with multiple players.

@samsonyu5679 Жыл бұрын

Thank you, this is very helpful 👍

@xXUkanlosXx 3 жыл бұрын

Thank you very much! My game of othello is fully working now! :)

@tobiasarndt5640 3 жыл бұрын

fantastic!

@sajadshafaf534 11 ай бұрын

thank you very much, it helped me a lot!

@smallwang 6 жыл бұрын

Thanks a lot, it's really clear!

@tolis947 3 жыл бұрын

First of all, thank you for the excellent lecture and explanation! The algorithm is very clear and understandable from the way it's presented. My question is: In the end you mention that what this gives us, is that we should take action a2 over a1. Why can't we take it a step further and also infer that, for example a5 is better than a6 (assuming we have visited them both more times than what is presented in the example). Why do we only use a whole search tree for just one decision, and then scrap it and rebuild it anew, instead of making more sequential decisions using the same tree?

@seanwu2343 Жыл бұрын

Perfect video

@ercanatam6344 2 жыл бұрын

Dear Prof. John, thank you for the amazing video on MCTS. Where ca we find the pseudocode for the MCTS? Which materials do you suggest for this?

@Gornak40 3 жыл бұрын

Very nice explanation! Thank you! I have a question: what should I do if I am in the leaf node and I should extend it, but the state of this node is terminal?

@robertjurjevic6580 6 жыл бұрын

thanks for sharing the lecture 😊

@malinyamato2291 2 жыл бұрын

thank u for helping me in my AI studies... ure great.

@stivstivsti 3 жыл бұрын

Brilliant.

@guillaumerousseau9549 4 жыл бұрын

Super clear thank you.

@LosSebosFR 22 күн бұрын

Thank you so much ! The explanations are very clear, you make it sound like it's easy! What's with the human calculator by the way ?

@marionmeyers9836 7 жыл бұрын

Thank you so much for this video, really helped me a lot. I just have one question : when expanding, why do you only roll out from one of the children? Wouldn't it give a more accurate score for the parent if we simulate all its children?

@johnlevine2909 7 жыл бұрын

Good question. In general, MCTS is trying to make the best decision it can from the limited number of rollouts that it can make in the time available. Therefore, every time it does a rollout, it updates the whole tree and then uses this updated tree too decide where to sample next - it could be that sampling only a single child will tell us that the node above is a poor choice, and so we will start sampling elsewhere in the tree.

@pedrosmmc 5 жыл бұрын

Didn't understand why after backpropagation on S5, S0 became 44 if it was 30 and S2 = 24. I thought we should add S0 = 30 plus S2 = 24 so S0 would be 54! By the way, best explanation EVER! Thanks a lot for sharing.

@morelfotsing2221 5 жыл бұрын

Pedro Coelho I believe we just backpropagate the value we obtained after the rollout. So if we got 14, we just add that 4 to each of the parents all the way to the root.

@QuantKinderChocolate 2 жыл бұрын

Nice lecture!

@user-sv1qi4rq5q 6 жыл бұрын

Great video

@Galakyllz Жыл бұрын

I really appreciate the algorithm trace.

@MastersofTrickshots 6 жыл бұрын

Thanks for the great explanation! I have one question though: If we now decide after four iterations to choose action a_2 as you showed in the video, then the next four iterations would start in s_2 to determine the next action. For this, do we use the results of the calculations from previous iterations or do we "throw everything away" and start with all values initialized to 0?

@johnlevine2909 6 жыл бұрын

Let's say we do select a_2 for execution. So we perform a_2 and go into state s_2, as you say. s_2 now becomes the new root node, and we need to decide what action we should take next. As you also say, we already have some results stored in the tree which are useful to us, so it makes sense to start the next iteration using the sub-tree with s_2 as the root node.

@zorankonkoli2050 3 жыл бұрын

great explanation! thx! a quick question about how the algorithm is constructed. the leaf nodes: the algorithm is constructed to rollout on the first visit, on the second visit to expand the tree & rollout, and on any subsequent visit to just traverse using UCB1 formula. is my understanding correct?

@dashadower 7 жыл бұрын

I've been searching for ages for a proper explanation on MCTS, and this is the best of all. Thanks for taking the time to make this! Just one question, however; how did you get the value of the leaf modes on rollout? I see that you randomized until terminal state. Do you use an evaluation function there?

@johnlevine2909 7 жыл бұрын

The idea is that the rollout should end in a state that really is terminal - the end of the game. For example, if we're playing tic-tac-toe, the terminal state is a win for one player or a draw because there are no moves left. In my example run, I'm imagining a game with a more sophisticated scoring system than tic-tac-toe, so the score that I'm using is the reward of the terminal state to the player that's making the decision. Hope this helps!

@dashadower 7 жыл бұрын

John Levine Thanks for the reply. However in the expansion phase, we're adding possible moves we can make, but before that, how can we simulate where the enemy will place his move?

@dashadower 7 жыл бұрын

John Levine If I elaborate: in the expansion phase, our current state is the state where the AI placed the stone. In a turn based game, the nodes added to S2 in the example in the video should be the opponent's moves. How do you simulate that?

@johnlevine2909 7 жыл бұрын

In a 2-player game the opponent will choose moves that keep that minimise the first player's score, while also choosing moves that keep the exploration term high. So the easiest way to do this is to put a negative sign in front of the first term, and then return the move that gives the maximum value for this new formula.