Best Multi-Armed Bandit Strategy? (feat: UCB Method)

  Рет қаралды 44,224

ritvikmath

ritvikmath

Күн бұрын

Пікірлер: 96
@shahnazmalik6553
@shahnazmalik6553 4 жыл бұрын
This is called teaching with the highest standards
@lilmoesk899
@lilmoesk899 4 жыл бұрын
You're a good teacher, man! Too bad only very few academics can explain things with the clarity and simplicity you do.
@ritvikmath
@ritvikmath 4 жыл бұрын
I appreciate that!
@suhailshaikh7139
@suhailshaikh7139 4 жыл бұрын
One of the best video explanations I have seen on Data science so far. Please keep up the good work Ritvik. Thanks a lot!!
@ritvikmath
@ritvikmath 4 жыл бұрын
Most welcome!
@nghianguyendanh5539
@nghianguyendanh5539 3 жыл бұрын
I was stuck on bandit algorithm for a day before I found your video. Excellent work!
@ritvikmath
@ritvikmath 3 жыл бұрын
Thanks!
@sophielenz9991
@sophielenz9991 Жыл бұрын
Choosing a place for dinner will never be the same again...your videos are fantastic, man! I was so frustrated earlier today because I simply couldn't get a grip on the UCB algorithm. Now, I am more than happy not only because I finally understood it (at least the intuition behind it), but also because I have a name for one of the dominating stories of my life (exploration - exploitation - dilemma). You, sir, are one of the most amazing teachers I ever experienced!
@debasishhazra3222
@debasishhazra3222 3 жыл бұрын
You simply rock 👍Your teaching style, way of explaining complex things in such a simpler fashion makes learning much easier and faster. Wonderful.
@marcusakiti7608
@marcusakiti7608 2 жыл бұрын
Love this video man. Just the simple message the viewer gets that you're here to help them and break down higher, abstract concepts into simpler terms they can grasp is incredibly reassuring. Even if I failed to understand any given part as a student I'd go back over and over with the confidence you're willing and able to help me get there eventually. Even if this channel isn't around forever never stop sharing your knowledge.
@aidanmorrison5925
@aidanmorrison5925 3 жыл бұрын
This is very, very well explained. Concise, yet conversational. Excellent stuff.
@xflory26x
@xflory26x 3 жыл бұрын
I've watched loads of your videos and it's given me so much clarity with so many different data science concepts. You're a really great teacher, hope you keep posting videos and hope your channel keeps growing!
@jeshocarmel
@jeshocarmel 3 жыл бұрын
this is too good Ritvik. Congrats you made learning UCB easier
@DaquanMHall
@DaquanMHall 2 жыл бұрын
I really appreciate your videos. i’m taking a course on machine learning and a/b testing and after every lesson I come watch your videos to actually understand what I just learned.
@hasaanijaz145
@hasaanijaz145 3 жыл бұрын
Your ability to communicate difficult concepts using story telling is unparalleled.
@neonzhang4216
@neonzhang4216 3 жыл бұрын
So great, clear my doubt completely. Please keep doing this!!
@alexandervalach4065
@alexandervalach4065 3 жыл бұрын
The best video explanation I have seen so far. Could not stop paying attention. Thank you!
@ritvikmath
@ritvikmath 3 жыл бұрын
Glad it was helpful!
@orb90210
@orb90210 2 жыл бұрын
The best math-computer-science instructor online. Much appreciated
@omarapacanadhih
@omarapacanadhih Жыл бұрын
BEST EXPLANATION EVER. Thank you so much, Ritvik!
@karapuzzx
@karapuzzx Жыл бұрын
Thank you, brother! You are very good at explaining and giving the right information. Respect!
@rajbhowmick8575
@rajbhowmick8575 8 ай бұрын
Your videos has cleared my concepts over the years. Please make a playlist on Reinforcement Learning.
@vijaydanielm3728
@vijaydanielm3728 4 жыл бұрын
The thing I love the most about your videos is the perfect balance between intuition, theory and matching them to results. Keep going! If you have a Patreon or equivalent account, I'd be honored to support you in this terrific journey of yours.
@ritvikmath
@ritvikmath 4 жыл бұрын
I appreciate that!
@pruthvisrungavarapu5238
@pruthvisrungavarapu5238 3 жыл бұрын
Agree!
@nonamenoname1942
@nonamenoname1942 3 жыл бұрын
Thank you, you're talented teacher. You explained it very well and clear.
@viola2dora
@viola2dora 3 жыл бұрын
This is such a great explanation! Thank you!
@Rudolf-ul1zh
@Rudolf-ul1zh 2 жыл бұрын
Thank you so much for also providing the link to the Hoeffding's inequality! Most other sources for this just skip the theory which I dislike since I would like to understand this algorithm.
@serbolothbrok
@serbolothbrok 3 жыл бұрын
better than my professor thank god i found your video, thank you very much!!
@amirnouripour5501
@amirnouripour5501 2 жыл бұрын
love the way you explain by examples!
@oswaldocastro9600
@oswaldocastro9600 3 жыл бұрын
Warning Everybody... Very adictive vídeos... I just can't stop seeing one after another. Fantastic job!!!
@sans8119
@sans8119 3 жыл бұрын
The explanation makes the concepts very clear.
@ritvikmath
@ritvikmath 3 жыл бұрын
thanks!
@arc6fire
@arc6fire 2 жыл бұрын
thanks a lot for these multi-bandit videos.......... spent ages trying to figure this stuff out, your explanations have helped a lot Thank you :-)
@shipan5940
@shipan5940 2 жыл бұрын
while watching this vid, i unconsciously started nodding!!!
@brendansullivan4872
@brendansullivan4872 2 жыл бұрын
Made kid easy. Thanks for teaching this and being clear as day.
@tyler-iy4jk
@tyler-iy4jk Жыл бұрын
Came here to get better at picking restaurants but stayed for the data science teaching!
@ritvikmath
@ritvikmath Жыл бұрын
Woo!
@madhuryadav4316
@madhuryadav4316 3 жыл бұрын
This is such a good explanation. Brilliant.
@ritvikmath
@ritvikmath 3 жыл бұрын
Glad you think so!
@ashyboy1324
@ashyboy1324 2 жыл бұрын
Ritvik you are a pedagogical GOD
@mohammedluqman8757
@mohammedluqman8757 2 жыл бұрын
In real-world problems, state-space will be very big and we will not get enough time to explore all possible states. In such cases, UCB1 should perform better than exploitation..
@shivangitomar5557
@shivangitomar5557 Жыл бұрын
Amazing! Thanks a lott!!
@RajarshiGhoshal
@RajarshiGhoshal 4 жыл бұрын
I like your videos dude. Thank you for creating them!
@ritvikmath
@ritvikmath 4 жыл бұрын
Glad you like them!
@9okku
@9okku 2 жыл бұрын
This is great. You should definitely continue with reinforcement learning applications!!!
@kisholoymukherjee
@kisholoymukherjee 2 жыл бұрын
You are a great teacher indeed.
@bhargav975
@bhargav975 3 жыл бұрын
This was an excellent video. Thanks.
@ritvikmath
@ritvikmath 3 жыл бұрын
Glad it was helpful!
@Mars.2024
@Mars.2024 8 ай бұрын
Always the best 👌 I hope you design a RL course one day. It will definitely be one of the best🌝
@AdityaMehta1307
@AdityaMehta1307 Ай бұрын
The video was very useful, Thank you !
@jjsharat5895
@jjsharat5895 3 жыл бұрын
Thank you very much..you made it very easy to understand
@ritvikmath
@ritvikmath 3 жыл бұрын
You are welcome!
@thevoyager7675
@thevoyager7675 Жыл бұрын
That's an amazing explanation!
@XShollaj
@XShollaj 3 жыл бұрын
Wonderful explanation
@ritvikmath
@ritvikmath 3 жыл бұрын
Glad you think so!
@beattoedtli1040
@beattoedtli1040 3 жыл бұрын
That's probably Hoeffding's inequality. Maybe the name sounds strange, but nevertheless deserves to be spelled correctly!
@beaastbabo9820
@beaastbabo9820 4 жыл бұрын
Hey thx a lot for the explanations! Maybe you can make a third video about random and directed exploration. There are a lot more models like the UCB :)
@kostyamamuli1999
@kostyamamuli1999 2 жыл бұрын
great explanation bro!
@zameelvisharathodi7859
@zameelvisharathodi7859 3 жыл бұрын
Awesome explanation. Thanks a lot
@imagenitin
@imagenitin 3 жыл бұрын
Last option (n=100) is akin to real life. There are so many things to do and choose from in a short time. Exploiting is a better strategy to reduce regret - Make the most of what you got !
@gopsda
@gopsda 3 жыл бұрын
Nicely explained, Thanks.
@ziggyzagh
@ziggyzagh 2 жыл бұрын
Thank you so much you explained that very well
@abhijitji
@abhijitji 3 жыл бұрын
Your videos are getting only better! Thank you very much. Is the restaurant's happiness score equivalent to the rewards delivered?
@itsdazeus
@itsdazeus 3 жыл бұрын
Keep up the good work !!
@bregottmannen2706
@bregottmannen2706 3 жыл бұрын
Are there any models that factor in staleness? I would image going to the same restaurant 297 days in a row would be pretty boring so the optimal strategy should include the other restaurants every once in a while.
@ankittiwari1910
@ankittiwari1910 3 жыл бұрын
Very helpful!! Just wanna know if we don't have any prior info about happiness distribution of each restaurant, then how to use this UCB algorithm. In total cold start problem what parameters will be helpful to decide happiness distribution of restraunt in city.
@javierredondo
@javierredondo 9 ай бұрын
Wouldn't the averages have to be within a specific range (e.g. [0,1])? Considering the explanation in the video, if the means move in an order of thousands, the bound would have practically no effect on the decision. Please correct me if this is not correct. Thanks!
@mr-shady
@mr-shady 2 жыл бұрын
Hi, first of all, very well put together video! One question: in exploitation approach, in the example, we visited each restaurant once (n times in total) and then continued with the best observed one for the rest of 300 - n days, right? Also, I find it quite surprising that exploitation only outperforms UCB1 for larger n, intuitively it seems that exploitation only approach is less stable/more up to chance (may perform worse than even exploration only). I guess the second term based on Hoeffding's inequality really punishes UCB1 in the example 🤔
@rajatchandiwala981
@rajatchandiwala981 4 жыл бұрын
This is super cool! Thanks :)
@eitan166
@eitan166 3 жыл бұрын
Mindblowing!
@thinkanime1
@thinkanime1 Жыл бұрын
Another very good viedo.
@ritvikmath
@ritvikmath Жыл бұрын
Glad you enjoyed it
@shahulrahman2516
@shahulrahman2516 7 ай бұрын
Clear explanation
@hasaanijaz145
@hasaanijaz145 3 жыл бұрын
Kindly, also upload a video about Thompson Sampling as well! Exam in 4 days
@shivamdubey4783
@shivamdubey4783 3 жыл бұрын
great tutorial brother can you make an lecture on ucb1 derivation
@haneulkim4902
@haneulkim4902 3 жыл бұрын
Hi, MAB seems to be inefficient when there are lots of arms. One way to calculate q-value for multiple arms using single model is by using contextual bandits, could you explain how contextual bandit does this? I cannot understand how one model outputs q-value for multiple arms..
@kisholoymukherjee
@kisholoymukherjee 2 жыл бұрын
One additional question: can this be solved through an optimization problem's solution?
@Archer-kx5ct
@Archer-kx5ct 7 ай бұрын
You’re the goat
@shubh4319
@shubh4319 2 жыл бұрын
Can you make a video on Contextual Bandit
@AshokKumar-lk1gv
@AshokKumar-lk1gv 4 жыл бұрын
very nice lecture
@lucaspecina
@lucaspecina 4 жыл бұрын
Nice!! thank you
@ritvikmath
@ritvikmath 4 жыл бұрын
No problem!
@ebtehalturki6329
@ebtehalturki6329 3 жыл бұрын
PERFECT !!!
@jclopez5447
@jclopez5447 4 жыл бұрын
First, i love your channel!
@cambridgebreaths3581
@cambridgebreaths3581 4 жыл бұрын
Hi. I have just watched a couple of your videos and couldn't resist the temptation to subscribe and binge on all the materials. Very impressed by the intuitiveness of your approach. May I ask if you have or recommend any materials to intuitively understand epsilon automata machines and CSSR algorithm. Utterly grateful for your reply.
@rahulshah6119
@rahulshah6119 9 ай бұрын
To use hoeffdings, you need to be bounded. Why do we see that here?
@hp4000
@hp4000 4 жыл бұрын
thank you!
@MsKisshello
@MsKisshello 3 жыл бұрын
perfect!
@herzallahsaich4068
@herzallahsaich4068 3 жыл бұрын
thanks
@shiyuwang
@shiyuwang 2 жыл бұрын
after seeing this video I decide not to continue exploration
@АртемЩербаков-ш5э
@АртемЩербаков-ш5э 2 жыл бұрын
nice :)
@김하준-j4h
@김하준-j4h 2 жыл бұрын
cannot believe
@shravankalyankar9194
@shravankalyankar9194 2 жыл бұрын
explained everything in a hurry, till I reached the end of the video I have already had forgotten what did you say at the start of the video. and watching again and again is also not helping. please put the other formulas on white board as well and show by calculating a manually a bit, so that the ideas and concept has time to sink in our brains. running to the end, won't help the learners.
Time Series Model Selection (AIC & BIC) : Time Series Talk
9:42
Multi-Armed Bandit : Data Science Concepts
11:44
ritvikmath
Рет қаралды 98 М.
Andro, ELMAN, TONI, MONA - Зари (Official Music Video)
2:50
RAAVA MUSIC
Рет қаралды 2 МЛН
JISOO - ‘꽃(FLOWER)’ M/V
3:05
BLACKPINK
Рет қаралды 137 МЛН
Thompson sampling, one armed bandits, and the Beta distribution
12:40
Serrano.Academy
Рет қаралды 24 М.
Bayes theorem, the geometry of changing beliefs
15:11
3Blue1Brown
Рет қаралды 4,6 МЛН
Multi-Armed Bandits and A/B Testing
19:01
Jay Feng
Рет қаралды 6 М.
Thompson Sampling : Data Science Concepts
13:16
ritvikmath
Рет қаралды 38 М.
Multi-Armed Bandits: A Cartoon Introduction - DCBA #1
13:59
Academic Gamer
Рет қаралды 46 М.
Multi Armed Bandits - Reinforcement Learning Explained!
10:33
CodeEmporium
Рет қаралды 12 М.
UCB 1
13:34
Reinforcement Learning
Рет қаралды 46 М.
CS885 Lecture 8b: Bayesian and Contextual Bandits
1:17:00
Pascal Poupart
Рет қаралды 13 М.
7 Outside The Box Puzzles
12:16
MindYourDecisions
Рет қаралды 88 М.
If you're ambitious but lazy, please watch this video...
12:57
Mark Tilbury
Рет қаралды 421 М.
Andro, ELMAN, TONI, MONA - Зари (Official Music Video)
2:50
RAAVA MUSIC
Рет қаралды 2 МЛН