OMG. It's exactly what I needed for my project!! Thank you for this vid!! At first I had giant problem with understanding why the code is correct but after the proofs and banging my head, now I got it. Thank you so much 🙂
@davidlu62653 жыл бұрын
Bit late here but your explanation at 11:05 and on helped me hella. Hope you make more videos on confusing topics like this in the future
@animatedzombie643 жыл бұрын
look, how cool is this guy.
@varunshenoy5322 жыл бұрын
Nice video, I really like your thought process of this not being a fair question.
@seal01183 жыл бұрын
more coding content? i really like your thought process and personally think there is good things to learn from you
@jz6134 жыл бұрын
Great video dude, you laid out all the background really well and explained everything super clearly, thanks!
@tryandcatchjeeves4 жыл бұрын
Thank you so much! Good luck on your studying and/or interviews!
@sandeepsuresh11624 жыл бұрын
Thanks, Jeeves! Needed to find a good explanation for this :)
@ninjacoder21094 жыл бұрын
Thanks for explaining the intuition part for solving this problem. That was really helpful!
@tryandcatchjeeves4 жыл бұрын
Thank you! I'm glad it helped, I was struggling with it too haha
@yifatamir47432 жыл бұрын
this is great!!
@pranav74712 жыл бұрын
Isn't this useful for minibatch creation? I mean mini batch SGD, Adam etc. and other optimization algorithms expect each mini batch to be a "unbiased random" sample for the convergence to occur, useful especially when training is happening on a live stream of data. The question makes sense for ML Engineering or Research roles, probably not for developers though Gr8 content and explanation!
@clairegu89692 жыл бұрын
Given a pool of N lottery tickets, where the lottery tickets dispatched within X stores (each store might get different numbers of lottery tickets). Does the probability of winning the lottery by purchasing ONE ticket change if we buy lottery ticket from different stores?
@arresteddevelopment21263 жыл бұрын
Hi Jeeves. Great explanation. I have a doubt. If the stream is too large to fit in the memory, how is the heapq.nlargest() thing going to work?
@JustCountPixels4 жыл бұрын
Great Video!
@tryandcatchjeeves3 жыл бұрын
Thank you!
@benlucas81093 жыл бұрын
I have a question about the code. What if i=2 and k=5, the random_index must be smaller than k. So, does it mean that you will have the same element in the reservoir twice? one by appending, one by replacing?
@benlucas81093 жыл бұрын
I think in the first if state, where i < k, "continue" should be added to avoid executing code for i > k
@tryandcatchjeeves3 жыл бұрын
@@benlucas8109 Bullocks! You're right! yeah the array could also be initialized with the first k elements and then continue the algorithm. Maybe the proofs would have to be adjusted too to show that this indeed correct.
@EpiKuhL4 жыл бұрын
Very organized video! Can you please help me understand why your heap solution would be O(n + nlogk)? Wouldn't it simply be O(nlogk), since for each of the n elements in the stream, we are maintaining a heap with k elements (which takes log(k) each for each element of the stream)? Assigning each element of the stream a random key (random.random()) should be an O(1) operation, AFAIK.
@tryandcatchjeeves4 жыл бұрын
Thank you! Yes you're correct (and so am I?) I apologize if it caused you any confusion. I left it as O(n + nlogk) to simply show (and also separate in my mind) that you are indeed iterating over the entire stream of size N and each time you do so you are do a log K operation (insertion into a fixed-size heap) a grand total of N times. That said, the asymptotic running time of this algorithm is supposed to be linear O(n), because depending on the context, it is safe to assume k is a small number independent of n (like the integer 5 haha). You can safely write it off the operation. In other contexts, if K was sufficiently large something like k=N/3, because you actually wanted a decent sample size or something, then O(N log K) seems more appropriate like you said.
@EpiKuhL4 жыл бұрын
@@tryandcatchjeeves Perfect, that makes sense! Yes I also wonder in what scenarios this algorithm is actually implemented and what typical values of K would be. But in terms of big O, I am on board with everything you said. Thank you for clarifying!
@anandapoudel57674 жыл бұрын
@@EpiKuhL Did you come here from leetcode's december challenge?
@EpiKuhL4 жыл бұрын
@@anandapoudel5767 yes I did haha! I expect you did too? I had never heard of reservoir sampling prior to today’s problem
@bloggermitavii4 жыл бұрын
Hey Jeeves, I am not sure if you would take any requests but I would like to see a video on algorithm L for reservoir sampling.
@tryandcatchjeeves4 жыл бұрын
@Amitav Hello there! Well you'd be the first to request something, and I'd be happy to do it. If you have any questions in particular that would help make something tailored to what would be most helpful to you and people like you with similar questions, then please let me know!
@bloggermitavii4 жыл бұрын
@@tryandcatchjeeves Algorithm R is subtle to develop to an intuition for. Algorithm L 's random number generation part was a bit unclear.
@DontTakeCrack3 жыл бұрын
i just bombed my interview with this question and here i am. and cool, it's just math/statistics. i don't feel so bad anymore. why was i asked this :(
@tryandcatchjeeves3 жыл бұрын
Sorry to hear that and you're not alone! haha same thing happened to me. Imho, it's a terrible interview question, doesn't showcase any relevant skills, and can leave the candidate dejected. I hope you have better luck and crush your upcoming job interviews!
@jayeshborgaonkar91664 жыл бұрын
great video
@tryandcatchjeeves4 жыл бұрын
Thanks so much Jayesh! Glad you found the video useful. Good luck on your interviews!
@seal01183 жыл бұрын
11:50 why do we replace the elements that we have picked? doesn't it mean that if i picked element of already existing k elements that it has "won" and should stay?
@tryandcatchjeeves3 жыл бұрын
Hey great question, I hope you've already figured it out since this is a late reply, but it's a great question and something I was stuck on too. Here's a simple answer that I can elaborate on further: "We replace the already picked elements, because we want every element to have an equal probability of being chosen. If we were to keep the existing k elements that have 'won' so to speak, we give an unfair advantage to the elements that came first". It's easy to answer your question with a proof by contradiction. Let's say you just flipped a coin and with probability p=0.5, if the coin lands on heads then it was put into the reservoir of k elements, and said if an element was picked, it stays in the reservoir. Well then, every element until you fill up the reservoir has a probability 1/2 of being put into the reservoir (seems equal right?). Let's say k=5, N=100, and you flipped your coin 5 times and it ended up HHHHH; well then, the rest of the n - k elements (95 elements) in this scenario would have 0% probability of being chosen. Hence, the probability of each element being chosen was not equal to each other 0 != p. The question implies you're only allowed one-pass over a seemingly infinite stream and that you must do something to guarantee an equal probability to each element being chosen, so the point of replacing is making sure that as you increase the size of the elements you've seen to N, that you have given each element an equal probability of being chosen (1/N). You realize once you've chosen to place something in the reservoir, the probability of that staying in the reservoir has to then decrease as you're iterating! Because as you iterate, the probability of any element being chosen needs to be 1, 1/2, 1/3, 1/4, 1/5 .... 1/N when you stop. Hope this helped!
@seal01183 жыл бұрын
@@tryandcatchjeeves i did actually figure out why that's the case as the only way to have equal probability is to replace but now i have another question which is: how's the probability of each element getting picked is 1/n, shouldn't it be k/n?