Reservoir Sampling | Algorithm R | Interviewing Questions

  Рет қаралды 6,416

Try and Catch Jeeves

Try and Catch Jeeves

Күн бұрын

Пікірлер: 36
@krzychu1463
@krzychu1463 2 жыл бұрын
OMG. It's exactly what I needed for my project!! Thank you for this vid!! At first I had giant problem with understanding why the code is correct but after the proofs and banging my head, now I got it. Thank you so much 🙂
@davidlu6265
@davidlu6265 3 жыл бұрын
Bit late here but your explanation at 11:05 and on helped me hella. Hope you make more videos on confusing topics like this in the future
@animatedzombie64
@animatedzombie64 3 жыл бұрын
look, how cool is this guy.
@varunshenoy532
@varunshenoy532 2 жыл бұрын
Nice video, I really like your thought process of this not being a fair question.
@seal0118
@seal0118 3 жыл бұрын
more coding content? i really like your thought process and personally think there is good things to learn from you
@jz613
@jz613 4 жыл бұрын
Great video dude, you laid out all the background really well and explained everything super clearly, thanks!
@tryandcatchjeeves
@tryandcatchjeeves 4 жыл бұрын
Thank you so much! Good luck on your studying and/or interviews!
@sandeepsuresh1162
@sandeepsuresh1162 4 жыл бұрын
Thanks, Jeeves! Needed to find a good explanation for this :)
@ninjacoder2109
@ninjacoder2109 4 жыл бұрын
Thanks for explaining the intuition part for solving this problem. That was really helpful!
@tryandcatchjeeves
@tryandcatchjeeves 4 жыл бұрын
Thank you! I'm glad it helped, I was struggling with it too haha
@yifatamir4743
@yifatamir4743 2 жыл бұрын
this is great!!
@pranav7471
@pranav7471 2 жыл бұрын
Isn't this useful for minibatch creation? I mean mini batch SGD, Adam etc. and other optimization algorithms expect each mini batch to be a "unbiased random" sample for the convergence to occur, useful especially when training is happening on a live stream of data. The question makes sense for ML Engineering or Research roles, probably not for developers though Gr8 content and explanation!
@clairegu8969
@clairegu8969 2 жыл бұрын
Given a pool of N lottery tickets, where the lottery tickets dispatched within X stores (each store might get different numbers of lottery tickets). Does the probability of winning the lottery by purchasing ONE ticket change if we buy lottery ticket from different stores?
@arresteddevelopment2126
@arresteddevelopment2126 3 жыл бұрын
Hi Jeeves. Great explanation. I have a doubt. If the stream is too large to fit in the memory, how is the heapq.nlargest() thing going to work?
@JustCountPixels
@JustCountPixels 4 жыл бұрын
Great Video!
@tryandcatchjeeves
@tryandcatchjeeves 3 жыл бұрын
Thank you!
@benlucas8109
@benlucas8109 3 жыл бұрын
I have a question about the code. What if i=2 and k=5, the random_index must be smaller than k. So, does it mean that you will have the same element in the reservoir twice? one by appending, one by replacing?
@benlucas8109
@benlucas8109 3 жыл бұрын
I think in the first if state, where i < k, "continue" should be added to avoid executing code for i > k
@tryandcatchjeeves
@tryandcatchjeeves 3 жыл бұрын
​@@benlucas8109 Bullocks! You're right! yeah the array could also be initialized with the first k elements and then continue the algorithm. Maybe the proofs would have to be adjusted too to show that this indeed correct.
@EpiKuhL
@EpiKuhL 4 жыл бұрын
Very organized video! Can you please help me understand why your heap solution would be O(n + nlogk)? Wouldn't it simply be O(nlogk), since for each of the n elements in the stream, we are maintaining a heap with k elements (which takes log(k) each for each element of the stream)? Assigning each element of the stream a random key (random.random()) should be an O(1) operation, AFAIK.
@tryandcatchjeeves
@tryandcatchjeeves 4 жыл бұрын
Thank you! Yes you're correct (and so am I?) I apologize if it caused you any confusion. I left it as O(n + nlogk) to simply show (and also separate in my mind) that you are indeed iterating over the entire stream of size N and each time you do so you are do a log K operation (insertion into a fixed-size heap) a grand total of N times. That said, the asymptotic running time of this algorithm is supposed to be linear O(n), because depending on the context, it is safe to assume k is a small number independent of n (like the integer 5 haha). You can safely write it off the operation. In other contexts, if K was sufficiently large something like k=N/3, because you actually wanted a decent sample size or something, then O(N log K) seems more appropriate like you said.
@EpiKuhL
@EpiKuhL 4 жыл бұрын
​@@tryandcatchjeeves Perfect, that makes sense! Yes I also wonder in what scenarios this algorithm is actually implemented and what typical values of K would be. But in terms of big O, I am on board with everything you said. Thank you for clarifying!
@anandapoudel5767
@anandapoudel5767 4 жыл бұрын
@@EpiKuhL Did you come here from leetcode's december challenge?
@EpiKuhL
@EpiKuhL 4 жыл бұрын
@@anandapoudel5767 yes I did haha! I expect you did too? I had never heard of reservoir sampling prior to today’s problem
@bloggermitavii
@bloggermitavii 4 жыл бұрын
Hey Jeeves, I am not sure if you would take any requests but I would like to see a video on algorithm L for reservoir sampling.
@tryandcatchjeeves
@tryandcatchjeeves 4 жыл бұрын
@Amitav Hello there! Well you'd be the first to request something, and I'd be happy to do it. If you have any questions in particular that would help make something tailored to what would be most helpful to you and people like you with similar questions, then please let me know!
@bloggermitavii
@bloggermitavii 4 жыл бұрын
@@tryandcatchjeeves Algorithm R is subtle to develop to an intuition for. Algorithm L 's random number generation part was a bit unclear.
@DontTakeCrack
@DontTakeCrack 3 жыл бұрын
i just bombed my interview with this question and here i am. and cool, it's just math/statistics. i don't feel so bad anymore. why was i asked this :(
@tryandcatchjeeves
@tryandcatchjeeves 3 жыл бұрын
Sorry to hear that and you're not alone! haha same thing happened to me. Imho, it's a terrible interview question, doesn't showcase any relevant skills, and can leave the candidate dejected. I hope you have better luck and crush your upcoming job interviews!
@jayeshborgaonkar9166
@jayeshborgaonkar9166 4 жыл бұрын
great video
@tryandcatchjeeves
@tryandcatchjeeves 4 жыл бұрын
Thanks so much Jayesh! Glad you found the video useful. Good luck on your interviews!
@seal0118
@seal0118 3 жыл бұрын
11:50 why do we replace the elements that we have picked? doesn't it mean that if i picked element of already existing k elements that it has "won" and should stay?
@tryandcatchjeeves
@tryandcatchjeeves 3 жыл бұрын
Hey great question, I hope you've already figured it out since this is a late reply, but it's a great question and something I was stuck on too. Here's a simple answer that I can elaborate on further: "We replace the already picked elements, because we want every element to have an equal probability of being chosen. If we were to keep the existing k elements that have 'won' so to speak, we give an unfair advantage to the elements that came first". It's easy to answer your question with a proof by contradiction. Let's say you just flipped a coin and with probability p=0.5, if the coin lands on heads then it was put into the reservoir of k elements, and said if an element was picked, it stays in the reservoir. Well then, every element until you fill up the reservoir has a probability 1/2 of being put into the reservoir (seems equal right?). Let's say k=5, N=100, and you flipped your coin 5 times and it ended up HHHHH; well then, the rest of the n - k elements (95 elements) in this scenario would have 0% probability of being chosen. Hence, the probability of each element being chosen was not equal to each other 0 != p. The question implies you're only allowed one-pass over a seemingly infinite stream and that you must do something to guarantee an equal probability to each element being chosen, so the point of replacing is making sure that as you increase the size of the elements you've seen to N, that you have given each element an equal probability of being chosen (1/N). You realize once you've chosen to place something in the reservoir, the probability of that staying in the reservoir has to then decrease as you're iterating! Because as you iterate, the probability of any element being chosen needs to be 1, 1/2, 1/3, 1/4, 1/5 .... 1/N when you stop. Hope this helped!
@seal0118
@seal0118 3 жыл бұрын
​@@tryandcatchjeeves i did actually figure out why that's the case as the only way to have equal probability is to replace but now i have another question which is: how's the probability of each element getting picked is 1/n, shouldn't it be k/n?
@nocgod
@nocgod 3 жыл бұрын
#AskJeeves :)
@tryandcatchjeeves
@tryandcatchjeeves 3 жыл бұрын
Gotten that all my life - never gets old haha
RANDOM PICK INDEX | LEETCODE # 398 | PYTHON RESERVOIR SAMPLING
14:56
Turn Off the Vacum And Sit Back and Laugh 🤣
00:34
SKITSFUL
Рет қаралды 7 МЛН
Hoodie gets wicked makeover! 😲
00:47
Justin Flom
Рет қаралды 138 МЛН
Deep Learning Interview Prep Course
3:59:50
freeCodeCamp.org
Рет қаралды 489 М.
Reservoir Sampling Algorithm
12:41
Coding Interview Prep
Рет қаралды 10 М.
[QISCA Joint Seminar] Supremacy with Complexity - HeeRyang Choi
1:07:03
SQRT (SNU Quantum Research Team)
Рет қаралды 54
Hands-On Power BI Tutorial 📊Beginner to Pro [Full Course] ⚡
3:05:45
Pragmatic Works
Рет қаралды 2,2 МЛН
Codeforces Round 667 (Div. 3) Stream + All Solutions (A-F) (+ extra)
3:55:56