I find this more intuitive and helpful than the video that has more than a million views recommended to me. May your channel be blessed by the algorithm.
@core_dump4 ай бұрын
Thanks!
@bigpopakap Жыл бұрын
Fantastic and simple explanation of this algorithm! At first, hyperloglog sounds like magic. But now I can fully understand the core principles that make it work. It's ingenious!
@core_dump Жыл бұрын
Thanks Kapil! ❤️ Means a lot.
@dytundesuАй бұрын
very cool explaination bro
@SauravSahu012 жыл бұрын
3:46 How to determine that we need to consider first 2 bits only, not first 3-bits? Or does it not matter a great deal with the final mean output?
@core_dump2 жыл бұрын
So there is a tradeoff you have to make, the first n bits that you take will determine the number of buckets you will have, thus multiple data points to make the average better, but that would make the rest of the binary number smaller and make count in each bucket be inaccurate. So you have to run tests and experiments on your own data to decide the balance.
@Ali423746 ай бұрын
at 03:56, the mean should be (2+0+1+2)/4 = 1.25. Why you missed the bucket 2 with 1 count ? . It will narrow down the answer to 4 unique entries
@rahulsbytes6 ай бұрын
leading zero in remaining bits...... is 0
@ph62956 ай бұрын
저도 알고 싶어요
@jeevan999able Жыл бұрын
Superb explaination thank you
@BernhardBB8 Жыл бұрын
Im not fully getting it: why can I not just count the elements, instead of assigning a random number to each element, and analysing that number...
@PravinDahal Жыл бұрын
Not memory efficient. To see if the new entry has been seen before, you'll have to store and check against all that have come so far.
@anishkelkar6434 Жыл бұрын
I don't think it's about storing either.. but when you get a new value in the entey you will have to search it in the set to see if it exists . Which would be logn for extremely large sets as hashset wouldn't be possible... Thus we are optimizing over the process of checking value in the set by having this approximate data
@GameSteals2 ай бұрын
We cannot simply *count* the elements because of the constraint where we only need to count *unique* elements. How do we ensure that we are not counting the same element twice while counting? To ensure that, we would need to maintain some map or a set, and that is where the problem lies since storing a billion elements in a set is costly, and is not feasible. That is where this algorithm comes in to give an estimate on the number of unique values. If the problem was simply counting the number of occurrences of elements, we would not have needed this, and simply maintaining a count would be enough.
@tonyennis1787 Жыл бұрын
0:26 At least three of those names are chess grandmasters. 1:40 there's another. Where is Anand, we wonder...