Hyperloglog Explained | Counting things at scale.

  Рет қаралды 12,160

Core Dump

Core Dump

Күн бұрын

Пікірлер: 22
@TriNguyen-xi8ji
@TriNguyen-xi8ji 6 ай бұрын
I find this more intuitive and helpful than the video that has more than a million views recommended to me. May your channel be blessed by the algorithm.
@core_dump
@core_dump 6 ай бұрын
Thanks!
@bigpopakap
@bigpopakap 2 жыл бұрын
Fantastic and simple explanation of this algorithm! At first, hyperloglog sounds like magic. But now I can fully understand the core principles that make it work. It's ingenious!
@core_dump
@core_dump 2 жыл бұрын
Thanks Kapil! ❤️ Means a lot.
@Ali42374
@Ali42374 8 ай бұрын
at 03:56, the mean should be (2+0+1+2)/4 = 1.25. Why you missed the bucket 2 with 1 count ? . It will narrow down the answer to 4 unique entries
@rahulsbytes
@rahulsbytes 8 ай бұрын
leading zero in remaining bits...... is 0
@ph6295
@ph6295 7 ай бұрын
저도 알고 싶어요
@SauravSahu01
@SauravSahu01 2 жыл бұрын
3:46 How to determine that we need to consider first 2 bits only, not first 3-bits? Or does it not matter a great deal with the final mean output?
@core_dump
@core_dump 2 жыл бұрын
So there is a tradeoff you have to make, the first n bits that you take will determine the number of buckets you will have, thus multiple data points to make the average better, but that would make the rest of the binary number smaller and make count in each bucket be inaccurate. So you have to run tests and experiments on your own data to decide the balance.
@dytundesu
@dytundesu 3 ай бұрын
very cool explaination bro
@BernhardBB8
@BernhardBB8 Жыл бұрын
Im not fully getting it: why can I not just count the elements, instead of assigning a random number to each element, and analysing that number...
@PravinDahal
@PravinDahal Жыл бұрын
Not memory efficient. To see if the new entry has been seen before, you'll have to store and check against all that have come so far.
@anishkelkar6434
@anishkelkar6434 Жыл бұрын
I don't think it's about storing either.. but when you get a new value in the entey you will have to search it in the set to see if it exists . Which would be logn for extremely large sets as hashset wouldn't be possible... Thus we are optimizing over the process of checking value in the set by having this approximate data
@GameSteals
@GameSteals 4 ай бұрын
We cannot simply *count* the elements because of the constraint where we only need to count *unique* elements. How do we ensure that we are not counting the same element twice while counting? To ensure that, we would need to maintain some map or a set, and that is where the problem lies since storing a billion elements in a set is costly, and is not feasible. That is where this algorithm comes in to give an estimate on the number of unique values. If the problem was simply counting the number of occurrences of elements, we would not have needed this, and simply maintaining a count would be enough.
@jeevan999able
@jeevan999able Жыл бұрын
Superb explaination thank you
@magui2277
@magui2277 Жыл бұрын
Very clear, thank you
@tonyennis1787
@tonyennis1787 Жыл бұрын
0:26 At least three of those names are chess grandmasters. 1:40 there's another. Where is Anand, we wonder...
@core_dump
@core_dump Жыл бұрын
😰
@PrathamShah-y6m
@PrathamShah-y6m 9 ай бұрын
we need gukesh now
@core_dump
@core_dump 8 ай бұрын
@@PrathamShah-y6m haha yes! Gukesh for WC!
@catharsis7629
@catharsis7629 Ай бұрын
​@@core_dumpAged very nicely lol
@core_dump
@core_dump Ай бұрын
@@catharsis7629 I was so so happy seeing him win! Guki FTW!
The Algorithm with the Best Name - HyperLogLog Explained #SoME1
11:02
Victor Sanches Portella
Рет қаралды 17 М.
Bloom Filters Explained by Example
9:19
Hussein Nasser
Рет қаралды 63 М.
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
Don’t Choose The Wrong Box 😱
00:41
Topper Guild
Рет қаралды 62 МЛН
A problem so hard even Google relies on Random Chance
12:06
Breaking Taps
Рет қаралды 1,2 МЛН
HyperLogLog Algorithm Counting Unique IDs Efficiently
13:52
Stephen Blum
Рет қаралды 185
Someone improved my code by 40,832,277,770%
28:47
Stand-up Maths
Рет қаралды 2,7 МЛН
Advanced Data Structures: Designing an Optimal Count-Min Sketch
8:02
Hyperloglog: Facebook's algorithm to count distinct elements
11:30
Redis HyperLogLog Explained
5:10
Redis
Рет қаралды 16 М.
Understanding B-Trees: The Data Structure Behind Modern Databases
12:39
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 358 М.