Hyperloglog Explained | Counting things at scale.

  Рет қаралды 11,545

Core Dump

Core Dump

Күн бұрын

Пікірлер: 20
@TriNguyen-xi8ji
@TriNguyen-xi8ji 4 ай бұрын
I find this more intuitive and helpful than the video that has more than a million views recommended to me. May your channel be blessed by the algorithm.
@core_dump
@core_dump 4 ай бұрын
Thanks!
@bigpopakap
@bigpopakap Жыл бұрын
Fantastic and simple explanation of this algorithm! At first, hyperloglog sounds like magic. But now I can fully understand the core principles that make it work. It's ingenious!
@core_dump
@core_dump Жыл бұрын
Thanks Kapil! ❤️ Means a lot.
@dytundesu
@dytundesu Ай бұрын
very cool explaination bro
@SauravSahu01
@SauravSahu01 2 жыл бұрын
3:46 How to determine that we need to consider first 2 bits only, not first 3-bits? Or does it not matter a great deal with the final mean output?
@core_dump
@core_dump 2 жыл бұрын
So there is a tradeoff you have to make, the first n bits that you take will determine the number of buckets you will have, thus multiple data points to make the average better, but that would make the rest of the binary number smaller and make count in each bucket be inaccurate. So you have to run tests and experiments on your own data to decide the balance.
@Ali42374
@Ali42374 6 ай бұрын
at 03:56, the mean should be (2+0+1+2)/4 = 1.25. Why you missed the bucket 2 with 1 count ? . It will narrow down the answer to 4 unique entries
@rahulsbytes
@rahulsbytes 6 ай бұрын
leading zero in remaining bits...... is 0
@ph6295
@ph6295 6 ай бұрын
저도 알고 싶어요
@jeevan999able
@jeevan999able Жыл бұрын
Superb explaination thank you
@BernhardBB8
@BernhardBB8 Жыл бұрын
Im not fully getting it: why can I not just count the elements, instead of assigning a random number to each element, and analysing that number...
@PravinDahal
@PravinDahal Жыл бұрын
Not memory efficient. To see if the new entry has been seen before, you'll have to store and check against all that have come so far.
@anishkelkar6434
@anishkelkar6434 Жыл бұрын
I don't think it's about storing either.. but when you get a new value in the entey you will have to search it in the set to see if it exists . Which would be logn for extremely large sets as hashset wouldn't be possible... Thus we are optimizing over the process of checking value in the set by having this approximate data
@GameSteals
@GameSteals 2 ай бұрын
We cannot simply *count* the elements because of the constraint where we only need to count *unique* elements. How do we ensure that we are not counting the same element twice while counting? To ensure that, we would need to maintain some map or a set, and that is where the problem lies since storing a billion elements in a set is costly, and is not feasible. That is where this algorithm comes in to give an estimate on the number of unique values. If the problem was simply counting the number of occurrences of elements, we would not have needed this, and simply maintaining a count would be enough.
@tonyennis1787
@tonyennis1787 Жыл бұрын
0:26 At least three of those names are chess grandmasters. 1:40 there's another. Where is Anand, we wonder...
@core_dump
@core_dump Жыл бұрын
😰
@PrathamShah-y6m
@PrathamShah-y6m 7 ай бұрын
we need gukesh now
@core_dump
@core_dump 6 ай бұрын
@@PrathamShah-y6m haha yes! Gukesh for WC!
@magui2277
@magui2277 Жыл бұрын
Very clear, thank you
Hyperloglog: Facebook's algorithm to count distinct elements
11:30
A problem so hard even Google relies on Random Chance
12:06
Breaking Taps
Рет қаралды 1,2 МЛН
Deadpool family by Tsuriki Show
00:12
Tsuriki Show
Рет қаралды 4,5 МЛН
Как Я Брата ОБМАНУЛ (смешное видео, прикол, юмор, поржать)
00:59
The IMPOSSIBLE Puzzle..
00:55
Stokes Twins
Рет қаралды 198 МЛН
Understanding B-Trees: The Data Structure Behind Modern Databases
12:39
What are Probabilistic Data Structures: Bloom Filters
9:17
The Algorithm with the Best Name - HyperLogLog Explained #SoME1
11:02
Victor Sanches Portella
Рет қаралды 16 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 3,8 МЛН
Understand RAFT without breaking your brain
8:51
Core Dump
Рет қаралды 31 М.
Redis HyperLogLog Explained
5:10
Redis
Рет қаралды 15 М.
Bloom Filters Explained by Example
9:19
Hussein Nasser
Рет қаралды 63 М.
10 Key Data Structures We Use Every Day
8:43
ByteByteGo
Рет қаралды 360 М.