I'm 15 minutes in and find this to be one of the best technical talks I've ever seen.
@BoostCon8 ай бұрын
Thank you so much for your appreciative comment!
@OmarChida3 жыл бұрын
One of the best talks I have ever saw. I quite like his scientific approach.
@conradomartinezp2 жыл бұрын
As long as you have a way to solve ties in Robin Hood which only depends on the identity of the keys, e.g., the smaller key stays at its slot, the larger continues probing, the layout of the hash table after N inserts will be exactly the same, independent of the order of insertion. If ties are solved by the time of insertion then a cluster of synonims (equal hash value) will be ordered by time of insertion. Robin Hood does not reduce the average time for successful search but it makes a bit reduction of the variance (even in almost full tables) and it also reduces the expected longest probe sequence from O(log n) to O(log log n). It's nice to see that it also does well in practice 😉
@dennisrkb5 жыл бұрын
you explained google's map much better than they did themselves
@skepticmoderate57905 жыл бұрын
They explained it pretty well...
@TheOnlyFeanor6 жыл бұрын
This was an excellent talk, thank you.
@lenkite4 жыл бұрын
Wonderful and enlightening talk especially the zooms at various N levels. So std::map is great at < 400 elements!. I only wish the size of the element was also shown.
@leleo530004 жыл бұрын
it's using `std::map`, the size of the element is probably 4
@excitedbox57054 жыл бұрын
@@leleo53000 I think he said 8 later on but that might not apply to that.
@amitm11575 жыл бұрын
Nice explanation about cache hits and misses while using hash tables.
@KilgoreTroutAsf5 жыл бұрын
300 nanoseconds amortized sounds atrocious for a cache miss on modern hardware. could it actually be 300 cpu cycles instead? This would fit the typical 50-80 ns latency for ddr4 and 3~4 Ghz cpu frequency either that or each miss results in O*(4) accesses to ram, with the cost of cpu instructions being too small ( ~0.25 ns per cycle) to matter
@Voy23786 жыл бұрын
Benchmarking only on int keys only is deceptive. If you have std::string or any other type with expensive operator == ridiculously high load factor will kill you.
@KilgoreTroutAsf5 жыл бұрын
A good implementation for large-sized or variable-length keys should never store keys and hashes together, because it completely negates cache locality and memory alignment trickery. The better solution is to use two-level indirection and use the hash table to store (hash,pointer) or (hash,offset) pairs, where the offsets point to a flat array or other memory pool for keys, and are just linearly incremented after each addition. Also, the whole point of hash tables is to use a good universal hash function which digests the object into a fixed-length string of enough bits to guarantee very low collision probability. The final test for true equality should only happen more than once very rarely.
@excitedbox57054 жыл бұрын
insert your strings into an array and use the keys as your INT. Then when you retrieve the key you can use stringArray[key] which only adds minor overhead.
@andrewgrant7882 жыл бұрын
Agreed. The comparison between std::map() and std::unordered_map() performance for small values of N are very misleading when benchmarked with int keys. Because a std::map is a tree, std::map::find() does a lot of < comparisons of keys which is very cheap for ints and much more expensive for strings or more complex keys. unorded_map first hashes the key to determine the bucket and then compares the keys in the same hash bucket. Keys can be pre-hashed to further optimized performance.
@ДаниилМатвеев-ю5т7 ай бұрын
Thank you! Best talk I have seen on this topic😊
@BoostCon7 ай бұрын
Thank you! So pleased to hear that you found the presentation informative.
@origamibulldoser161811 ай бұрын
This is mad fascinating.
@77oxf4 жыл бұрын
Why is std::map faster for fewer elements?
@TheJGAdams5 ай бұрын
They're O(log(n))
@ratchet1freak6 жыл бұрын
I'm pretty sure that you can significantly improve the robinhood adaptation of the google's map implementation is different, If you treat the 16 element bucket as a bucket that each are targets of the hash, instead of the individual elements of the buckets. Then the robin hood variant can use just 2 bits for the distance meta data (0-2 bucket distance + 3 for empty) and keep 6 bits for the hash pre-filter.
@TheLavaBlock6 жыл бұрын
Great talk! Thanks
@wiadroman5 жыл бұрын
Great material, thanks a lot.
@Sychonut2 жыл бұрын
The content was great and much appreciated, but damn it felt at several points throughout that it's a nerds fest with people trying to show off to one another how smart they are. It's just not a good look.
@roman95096 жыл бұрын
cool
@ABaumstumpf5 жыл бұрын
If only the real world was that simple - a hasmap for int-int, gosh how it would like that. instead we have things like.... normally we have maps of maps of vectors and such things.........
@blarghblargh4 жыл бұрын
his benchmarks are open source. go prove how his map works worse for those, and show us you actually know what you're talking about
@muonline20672 жыл бұрын
@@blarghblargh my question is: where can I download his unordered_map for testing man?
@joestevenson55686 ай бұрын
Great talk, terrible audience
@zyxyuv16506 жыл бұрын
Obvious German
@seditt51465 жыл бұрын
Really? Because I thought he sounded as though he had a slight Irish accent not German.