Hash Match Join Internals in SQL Server

Рет қаралды 27,676

Bert Wagner

Күн бұрын

Пікірлер: 23

@singhsVP 2 ай бұрын

Bert your explanation to the various processes is great

@KristinaMyLife 5 жыл бұрын

Bert this is the easiest and shortest explanation of Hash Match I've ever seen. Thank you very much!

@stan_likes_single_coils 5 жыл бұрын

Man, thank you for this series, videos are fantastic! Thank you for saying it as it is, not being vague or keeping it abstract. Your videos have high learning value! Keep up the good work!

@zanonilabuschagne7628 6 жыл бұрын

Wow, commitment, doing a video on the 2nd!! Hope you had a great festive period. Have a great 2019. Thanks for this series. Internals are always VERY interesting and exciting. You've set the bar pretty high; can't wait to see what comes next!!

@tinghsu9972 5 жыл бұрын

This video is so good. I was confused in my college database class. But you explain everything clearly! Thank you so much for making this video!

@James868198 5 жыл бұрын

I'm taking a database course. Your video is pretty helpful to help me understand this stuff.

@chazsmith20 2 жыл бұрын

ok so it seems the advantage of a hash join over a nested loop is that once you've identified a match you don't have to keep iterating since all tuples of that join coindition are in that one bucket is that correct?

@chswin 3 жыл бұрын

Also a hash match join can run in batch which can make it quite performant...

@Pblaze12 3 жыл бұрын

May somebody explain to me why in SQL Server, hash join doesn't preserve order. It clearly can but I don't know why it always needs to re-sort after a hash join.

@odinsrensen7460 3 жыл бұрын

Thank you, this made it much easier to understand.

@mantistoboggan537 4 жыл бұрын

What's the time complexity of this? Wikipedia actually doesn't have one on its entry and I couldn't find an authoritative answer elsewhere either. It seems to me that we have to hash everything in Table1, which is Theta(n). Then, for each record in Table2, we need to do a lookup of the hash structure. So, in the worst case, we need to look at every key in the hash structure. So, assuming both tables have the same number of elements, would it be O(N^2)?

@kirkvukonich52 4 жыл бұрын

Good question. I believe it is O(m + n) because of the following logic: the process to create the build phase's in-memory hash map is O(m) since all m records must be traversed, the process to iterate over the the probe phase's records and compute each hash is O(n) since all n records must be traversed, and the probe phase's searches against the in-memory hash map are O(1) because searching the in-memory hash map is analogous to searching an array (O(1)) and not a linked list (O(n)). Since we throw out O(1), we're left with O(m + n).