06 - Hash Tables (CMU Databases Systems / Fall 2019)

  Рет қаралды 33,428

CMU Database Group

CMU Database Group

Күн бұрын

Пікірлер: 30
@marcoq7160
@marcoq7160 4 жыл бұрын
0:15 DJ Drop Tables 0:56 Administrivia 1:45 Course status 3:04 Data structures 4:47 Design decisions (data organization, concurrency) 6:55 Hash tables 8:38 Constant factors matter a lot 9:28 Static hash table 10:54 Static hash table: assumptions 12:20 Static hash table: perfect hash function 13:24 Hash table design decisions: hash function, hashing scheme 16:12 Today's agenda: hash functions, static/dynamic hashing schemes 16:43 Hash functions 18:35 Hash functions: CRC-64, {Murmur,City,XX,Farm}Hash 20:30 Hash function benchmark (XXHash3 was (still is?) the best) 22:42 A question on hash collision attack 24:45 Static hashing schemes (linear probe, robin hood, cuckoo) 26:23 Linear probe hashing (open addressing) 29:28 Linear probe hashing - deletes (tombstones) 35:22 Non-unique keys (separate linked list, redundant keys) 37:30 Robin Hood hashing 43:01 Cuckoo hashing 47:22 Observation: resizing, dynamic hash tables 48:49 Chained hashing 50:35 Extendible hashing 56:59 What is the relationship between a hash table and a buffer pool? 1:00:10 Linear hashing 1:07:28 Linear hashing - deletes 1:10:40 Conclusion
@luisponce3580
@luisponce3580 4 жыл бұрын
beast
@marcoq7160
@marcoq7160 4 жыл бұрын
@@luisponce3580 haha, man, you're doing great! Thanks for these comments, they make me smile :) This is the penultimate timestamp comment I made. But you motivate me to get back to it again!
@luisponce3580
@luisponce3580 4 жыл бұрын
@@marcoq7160 lol, appreciated!
@ashishacharya4001
@ashishacharya4001 5 жыл бұрын
No "Hit it" at the end? I'm disappointed.
@garfieldnate
@garfieldnate 4 жыл бұрын
Java checks if the collection's keys implement Comparable, and if so it uses TreeMaps in the buckets instead of simple LinkedLists. In my undergrad data structures class, the first thing we learned about hashes is "this linear probing thing sucks, use linked lists in the buckets instead" (I never heard the term "chained hashing" before this lecture). So it was really eye-opening to hear the lecturer say, "in practice everyone uses linear probing"! I suspect the JVM default is a balanced generic implementation for many disparate use-cases, which has quite different requirements from what we would use in an RDBMS. In general I feel that this lecture filled lots of holes in my knowledge. Thanks for another great lecture!
@NaderHGhanbari
@NaderHGhanbari 3 жыл бұрын
I believe his statement about linear probing is in the context of databases, not for general purpose programming.
@willw2596
@willw2596 3 жыл бұрын
Nice lecture. For extendible hashing, when the directory needs doubling due to a bucket split, you don't need to re-map all the the directory entries if using the suffix hash bits as the directory index. Just bulk copy the first half of the directory content to the second half of the new directory would work. Basically the two halves of the directory entries both pointing to the same buckets after the doubling.
@cacheman
@cacheman 5 жыл бұрын
12:33 I will admit to some confusion here; this section makes it sound like perfect hashing is only of theoretical interest, but of course there's plenty of software to generate even a minimal mapping (like gperf, or CMPH) that is used in production software. 19:03 Using CRC for hash tables, yes probably a bad idea. That said, modern CPUs (both AMD64/SSE4.2 and some ARM variants) include CRC32 instructions which should in some sense be "fast", for their intended use-case. 22:42 Is he saying he's _never_ heard about this before, or that he's heard of it before? Okay kids, having a seed is not enough; see Aumasson & Bernstein on breaking murmur. More generally, look up 'algorithmic complexity attacks'. It's cool to ignore it, but only when you know what you're doing.
@AndersonSilva-dg4mg
@AndersonSilva-dg4mg 5 жыл бұрын
thank you Andy
@420_gunna
@420_gunna 4 жыл бұрын
feeling blessed to hear my professor say the crc hash "sucks ass" in 2020 year of our lord
@xuejianpan3020
@xuejianpan3020 3 жыл бұрын
Great lecture! But I have some questions: 1)is chained hashing the most popular one? 2)Extendible hashing and linear hashing might be used when there are a lot of collisions, but do we need that if we have a good hash function? In practice are they used widely? 3)are static hash tables actually used in some real systems? If so why are they better than chained hashing? Thanks!
@alfin3644
@alfin3644 5 жыл бұрын
SHA-256 is not reversible
@andypavlo
@andypavlo 5 жыл бұрын
Ah you are correct! I got this very wrong. I will correct this in the next lecture. Thanks! My point about that we don't care about leaking information is the more important part though.
@anarionzuo1425
@anarionzuo1425 4 жыл бұрын
@@andypavlo It seems some chinese guy has made this reversible...
@abdelrhmanahmed1378
@abdelrhmanahmed1378 3 жыл бұрын
In Robin hode what if we want to move the richer one down by one but the place is already taken ?
@abdelrhmanahmed1378
@abdelrhmanahmed1378 3 жыл бұрын
i have example where we will have the number of jumps = 3 , if we have A[0],B[0],C[1],D[2] , and now we want to insert M and its location is where A at . so M[0] == A[0] , so next B[0]
@paulchiang4752
@paulchiang4752 4 жыл бұрын
Thanks, andy. So the tombstone will be considered when we calculate the fill factor(load factor) since it is a kind of entry?
@theUnorthodoxxx
@theUnorthodoxxx 4 жыл бұрын
I do not think so. Because we have used it for our easiness only.It is just a special empty entry.
@paulchiang4752
@paulchiang4752 4 жыл бұрын
I understand what you mean, but in this video, from 30:15 to 31:00, Professor said that "the tombstone is wasting space and contributing to the fill factor". I am not quite sure and thus want to confirm.
@AshishNegi1618
@AshishNegi1618 4 жыл бұрын
@@paulchiang4752 Think what happens if you add N keys and delete N keys where hash table size is N. Should you consider hash table completely filled ?
@zhenghanghu2430
@zhenghanghu2430 2 жыл бұрын
Yes, tombstones are considered for the fill factor.
@monitoringtsdb3884
@monitoringtsdb3884 3 жыл бұрын
such fire beats
@zeevtarantov
@zeevtarantov 5 жыл бұрын
Why is the a lesson on basics of hash functions in a "how to build a RDBMS" course?
@AshishNegi1618
@AshishNegi1618 4 жыл бұрын
First 5 mins of this lecture talks about this.
@zeevtarantov
@zeevtarantov 5 жыл бұрын
SHA-256 is not an asymmetric cryptographic primitive, and it is not reversible, and the author should bleep out the audio of that part if they don't want to take down the whole video.
@jimwichen7978
@jimwichen7978 2 жыл бұрын
beep
07 - Tree Indexes I (CMU Databases Systems / Fall 2019)
1:18:02
CMU Database Group
Рет қаралды 34 М.
16 - Concurrency Control Theory (CMU Databases Systems / Fall 2019)
1:23:08
CMU Database Group
Рет қаралды 23 М.
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
Support each other🤝
00:31
ISSEI / いっせい
Рет қаралды 81 МЛН
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
06 - Hash Tables (CMU Intro to Database Systems / Fall 2021)
1:18:41
CMU Database Group
Рет қаралды 6 М.
S2024 #17 - Google BigQuery / Dremel (CMU Advanced Database Systems)
1:08:59
CMU Database Group
Рет қаралды 4,3 М.
Systems Design in an Hour
1:11:00
Jordan has no life
Рет қаралды 33 М.
Delta Live Tables A to Z: Best Practices for Modern Data Pipelines
1:27:52
05 - Buffer Pools + Memory Management (CMU Databases Systems / Fall 2019)
1:19:00
03 - Database Storage 1 (CMU Intro to Database Systems / Fall 2022)
1:23:28
CMU Database Group
Рет қаралды 39 М.
12 - Query Execution I (CMU Databases Systems / Fall 2019)
1:05:38
CMU Database Group
Рет қаралды 19 М.
08 - Tree Indexes II (CMU Databases Systems / Fall 2019)
1:17:44
CMU Database Group
Рет қаралды 19 М.
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН