05 - Columnar Databases & Compression (CMU Intro to Database Systems / Fall 2022)

  Рет қаралды 20,495

CMU Database Group

CMU Database Group

Күн бұрын

Пікірлер: 22
@sontungnguyen4527
@sontungnguyen4527 Жыл бұрын
Sound quality is much better in this one compared to previous lectures. Thank you!
@Tryagainbtf2
@Tryagainbtf2 2 жыл бұрын
At 31:20, it is mentioned that nobody uses cubes anymore. The company (ActiveViam) I work at produces an in-memory column store paired with an aggregation engine/data cube, supporting real-time transactions. It is a very niche product, but it does still exist and fill a specific need. Loving the lecture! Cheers
@Azxsdwrrbh
@Azxsdwrrbh 2 жыл бұрын
Graduated CMU student comes back and listen the newly added lecture.
@foxl9195
@foxl9195 2 жыл бұрын
Do you do databases after graduation? :)
@Azxsdwrrbh
@Azxsdwrrbh 2 жыл бұрын
@@foxl9195 I took 15645 at school.
@foxl9195
@foxl9195 2 жыл бұрын
I mean, do you develop databases these days?
@OOpSjm
@OOpSjm 11 ай бұрын
​@@foxl9195This is an intro to the database course.
@foxl9195
@foxl9195 2 жыл бұрын
1:10:50 On incremental encoding: Why does Andy say you don't need to replay the whole thing? If we want to decode the "robbing" word, we have "ing" suffix and and the length of prefix, 4, but to get that prefix we first need to decode "robbed", and so on until we decode all the values up to the word that doesn't have a common prefix with a previous word, in this example it's "rob".
@monfera
@monfera 2 жыл бұрын
One technnique with differential (delta) encoding is to periodically restate the full value (not necessarily in the same physical column, as it may need more bits) so there's a latch-on point. A bit like how video compression does a bunch of delta compression between frames but also periodically restates the full frame (that's lossy encoding so it's also for errors to not cumulate too much)
@monfera
@monfera 2 жыл бұрын
I just see here it's a different technique, the prefix can be referenced from the Common Prefix table, so there's no need to start from the beginning. It's not super clear to me how it saves space. Maybe it uses a tree structure with structural sharing for representing the strings in the Common Prefix table, or it relies on better compressibility of the two derived vectors compared to the original vector (esp. if there's exact repetition of the prefix, often the case if there are lots of words in alphabetical order). Just guessing though
@建平許
@建平許 2 жыл бұрын
Wow! The new chapter.
@olegpatraschku3736
@olegpatraschku3736 Жыл бұрын
at 55:53 and later on I think the id column didn't skip id 5 on purpose ?
@haitaoyang-gf5fk
@haitaoyang-gf5fk 8 ай бұрын
About the bit-packing algorithm, how does data system know the max value of a column? scan the whole column?
@allencheri9286
@allencheri9286 2 жыл бұрын
OLTP, row store. OLAP, column store. Which store used by HTAP? hybrid store?
@stackunderflow5951
@stackunderflow5951 2 жыл бұрын
Typically they shore the OLTP data in the row store component, which are asynchronously replicated into the column store component. There is something identifying the category of queries (OLTP query or OLAP query) and routes them to the appropriate component.
@manmohanmundhraa3087
@manmohanmundhraa3087 Жыл бұрын
can small tables with say less then 5 column , can replace columner database ?
@rainwave5
@rainwave5 Жыл бұрын
51:50 Someone wasn't paying attention in earlier lectures 😄
@juan-tj1xf
@juan-tj1xf Жыл бұрын
hIT IT!
@sasuke_2145
@sasuke_2145 2 жыл бұрын
Can somebody explain this part: kzbin.info/www/bejne/p2W6ZqVpfLdjhbc , i didn't quite get that fixed length values in column part. Like how does stiching of column works when fetching entire row?
@mephistotel87
@mephistotel87 Жыл бұрын
The same. You can't just jump to this offset because columns are in pages, which are unordered and scattered around database file. Still don't understand this part.
@abdalaelgendy1762
@abdalaelgendy1762 8 ай бұрын
done
@knowsomething-b8d
@knowsomething-b8d Жыл бұрын
Are they killing a rat wiz a baseball bat? It seems like they are.
Я сделала самое маленькое в мире мороженое!
00:43
Кушать Хочу
Рет қаралды 4,9 МЛН
Help Me Celebrate! 😍🙏
00:35
Alan Chikin Chow
Рет қаралды 89 МЛН
龟兔赛跑:好可爱的小乌龟#short #angel #clown
01:00
Super Beauty team
Рет қаралды 68 МЛН
Mom had to stand up for the whole family!❤️😍😁
00:39
Column vs Row Oriented Databases Explained
34:16
Hussein Nasser
Рет қаралды 77 М.
08 - B+Tree Indexes (CMU Intro to Database Systems / Fall 2022)
1:22:47
CMU Database Group
Рет қаралды 18 М.
ClickHouse: what is behind the fastest open source columnar database
38:39
What does a Data Analyst actually do? (in 2024) Q&A
14:27
Tim Joo
Рет қаралды 70 М.
Я сделала самое маленькое в мире мороженое!
00:43
Кушать Хочу
Рет қаралды 4,9 МЛН