Column Oriented Storage (with Parquet!) | Systems Design Interview: 0 to 1 with Ex-Google SWE

  Рет қаралды 7,662

Jordan has no life

Jordan has no life

Күн бұрын

Пікірлер: 30
@abhimanyushekhawat2626
@abhimanyushekhawat2626 5 сағат бұрын
The great part about your videos is how you build the concepts from the ground up with first principles & fundamental intuitions. I am watching your videos after reading DDIA and I am able to appreciate them so much more. Thanks Jordan for your efforts.
@mohanakumaar
@mohanakumaar 21 күн бұрын
The way you synthesize the information given in DDIA in 15 Minutes for a topic is amazing:)
@tarun4705
@tarun4705 8 ай бұрын
This whole playlist is a valuable treasure.
@dwivedys
@dwivedys 2 ай бұрын
Not sure if anyone (or may be all of you) noticed the phrase he very smoothly utters at 6:58 -- “at least that’s what she said…” … amazing man!!
@pavansrinivas4388
@pavansrinivas4388 Ай бұрын
Haha, didnt notice it.. such humor in the context of explaining technical stuff... awesome:)
@user-rn4pb6yu1c
@user-rn4pb6yu1c Жыл бұрын
Love it Jordan ! Thanks for the video, looks like you are confident regardless of the rejections, as you just want to perform data analysis. One thing tho, I think LSM tree itself should not be an in memory binary tree, it consists of both in-memory component of memtable (binary tree) and the on-disk component SStable
@PavanBommana19-25
@PavanBommana19-25 6 ай бұрын
Love this series! Thank you, Jordan. Can you please post the pdfs to the ipad notes that you write during the videos? It can help a lot in quickly refreshing the topics.
@jordanhasnolife5163
@jordanhasnolife5163 6 ай бұрын
Hey Pavan! I will once my current series is complete so that I can do it all in bulk
@TheSdl79
@TheSdl79 5 ай бұрын
The more that we can fit in there - the better... that's what she said... I'm dying))))
@HarshitKumar-zx4dj
@HarshitKumar-zx4dj 3 ай бұрын
To summarise: Column storage - data of a column is stored together in disk. Selecting few column for analysing is faster as we can pick those column in faster way as they are located at same location in disk. Column oriented storage also allows column compression.
@timavilla
@timavilla 5 ай бұрын
Hey Jordan, thank you for the series, complicated concept stick really well thanks to silly examples! As i understood, both compressions are used only if there is a small amount of possible values. But why would i ever use dictionary compression, that doesnt reduce amount of values stored, just makes each value smaller, if regular compression greatly reduces amount of values stored? Wont result of regular compression always take less space?
@jordanhasnolife5163
@jordanhasnolife5163 5 ай бұрын
You can't always do regular compression! It requires similar values to be next to one another on disk
@cricket4671
@cricket4671 Жыл бұрын
Bdw, like your content & thank you for making these videos. Some unsolicited skin advice. I heard you mention about bad skin/acne in one of your earlier video. I had something similar & would suggest cutting out all sugar, beer, wine, simple carbs from diet. Increasing fiber intake, adding a probiotic & taking all of your multivitamins. Healing leaky gut takes a while. All the best! 😊
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Thanks man! I'd love to cut alcohol, alas I love it - the main issue for me personally I think is eating tons of dairy for lifting, doesn't help me
@Spreadlove5683
@Spreadlove5683 9 ай бұрын
Additionally if you just need one column, I assume you don't have to load a bunch of extra data from disk into memory just to then filter it all out. So in addition to locality benefits, you also don't have to read as much stuff from disk. Although if latency outweighs bandwidth because you aren't reading much data, it might not matter.
@jordanhasnolife5163
@jordanhasnolife5163 9 ай бұрын
Absolutely correct
@dibll
@dibll Жыл бұрын
Could you pls do a video on different types of indexes - clustered, multi dimensional etc?
@dibll
@dibll Жыл бұрын
Is the compressed data stored along side the actual table on hard disk? How does the client knows that compressed data 011000110(example) denotes real value (let's say)1?
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
There would be a little bit of additional metadata saying that the column is in compressed form - it would defeat the purpose of compression if we stored the original column besides it :)
@radosawmul8963
@radosawmul8963 6 ай бұрын
What if there is more than 9 rows during conversion from bitmap to run lenght encoding? :D How would one know if 13 denotes "13" or "1" and"3"
@jordanhasnolife5163
@jordanhasnolife5163 6 ай бұрын
Keep in mind that these are really numbers in binary, so assuming we used an int to represent our run size we just read the next 32 bits.
@murike
@murike 5 ай бұрын
If columns are stored separately, how do I get data from a column that is not where clause? ex: select name where age < 23 // how it matches column name with column age?
@jordanhasnolife5163
@jordanhasnolife5163 5 ай бұрын
Sort them in the same order
@amlanbiswas4526
@amlanbiswas4526 5 ай бұрын
Along with the data, the row_id can be present, like: Name: Jordan:1, Trump:2 Age: 25:1, 102:2
@art4eigen93
@art4eigen93 Жыл бұрын
nomad filmmaker we know about... but nomad coder? wow!
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Haha not exactly, but that would be fun
@recursion.
@recursion. Жыл бұрын
Audio is completely out of sync tho 😪
@jordanhasnolife5163
@jordanhasnolife5163 Жыл бұрын
Damn, will look into this and have it fixed by next time
@satadhi
@satadhi 5 ай бұрын
did you choose 322 on purpose are u a dota 2 fan ?
@jordanhasnolife5163
@jordanhasnolife5163 5 ай бұрын
Haha I did not didn't know there was a reference there
GTA 5 vs GTA San Andreas Doctors🥼🚑
00:57
Xzit Thamer
Рет қаралды 27 МЛН
when you have plan B 😂
00:11
Andrey Grechka
Рет қаралды 48 МЛН
Column vs Row Oriented Databases Explained
34:16
Hussein Nasser
Рет қаралды 76 М.
How row oriented and column oriented db works?
21:26
Tech Dummies Narendra L
Рет қаралды 41 М.
What is a Columnar Database? (vs. Row-oriented Database)
8:20
Anton Putra
Рет қаралды 2,3 М.
Linearizable Databases | Systems Design Interview 0 to 1 with Ex-Google SWE
12:44
System Design: Why is Kafka fast?
5:02
ByteByteGo
Рет қаралды 1,1 МЛН
GTA 5 vs GTA San Andreas Doctors🥼🚑
00:57
Xzit Thamer
Рет қаралды 27 МЛН