Scaling Write-Intensive Key-Value Stores

  Рет қаралды 22,943

Microsoft Research

Microsoft Research

Күн бұрын

Пікірлер: 19
@williamclausen3122
@williamclausen3122 11 ай бұрын
This was a great talk, thank you for publishing it!
@munikarmanish
@munikarmanish 4 жыл бұрын
I'm studying LSM-trees and this is indeed a very helpful video!
@jaigohil4963
@jaigohil4963 4 жыл бұрын
Is someone db fanatic. Really recommend. Great content!
@AnkitGaur9
@AnkitGaur9 3 жыл бұрын
Good presentation on LSM tree and it's application.
@dn5426
@dn5426 7 ай бұрын
surprised there wasn't any mention of WiscKey
@DirtyPio
@DirtyPio 3 жыл бұрын
I'm trying to internalize why probing any run would cost the same and not sure if I'm missingsomething. The paper states that: "Our insight is that the worst-case point lookup cost over the whole LSM-tree is proportional to the sum of the false positive rates of all filters. Assigning equal false positive rates to all filters, however, does not minimize this sum. The reason is that maintaining the same false positive rate across all runs means that larger runs have proportionally larger Bloom filters, whereas the I/O cost of probing any run is the same regardless of its size (due to each run having fence pointers in main memory that allow direct access to the relevant disk page)." Let's say fence pointers cover P pages, then the statement is true as long as each level consists of only P pages, but when higher levels start to have N * P pages you need to find your right page, so you need to search in your array of fence pointers on each level qualified by the bloom filter. So shouldn't the I/O cost be log(N) where a level has N fence pointers? It's going to be very cheap, because the array of fence pointers are in memory and cost of in-memory binary search vs. disk I/O has orders of magnitude difference.
@ozgurakkurt9770
@ozgurakkurt9770 Жыл бұрын
He means every single page has a pointer and you will read only one page if you are doing a point lookup. For example in RocksDB you set a single block size which applies to all levels. So this means every single block of that size will have a pointer. So for a point lookup you will read NumFalsePositives+1 pages from disk or something like that.
@kinglygaurav
@kinglygaurav 4 жыл бұрын
What is a run?
@nivdayan9717
@nivdayan9717 4 жыл бұрын
A run is a sorted array of data
@yunuskoning7584
@yunuskoning7584 4 жыл бұрын
@@nivdayan9717 So there can be multiple runs per level right? A run is just lsm specific nomenclature for any member of the set of sorted arrays associated with a single level ready to be merged with runs of the underlying level. Also the buffer itself might be called a level, e.g. level 0 or something.
@nivdayan9717
@nivdayan9717 4 жыл бұрын
@@yunuskoning7584 All correct!
@utkarshgupta2909
@utkarshgupta2909 3 жыл бұрын
SSTable
@parkatip
@parkatip 5 жыл бұрын
Please put the speaker's name in video titles
@josephoun4218
@josephoun4218 4 жыл бұрын
His name is Niv Dayan
@Cbon-xh3ry
@Cbon-xh3ry Жыл бұрын
It can be found on the first slide
@parkatip
@parkatip Жыл бұрын
@@Cbon-xh3ry i know that. nothing to do with what i asked
@Cbon-xh3ry
@Cbon-xh3ry Жыл бұрын
@@parkatip why would you want his name in the title ? It could be added to the description but the title doesn’t make sense 🤷‍♂️
@parkatip
@parkatip Жыл бұрын
@@Cbon-xh3ry because it's a crucial bit of information that i don't want to fish out of a video or its description. putting it in the title makes perfect sense to me so i can see it before clicking on these videos and spending a minute to get to that slide. it also makes the name searchable and visible in search results.
Models as Code: Differentiable Programming with Zygote
1:01:49
Microsoft Research
Рет қаралды 10 М.
Speedb/RocksDB - The Rise of LSM-Trees, Why Now?
38:28
Speedb
Рет қаралды 1,6 М.
Family Love #funny #sigma
00:16
CRAZY GREAPA
Рет қаралды 43 МЛН
ТЫ В ДЕТСТВЕ КОГДА ВЫПАЛ ЗУБ😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 3 МЛН
Car Bubble vs Lamborghini
00:33
Stokes Twins
Рет қаралды 38 МЛН
SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores
50:14
The Secret Sauce Behind NoSQL: LSM Tree
7:35
ByteByteGo
Рет қаралды 208 М.
InfluxDB Storage Engine Internals | Metamarkets
43:42
Data Council
Рет қаралды 15 М.
Dissecting, Designing, and Optimizing LSM-based Data Stores (Tutorial at SIGMOD 2022)
1:20:25
Data-Intensive Systems and Computing Lab - BU
Рет қаралды 6 М.
Distributed Systems in One Lesson by Tim Berglund
49:00
Devoxx Poland
Рет қаралды 415 М.
Базы данных LSM tree
17:01
Sergey Nemchinskiy
Рет қаралды 13 М.
The Strange Physics Principle That Shapes Reality
32:44
Veritasium
Рет қаралды 4,9 МЛН
Algorithms behind Modern Storage Systems
49:03
InfoQ
Рет қаралды 25 М.
Family Love #funny #sigma
00:16
CRAZY GREAPA
Рет қаралды 43 МЛН