"Transactions: myths, surprises and opportunities" by Martin Kleppmann

  Рет қаралды 75,581

Strange Loop Conference

Strange Loop Conference

Күн бұрын

Back in the 1970s, the earliest databases had transactions. Then NoSQL abolished them. And now, perhaps, they are making a comeback... but reinvented.
The purpose of transactions is to make application code simpler, by reducing the amount of failure handling you need to do yourself. However, they have also gained a reputation for being slow and unscalable. With the traditional implementation of serializability (2-phase locking), that reputation was somewhat deserved.
In the last few years, there has been a resurgence of interest in transaction algorithms that perform well and scale well. This talk answers some of the biggest questions about the bright new landscape of transactions:
What does ACID actually mean? What race conditions can you get with weak isolation (such as "read committed" and "repeatable read"), and how does this affect your application?
What are the strongest guarantees we can achieve, while maintaining high availability and high performance at scale?
How do the new generation of algorithms for distributed, highly-available transactions work?
Linearizability, session guarantees, "consistency" and the much-misunderstood CAP theorem -- what's really going on here?
When you move beyond a single database, e.g. doing stream processing, what are your options for maintaining transactional guarantees?
Martin Kleppmann
@martinkl
Martin Kleppmann is a software engineer and entrepreneur, and author of the O'Reilly book Designing Data-Intensive Applications (dataintensive.net), which analyses the data infrastructure and architecture used by internet companies. He previously co-founded a startup, Rapportive, which was acquired by LinkedIn in 2012. He is a committer on Apache Samza, and his technical blog is at martin.kleppman....

Пікірлер
@ruixue6955
@ruixue6955 3 жыл бұрын
3:40 Durability 4:28 Consistency 4:36 != C in CAP theorem 5:08 5:27 associated as the transactions the application executed in the database, move the database one consistent state to another 6:06 6:21 Atomicity 6:55 fault handling 9:54 Isolation 11:26 Question: *repeatable read* VS *read committed* 12:00 explain by example 13:02 read committed - 13:18 14:57 *read skew* (can occur under *read committed*) 15:06 assumption: 2 accounts: x & y 15:33 consider: you have concurrently running a read-only transaction ( *backup process* or *analytic query* ) 16:30 problem for the *backup* : you've seen different part of databases at different point in time 16:39 can happen under *read committed* 16:48 *repeatable read* and *snapshot isolation* to prevent *read skew* 17:25 more common: *snapshot isolation* 18:36 example of *write skew* - 18:45 INVARIANT: at least 1 doctor is on-call 19:42 assumption on data in database 20:15 result in violation on INVARIANT: there has to be doctors on-call 20:33 in *Oracle* this can not be prevented unless
@aminigam
@aminigam 2 жыл бұрын
Brilliant enlightening session, a gem. Listening to Martin is a pleasure
@arunsatyarth9097
@arunsatyarth9097 4 жыл бұрын
Listening to Martin Kleppmann is like an orchestra. You enjoy it to the very limit!
@Peeja
@Peeja 9 жыл бұрын
Fantastic talk! It's worth noting, spacetime itself obeys the same upper bound on consistency without coordination: causality.
@proudindian-73
@proudindian-73 2 жыл бұрын
That was very deep..
@rohandvivedi
@rohandvivedi 28 күн бұрын
Thanks you, for this insightful talk
@valentinwaeselynck8124
@valentinwaeselynck8124 9 жыл бұрын
"Every sufficiently large deployment of microservices contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half transactions" :D
@implemented2
@implemented2 4 жыл бұрын
What is half transactions?
@SimplyaGameAholic
@SimplyaGameAholic 3 жыл бұрын
@@implemented2 I think he meant the transaction contains partial failures with no reversibility. Basically you have no atomicity, and you're leaving things messed up after any failures :)
@michaeleaster1815
@michaeleaster1815 9 жыл бұрын
Thanks! I enjoyed this talk very much, esp. the sanity check of "who can describe read committed vs repeatable reads, from memory?".
@ChumX100
@ChumX100 3 жыл бұрын
Absolutely brilliant! Very entertaining as well.
@gurjarc1
@gurjarc1 2 жыл бұрын
i have a doubt on 2 phase locking at 23:00 in the video say you have 2 txns t1 and t2 that executes the unit of read first, check some condition and update secnario1 - t1 has got exclusive lock on the row (for write), before t2 can get shared lock to read. So read of t2 has to wait for t1 to commit. So we get consistency. scenario2 - say both t1 and t2 both executed the read part, but not yet executed the modify part, so both got shared lock that read all the doctors where oncall=true. Now neither t1 nor t2 can commit, because t1 cant write as t2 is holding shared lock and vice versa. so it is a deadlock in this scenario Can anyone confirm that scenario1 was a case where in 2pl successfully was able to serialize and scenario2, the timing was bad that it resulted in a deadlock which required the db system to interfere and victimize one of the two txns. Thanks for helping a guy in advance trying to understand these concepts
@a3090102735
@a3090102735 5 жыл бұрын
This is a great talk! I'm also reading your books, the transactions section, however, listening to your explanations makes more sense to me now!
@rohandvivedi
@rohandvivedi 28 күн бұрын
19:44 , Can we say that this anomaly is due to the database, having no clue of what the application is trying to do? If we store the count of on call doctors in the database, with mvcc snapshot isolation level, the writes to this counter will serialize and abort the other doctor's transaction.
@ykochubeev
@ykochubeev 5 жыл бұрын
Thank you so much! Very interesting problem highlighting!
@ruixue6955
@ruixue6955 3 жыл бұрын
21:53 how to implement serializability 22:08 two-phase lock
@samlaf92
@samlaf92 Жыл бұрын
At 14:33 does read committed implementation with row-locking lead to deadlock here?
@BillBurcham
@BillBurcham 8 жыл бұрын
At 34:35 kzbin.info/www/bejne/a4vNmYGKgp2Li5om35s "Every sufficiently large deployment of microservices contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of transactions"
@barcelona.netcore4191
@barcelona.netcore4191 4 жыл бұрын
Brilliant!
@darshanime
@darshanime 3 жыл бұрын
at 14:34, how does read commited prevent the inconsistency? isn't it that transaction serializability prevents it?
@PaulSladekb
@PaulSladekb 8 жыл бұрын
great talk!
@sabirove
@sabirove 6 жыл бұрын
brilliant! ty!
@BethKjos
@BethKjos 2 жыл бұрын
Never did get to that explanation of "repeatable read"...
@bigphatkdawg
@bigphatkdawg 2 жыл бұрын
I think it was implied: Not susceptible to read skew
@gijduvon6379
@gijduvon6379 3 жыл бұрын
Это просто ахуительный видос!
@DanHaiduc
@DanHaiduc 5 жыл бұрын
Consensus is indeed expensive; blockchains are proof of that. Cryptocurrencies' transaction rate are limited either artificially, or by the fastest (single) node's processing power. For anything faster, you'd have to do sharding, which sacrifices consensus.
@valtih1978
@valtih1978 8 жыл бұрын
Which `read skew` is he talking about? Read Committed means that lock is take for the duration of select statement. This read lock should prevent any commit during the 'backup process'.
@stIncMale
@stIncMale 7 жыл бұрын
Probably the best way to get an answer to your question is by reading the remarkable article called "A Critique of ANSI SQL Isolation Levels" (just google it).
@implemented2
@implemented2 4 жыл бұрын
You can have a long running read transaction, for instance, making a dump. Locking the whole database for writes for the duration of dumping is not possible.
"I See What You Mean" by Peter Alvaro
52:29
Strange Loop Conference
Рет қаралды 56 М.
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН
Мясо вегана? 🧐 @Whatthefshow
01:01
История одного вокалиста
Рет қаралды 7 МЛН
Try this prank with your friends 😂 @karina-kola
00:18
Andrey Grechka
Рет қаралды 9 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Martin Kleppmann - Event Sourcing and Stream Processing at Scale
51:34
Domain-Driven Design Europe
Рет қаралды 54 М.
Postgres, MVCC, and you or, Why COUNT(*) is slow (David Wolever)
29:39
Distributed Systems 8.2: Google's Spanner
18:41
Martin Kleppmann
Рет қаралды 35 М.
"The Mess We're In" by Joe Armstrong
45:50
Strange Loop Conference
Рет қаралды 384 М.
"How NOT to Measure Latency" by Gil Tene
42:59
Strange Loop Conference
Рет қаралды 107 М.
"Turning the database inside out with Apache Samza" by Martin Kleppmann
47:43
Strange Loop Conference
Рет қаралды 191 М.
Spring & Spring Data JPA: Managing Transactions
10:34
Thorben Janssen
Рет қаралды 67 М.
CRDTs and the Quest for Distributed Consistency
43:39
InfoQ
Рет қаралды 57 М.
This is why understanding database concurrency control is important
9:05
Quando eu quero Sushi (sem desperdiçar) 🍣
00:26
Los Wagners
Рет қаралды 15 МЛН