Distributed Systems 7.1: Two-phase commit

Рет қаралды 68,538

Martin Kleppmann

Күн бұрын

Пікірлер: 57

@Alvaro-hm9vu 2 жыл бұрын

Got a job because of you... you changed my life... thank you

4 жыл бұрын

As soon as I find enough time I'm going to go through all the series. Thank you for making the effort.

@ahmetb 3 жыл бұрын

I was reading your book and got tired at the beginning of chapter 8 then I found your KZbin channel while trying to watch some videos before I dig into the chapter! Thanks for all your work in making this field more understandable.

@krizh289 8 ай бұрын

Thanks for putting these lectures on youtube--education should be accessible to all

@IrvinHerreraGarza 2 жыл бұрын

Mr. Kleppmann , I love your book and the way you explain things in your videos. Thank you so much for creating this material.

@thewolfer2281 2 жыл бұрын

Legend!! Im passing this course cuz of this playlist, the whole distributed systems in 1 day thanks to you

@iyadelwy1500 2 жыл бұрын

Bas yala ya abdo

@thewolfer2281 2 жыл бұрын

@@iyadelwy1500 😂😂😂 walahy sebtaha 3shan enta tshofha

@sachin_getsgoin 3 жыл бұрын

Delighted to watch the series. Thanks for creating this. I am already grateful to you because of "DDIA"

@nakonachev1407 2 жыл бұрын

Great lecture, straight to the point. Thanks for the effort put into it and the adequate way of explaining it.

@timurlanrahimberdiev6096 2 жыл бұрын

Great lectures, Great book, Great author 👍

@mikedelta658 2 жыл бұрын

Crystal clear explanation. Hats off to you, Martin!

@andreip9378 6 ай бұрын

Wow, I didn't know Martin has a YT channel. Instant subscribe.

@paulchicos1872 3 жыл бұрын

you my guy are a gem of humanity

@zhou7yuan 3 жыл бұрын

"Consistency" [0:11] ACID Read-after-write-consistency (lecture 5) Replication Consistency model Distributed transactions [2:26] Atomic commit versus consensus [4:47] >1 propose | all votes any 1 proposed value decided | must all commit/abort crash tolerated | abort if 1 node crash Two-phase commit (2PC) [6:33] (key moment) [9:45] The coordinator in two-phase commit [10:25] Fault-tolerant two-phase commit (1/2) [12:58] Fault-tolerant two-phase commit (2/2) [16:43]

@lifeirao7605 4 жыл бұрын

super illustrative. Thank you!

@leon_thinks 3 жыл бұрын

Grateful for the amazing lecture! Finally, get some impression about how Raft works.

@2tce 2 жыл бұрын

@martin Kleppmann, thanks for the interesting presentation all the way from Cambridge. I'll like to suggest that we could update the Linearizable CAS to: IF old = new THEN success := true There is no point comparing the old and new, if they are the same. :)

@martinkunev9911 3 жыл бұрын

What does the failed replica do when it comes up?

@manishsakariya4595 3 жыл бұрын

Very nice and detailed video. I would love to see your three-phase commit explanation,

@veerajbhokre1847 Жыл бұрын

Amazing lectures. Thank you so much. You are a god.

@abcdef-fo1tf Жыл бұрын

Am I right in understanding that we can use raft to send total order broadcasts and elect new coordinators for node communication and two phase commit for commiting data?

@kobew1351 10 ай бұрын

Hope you can make a video to explain three phase commit and how it improves fault tolerance.

@ashishkalra9438 2 ай бұрын

Please correct me if i missed something..Before client sends commit message to coordinator for 2 phase commit start..it perform normal transaction on replicas but my confusion is if the problem happens during that write i.e. one replica performed write but another fails then how two phase will help...i thought that the entire reason of two-phase commit is to perform write in prepare phase and the commit in commit phase...why we are allowing normal update to both replicas before 2pc can start the process.

@BodhiAli-t8c 3 ай бұрын

Thank you sir, a grateful new subscriber.

@za406 Жыл бұрын

Question: Why is the "prepare" message necessary if replicas "ack" on the original transaction message?

@rastaeule7482 3 жыл бұрын

Very clear explanation!

@OffAndGo 2 жыл бұрын

Hello, the video is so helpful but hope that my question can be clarified, best. Does the coordinator node care if other nodes have committed successfully or not, if it does and a node failed to commit, does the coordinator make a second decision for sending an abort to all the nodes?

@tanmaymehrotra86 2 жыл бұрын

what if is nodes reply to co ordinator that yes we can peform this transaction and send out the ok message (in response to prepare) but after sending the prepare they crash ? I assume these nodes will replicate the data (via consensus) so even in the face of faliure another leader will get elected. I do understand how total order broadcast work via raft but I am unable to how data is locked ?

@BHARATKUMAR-le6eq 2 жыл бұрын

Hi Martin, you told failure detector can be run on any node. So my doubts are what will happen if the specific node is down or crashed on which failure detector is running?? and then how we will detect how many other nodes also crash??

@danish6192 2 жыл бұрын

Why client is opening transaction simultaneously on 2 nodes in 2PC ? shouldn't the transaction be open on master node only ?

@yihanwu3823 3 жыл бұрын

Fault tolerant 2PC means the coordinator is redundant and can be removed?

@ivan.p 4 жыл бұрын

Good explanation! Thank you!

@jainamm5307 9 ай бұрын

What happens if one of the nodes has sent ok for prepare but while waiting for all the oks it crashes ? The transaction will go forward in all the other nodes.

@jainamm5307 9 ай бұрын

One potential solution to this problem is to have a recovery mechanism for the node when it comes back up.

@jainamm5307 9 ай бұрын

One potential solution is to have a recovery mechanism for the node when it comes back up.

@complicated2359 Жыл бұрын

If database gone down after it had agreed to commit, what would you do?

@zuggrr 2 жыл бұрын

This is fantastic ! thank you so much :)

@albumlist1 3 жыл бұрын

Hi Martin, Thanks for this amazing series. I have a question here . If for any replica there are conflicting answers (one sent by the replica itself and other sent by other node on behalf of the replica(suspecting the replica is down) around the same time, shouldn't it take the later decision instead of the first decision? If some other node said a "No" (on this replica's behalf) and then the actual replica recovers itself and says a "yes" , then taking the later decision looks more logical . Same is true in the opposite case.

@m-ld3832 2 жыл бұрын

At first glance, that approach is appealing, since it appears to be the safest, avoiding any confusion by taking the most conservative default position. However, that isn't actually necessary, by virtue of the way Total Order Broadcast works. This is down to the relative timing of the slow / recovered replica's vote of "Yes", and the consensus decision by all the nodes. If the "Yes" vote is received from the slow node _after_ all the other "No" votes from others on its behalf, those "No" votes are overridden by the "Yes", since that was the first vote seen by it from others. What's not entirely clear from the video is precisely when a consensus is considered to have been reached, and if/how this is consequently communicated among them. Presumably, if all the other nodes have already settled on the decision against proceeding before the "Yes" vote is received from the slow one, then that decision is not invalidated. The previous video in this series may expand upon this.

@zaixrx 6 ай бұрын

Big thanks

@QDem19 3 жыл бұрын

Thank you for going over this. I have a question regarding slide 2 of the Fault tolerant 2PC. Which node is taking the decision on the fate of the transaction, is it the current term leader of the Total Order Broadcast, or can it be any node participating in the transaction. It seems like it should be the former, i.e. current term leader, but just wanted to be sure.

@giorgiobuttiglieri5876 Жыл бұрын

Each node can independently understand if the distributed transaction failed: each node receives the same sequence of messages and the algorithm used to determine if the transaction failed is deterministic. So all the nodes will reach the same conclusion without the need of a coordinator.

@jainamm5307 9 ай бұрын

@@giorgiobuttiglieri5876 When you say each node receives the same sequence of messages - how is the "sequence" guranteed to be the same in every node?

@giorgiobuttiglieri5876 9 ай бұрын

@@jainamm5307 For the proposed fault-tolerant version of the 2PC, we use total order broadcast as communication primitive. So by definition all nodes receive the same messages in the same order. If you are interested in how to achieve this, there are other videos in this channel explaining it very well

@jeniamtl6950 Жыл бұрын

atomic commitment is completely different from atomic in ACID. For example, if students and classes are handles on different nodes, then after all components have voted yes and the coordinator send the commit messages, there will be a moment when the student has enrolled in a class but the class does not yet exist or vise versa. This is completely different from "atomic" in ACID.

@yuchen6630 Жыл бұрын

thank you

@austecon6818 Жыл бұрын

I still don't get how with geographically distributed nodes (with different ping/latency to each other)... total order broadcast can prevent a (very rare and unlikely) race condition where you have 5/10 nodes that get the failure detector message to abort fractions of a second before the sluggish node sends a vote to go ahead and commit... and the other 5/10 nodes would have the opposite ordering If it happens at exactly the same time... due to network latency effects... you could have a split of the network (5 nodes with low ping to the failure detector and 5 nodes with low ping to the sluggish node but high ping to the failure detector)... so in that case do you just go with majority rules and always have an odd total number of nodes to decide which is the true(er) version of history? But now we are into 3 phases not 2 phases... So is this like a shitty version of the raft protocol or something where it assumes 0 network latency?

@Ynno2 9 ай бұрын

Total order broadcast requires consensus and if only 5/10 nodes have agreed then there's no quorum and no consensus. Neither event will be actionable until n/2+1 nodes have received it. If there is a 50%/50% split, neither side of the split will make any decisions (nothing will be committed and everything will grind to a halt) until the partition is resolved.

@murali1790able 3 жыл бұрын

I thought consensus are used in databases but looks like consensus can't solve atomic commit problem. Can anyone explain the real application of consensus?

@vhscampos1 2 жыл бұрын

Consensus achieves total order broadcast, i.e. all nodes deliver messages/operations in the same order.

@arthursimeon2620 2 жыл бұрын

So is the coordinator used for decision making on commits, and the total order broadcast system just a backup in case the coordinator crashes?

@BHARATKUMAR-le6eq 2 жыл бұрын

I have one more doubt. So we will wait to get an "OK" message from all the replicas or we will commit to a specific replica after receiving the "OK" message??. I mean if we will wait for all the replicas that make sense but if we just commit after receiving "OK" then it may consist of inconsistency. Ex if one replica sends the message "OK" and we commit the change to a specific replica but the other replica crash and does not send the "OK" message then both replica will be inconsistent.