CRDTs: The Hard Parts

Рет қаралды 70,556

Күн бұрын

Пікірлер: 40

@jawaffles 4 жыл бұрын

As a concrete example of duplication/picking-both-options being a valid resolution, see the merge-conflict behaviour of most any source code version control tool, which inserts conflict markers along with two or more incompatible states. Note this leaves conflict at the user-meaning level, needing user input to resolve, but it resolves the conflict at the level of the version control data structure. This difference between conflict of high level meaning vs. conflict at the data structure level can cause a lot of misunderstandings in casual discussion I think.

@BosonCollider Жыл бұрын

There is a VCS that makes this explicit called Pijul, which uses a CRDT to make patches associative & commutative, and goes for exactly this approach where it is very conservative about how to handle conflicting merges (explicitly identifying them and marking them as conflicts) and forces the user to resolve them with a third patch.

@russellbrown8760 4 жыл бұрын

Thanks Martin, a really interesting talk, so good to see constant progresss in this field. I have to point out that CRDTs have been in production systems since 2014 at least, so maybe your final summary about them one day being practical and not just research prototypes is a little unfair.

@kleppmann 4 жыл бұрын

Good point Russell! I had in mind collaboration software (where the replicas are apps running on end-user devices), where OT is currently still quite popular; of course distributed databases like Riak have been using CRDTs in production for a long time.

@kevinjahns2746 4 жыл бұрын

@@kleppmann I pointed out to you before that CRDTs are being used in collaborative software that runs in production. Already in 2016 Pluxbox - a radio management software - integrated the Yjs CRDT for collaborative richtext editing. More recently Nimbus Notes (>1 million users) integrated Yjs as their collaborative engine. Another example is the Teletype CRDT that powers collaborative editing in Atom (since 2017). Dismissing this, and describing the current state of CRDTs as just research prototypes, really doesn't help to make CRDTs more prominent in collaborative software.

@Ondra011 4 жыл бұрын

@@kevinjahns2746 I don't think Martin is trying to dismiss the practicality of CRDTs, it seemed more like he was a bit dismayed at the fact that the largest of collaborative editing applications (Google Docs, Office 365) are not using CRDTs. One would think -- or hope -- that Google and Microsoft hire enough talent to solve these hard problems properly and innovate beyond last-century algorithms.

@kevinjahns2746 4 жыл бұрын

@@Ondra011 Many people, not just Martin, describe CRDT implementations for shared editing just as prototypes. This is unfortunate and simply not true. Just because Google and the majority of companies are not using them doesn't make them impractical. There is little chance that they are going to rebuild a system that works really well.

@CalebRox 3 жыл бұрын

@@kevinjahns2746 What makes CRDT superior to OT for Google Docs-like apps that can already share a central server?

@kikikukudu61 2 жыл бұрын

Great video! Easy to follow, thanks for the hard work! :)

@YG-bx3wk 3 жыл бұрын

Such a great video with a detailed explanation! thanks a lot!

@NikitaNikita-k8q Жыл бұрын

Great Video! I am adding collaboration to my app and this video helped me a lot.

@ruslanfadeev3113 3 жыл бұрын

Thought: maybe it's better to design the operations themselves with underlying CRDT structures in mind. For example, if moving a node cannot be done well in all cases, we could give users an on-screen personal clipboard that holds the node that is being moved. So when the user presses Ctrl+X to cut the node, it goes into the clipboard and gets a new ID. When the user presses Ctrl+V, the node is moved from the clipboard to the main graph. This way we're duplicating nodes on concurrent operations, but because of the new element this behavior makes sense to the user because duplication happens during copying into the personal clipboard. And if user A moves the node while user B was in the process of doing so, they can still choose to insert the node somewhere else too. I guess this makes conflict resolution into a part of the editing process.

@hendrikahuis6919 4 жыл бұрын

Very interesting and concise talk. Thanks!

@AndersonSilva-dg4mg 4 жыл бұрын

Thank you very much!

@artem_r 4 жыл бұрын

Thanks for the amazing talk, it was very interesting. I have one question-at the beginning (2:58) there is a table that shows that CRDTs are being used (TomTom GPS, Teletype for Atom, etc), but at the end of the talk you said that CRDTs might become a future of collaborative apps. What does the latter mean? Thanks!

@vetiarvind 15 күн бұрын

well none of that are collaborative apps.

@asm0di0 4 жыл бұрын

Thank you for very interesting talk! I have a question on concurrent move operation in tree from 46:41. Do we really sync all our undo and redo operations to Replica 2? And if yes - then how should it roll back its mv(b, a) operation?

@MikkoRantalainen 4 жыл бұрын

Such CRDT system can only implement actions that it can undo. Basically if you originally move something and you have to undo it, you just move it back to previous location. And because *all* changes to this filesystem go using the same CRDT system, it has already undone every change done later so it obviously can undo the move, too. And the "tombstones" mentioned later are just the fact that when you *delete* something, you have to still store it somewhere to be able to undelete perfectly *when* needed.

@asm0di0 4 жыл бұрын

@@MikkoRantalainen Thank you for the answer!

@MaxPrehl 2 жыл бұрын

39:05 The result on Windows from Powershell is: mv : Access to the path 'C:\Users\Max\3D Objects\a' is denied.

@bernardosulzbach 4 жыл бұрын

Great slides.

@EvgeniyDolzhenko 11 ай бұрын

What I don't quite understand is how is it possible to say "*last* writer wins", how do we get nodes to agree on which one is the last?

@codeelsewhere 8 ай бұрын

timestamp

@rabindrapatra7151 2 ай бұрын

@@codeelsewhere but they say we cant trust on timestamp due to clock-skew

@gamenovice4928 Жыл бұрын

@23:40 - actually wouldnt the interleaving for typing backwards problem be a big issue for languages that are typed right to left, like arabic or hebrew? That being said you could invert RGA for multi-lingual cases so that the structures are in the right order. But if the string itself is just left to right character sequences, and it's up to the text rendering engine to pick how it's laid out, then what I raise as a question here is likely a non-issue

@nseco 4 жыл бұрын

Thanks for this. I am confused about the suggestion to use Lamport clocks at 49:20. Don’t Lamport clocks only provide a partial ordering? The impression I got from the example you give prior to explaining the data structure seem to imply the need for a total order.

@spudwaffle 4 жыл бұрын

Nuno Seco see 1:01:50

@monarodan 3 жыл бұрын

It's a bit unfair to say that OT requires a central server without mentioning that this is only done to avoid needing to satisfy the TP2 property of transformation functions.

@davidmiller252 4 жыл бұрын

This would make a really good Computerphile episode. You should contact Brady

@vishalsharma-pb4pj 3 жыл бұрын

Why not bundle operational transformation in the application itself? So it wouldn't have to go through server each time.?

@crfe5gce 4 жыл бұрын

24:35 would it solve the problem to order branches by session id then time?

@dabblewriter 4 жыл бұрын

Late to the party, but you could solve the Bacon Milk problem by representing text with an array having each line be a separate entry. This does require additional work for splitting/joining lines, but it seems it would map closer to what the user would expect to happen in a merge.

@stefan-bayer Жыл бұрын

Does anyone have insights if Apple implemented CRDTs already in Apple Notes?

@amoghyermalkar5502 8 ай бұрын

That's a great question but no way of finding this out :( let me know if you do though!

@bojidarmarinovbg 4 жыл бұрын

1:06:42 isn't it possible to encode/decode individual UTF8 characters without storing length alongside them? Referring to the Wikipedia article, en.wikipedia.org/wiki/UTF-8#Encoding, it should be possible to find the number of bytes in the variable-width character by looking only at the significant ones in the first byte.

@Ondra011 4 жыл бұрын

sure, but you're also interested in the actual range that the edit added/removed.

@kleppmann 4 жыл бұрын

That is sufficient for splitting the byte sequence into code points, but a unicode "character" (technically, a grapheme cluster) may consist of multiple code points. For example, a letter plus an accent, or an emoji plus a skintone modifier. Storing the length separately allows us to encode such situations where a sequence of multiple code points get treated as a single character.