Build a simple real-time collaborative text-editor with CRDT (Conflict-free replicated data types)

We’re all using tools like Google Docs or Notion and those tools give us the possibility to edit the same document from different devices simultaneously. But have you wondered how these tools make everyone’s edits in sync without conflict? The answer to this question is a powerful concept called Conflict-free replicated data types (CRDT).

CRDT is a data structure designed specifically for distributed systems to handle concurrent updates without requiring constant synchronization or central coordination. The system employing CRDT will achieve eventual consistency, meaning that it will guarantee the states of all the nodes in the system will be converged (the same) after a certain time if there is no more update.

Synchronization and Eventual Consistency

With synchronization, here it means that for every update in the shared document that we make, we first need to acquire the shared lock which stops all other (update) activities in other nodes, make some updates, and then we release the lock. Other nodes could make some updates after the lock is released, and there is no concurrent update in this case. However, this is not ideal for the distributed collaborative text-editor use case, since if just one person can edit the document at a time then the word collaborative would become invalid. With CRDT, the shared document could be updated from multiple nodes at the same time without locking, in the end, they all resolve to the same state, eventually consistent, if there is no further update.

Formally, Let’s define the distributed system with a set of processes (nodes) \(P = \{p_1, p_2, p_3,…,p_n\}\) that operates on replicated data objects (the shared object between different processes). For a given data object \(x\), let’s \(x_i(t)\) denotes the value of the data object \(x\) at the time \(t\) stored at the process \(p_i\). We can express the eventual consistency property of a system as:

\(\normalsize{\begin{equation} \forall i,j \in \{1,2,…,n\}, \lim_{t \to \infty} x_i(t) = \lim_{t \to \infty} x_j(t) = \bigsqcup_{u \in U} u \end{equation}}\)

Where:

\(i, j\) represent different nodes in the system.
\(x_i(t)\) is the state of node \(i\) at time \(t\).
\(x_j(t)\) is the state of node \(j\) at time \(t\).
\(U\) is the set of all updates.
\(\bigsqcup\) is a special “join” operation that combines all the updates.

This means that, after some certain time has passed in the system, all the updates from different nodes in the system would resolve to the same final state, regardless of the order in which they’re applied. The system is eventually consistent if these properties hold:

Key properties are:

Eventual delivery: For any update \(u\) that initiated at the process \(p_i\) at time \(t\), there exists time \(t’ > t\) such that all non-faulty process \(p_j \in P\) have received the update \(u\) by the time \(t’\).
Convergence: If there is no update for the object \(x\) after time \(t_0\), then there exists time \(t_f\), such that: \(\forall {i, j}\in \{1, 2,…,n\} : x_i(t) = x_j(t)\quad \forall t \ge t_f\).
Deterministic resolution: Given a set of concurrent updates \(U = \{u_1, u_2,…,u_m\}\) to a data object, all processes must resolve conflicts in a deterministic manner using some conflict resolution function: \(F: U \rightarrow x’\), such that: \(\forall i, j \in \{1, 2,…n\}: F_i(U) = F_j(U)\)

Let’s describe these properties in simpler words, the first property, eventual delivery, means that for any update in the system, all functional processes will receive the update at some point in time, this can be after some short or long period, but eventually, this update will be delivered to all processes.

For the second property, convergence means if no further updates are happening across all processes, then the state of all the processes will be ‘converged’ (the same) after some time.

The third property, deterministic resolution, means that if multiple updates from different processes occur at a particular time, some resolution function must exist that takes in all the updates across all nodes, and if this function is applied to any of the processes, it will yield the same value.

Eventual consistency could be achieved in different approaches, for CRDTs, the deterministic resolution function has some certain mathematical properties making them eventually consistent without any consensus protocol or central coordination. Formally, it can be defined as a tuple \((S, \sqsubseteq,\sqcup)\) where:

\(S\) is the set of possible states.
\(\sqsubseteq\) is a partial order relation on \(S\)
\(\sqcup: S \times S \rightarrow S\) is the join operation (the deterministic resolution function)

Then, these fundamental mathematical properties must hold in the deterministic resolution function to make CRDTs eventually consistent:

Commutativity: For any \(x, y \in S: x \sqcup y = y \sqcup x\)
Associativity: For any \(x, y, z \in S: (x \sqcup y) \sqcup z = x \sqcup (y \sqcup z)\)
Idempotence: For any \(x \in S: x \sqcup x = x\)

The type of CRDTs we’re talking about today is the position-based CRDTs (Sequence CRDTs) that are used in text editing. How does the resolution function in position-based CRDTs achieve the mathematical properties above? First, we still need to formalize it a bit:

Let \(P\) be the set of positions with a total order relation \(<_P\).
Let \(V\) be the set of values (characters).
Let \(S\) be the set of states, where each of the states is a set of pair \((p, v)\) where \(p \in P\) and \(v \in V\).

For position-based CRDTs to function correctly, the position set \(P\) must satisfy these properties:

Uniqueness: Position identifiers are unique across all processes.
Total ordering: \(\forall p_1, p_2 \in P, p_1 \ne p_2 \Rightarrow (p_1 <_P p_2) \vee (p_2 <_P p_1)\)
Position Density: \(\forall p_1, p_2 \in P, p_1 <_P p_2 \Rightarrow \exists p_3 \in P: p_1 <_P p_3 <_P p_2\)

The first property, total ordering, if we consider 2 different positions in the set of all positions, then one position must come before another, 2 different positions can never be identical (it’s the property that the resolution function must satisfy), \(<_P\) indicates the element from the left comes before the element on the right in order.

Density means that if we take any 2 different options in the set of positions, there always exists some position that’s positioned between those two.

Uniqueness means that by taking all the position identifiers from all the processes in the system, there will be no single duplicate position identifier entry.

We’ll connect the dots between the aforementioned mathematical properties with position-based CRDT properties in a moment, but first, let’s define the data structure that we will use for character positioning in text editing CRDTs:

Let’s define our CRDT state \(S\) as a set of character elements:

\(S \subseteq \{(id, pos, val, tomb) \mid id \in \mathbb{N}, pos \in P, val \in \text{Char}, tomb \in \{\text{true}, \text{false}\}\}\), where:

\(id\) is the unique identifier.
\(pos\) is the position from a total ordered set \(P\).
\(val\) is the character value.
\(tomb\) is the boolean value that indicates whether a character is deleted.

Position Uniqueness

In a proper position-based CRDT implementation, the position identifier is always unique, there must be no 2 positions with the same identifier. To ensure that, the \(pos\) from the total ordered set that we defined early is a tuple of \((index, site, clock)\), where:

The \(index\) is an array for ordering within the document.
The \(site\) is a string to differentiate between different sites (processes).
The \(clock\) value to order operations from the same site.

Technically, in the implementation, we can achieve position uniqueness through a combination of:

Unique site identifiers
Monotonically increasing logical clocks
UUID generation for character IDs

interface Position {
  index: number[], // location in document (array of fractional indexing)
  site: string, // unique site identifier
  clock: number // logical clock that is monotonically increase for each new update operation
}


interface Character {
   id: string,
   value: string,
   position: Position,
}

private incrementClock(): number {
   return ++this.clock;  
}

insert(value: string, prevId: string | null = null, nextId: string | null = null): Character {
   const position: Position = this.generatePositionBetween(
       prevId ? this.characters.get(prevId)?.position || null : null,
       nextId ? this.characters.get(nextId)?.position || null : null
   ); // for every new position inserted, the generatePositionBetween function always increase the logical clock of the document for the current site monotonically.
   
   const char: Character = {
       id: uuidv4(),
       value,
       position
   };
   
   this.characters.set(char.id, char);
   return char;
}

With that, if there are 2 concurrent inserted concurrently, then they still yield 2 different positions because:

Either they come from different sites (processes, users)
Or if they come from the same site, then their \(clock\) value will be different.

Total ordering

Now, let’s take a look at how we could implement the total ordering property of the position-based CRDT:

 private comparePositions(pos1: Position, pos2: Position): number {
      // compare index array element by element 
      const len = Math.min(pos1.index.length, pos2.index.length);
      for (let i = 0; i < len; i++) {
          if (pos1.index[i] !== pos2.index[i]) {
              return pos1.index[i] - pos2.index[i];
          }
      }
      
      // if common prefix is identical, compare array lengths
      if (pos1.index.length !== pos2.index.length) {
          return pos1.index.length - pos2.index.length;
      }
      // if array lengths are identical, then compare site IDs
      if (pos1.site !== pos2.site) {
          return pos1.site.localeCompare(pos2.site);
      }
      // if site IDs are identical, then compare the clock values
      return pos1.clock - pos2.clock;
}

Here is a concrete example:

Site A inserts ‘x’ at the position [10] with the clock 1: {index: 10, site: 'A', clock: 1}
Site B inserts ‘y’ at the position [10] with the clock 1: {index: 10, site: 'B', clock: 1}

With our implementation for \(comparePositions(p1, p2)\) must return some values to indicate either the Site A operation comes before Site B, and vice versa. In our Typescript implementation, the comparison will:

First, check the index arrays, here they are equal ([10] = [10]).
Then check the array lengths, here they are equal (1 = 1)
Then comparing the site IDs: “A” comes before “B” lexicographically, so the comparison function should return a negative value, indicating Site A’s update comes before Site B.
This means that the character ‘x’ from site A must appear before the ‘y’ from site B.

This will ensure the total ordering property, for any two distinct positions in the document, there must be one position that is “less than” the other.

Position density

In our CRDT implementation, the position density is achieved by a recursive algorithm that generates position between existing ones:

private generatePositionBetween(
    prev: Position | null,
    next: Position | null,
    newPos: number[] = []
): Position {
    // Case 1: Empty document or inserting at beginning or end
    if (!prev && !next) {
        return { index: [this.BASE], site: this.siteId, clock: this.incrementClock() };
    }
    if (!prev) {
        return {
            index: this.generatePositionBefore(next!.index),
            site: this.siteId,
            clock: this.incrementClock()
        };
    }
    if (!next) {
        return {
            index: this.generatePositionAfter(prev.index),
            site: this.siteId,
            clock: this.incrementClock()
        };
    }

    // Case 2: Inserting between two positions
    const prevIndex = prev.index;
    const nextIndex = next.index;
    let head = newPos;

    // Find common prefix
    let i = 0;
    while (i < prevIndex.length && i < nextIndex.length) {
        if (prevIndex[i] !== nextIndex[i]) {
            break;
        }
        head.push(prevIndex[i]);
        i++;
    }

    const prevDigit = i < prevIndex.length ? prevIndex[i] : 0;
    const nextDigit = i < nextIndex.length ? nextIndex[i] : this.BASE;
    const difference = nextDigit - prevDigit;

    if (difference > 1) {
        // If there's room, place the new position halfway between
        head.push(prevDigit + Math.floor(difference / 2));
        return {
            index: head,
            site: this.siteId,
            clock: this.incrementClock()
        };
    } else {
        // Otherwise, extend positions and try again
        head.push(prevDigit);
        return this.generatePositionBetween(
            { ...prev, index: prevIndex.slice(i + 1) },
            { ...next, index: nextIndex.slice(i + 1) },
            head
        );
    }
}

This algorithm will guarantee that we can always insert a new position between 2 different positions. For one simple case, when there is some room between those 2 positions then basically we add a new index to the middle, for example, if the index positions are now [2] and [6], then we will push the [4] to the head and return. However, if the difference is only one unit, then we cannot have another natural number to fill a spot in between, at this point, we need to extend the position index array.

First, we find the common prefix between the two positions, for example [2, 2, 3] and [2, 2, 4] then the common prefix is [2, 2], then we push the common prefix to the head, and recursively extend the position index array, the prevIndex.slice(i + 1) or nextIndex.slice(i + 1) will eventually be empty ([]) and at the end the algorithm will terminate. We can look at this example in more detail to see how it works:

First call to generatePositionBetween([2, 2, 3], [2, 2, 4], []):

Common prefix: [2, 2]
First difference: prevDigit = 3, nextDigit = 4
difference = 1
Add prevDigit to head: head = [2, 2, 3]
Recursive call with:
- { ...prev, index: []} (remaining part of prev)
- { ...next, index: []} (remaining part of next)
- head = [2, 2, 3]

Second call to generatePositionBetween({...prev, index: []}, {...next, index: []}, [2, 2, 3]):

prevIndex and nextIndex are empty, but we are NOT hitting the case !prev && !next
Since next is not null but has an empty index, we hit the case !next is false
Since prev is not null but has an empty index, we hit the case !prev is false
We continue with the algorithm but with empty arrays

In this case, since we defined prevDigit = i < prevIndex.length ? prevIndex[i] : 0, we get prevDigit = 0 (empty array case)

And since nextDigit = i < nextIndex.length ? nextIndex[i] : this.BASE, we get nextDigit = 32 (this.BASE)

The difference is 32 – 0 = 32, which is > 1, so we take the first branch:

head.push(prevDigit + Math.floor(difference / 2)); // head.push(0 + 16) = [2, 2, 3, 16]

So the final position would be [2, 2, 3, 16].

Connect CRDT resolution function to necessary mathematical properties

We’ve demonstrated 3 properties that positioned-based CRDT implementation satisfies: uniqueness, total ordering, and position density. But how do they connect to the mathematical properties that the deterministic resolution function of CRDT must satisfy: commutativity, associativity, and idempotence? The important detail here will be in the merge operation, which uses the \(resolve\) function, which takes 2 different document states and then combines all updates from them in a deterministic order, which is formally defined as follows:

\(\normalsize{S_1 \sqcup S_2 = \{ c \mid c \in S_1 \setminus S_2 \} \cup \{ c \mid c \in S_2 \setminus S_1 \} \cup \{ \text{resolve}(c_1, c_2) \mid c_1 \in S_1, c_2 \in S_2, \text{ and } id(c_1) = id(c_2) \}}\)

This means that for 2 different states \(S_1\) and \(S_2\), the merging operation will combine into the final merged states which contain the characters coming from the \(S_1\), or the characters coming from the \(S_2\) and if 2 characters from these 2 documents have identical identifiers, then using the \(resolve(c_1, c_2)\) function to deal with the conflict, and only either \(c_1\) or \(c_2\) will be in the final state but not both! Since we always have unique position identifiers (our \(pos\) data structure and the \(comparePositions\) help us in this regard).

Here is the implementation for our resolve function:

merge(other: DocumentCRDT): void {
    for (const [id, char] of other.characters) {
       if (!this.characters.has(id)) {
          this.characters.set(id, char);
       } else {
          const existing = this.characters.get(id)!;
          if (this.comparePositions(char.position, existing.position) > 0) {
            this.characters.set(id, char);
          }
       }
    }
}

At this point, we have all the important details of the position-based implementation that satisfy the mathematical properties for an eventual consistency system. Now with the merge function, we can see:

Commutativity between 2 documents \(doc_1\) and \(doc_2\): \(merge(doc_1, doc_2) = merge(doc_2, doc_1)\). — since the total ordering guarantees that either \(pos(c_1) >_P pos(c_2)\) or \(pos(c_2) >_P pos(c_1)\) when positions differ, the \(merge\) will make the same decision regardless of the argument order.
Associativity between 3 different documents: \(merge(doc_1, merge(doc_2, doc_3)) = merge(merge(doc_1, doc_2), doc_3)\). — the relative ordering of characters is preserved regardless of operating grouping.
Idempotence: \(merge(doc_1, doc_1) = doc_1\) — for the characters with the same IDs, the \(merge\) function comparing the character with itself, resulting in no new character added.

The complete position-based CRDT implementation can be found here.

5 3 votes

Article Rating

Facebook Tweet LinkedIn

Build a simple real-time collaborative text-editor with CRDT (Conflict-free replicated data types)14 min read

Synchronization and Eventual Consistency

Position Uniqueness

Total ordering

Position density

Connect CRDT resolution function to necessary mathematical properties