Initializing a Yjs document with a common value

I‘m working on a project that uses collaborative editing, and of course stumbled upon the awesome Yjs library. One thing that tripped me (and some other people) up in the beginning is how to initialize a Yjs document with some data on all clients. I want to try to summarize this here so there is an easy reference for people to read in the future.

TLDR: Instead of inserting text, create a base64 representation of the update once and use that in all clients.

The wrong approach

When I tried applying an update with some text to the doc, it would always result in a duplicated string when another client connected. My code (client side) looked something like this:

const templateDocument = new Y.Doc();
const templateText = templateDocument.getText("text");
templateText.insert(0, "Initial document content");

As soon as a second client connects, this would result in:

Initial document content
Initial document content

The problem with this code is this line:

templateText.insert(0, "Initial document content");

produces different updates on both clients, because they are called by different clients at different times. You can think of it as each update creating a unique id. If the updates have different id‘s, they will be counted as different updates (even though they may contain the same string), and be applied twice.

Doing something like this on the server also doesn‘t work:

app.get("/getDocument", (req, res) => {
  res.json({
    yjsUpdate: getUpdateBase64("Initial document content"),
  });
});

You will still get duplicated text, even though the update is generated on the server now and used by all clients. It‘s because a new update is generated for each client for every request, so they are counted as different updates that just contain the same string by coincidence.

The right approach

The solution to the problem is to create the update only once, and then send a representation of that update to all clients.

One way this could be done is to create the update when initializing a database entry, and storing the update in the database. For example, using node.js:

// When creating the document
const templateDocument = new Y.Doc();
const templateText = templateDocument.getText("text");
templateText.insert(0, "Initial document content");
const buffer = new Buffer(Y.encodeStateAsUpdate(templateDocument));
// Store this string in the database.
const base64 = buffer.toString("base64");
await storeDocumentInDatabase({ base64 });

// Then, when sending it to the client
app.get("/getDocument", async (req, res) => {
  const { base64 } = await getDocumentFromDatabase();
  res.json({
    base64,
  });
});

// On the client:
// For example, there are various techniques to do this
import toUint8Array from "base64-to-uint8array";

const templateDocument = new Y.Doc();
const templateText = templateDocument.getText("text");
const decodedUpdate = toUint8Array(base64);
Y.applyUpdate(templateDocument, decodedUpdate);

If you‘re not using a database, you could generate the update once at startup:

// Is run exactly once (assuming you are not using serverless)
const updateAsBase64 = getUpdateBase64("Initial document content");

app.get("/getDocument", (req, res) => {
  // Now we can reuse the same update every time
  res.json({
    yjsUpdate: updateAsBase64,
  });
});

Hope this helps!