CollaborationApr 9, 20267 min read

Real-Time Collaboration in Database Design: How CRDTs Make It Possible

Real-time collaboration sounds simple in theory. In practice, it's one of the hardest problems in distributed systems. The technology that solves it elegantly for database design is called CRDTs — Conflict-free Replicated Data Types. Here's how they work and why they matter.

Real-time collaboration sounds simple in theory: two people edit the same document at the same time, and both see each other's changes instantly. In practice, it's one of the hardest problems in distributed systems. What happens when two developers add a column to the same table simultaneously? What if one person deletes a table while another is adding relationships to it? What if someone goes offline, makes changes, and comes back?

These aren't hypothetical edge cases. They're everyday scenarios when a team works on a database schema together. The technology that solves them elegantly is called CRDTs — Conflict-free Replicated Data Types. Here's how they work and why they matter for database design collaboration.

The Problem with Naive Collaboration

The simplest approach to collaboration is "last write wins." Two people make changes, and whoever saves last overwrites the other's work. This is what happens when you share a file on Dropbox or Google Drive without real-time editing — the last person to save "wins," and the other person's changes are lost or relegated to a conflict file.

A slightly better approach is operational transformation (OT), used by Google Docs. OT tracks operations (insert character at position 5, delete character at position 10) and transforms them when conflicts occur. If Alice inserts a character at position 5 and Bob inserts one at position 10, OT adjusts Bob's position to account for Alice's insertion. This works well for text documents.

But database schemas aren't text documents. They're structured objects with complex relationships. A schema operation isn't "insert character at position 5" — it's "add column email of type VARCHAR(255) with a unique constraint to table users." The relationships between operations are semantic, not positional. OT's text-oriented approach doesn't translate well to structured data.

Enter CRDTs

CRDTs take a fundamentally different approach. Instead of transforming operations after conflicts, they design data structures that are mathematically guaranteed to converge — meaning that no matter what order operations arrive in, all peers end up with the same state. No conflict resolution needed, because conflicts literally cannot occur.

This guarantee comes from the mathematical properties of the data structures themselves. A CRDT is designed so that operations are commutative (the order doesn't matter), associative (grouping doesn't matter), and idempotent (applying the same operation twice has no additional effect).

How this works in practice

Imagine Alice and Bob are both editing a database schema. Alice adds a phone column to the users table. Bob adds an avatar_url column to the same table at the same time. Neither knows about the other's change yet.

With CRDTs, both operations are represented as independent additions to a shared data structure. When Alice's change reaches Bob and Bob's change reaches Alice, both end up with a users table that has both phone and avatar_url — regardless of which change arrived first. The CRDT guarantees convergence.

What about deletions? If Alice deletes the profiles table while Bob adds a column to it, the CRDT resolves this deterministically. The typical resolution is that deletions take precedence (the table is deleted, and Bob's column addition is silently discarded) or additions take precedence (the table survives with the new column, and Alice's deletion is reversed). The specific policy is a design choice, but the key point is that all peers reach the same state automatically.

Yjs: The CRDT Framework

ER Flow uses Yjs, one of the most mature and battle-tested CRDT implementations available. Yjs provides shared data types (maps, arrays, text) that sync automatically across peers, offline support (changes are queued and synced when connectivity returns), awareness protocol (cursor positions and presence information), and an efficient binary encoding that minimizes network bandwidth.

Yjs has been adopted by major collaborative tools and has proven its reliability at scale. For database schema collaboration, the relevant Yjs data types are Y.Map for table definitions (column names, types, constraints), Y.Array for ordered lists (column order within a table), and nested Y.Maps for complex structures (a schema is a map of tables, each table is a map of columns, each column is a map of properties).

What This Means for Database Design Teams

True simultaneous editing

Multiple team members can work on different parts of the schema at the same time without coordination. One developer adds the authentication tables while another designs the billing schema. Changes appear on everyone's canvas in real-time — typically within 50-100 milliseconds on a decent connection.

Offline resilience

If a developer loses connectivity (common on laptops, in coffee shops, or on trains), they can continue working on the schema. When connectivity returns, Yjs automatically syncs their offline changes with the server and other peers. The CRDT guarantees that the merge is consistent, regardless of how many changes were made offline or by other people while they were disconnected.

No coordination overhead

Without CRDTs, teams need to coordinate who's editing what: "I'm working on the users table, don't touch it." This implicit locking slows down collaboration and creates bottlenecks. With CRDTs, there's no need for coordination — everyone can edit anything, and the system handles convergence automatically.

Cursor presence and awareness

Beyond data synchronization, Yjs's awareness protocol powers real-time cursor tracking. You can see where other team members are on the canvas, what they're selecting, and what they're editing. This shared awareness reduces confusion and makes remote collaboration feel as natural as working side by side.

The Technical Details (For the Curious)

Conflict scenarios and resolution

Here's how ER Flow handles specific conflict scenarios with CRDTs.

Two people edit the same column simultaneously. If Alice changes the column type to TEXT while Bob changes it to VARCHAR(500), the CRDT resolves this using a deterministic rule (typically the operation with the higher logical timestamp wins). Both peers converge to the same value. In practice, this scenario is rare because awareness features show what others are editing, naturally preventing simultaneous edits to the same property.

One person deletes a table while another modifies it. The deletion wins — the table is removed, and the modification is discarded. This follows the principle of least surprise: if someone explicitly deleted the table, preserving it because of a concurrent edit would be confusing.

Two people create tables with the same name simultaneously. Each table gets a unique internal ID, so name collisions at the CRDT level don't occur. The UI can flag the duplicate name for the users to resolve.

Performance characteristics

Yjs is highly optimized for performance. The binary encoding is compact — a typical schema operation (adding a column) produces a sync message of roughly 50-200 bytes. Initial state synchronization for a 100-table schema takes under 100 milliseconds. Memory usage is proportional to document size, not history length — Yjs garbage-collects merged operations.

For database schema collaboration, where the data structures are relatively small (compared to, say, a shared text document with millions of characters), Yjs's performance is more than adequate. Even schemas with hundreds of tables and thousands of columns sync near-instantaneously.

Beyond Real-Time Editing

CRDTs enable features that go beyond basic simultaneous editing.

Branching and merging schemas. Because CRDTs guarantee convergence, you can create independent "branches" of a schema (similar to Git branches), make changes independently, and merge them back — with the CRDT handling the convergence automatically. This opens up workflows like feature-branch schema design, where each developer experiments with schema changes independently and merges when ready.

Time travel. CRDT operations can be stored as a log, enabling playback of the schema's evolution over time. Want to see how the schema looked three weeks ago? Replay operations up to that timestamp.

Conflict-free imports. When importing a schema from SQL, the import operations are CRDT operations. This means importing into a schema that others are actively editing works correctly — the imported tables merge with live changes without conflicts.

Why This Matters

For teams accustomed to file-based schema design (passing around .sql files or draw.io diagrams), real-time CRDT-based collaboration is a step change. The database schema becomes a living document that the entire team can work on simultaneously, with the confidence that concurrent changes will always converge to a consistent state.

Combined with visual design, AI assistance, and migration generation, CRDT collaboration makes ER Flow the first database design tool that truly supports how modern teams work: distributed, asynchronous, and fast.

Fundamentals6 min read