Advanced Internals

How Jazz works under the hood: raw tables, row histories, sync, the query pipeline, and the browser architecture.

This page describes Jazz's internal architecture. You do not need any of this to use Jazz, but it is helpful if you are debugging, reasoning about performance, or understanding why the system behaves the way it does.

Data model

Raw tables plus engine-managed fields

Jazz stays table-first all the way down.

Your schema defines normal application columns such as title, done, and projectId. Under the hood, the engine also tracks a small set of reserved _jazz_* columns that explain how each row behaves over time, such as:

a stable row id
the branch view the row belongs to
the current row-version id
ancestry pointers to earlier row versions
visibility state
confirmed durability tier
delete markers
engine/user metadata

The important physical fact is that Jazz stores one flat row_format row containing both the user columns and the reserved engine columns. Some Rust types still expose the user-column slice separately for convenience, but that is just a decoded view rather than a different storage model.

Visible entries and row histories

Each logical row has two important storage shapes behind it:

a visible entry for current reads
a row history containing every stored row version

Ordinary queries read the visible entry first. History is what makes replay, reconnect, branching, and future historical queries possible.

The simplest picture is:

todos
  visible: (branch, row_id) -> current winner for that branch view
  history: (row_id, version_id) -> row versions over time

This is why Jazz can feel like "just tables" at the app layer while still keeping rich local-first history underneath.

Both storage shapes are flat rows:

history rows use reserved _jazz_* columns plus the user columns
visible rows use a slightly larger _jazz_* prefix plus the same user columns

Indexing

By default, every column on every table is indexed. This keeps where, orderBy, and join lookups fast on any column without you having to think about it, at the cost of one index entry per column per row. The _id index for each table doubles as the authoritative row manifest, so discovering all rows in a table is just an _id index scan.

Sometimes you'll want to optimise for write performance or storage cost instead. Use indexOnly() to specify which columns in the table need indexes; only the columns you specify will be indexed.

Avoid overuse

table.indexOnly(["title", "done"]) does not mean "add indexes on title and done". It means "drop the indexes on every other column on this table". Read it as "index only these". indexOnly is an optimisation that you're unlikely to need to begin with, and if not used correctly, can significantly impact the performance of your app, especially on reads.

schema.ts

import { schema as s } from "jazz-tools";

const schema = {
  todos: s
    .table({
      title: s.string(),
      done: s.boolean(),
      description: s.string().optional(),
      activityLog: s.string().optional(),
    })
    .indexOnly(["title", "done"]),
};

In this example, where({ title: ... }) and where({ done: ... }) are still index-backed. A query that filters on description or activityLog works, but falls back to scanning every row in the table.

Use indexOnly when you're seeing slow writes and:

The column holds large or rarely-queried data (long text, serialised metadata, audit logs).
The table is write-heavy and the per-column index cost is showing up in your performance data.

Deletion

The user-facing delete API performs a soft delete. The row is preserved in history, but it disappears from ordinary live queries.

Internally, the current visible state leaves the live _id index and can still be addressed through deleted-row paths such as _id_deleted.

A hard delete mode also exists at the storage layer, but it is not currently exposed as the normal app-facing API.

Row history and truncation

Row history is append-only by default. Every write creates a new row version and keeps older versions available for replay and reconciliation.

There is a low-level truncation path that can drop older ancestry while preserving the current visible state, but it is not a normal application-facing feature yet.

Monotonic direct-write ordering

Each runtime instance maintains a small monotonic clock for direct writes. New row versions created by that runtime get strictly increasing local timestamps, which makes deterministic last-writer-wins ordering straightforward within a single device or process.

Merge strategies

By default, Jazz adopts a per-column last-writer-wins strategy to resolve concurrent edits. This means that if two clients update the same column of the same row simultaneously, the one with the later timestamp overwrites the earlier one.

This is a sensible default, but can cause some unexpected behaviour with certain types of data. For example, imagine a voting app. Alice and Bob both read the current voteCount value as 2. They each want to increment the value. If they simultaneously write 3, then the new value will be 3, even though they actually each wanted to increment by one.

Counters

Use .merge("counter") on the column to keep deltas additive instead:

schema.ts

import { schema as s } from "jazz-tools";

const schema = {
  votes: s.table({
    proposal: s.string(),
    voteCount: s.int().merge("counter"),
  }),
};

With merge("counter"), every update(...) on the column is recorded as a delta from the value the writer was looking at. When concurrent edits meet, the deltas are summed: Alice's +1 and Bob's +1 both apply, and the voteCount correctly lands on 4.

Counter merges are useful for things like:

A shared score or vote tally.
An inventory level being incremented and decremented from multiple devices.
Any other counter where you care about preserving every increment rather than which device wrote last.

merge("counter") is only valid on non-nullable integer columns. Calling it on a string, on a nullable integer (s.int().optional()), or on any other type throws at schema construction time.

Grow-only sets

Counters solve concurrent numbers; arrays have the same problem with concurrent membership. Under last-writer-wins, if Alice adds "urgent" to a tags array while Bob concurrently adds "blocked", one write clobbers the whole array and a tag is silently lost.

Use .merge("g-set") to make an array column a grow-only set instead — concurrent writes converge to the union of every replica's elements:

schema.ts

import { schema as s } from "jazz-tools";

const schema = {
  documents: s.table({
    title: s.string(),
    tags: s.array(s.string()).merge("g-set"),
  }),
};

When concurrent edits meet, the merged array is the union of all contributed elements, deduplicated and sorted into a canonical order so every replica converges on a byte-identical result. Alice's "urgent" and Bob's "blocked" both survive. An element written by one replica is never dropped by a concurrent write from another that never saw it.

Grow-only sets are useful for things like:

Accumulating tags, labels, or category membership.
Append-only logs of participants, contributors, or seen IDs.
Any collection where you care about keeping every element rather than which device wrote last.

merge("g-set") is grow-only: there is no element removal. It is only valid on non-nullable array columns; calling it on any other type, or on a nullable array (s.array(...).optional()), throws at schema construction time.

Cold start

On startup, Jazz loads indices first rather than eagerly decoding every row. Row content is then loaded on demand as queries reference it.

The result is that cold-start cost is much closer to "index size" than "total stored data size."

Browser architecture

Dual-runtime model

In the browser, Jazz runs two runtime instances:

Main thread — an in-memory runtime that serves UI-facing reads and writes immediately
Dedicated worker — a persistent runtime backed by OPFS (Origin Private File System) that owns durable storage and upstream server sync

The main thread treats the worker as its upstream peer. Writes apply to the in-memory runtime immediately, then sync to the worker via postMessage. The worker persists them to OPFS and forwards them to the next sync tier. Incoming server updates flow the reverse path: server -> worker -> main thread -> your UI callback.

Main thread (in-memory runtime)
  ↕ postMessage
Dedicated worker (persistent OPFS runtime)
  ↕ HTTP/SSE
Edge/global server runtime

With driver: { type: "memory" }, the worker and OPFS are skipped entirely, and the main-thread runtime syncs directly with the server.

OPFS crash safety

The OPFS storage engine uses checkpoint-based persistence with two superblock slots (A/B). Each checkpoint writes dirty pages, flushes, then swaps the active superblock. On reopen after a crash or torn write, the highest valid superblock generation wins, recovering to the last complete checkpoint.

Tab coordination

Jazz uses a browser broker backed by SharedWorker, MessageChannel, and the Web Locks API (navigator.locks) to keep one OPFS-backed worker per persistent browser namespace. The broker elects a leader tab, and that leader owns the dedicated worker. Follower tabs receive broker-routed message ports to the leader's worker instead of spawning their own OPFS worker.

When the leader tab closes, the browser releases the lock and the broker promotes another tab.

React Native

React Native uses a separate native runtime adapter with no web worker or OPFS path. It still uses the same table-first runtime model, but local persistence is provided by the native embedded backend rather than by browser APIs.

Query engine

Execution pipeline

Queries compile into a graph of processing nodes:

IndexScan → [Union] → Materialize → [PolicyFilter]
  → [ArraySubquery] → [Filter] → [Sort] → [LimitOffset]
  → [Project] → Output

Nodes in brackets are only present when the query requires them. The graph processes deltas incrementally, which means that when data changes, only dirty nodes re-evaluate. That is what makes live subscriptions efficient: a single row change does not require re-running the whole query.

Materialization

Materialize is where candidate row ids turn back into rows.

It typically:

looks up the visible entry for the relevant branch
falls back to row history only when the query needs an older settled winner
decodes or reprojects the flat row, dropping the reserved engine columns before returning app-facing values
emits row-level deltas to the downstream graph

This is why the visible region matters so much: most current reads never need to reconstruct a row from full history.

One-shot queries

db.all() and db.one() are implemented as "create a temporary subscription, wait for the first durability-qualified snapshot, then auto-unsubscribe." They share the same reactive machinery as live subscriptions, which is why they participate in durability-tier gating and lens transforms.

Include performance

Each outer row in an include() / array subquery gets its own compiled sub-graph. With 1,000 outer rows, that is 1,000 sub-graphs. Any change to the inner table re-settles all instances. This is correct and simple, but worth remembering when including across very large result sets.

Sync protocol

Transport

Jazz uses a single WebSocket sync transport plus a small HTTP surface for health and admin reads.

Sync: GET /apps/<appId>/ws upgrades to a WebSocket carrying the typed sync protocol.
Admin: GET /apps/<appId>/schemas, GET /apps/<appId>/schema/:hash, and POST /apps/<appId>/admin/... handle schema and permissions publication/read flows.
Health: GET /health.

Client identity

Each client generates and persists a stable ClientId. On reconnect with the same id, the server can treat it as the same logical peer rather than as a brand-new client with no prior state.

Reconnection

The TypeScript client uses exponential backoff with jitter. On reconnect, active query subscriptions are replayed as anti-entropy: the server re-evaluates them and resends any rows the client still needs.

Trust model and client roles

Sync is asymmetric:

Upward (client -> server): row versions, row-state changes, and catalogue updates are pushed toward trusted servers
Downward (server -> client): only rows matching the client's active query subscriptions are sent

Each client connection has a role that determines how writes are routed:

Role	Write handling
`User`	Writes queued for permission policy evaluation before apply
`Admin`	Writes applied directly, no permission check
`Peer`	Writes applied directly, used for trusted runtime-to-runtime sync

Frontend clients usually authenticate as User. Backend services with a backend secret authenticate as Admin or Peer.

Schema evolution

Lenses

Migrations in Jazz produce lenses — bidirectional transformations between schema versions. When jazz-tools migrations create diffs two schemas, it generates a lens with declarative operations such as adding, removing, or renaming columns and tables.

At query time, Jazz can use lens paths to read older stored data through the current schema. At write time, updates to older rows are written back into the current schema branch via copy-on-write.

Catalogue sync

Schemas and lenses travel through a separate catalogue lane, not through the normal user-row history path. Clients publish catalogue entries, servers discover them lazily, and query execution uses that catalogue state to resolve schema context on demand.

Durability signals

Jazz separates two durability questions:

Signal	Gates	Question it answers
`QuerySettled`	First read delivery	"Has the query result settled at tier T?"
Write tier confirmation	`.wait({ tier })` promise completion	"Has this write been confirmed at tier T?"

Both use the same tier lattice (local < edge < global), but they answer different questions. A query's first callback is held until QuerySettled reaches the requested tier. A .wait({ tier }) promise resolves when the requested tier confirms the write.

The read durability tier only gates the first delivery of a subscription. After the initial snapshot arrives at the requested tier, later updates are delivered as they reach the local node. That means tier: "global" gives you a globally settled first snapshot, not globally gated delivery forever after.

Advanced Internals

On this page