Files
patch-db/ARCHITECTURE.md
Matt Hill 86b0768bbb audit fixes, repo restructure, and documentation
Soundness and performance audit (17 fixes):
- See AUDIT.md for full details and @claude comments in code

Repo restructure:
- Inline json-ptr and json-patch submodules as regular directories
- Remove cbor submodule, replace serde_cbor with ciborium
- Rename patch-db/ -> core/, patch-db-macro/ -> macro/,
  patch-db-macro-internals/ -> macro-internals/, patch-db-util/ -> util/
- Purge upstream CI/CD, bench, and release cruft from json-patch
- Remove .gitmodules

Test fixes:
- Fix proptest doesnt_crash (unique file paths, proper close/cleanup)
- Add PatchDb::close() for clean teardown

Documentation:
- Add README.md, ARCHITECTURE.md, CONTRIBUTING.md, CLAUDE.md, AUDIT.md
- Add TSDocs to TypeScript client exports

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 19:06:42 -07:00

8.4 KiB
Raw Permalink Blame History

Architecture

High-level design

patch-db is split into two layers that communicate over a transport boundary:

┌─────────────────────────────────────────────┐
│              TypeScript Client              │
│  PatchDB<T> ─ RxJS observables ─ watch$()  │
│         ▲                                   │
│         │  Update<T>[] (Dump | Revision)    │
│         │  over WebSocket / SSE / etc.      │
└─────────┼───────────────────────────────────┘
          │
┌─────────┼───────────────────────────────────┐
│         ▼                                   │
│              Rust Backend                   │
│  PatchDb ─ Store ─ Broadcast ─ Subscriber  │
│                                             │
│  ┌──────────┐  ┌───────────┐  ┌──────────┐ │
│  │ json-ptr │  │json-patch │  │ciborium │  │
│  │ RFC 6901 │  │ RFC 6902  │  │ storage  │  │
│  └──────────┘  └───────────┘  └──────────┘  │
└─────────────────────────────────────────────┘

The Rust side owns the persistent state and produces patches. The TypeScript side consumes those patches and maintains a local mirror for reactive UI bindings. They are separate implementations of the same concepts (not WASM/FFI) — compatibility is maintained through shared RFC 6901/6902 semantics.

Project structure

patch-db/
├── core/                          # Core Rust crate — PatchDb, Store, typed wrappers
├── macro/                         # Procedural macro crate (derives HasModel)
├── macro-internals/               # Macro implementation details
├── util/                          # CLI tool (dump/load database files)
├── json-ptr/                      # RFC 6901 JSON Pointer implementation
├── json-patch/                    # RFC 6902 JSON Patch implementation
└── client/                        # TypeScript client library (RxJS-based)
    └── lib/
        ├── patch-db.ts            # PatchDB<T> class
        ├── json-patch-lib.ts      # Client-side patch application
        └── types.ts               # Revision, Dump, Update, PatchOp

Rust crates

core (crate name: patch-db)

The main database engine. Key types:

Type Role
PatchDb Thread-safe async handle (clone to share). All reads/writes go through this.
TypedPatchDb<T> Generic wrapper that enforces a schema type T via HasModel.
Store Internal state container. File-backed with CBOR. Holds the current Value, revision counter, and Broadcast.
Dump Snapshot: { id: u64, value: Value }
Revision Incremental change: { id: u64, patch: DiffPatch }
DiffPatch Newtype over json_patch::Patch with scoping, rebasing, and key-tracking methods.
DbWatch Combines a Dump + Subscriber into a Stream of values.
TypedDbWatch<T> Type-safe wrapper around DbWatch.
Subscriber tokio::sync::mpsc::UnboundedReceiver<Revision>.
Broadcast Fan-out dispatcher. Holds ScopedSenders that filter patches by JSON Pointer prefix. Automatically removes disconnected senders.
MutateResult<T, E> Pairs a Result<T, E> with an optional Revision, allowing callers to check both the outcome and whether a patch was produced.

Write path

caller
  │
  ▼
PatchDb::put / apply / apply_function / mutate
  │
  ▼
Store::apply(DiffPatch)
  ├─ Apply patch in-memory (with undo on failure)
  ├─ Serialize patch as CBOR, append to file
  ├─ Compress (rewrite snapshot) every 4096 revisions
  └─ Broadcast::send(Revision)
       └─ For each ScopedSender: scope patch to pointer, send if non-empty

Read path

caller
  │
  ▼
PatchDb::dump / get / exists / keys
  │
  ▼
Store (RwLock read guard)
  └─ Navigate Value via JsonPointer

Subscription path

PatchDb::subscribe(ptr)    → Subscriber (mpsc receiver)
PatchDb::watch(ptr)        → DbWatch (Dump + Subscriber, implements Stream)
PatchDb::dump_and_sub(ptr) → (Dump, Subscriber)

macro / macro-internals

Procedural macro that derives HasModel for structs and enums:

#[derive(HasModel)]
#[model = "Model<Self>"]
struct Config {
    hostname: String,
    port: u16,
}

Generates:

  • impl HasModel for Config { type Model = Model<Self>; }
  • Typed accessor methods: as_hostname(), as_hostname_mut(), into_hostname()
  • from_parts() constructor
  • destructure_mut() for simultaneous mutable access to multiple fields
  • Respects serde(rename_all), serde(rename), serde(flatten)
  • Enum support with serde(tag) / serde(content) encoding

json-ptr

RFC 6901 JSON Pointer implementation. Provides:

  • JsonPointer<S, V> — generic over string storage and segment list representation
  • Zero-copy BorrowedSegList for efficient path slicing
  • Navigation: get, get_mut, set, insert, remove, take
  • Path algebra: starts_with, strip_prefix, common_prefix, join_end, append
  • ROOT constant for the empty pointer

json-patch

RFC 6902 JSON Patch implementation. Provides:

  • Patch(Vec<PatchOperation>) — the patch type
  • PatchOperation enum: Add, Remove, Replace, Test, Move, Copy
  • patch() — apply a patch to a Value, returns an Undo for rollback
  • diff() — compute the minimal patch between two Values

util

CLI tool with two subcommands:

  • dump <path> — deserialize a patch-db file and print the final state as JSON
  • from-dump <path> — read JSON from stdin and write it as a fresh patch-db file

TypeScript client

PatchDB<T>

RxJS-based observable database client. Consumes Update<T>[] from a transport source.

Data flow:

source$ (Observable<Update<T>[]>)
  │
  ▼
PatchDB.processUpdates()
  ├─ Revision? → applyOperation() for each op, then update matching watchedNodes
  └─ Dump?     → replace cache, update all watchedNodes
  │
  ▼
cache$ (BehaviorSubject<Dump<T>>)
  │
  ▼
watch$(...path) → BehaviorSubject per unique path → Observable to consumer

Key design decisions:

  • watch$() has overloads for 06 path segments, providing type-safe deep property access
  • Watched nodes are keyed by their JSON Pointer path string
  • A revision triggers updates only for watchers whose path overlaps with any operation in the patch (prefix match in either direction)
  • A dump triggers updates for all watchers

json-patch-lib

Client-side RFC 6902 implementation (add/remove/replace only — no test/move/copy since those aren't produced by the server's diff()).

Operations are applied immutably — objects are spread-copied, arrays are sliced — to play nicely with change detection in UI frameworks.

Types

Type Definition
Revision { id: number, patch: Operation<unknown>[] }
Dump<T> { id: number, value: T }
Update<T> Revision | Dump<T>
PatchOp Enum: 'add' | 'remove' | 'replace'
Operation<T> AddOperation<T> | RemoveOperation | ReplaceOperation<T>

Storage format

The on-disk format is a sequence of CBOR values:

[ revision: u64 ] [ value: Value ] [ patch₁ ] [ patch₂ ] ... [ patchₙ ]
  • On open, the file is read sequentially: revision counter, then root value, then patches are replayed
  • On write, new patches are appended as CBOR
  • Every 4096 revisions, the file is compacted: a fresh snapshot is written atomically via a .bak temp file
  • A .failed file logs patches that couldn't be applied (data recovery aid)

Concurrency model

  • PatchDb wraps Arc<RwLock<Store>> — multiple concurrent readers, exclusive writer
  • Broadcast uses mpsc::unbounded_channel per subscriber — writes never block on slow consumers
  • OPEN_STORES static mutex prevents the same file from being opened twice in the same process
  • FdLock provides OS-level file locking for cross-process safety