fix: #34 forest write amplification - seal persisted HAMT nodes (0.6.8) by ehsan6sha · Pull Request #35 · functionland/fula-api

ehsan6sha · 2026-06-12T17:10:13Z

Fixes #34 — per-upload cost grew with bucket size because every forest flush re-uploaded the entire ever-touched portion of the hot shard's HAMT instead of just the changed path. Sequential bulk uploads cost O(N²) total index traffic (measured pre-fix: 1,630 node PUTs / 12.2 MB to store 150 × 4 KB files; FxFiles web measured per-upload latency growing 1.0 s → 3.8 s as the bucket filled).

Root cause

Node::set_value (copy-on-write insert) flips every node on the mutated root→leaf path to ChildPtr::InMemory.
Node::store(&self) persisted every InMemory child but — taking &self — could never mark it clean afterwards; the only constructor of Stored/StoredV2 was the load path (Pointer::from_wire).
So in a long-lived session the InMemory set grew monotonically and every flush re-serialized, re-encrypted (fresh nonce → not even dedupe-able), and re-PUT the whole accumulated tree. All files in one directory route to one shard (dir-local routing), and put_object_flat flushes per call — the FxFiles bulk pattern hit the worst case exactly.

The fix

New ChildPtr::Sealed { storage_key, cid, node } — persisted and still memory-resident. Reads resolve from the resident Arc with zero I/O (exactly like InMemory, so no read-path regression — this tree has no read-through cache); store paths emit the recorded Link/LinkV2 reference with zero I/O.
New Node::store_sealing(&mut Arc<Self>) — like store, but write-backs each successfully persisted InMemory child as Sealed. On a child PUT failure the child is restored to InMemory and the error propagates: the in-memory tree stays intact, and a retry re-persists exactly the unsealed remainder, converging on the identical content address (unit-tested with injected failures, exact PUT arithmetic asserted).
flush_dirty seals (shard write-guard instead of read). A flush now PUTs the shard root plus the path(s) mutated since the previous flush — bounded by tree depth, never by bucket size.
Mutations (set_value / remove_value / canonicalize) resolve the sealed Arc, copy-on-write, and re-attach as InMemory — unsealing exactly the touched path.

No data corruption — by construction and by test

Zero wire-format change. Sealed is never serialized as a new variant: it re-emits the same PointerWire::Link/LinkV2 its original PUT produced. Content addressing is plaintext-derived, so the sealing writer and the legacy writer produce byte-identical persisted state — pinned by issue_34_sealing_and_legacy_store_produce_identical_roots (same inserts → identical root content address via both paths). Existing buckets read/write exactly as before; pre-fix SDKs interoperate unchanged.
150-file integrity: after 150 sealed flushes, a cold session over the final manifest reads every file (metadata included) and lists the directory completely.
Deletion over a sealed tree: cold reader sees the deletion and all 39 survivors.
Failure mid-seal: tree fully readable, retry converges on the never-failed root, and re-PUTs everything except exactly the children sealed before the failure.

Multi-client safety (two clients, one writes then the other)

New integration test two_clients_alternating_writes_compose_without_loss against a stateful mock master honoring If-None-Match: * / If-Match (412) — the production conflict contract:

Client A writes 3 files (put_object_flat, flush per put → A fully sealed).
Client B (same user key, fresh instance) writes 3 files, advancing the manifest/page ETags past A's cached view.
Client A writes a 4th file from its stale sealed forest → the flush hits 412, evicts the sealed state wholesale, reloads B's winning state, replays its WAL, retries to success (the pre-existing NEW-7.2 reconcile, with sealing in play). The test asserts the 412/backoff path actually fired (flush_backoff_count), so it can't pass vacuously.
A reads B's writes post-reconcile; a third fresh client reads all 7 files byte-exactly and lists exactly 7.

Sealing adds no staleness window: conflict arbitration was always manifest-ETag-gated, and every reconcile path discards the in-memory forest (sealed state included) wholesale. Sealed pointers reference content-addressed, immutable node objects, so another client's writes can never invalidate them in place. An in-flight reader holding a pre-flush Arc is likewise unaffected (Arc::make_mut copy-on-write; pinned by issue_34_in_flight_reader_arc_unaffected_by_sealing).

Server-side GC/pinning cannot depend on re-PUT recency (a bucket idle for months already has nodes PUT exactly once), so skipping redundant re-PUTs is safe on that axis.

Measured result (same 150-file workload, counting backend)

metric	pre-fix (pinned at `2ed7fe0`)	post-fix
node PUTs at flush #150	17, growing with N	3, flat
bytes at flush #150	161.7 KB (≈1.08 KB × bucket size)	23 KB, bounded by path
long-lived vs fresh session marginal flush	17 vs 3	3 vs 3
totals for 150 × 4 KB files	1,630 PUTs / 12.2 MB	384 PUTs / 3.5 MB

Acceptance from the issue — "a 150-small-file sequential upload should show flat (or near-flat) marginal cost" — is met and enforced by inverted regression guards (issue_34_per_flush_put_cost_stays_flat_as_bucket_fills, issue_34_long_session_marginal_flush_matches_fresh_session).

All platforms

The change lives entirely in the shared fula-crypto core (both cfg branches), so Android, iOS, Windows, web/wasm and fula-js all pick it up:

cargo test -p fula-crypto — 449 passed (incl. 9 issue-34 guards)
cargo test -p fula-client --features test-fault-injection — 208 lib + all integration suites passed (incl. the new two-client interop test)
cargo check -p fula-crypto --target wasm32-unknown-unknown --no-default-features --features wasm — clean
cargo check -p fula-js --target wasm32-unknown-unknown — clean
cargo check -p fula-flutter / -p fula-client (native) — clean
(Android/iOS cross-compile needs the NDK/Xcode toolchains from release CI; the fix compiles the identical non-wasm cfg branch validated on Windows.)

Version

0.6.7 → 0.6.8 (workspace, fula-js + fula-flutter crates, fula_client pubspec/podspec, changelog). The fula-js / fula-wasm npm packages follow their own 0.2.x line and are untouched.

🤖 Generated with Claude Code

…l puts Three current-behaviour pins quantifying the O(N^2) bulk-upload cost: - node layer: store() re-PUTs an entire unchanged InMemory tree because persisted children are never downgraded to Stored (the only Stored constructor is the load path, Pointer::from_wire) - forest layer: per-flush PUT count and bytes grow with bucket size for the put_object_flat pattern (upsert + flush per file, one directory): flush #10 = 1 PUT / 10.8 KB -> flush #150 = 17 PUTs / 161.7 KB; 150 x 4 KB files = 1630 node PUTs / 12.2 MB of index re-uploads - contrast: marginal flush after 150 uploads is 17 PUTs in the long-lived session vs 3 PUTs for a fresh forest loaded from the same manifest - the amplification is session-accumulated InMemory state, not inherent tree cost Invert these pins into regression guards when #34 is fixed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ding the whole tree Root cause: Node::store(&self) persisted every InMemory child but could never mark it clean (the only Stored constructor was the load path), so a long-lived session re-uploaded the entire ever-touched shard tree on EVERY flush - O(N^2) total index traffic for N sequential puts into one directory (measured pre-fix: 1,630 node PUTs / 12.2 MB for 150 x 4 KB files; flush #150 alone re-PUT 17 nodes / 161.7 KB). The fix: - New ChildPtr::Sealed { storage_key, cid, node }: persisted AND still memory-resident. Reads resolve from memory exactly like InMemory (zero fetches); store paths emit the recorded Link/LinkV2 reference with zero I/O. Never serialized as a new wire variant - persisted bytes are identical to the pre-fix writer's, so existing buckets and older SDKs interoperate unchanged (no data-format change, no corruption risk). - New Node::store_sealing(&mut Arc<Self>): like store(), but write-backs each successfully-persisted InMemory child as Sealed. On a child failure the child is restored to InMemory and the error propagates - the in-memory tree stays intact and a retry re-persists exactly the unsealed remainder (converging on the same content address). - flush_dirty now persists via store_sealing (shard write-guard instead of read) - a flush PUTs the shard root plus the path(s) mutated since the previous flush, bounded by tree depth, never by bucket size. - Mutations (set_value / remove_value / canonicalize) resolve Sealed children from memory and re-attach the changed subtree as InMemory, unsealing exactly the touched path. Measured post-fix (same 150-file workload): 384 PUTs / 3.5 MB total; steady-state flushes are 2-3 node PUTs (~16-31 KB) at ANY bucket size; a long-lived session's marginal flush now equals a fresh session's (3 vs 3 node PUTs; pre-fix 17 vs 3). Cross-platform: the change is entirely in the shared fula-crypto core (cfg-split native/wasm paths both compile; fula-js wasm32 + fula-flutter + fula-client all check/test green), so Android, iOS, Windows, web/wasm and fula-js all pick it up. Tests: the three #34 pins are inverted into regression guards (flat per-flush cost; long-session == fresh-session marginal cost), plus new guards for sealing correctness: legacy-vs-sealing byte-identical roots, full 150-file integrity through a cold reload, sealed-tree deletion, second-store root-only re-PUT, and failure-mid-seal intact-and-retryable. Fixes #34 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Workspace + fula-js + fula-flutter crate versions, fula_client pubspec/ podspec, changelog entry. The fula-js / fula-wasm npm packages follow their own 0.2.x version line and are untouched. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…led flush Addresses the two-clients-alternating-writes concern directly: - New fula-client integration test (two_clients_alternating_writes_ compose_without_loss) against the stateful conditional-PUT mock master: client A writes 3 files (forest fully sealed), client B writes 3 files (advancing the manifest/page ETags), then A writes a 4th file FROM ITS STALE SEALED FOREST. A must hit 412, evict the sealed state wholesale, reload B's winning state, replay its WAL, and retry to success. Asserts the 412/backoff path actually fired (flush_backoff_count metric - non-vacuous by construction), that A reads B's writes post-reconcile, and that a third fresh client reads all 7 files byte-exactly with a 7-entry forest listing. - issue_34_in_flight_reader_arc_unaffected_by_sealing: a reader holding the shard-root Arc from BEFORE a flush keeps a fully-readable pre-seal view (Arc::make_mut copy-on-write) - the transient placeholder inside store_sealing is never observable. - Cross-session composition added to the long-vs-fresh guard: a third cold session reads writes made by both prior sessions. - Failure-retry assertion tightened to exact arithmetic: retry re-PUTs everything except exactly the 2 children sealed before the injected failure. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

ehsan6sha and others added 4 commits June 12, 2026 12:18

ehsan6sha merged commit acb3fea into main Jun 12, 2026
8 checks passed

ehsan6sha deleted the fix/34-forest-write-amplification branch June 12, 2026 17:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: #34 forest write amplification - seal persisted HAMT nodes (0.6.8)#35

fix: #34 forest write amplification - seal persisted HAMT nodes (0.6.8)#35
ehsan6sha merged 4 commits into
mainfrom
fix/34-forest-write-amplification

ehsan6sha commented Jun 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ehsan6sha commented Jun 12, 2026

Root cause

The fix

No data corruption — by construction and by test

Multi-client safety (two clients, one writes then the other)

Measured result (same 150-file workload, counting backend)

All platforms

Version

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant