fix: #34 forest write amplification - seal persisted HAMT nodes (0.6.8)#35
Merged
Conversation
…l puts Three current-behaviour pins quantifying the O(N^2) bulk-upload cost: - node layer: store() re-PUTs an entire unchanged InMemory tree because persisted children are never downgraded to Stored (the only Stored constructor is the load path, Pointer::from_wire) - forest layer: per-flush PUT count and bytes grow with bucket size for the put_object_flat pattern (upsert + flush per file, one directory): flush #10 = 1 PUT / 10.8 KB -> flush #150 = 17 PUTs / 161.7 KB; 150 x 4 KB files = 1630 node PUTs / 12.2 MB of index re-uploads - contrast: marginal flush after 150 uploads is 17 PUTs in the long-lived session vs 3 PUTs for a fresh forest loaded from the same manifest - the amplification is session-accumulated InMemory state, not inherent tree cost Invert these pins into regression guards when #34 is fixed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ding the whole tree
Root cause: Node::store(&self) persisted every InMemory child but could
never mark it clean (the only Stored constructor was the load path), so
a long-lived session re-uploaded the entire ever-touched shard tree on
EVERY flush - O(N^2) total index traffic for N sequential puts into one
directory (measured pre-fix: 1,630 node PUTs / 12.2 MB for 150 x 4 KB
files; flush #150 alone re-PUT 17 nodes / 161.7 KB).
The fix:
- New ChildPtr::Sealed { storage_key, cid, node }: persisted AND still
memory-resident. Reads resolve from memory exactly like InMemory
(zero fetches); store paths emit the recorded Link/LinkV2 reference
with zero I/O. Never serialized as a new wire variant - persisted
bytes are identical to the pre-fix writer's, so existing buckets and
older SDKs interoperate unchanged (no data-format change, no
corruption risk).
- New Node::store_sealing(&mut Arc<Self>): like store(), but write-backs
each successfully-persisted InMemory child as Sealed. On a child
failure the child is restored to InMemory and the error propagates -
the in-memory tree stays intact and a retry re-persists exactly the
unsealed remainder (converging on the same content address).
- flush_dirty now persists via store_sealing (shard write-guard instead
of read) - a flush PUTs the shard root plus the path(s) mutated since
the previous flush, bounded by tree depth, never by bucket size.
- Mutations (set_value / remove_value / canonicalize) resolve Sealed
children from memory and re-attach the changed subtree as InMemory,
unsealing exactly the touched path.
Measured post-fix (same 150-file workload): 384 PUTs / 3.5 MB total;
steady-state flushes are 2-3 node PUTs (~16-31 KB) at ANY bucket size;
a long-lived session's marginal flush now equals a fresh session's
(3 vs 3 node PUTs; pre-fix 17 vs 3).
Cross-platform: the change is entirely in the shared fula-crypto core
(cfg-split native/wasm paths both compile; fula-js wasm32 + fula-flutter
+ fula-client all check/test green), so Android, iOS, Windows, web/wasm
and fula-js all pick it up.
Tests: the three #34 pins are inverted into regression guards
(flat per-flush cost; long-session == fresh-session marginal cost), plus
new guards for sealing correctness: legacy-vs-sealing byte-identical
roots, full 150-file integrity through a cold reload, sealed-tree
deletion, second-store root-only re-PUT, and failure-mid-seal
intact-and-retryable.
Fixes #34
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Workspace + fula-js + fula-flutter crate versions, fula_client pubspec/ podspec, changelog entry. The fula-js / fula-wasm npm packages follow their own 0.2.x version line and are untouched. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…led flush Addresses the two-clients-alternating-writes concern directly: - New fula-client integration test (two_clients_alternating_writes_ compose_without_loss) against the stateful conditional-PUT mock master: client A writes 3 files (forest fully sealed), client B writes 3 files (advancing the manifest/page ETags), then A writes a 4th file FROM ITS STALE SEALED FOREST. A must hit 412, evict the sealed state wholesale, reload B's winning state, replay its WAL, and retry to success. Asserts the 412/backoff path actually fired (flush_backoff_count metric - non-vacuous by construction), that A reads B's writes post-reconcile, and that a third fresh client reads all 7 files byte-exactly with a 7-entry forest listing. - issue_34_in_flight_reader_arc_unaffected_by_sealing: a reader holding the shard-root Arc from BEFORE a flush keeps a fully-readable pre-seal view (Arc::make_mut copy-on-write) - the transient placeholder inside store_sealing is never observable. - Cross-session composition added to the long-vs-fresh guard: a third cold session reads writes made by both prior sessions. - Failure-retry assertion tightened to exact arithmetic: retry re-PUTs everything except exactly the 2 children sealed before the injected failure. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This was referenced Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #34 — per-upload cost grew with bucket size because every forest flush re-uploaded the entire ever-touched portion of the hot shard's HAMT instead of just the changed path. Sequential bulk uploads cost O(N²) total index traffic (measured pre-fix: 1,630 node PUTs / 12.2 MB to store 150 × 4 KB files; FxFiles web measured per-upload latency growing 1.0 s → 3.8 s as the bucket filled).
Root cause
Node::set_value(copy-on-write insert) flips every node on the mutated root→leaf path toChildPtr::InMemory.Node::store(&self)persisted everyInMemorychild but — taking&self— could never mark it clean afterwards; the only constructor ofStored/StoredV2was the load path (Pointer::from_wire).InMemoryset grew monotonically and every flush re-serialized, re-encrypted (fresh nonce → not even dedupe-able), and re-PUT the whole accumulated tree. All files in one directory route to one shard (dir-local routing), andput_object_flatflushes per call — the FxFiles bulk pattern hit the worst case exactly.The fix
ChildPtr::Sealed { storage_key, cid, node }— persisted and still memory-resident. Reads resolve from the residentArcwith zero I/O (exactly likeInMemory, so no read-path regression — this tree has no read-through cache); store paths emit the recordedLink/LinkV2reference with zero I/O.Node::store_sealing(&mut Arc<Self>)— likestore, but write-backs each successfully persistedInMemorychild asSealed. On a child PUT failure the child is restored toInMemoryand the error propagates: the in-memory tree stays intact, and a retry re-persists exactly the unsealed remainder, converging on the identical content address (unit-tested with injected failures, exact PUT arithmetic asserted).flush_dirtyseals (shard write-guard instead of read). A flush now PUTs the shard root plus the path(s) mutated since the previous flush — bounded by tree depth, never by bucket size.set_value/remove_value/canonicalize) resolve the sealedArc, copy-on-write, and re-attach asInMemory— unsealing exactly the touched path.No data corruption — by construction and by test
Sealedis never serialized as a new variant: it re-emits the samePointerWire::Link/LinkV2its original PUT produced. Content addressing is plaintext-derived, so the sealing writer and the legacy writer produce byte-identical persisted state — pinned byissue_34_sealing_and_legacy_store_produce_identical_roots(same inserts → identical root content address via both paths). Existing buckets read/write exactly as before; pre-fix SDKs interoperate unchanged.Multi-client safety (two clients, one writes then the other)
New integration test
two_clients_alternating_writes_compose_without_lossagainst a stateful mock master honoringIf-None-Match: */If-Match(412) — the production conflict contract:put_object_flat, flush per put → A fully sealed).flush_backoff_count), so it can't pass vacuously.Sealing adds no staleness window: conflict arbitration was always manifest-ETag-gated, and every reconcile path discards the in-memory forest (sealed state included) wholesale. Sealed pointers reference content-addressed, immutable node objects, so another client's writes can never invalidate them in place. An in-flight reader holding a pre-flush
Arcis likewise unaffected (Arc::make_mutcopy-on-write; pinned byissue_34_in_flight_reader_arc_unaffected_by_sealing).Server-side GC/pinning cannot depend on re-PUT recency (a bucket idle for months already has nodes PUT exactly once), so skipping redundant re-PUTs is safe on that axis.
Measured result (same 150-file workload, counting backend)
Acceptance from the issue — "a 150-small-file sequential upload should show flat (or near-flat) marginal cost" — is met and enforced by inverted regression guards (
issue_34_per_flush_put_cost_stays_flat_as_bucket_fills,issue_34_long_session_marginal_flush_matches_fresh_session).All platforms
The change lives entirely in the shared
fula-cryptocore (bothcfgbranches), so Android, iOS, Windows, web/wasm and fula-js all pick it up:cargo test -p fula-crypto— 449 passed (incl. 9 issue-34 guards)cargo test -p fula-client --features test-fault-injection— 208 lib + all integration suites passed (incl. the new two-client interop test)cargo check -p fula-crypto --target wasm32-unknown-unknown --no-default-features --features wasm— cleancargo check -p fula-js --target wasm32-unknown-unknown— cleancargo check -p fula-flutter/-p fula-client(native) — cleanVersion
0.6.7 → 0.6.8(workspace, fula-js + fula-flutter crates, fula_client pubspec/podspec, changelog). The fula-js / fula-wasm npm packages follow their own 0.2.x line and are untouched.🤖 Generated with Claude Code