Skip to content

Bump zwasm v1.6.1 → v1.7.2#1

Merged
chaploud merged 1 commit into
mainfrom
develop/bump-zwasm-v1.7.2
Apr 21, 2026
Merged

Bump zwasm v1.6.1 → v1.7.2#1
chaploud merged 1 commit into
mainfrom
develop/bump-zwasm-v1.7.2

Conversation

@chaploud
Copy link
Copy Markdown
Contributor

Summary

  • Bumps the zwasm dependency from v1.6.1 to v1.7.2.
  • v1.7.2 fixes an ARM64 JIT remainder bug (`rd == rs2` aliasing) that surfaces in `02_tinygo_test` — TinyGo's `gcd` lowers to IR `r3 = r0 % r3`, and after JIT compilation (HOT_THRESHOLD=3) the loop produced wrong remainders and spun for minutes.
  • Also picks up v1.7.1 (invoke() settings persistence + build options + spec bump) and v1.7.0 (SIMD JIT, memory64 fix, FD-based WASI, JIT correctness sweep).

Test plan

  • `bash test/run_all.sh`: zig test, ReleaseSafe build, `cljw test` (83 namespaces, 0 failures), e2e wasm, deps.edn e2e — all PASS
  • Binary size: 4.76 MB (≤ 5.0 MB)
  • Startup: 4.5 ms ± 0.6 ms (≤ 6 ms)
  • RSS: 7.65 MB (≤ 10 MB)
  • `bash bench/wasm_bench.sh --quick`: gcd completes in 66.2 ms (was hanging on v1.7.0/v1.7.1)

Regression context

Investigated at the zwasm side: the prior fix for rem aliasing (zwasm v1.7.1) covered only `rd == rs1`. The register allocator can also assign the destination to the same physical register as the divisor; in that case UDIV clobbered the divisor before MSUB could use it. zwasm v1.7.2 now preserves whichever operand the destination aliases and adds a regression test.

Picks up:
- ARM64 JIT remainder fix (rd == rs2 aliasing). Surfaced here via
  02_tinygo_test: TinyGo's gcd lowered to IR `r3 = r0 % r3` produced
  wrong remainders after HOT_THRESHOLD JIT compilation, causing the
  loop to spin for minutes.
- Preserve caller-set vm.* settings across invoke() (v1.7.1).
- -Dpic / -Dcompiler-rt build options (v1.7.1).
- Spec testsuite bumped to f9c743a (v1.7.1).
- v1.7.0 contents (SIMD JIT, memory64 fix, FD-based WASI config,
  JIT correctness sweep) previously inaccessible due to the rem bug.

Commit Gate (Mac):
- run_all.sh: zig test, ReleaseSafe build, cljw test (83 ns),
  e2e wasm, deps.edn e2e — all PASS
- Binary 4.76 MB (≤ 5 MB), startup 4.5 ms (≤ 6 ms), RSS 7.65 MB (≤ 10 MB)
- wasm_bench.sh --quick: gcd 66.2 ms (was hanging on v1.7.0/v1.7.1)
@chaploud chaploud merged commit 4b7437f into main Apr 21, 2026
6 checks passed
@chaploud chaploud deleted the develop/bump-zwasm-v1.7.2 branch April 24, 2026 12:55
chaploud added a commit that referenced this pull request Apr 26, 2026
Self-review against the strategic-review notes and reference projects
flagged 7 files as high rot risk (empty stubs / aspirational tables /
premature deep-dives). Removed and listed in ROADMAP §15.2 with templates
so they can be created on demand later without losing the format.

Removed:
- .dev/handover.md           (memo files rot; recreate when actual session
                              state needs to outlive a chat thread)
- .dev/known_issues.md       (empty stub; create when first issue arises)
- .dev/compat_tiers.yaml     (aspirational; create at Phase 10 when first
                              src/lang/clj/ namespace lands)
- .dev/concurrency_design.md (premature; Phase-15 reality may diverge —
                              kept the summary in ROADMAP §7)
- .dev/wasm_strategy.md      (same; summary stays in ROADMAP §8)
- .claude/rules/compat_tiers.md  (auto-loads on src/lang/** which does not
                              exist yet; recreate at Phase 10)
- docs/README.md             (single placeholder line; redundant with the
                              main README and docs/ja/README.md)

Reworked the learning-doc gate (`scripts/check_learning_doc.sh`) so the
doc is the COMMIT THAT IMMEDIATELY FOLLOWS each source commit, not in the
same commit. This lets the doc's `commit:` front-matter cite the actual
SHA of the source commit it documents — no "TBD then patch" cycle.

- Rule 1: a doc commit must not contain source-bearing files (mixing
  defeats the SHA pairing).
- Rule 2: a commit following an unpaired source commit must be the doc.
- "Source-bearing" tightened to exclude `.dev/decisions/README.md` and
  `.dev/decisions/0000-template.md` (meta-metadata, not real ADRs).

Patched docs/ja/0001 and 0002 front-matter `commit:` to actual SHAs
(116b874, ac2e2b9). Updated ROADMAP §11.6 #1, §12.2, §15.1, §15.2,
§17 history; updated SKILL.md, .dev/README.md, .dev/decisions/README.md,
CLAUDE.md to reflect the new flow.
chaploud added a commit that referenced this pull request May 23, 2026
…e field

Final cycle of §9.6 / 4.5. All seven Phase-1/2 special forms now
compile to bytecode.

tree_walk.Function gains an optional `bytecode: ?*const
BytecodeChunk = null` field — `null` keeps the existing TreeWalk
Node-body semantics; non-null routes the VM dispatcher (task 4.6)
to the compiled chunk while the `body` Node stays available for
error frames. Smallest-diff design — single heap type for both
backends, no new HeapTag, no zone reshuffle.

compiler.zig fn_node arm: a fresh sub-Compiler compiles the body,
the BytecodeChunk is arena-pinned (lifetime matches `body` /
`params` already referenced by Function), tree_walk's new
`allocFunctionWithBytecode` allocates the Function up-front, and
op_make_fn <idx> just reads the pre-built Function constant.
Closure-less only (slot_base == 0); slot_base > 0 returns
error.NotImplemented because op_make_fn's single-constant-index
operand cannot encode the capture snapshot — task 4.7 widens
that.

Two new Layer-1 unit tests: closure-less fn* allocation
verifying the nested chunk shape, and the NotImplemented guard
for slot_base > 0.

ROADMAP §9.6 row 4.5 flips to [x]. Mac (9/9) + OrbStack Ubuntu
x86_64 (8/8) green.

Note on prior handover framing: the previous commit (538f89e)
stopped the autonomous loop citing condition 2 (ADR-level
decision). On re-reading, design #1 was explicitly "no ADR amend
needed" and therefore self-decidable; the stop was inflated and
the loop continued.
chaploud added a commit that referenced this pull request May 23, 2026
…, ADR self-accept

Three converging issues with the last autonomous loop session (the
loop stopped at §9.6 / 4.5 fn_node citing "ADR-level decision"; the
choice was self-decidable, design #1 was explicitly "no ADR amend
needed"; separately, 7 commits accumulated unpushed because
"push free after green gate" left room for a "should I push?"
decision the closed stop list does not authorise):

- Closed stop list shrinks to 2 conditions (user explicit / physical
  block). The ADR-level condition is removed; ADR-level designs are
  handled inline (CLAUDE.md § "ADR-level designs are handled inline,
  not as a stop"). The AI drafts the ADR with
  Status: Proposed → Accepted in the same cycle and lands it
  alongside the source change. Rationale survives in ADR history; no
  external accept gate.
- Step 6 becomes atomic commit + push. Local commits never
  accumulate on cw-from-scratch. Push to main remains forbidden;
  push to cw-from-scratch is automatic and not deferred.
- Phrases "this needs human judgement" / "cannot be self-decided"
  / "user touchpoint required" / "ADR-level decision" (as a stop
  reason) / "user 確認待ち" become forbidden framings in
  handover.md (handover_framing.md table + grep). They map to the
  same anti-pattern the b09b54e commit identified
  (prohibition-list failure mode) and reframe self-decidable
  choices as stops.

Affected files:

- CLAUDE.md
  - Working agreement bullets: branch + push wording become atomic.
  - Step 6: split into "commit + push" with explicit
    `git push origin cw-from-scratch` after each commit. Depth 2-4
    becomes "draft + accept ADR inline".
  - "Stop only when": list shrinks from 3 to 2 conditions.
    Adds explicit "ADR-level designs are handled inline" section
    with the forbidden-phrase list.
- .dev/ROADMAP.md
  - §12.4 rewritten: 7-step table (was 8-step with /compact gate),
    stop discipline matches CLAUDE.md, push wording matches.
  - §13 forbidden actions: "Pushing without user approval" replaced
    with "Pushing to main" + "Leaving local commits unpushed".
- .dev/principle.md
  - Four-depths table footer adds: "All four depths proceed within
    the loop. Deeper depths land conclusion in a separate commit
    before the source commit, then continue. AI drafts + accepts
    the ADR itself."
- .dev/decisions/README.md
  - Lifecycle Add bullet: AI drafts ADR with Proposed → Accepted in
    the same cycle. "Reject after debate" → "Reject after
    consideration" (debate implied human-AI exchange).
- .claude/rules/handover_framing.md
  - "Legitimate stop framing" rewritten to the 2-condition list;
    ADR-level example removed. Forbidden-phrase table extends with
    cannot-be-self-decided / human-judgement / user-touchpoint /
    ADR-level-as-stop / awaiting-user-confirmation rows. Grep
    enforcement command updated.
- .claude/skills/continue/SKILL.md
  - "closed 3-condition" → "closed 2-condition" in two places.

Gate green on both Mac (9/9) and OrbStack Ubuntu x86_64 (8/8).
chaploud added a commit that referenced this pull request May 24, 2026
§9.7 task 5.0 closer. Encodes the survey at
private/notes/phase5-skeleton-audit.md (676 lines, gitignored) as
a tracked decision so §9.7 rows 5.1-5.16 execute against a fixed
activation classification map without re-deriving from the
survey.

Two load-bearing decisions:

  §1 Classification of the 8 Phase-4 skeletons:
       4.13 io_interface           matches FF  (Phase 14, ADR-0015 a2)
       4.17 type_descriptor        restructure (5.11)
       4.18 protocol               restructure (Phase 7, D-040)
       4.19 ObjectHeader           matches FF  (5.3)
       4.20 host/_host_api         matches FF  (Phase 6 host wave)
       4.22 binding_stack          reverted    (6a48e90 — terminal)
       4.23 numeric/big_int        restructure (5.2 + 5.9)
       4.24 lazy_seq               restructure (5.7)
       4.25 dispatch/method_table  matches FF  (Phase 7, D-040)

  §2 Critical-path activations for 5.16 exit smoke:
       5.2 → 5.3 → 5.4-5.6 → 5.7 → 5.8 → 5.9-5.10 → 5.11 → 5.12
       → 5.15 (build_options flip, mechanical after 5.12)
       → 5.16 (exit smoke).
       Parallel-safe: 5.13 (analyzer split), 5.14 (host placeholder
       doc).

Devil's-advocate subagent forked with fresh context per CLAUDE.md
§ "ADR-level designs are handled inline" / principle.md
"Devil's-advocate subagent is mandatory at depth ≥ 2". Subagent
output reflected verbatim into Alternatives considered (Alt 1
smallest-diff / Alt 2 finished-form-clean split / Alt 3 wildcard
pattern-ADR). Subagent recommendation: Alt 1.

Main loop disposition: Alt 1 applied with §3 reduced to a
pointer (not deleted entirely — the link to the survey's "5.1
input bullets" stays so future readers find them without
re-discovering); §4 removed entirely per the subagent's
accurate F-003-overlap observation; subagent's omitted-
constraints #1 (per-row OrbStack gate) and #2 (5.15 in critical
path) reflected into §2. Omitted-constraint #3 moot now that §3
is a pointer.

ROADMAP §9.7 row 5.0 flipped to [x] in-place with the survey
SHA / row count + the Alt 1 disposition recorded so future
audits can reconstruct the deferral choice from the row text.

Smell-audited: 1: Devil's-advocate alternative Alt 1 applied
(structural ADR shrink — depth-2 amendment of the draft before
Accepted). The original draft carried the 8 constraint bullets
verbatim; the subagent surfaced the duplication-with-5.1
concern accurately and proposed three alternatives within the
F-NNN envelope. None violated F-001..F-008. F-003 (decision-
deferral) is the active constraint here: ADR-0026 should not
pre-commit decisions the survey only surfaces — 5.1's ADR-0027
/ ADR-0028 cluster owns the bullets at the moment they bind.
chaploud added a commit that referenced this pull request May 24, 2026
… first/rest/meta

Smell-audited: 1: second per-type migration. Cons → extern struct
(declaration-ordered, HeapHeader at offset 0). consHeap body
switches rt.gpa.create + trackHeap → rt.gc.alloc(Cons); freeCons
removed. The arena-based `cons(alloc, ...)` path stays (per-eval
arena lifetime); only consHeap (long-lived) migrates.

traceGc fn registered into tag_ops.tag_trace_table[.list]: walks
first / rest / meta and calls mark_sweep.mark on each heap-tagged
child so the mark phase reaches Cons-rooted lists. No finaliser
needed (Cons has no owned non-GC resources — first/rest/meta are
Values, count is u32).

Mixed-lifetime caveat: a long-lived (GC) list may reference a
short-lived (arena) sub-tail; the arena.deinit at eval-end leaves
the GC-managed parent's `rest` pointing at freed memory. Boundary
copy lands at a later sub-step (analogous to cw v0's D100 #1 fix)
once analyzer-arena ownership boundaries are explicit.

Gate: Mac 13/13 + OrbStack Ubuntu x86_64 12/12 green (first-try).
chaploud added a commit that referenced this pull request May 28, 2026
… to P1

D-125 (per-task note batch), D-128 (orphan_prevention rule), and
D-129 (handover hook trim-Edit exemption) all closed via this
session's P0 batch. Debt rows flip to Discharged; handover loses
the now-stale `## Stopped` section per handover_framing.md resume
discipline, and the Resume contract advances to P1 item #1
(D-121 Java static method dispatch infra).

Smell-audited: depth 0: bookkeeping flip; status accurately
records the SHA of each discharging commit.
chaploud added a commit that referenced this pull request May 31, 2026
…); file D-180; wire resume

Smell-audited: 2: ROADMAP §17 amendment — adds §9.2.S (perf campaign, ROI-ordered,
ADR-0063 governance) as the ACTIVE resume target ahead of §9.2.R's Phase-15/JIT
sequence (F-003: no renumber, pulled-forward overlay). ADR-0063 Revision history
records the formalisation. No code smell.

User-directed (2026-05-31): extend the one-off range pull-forward into a sustained
ROI-ordered speed-tuning campaign, incorporated into the formal plan, wired so a
clear session tackles it head-on first. Changes:
- ROADMAP §9.2.S: the campaign (units table O-001✓/O-002✓ → D-180 → D-163 → D-140,
  measured numbers, PERF-marker + optimizations.md governance, F-002/F-011 discipline).
- D-180 filed: bulk persistent! / vector.fromSlice (toPersistent's N-persistent-conj
  rebuild = the into/vec 121s bottleneck; pairs with the reverted O-003 transient
  into/vec; core-Vector change → exhaustive boundary tests).
- handover resume contract → first-on-resume = §9.2.S / D-180.
- optimizations.md: identified candidates #1 (D-180) + #2 (D-140 startup) with measured
  numbers; map/filter fusion (D-163) annotated 42s/1e5.
chaploud added a commit that referenced this pull request Jun 4, 2026
…-io call sites (Phase B #1, ADR-0090 §1)

Smell-audited: 1: first Phase B implementation increment per ADR-0090 §1. New runtime/concurrency/io_default.zig — a process-wide std.Io singleton (lazy single-threaded default; set() upgrades to rt.io in production) + lockMutex/unlockMutex/condWait/condSignal/condBroadcast/sleep wrappers over std.Io.Mutex/Condition. Solves the no-io-arg call sites (GC allocator vtable + module-level mutexes) that the global heap lock (increment #2) needs. cljw-clean re-derivation from cw v0's io_default.zig (no_copy_from_v1: reference, not copy; kept only the sync-primitive surface). Wired into the src/main.zig test aggregator (test-discovery trap). 3 unit tests green (get lazy-init, lockMutex/unlockMutex round-trip, cond signal/broadcast no-waiter); zig build green. Additive (new file + pure aggregator insertion) -> rides the gate-cadence batch; the full gate lands at increment #2 (the GC global heap mutex = shared-code/risky change). No concurrency consumer yet -> single-threaded behaviour unchanged.
chaploud added a commit that referenced this pull request Jun 4, 2026
…locks #3; re-analysis gated (D-244)

Smell-audited: 2: Bad-Smell interrupt surfaced while designing increment #3. root_set.zig roots ns_vars/current_frame/macro_root_slot/permanent_roots but NOT the VM operand stack (vm.zig local Value array) nor tree_walk native-stack intermediates; safe today only because collect() runs at quiescent explicit points (no auto-collect). For Phase B real threads (#4), a mid-eval worker's operand/native-stack Values are un-rooted -> concurrent collect UAF; plus a pushFrame/popFrame read-during-write race during another thread's root walk. So ADR-0090 §2 Alt-2's 'no safepoint needed' is insufficient for mid-eval workers. Recorded as ADR-0090 Revision history + D-244 (the #3 gating design step): re-analyse with a DA-fork (safepoint Alt-1 vs publish-VM-operand-stack-root + forbid-tree_walk-during-collect) BEFORE the handshake code. The §1-2/§5-7 spine + increments #1/#2 are unaffected (the alloc lock is needed by either mechanism).
chaploud added a commit that referenced this pull request Jun 4, 2026
…e-i, ADR-0090 Alt B)

Smell-audited: 1: implements the decided ADR-0090 #4b real-threading within the
envelope. The FutureCell-pointer + finaliser is a language-forced impl detail
(std.Io.Condition is not extern, so it can't live in the extern Future — it lives
off-heap, infra-allocated, finaliser-freed). io_default.set(init.io) in main is a
gap-fix (the io_default doc's stated production wiring was unimplemented). The
force-VM Q2 + the concurrent-collect hardening (stopWorld currentTarget re-read,
the callFn->store result window, Q1 self-guard setters) are documented as
#4b-future-ii in D-244 — safe to defer because auto-collect is OFF and no user
collect trigger exists, so no collect fires during a worker in -i.

cljw's FIRST real concurrency. future.zig: eager-inline shell -> std.Thread spawn
running the thunk via callFn (VM path on the VM-default build, F-012), gc.pin for
the worker's lifetime, ThreadGcContext registration; deref BLOCKS on an
Io.Mutex/Condition cell. The GC handshake (#1-#4b-poll) makes the worker safe
under a concurrent collect. e2e clj-verified: @(future (+ 1 2))=3, shared-atom
@(future (swap! a inc))=1, worker-alloc @(future (vec (range 5)))=[0 1 2 3 4],
realized? after deref, error->future_thunk_failed (D-115). Existing eager-era
assertions updated to the async-correct shape.
chaploud added a commit that referenced this pull request Jun 4, 2026
…R-0092 Option A)

Smell-audited: 1: lands the locking surface over a new object_monitor.zig per
ADR-0092 (Option A = Option C's fast path). A header lock_state-bit spinlock
(CAS the whole gc_and_lock u32 to preserve gc_mark) + a threadlocal [32] held-set
for reentrancy + a safepoint-polling spin (the #1 GC-safety rule: a non-polling
spinner hangs a stop-the-world collect forever). The `locking` macro expands to
`(__locking obj (fn* [] body))`; the primitive holds the monitor across the body
thunk with a defer release. `(locking <immediate>)` errors (AD-014). clj-verified:
basic 42, reentrant 99 (no deadlock), body-env 15, mutual-exclusion 400 (a
non-atomic RMW under the lock loses no updates). Also corrects a stale stm.zig
docstring (STM is implemented, not pending). Contended waiters spin not park —
the blocking inflation is D-245.
chaploud added a commit that referenced this pull request Jun 6, 2026
Convergence Campaign Stage 0.1 / 0.2 / 0.5 inventory.

- 0.2 NEW .dev/v0_v1_feature_parity.md — every v0 bundled namespace (32)
  + app feature mapped to v1: 12 present / 3 partial / 24 MISSING, each
  MISSING carrying an owning debt row. Probed against a fresh cljw.
- 0.2 debt: new umbrella D-273 anchors the 24 MISSING (detail in the
  parity SSOT; single anchor per D-242 precedent — no per-ns row mint),
  so campaign Goal #1 (no un-rowed MISSING) holds.
- 0.5 NEW docs/works/ (F-010) — README + a 15-row pure-Clojure-degree
  ranked ladder. 4 libs actually require'd green on cljw (medley,
  math.combinatorics, tools.cli, cuerdas-partial); -cp works (ADR-0084),
  deps.edn (Stage 1.2) is the next unlock.
- 0.1 core_coverage_gaps.md — recipe re-run (168 raw missing, unchanged
  shape; residue dominated by known-deferred REPL-dynvar/array/proxy
  classes). Real coverage driver now = D-273 backfill + the lib ladder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant