v0.8.61: community harvest + freeze fix + WhaleFlow foundation layer (WIP, for review) by Hmbown · Pull Request #3225 · Hmbown/CodeWhale

Hmbown · 2026-06-14T17:15:53Z

v0.8.61 — community harvest + launch-blocker fix + WhaleFlow foundation layer

Draft / for review. This is the assembled v0.8.61 work on codex/v0.8.61 (28 commits over main). It is intentionally not merge-ready yet — the version bump is still a local working-tree change, and the new foundation modules are deliberately unwired (#![allow(dead_code)]) pending their follow-up wiring passes.

👏 New community contributors (6) — harvested with authorship preserved + credited on each PR

PR	Author	What
#3201	@mvanhorn	revive non-DeepSeek cost tracking → closes #3066
#3195	@cyq1017	telegram bridge keeps polling while turns stream (relates #2966)
#3220	@RobertEmprechtinger	cap mobile event history (freeze fix)
#3199	@gaord	`PUT /v1/sessions` engine-snapshot session save
#3197	@nightt5879	rename DEEPSEEK_BLUE → WHALE_ACCENT_PRIMARY (deprecated aliases kept) → closes #3069
#3221	@hongchen1993	exec honors `DEEPSEEK_BASE_URL` / `DEEPSEEK_MODEL`

(#3013 @cyq1017 verified already-implemented and credited.)

Launch blocker — sub-agent fanout freeze (#3216 / #2211)

Mechanism fix: the turn loop now observes cancellation between tool batches, so a runaway agent_open fanout is promptly interruptible instead of wedging the TUI.
Trigger fix: the base prompt no longer tells the model to spawn sub-agents "liberally" or advertises a 10/20 cap (the real effective launch limit is ~4); it now teaches deliberate, batch-and-poll fanout. (This was the "overlapping material" that drove the freeze.)
Full nonblocking/durable fanout is the designed follow-up (see docs/V0_8_61_EXECUTION.md).

Quick-fix issues closed

#3012 (auto-load global ~/.codewhale/instructions.md), #3068 (legacy .deepseek/ path audit doc), #3208 (release-artifact docs), plus wave-1 #3214 (branch-hygiene tool), #3188 (git identity in TUI status), #3076 (neutral provider ordering).

WhaleFlow foundation layer (the spine)

The orchestration pattern = WhaleFlow ≈ ultracode, native to CodeWhale with heterogeneous-model workers. Ten additive, tested foundation modules, not yet wired:
worker_profile (per-role permissions + model route + non-escalating parent→child derivation, #3217/#3211/#3213/#414/#426/#1186), goal_loop (persistent-objective decision core, wired into the continuation hook, #3215), record_thread_goal_usage (durable per-goal accounting), model_registry (#3071/#3073), provider_readiness (#3083), context_budget (#3086), provider_adapter (#3084), resource_telemetry (#2666), theme_override (#3074), request_tuning (#3024).

Docs

docs/V0_8_61_RELEASE_TRIAGE.md, docs/V0_8_61_ISSUE_COVERAGE.md (all 84 milestone issues → disposition + plan), docs/V0_8_61_EXECUTION.md (12 workstream clusters + the WhaleFlow spine).

Verification

cargo test -p codewhale-tui --bins → 4812 pass / 0 fail; codewhale-config/-protocol/-cli/-whaleflow/-state green; cargo fmt --all --check + git diff --check clean; scripts/release/check-versions.sh OK. (CI will run the full release build on this PR.)

Not in this PR / follow-ups

Version bump to 0.8.61 (kept as your local working-tree change).
Wiring the foundations into live flows (the "make it real" passes).
A 12-issue design batch (v0.8.61: Make multi-agent fanout durable and nonblocking before shipping swarm UX #3216/v0.8.61: Split sub-agents into a headless worker runtime with lightweight TUI projections #3096/v0.8.65 EPIC: Fleet execution substrate for profiled workers #3154/TUI-freeze-Windows-crossterm-poll #1812/v0.8.40: failed shell command can leave TUI feeling stuck with active task_shell_wait/background job state #1737/Work Queue Sync Lag & Shell PID Hang Causing Premature LIVE-State Exit During Long-Running Tasks #1786/v0.8.61: Default independent shell and verifier work to background jobs #3212/Track non-user cancellation reasons through engine call sites #1541/v0.8.61: Render tool work as expandable activity metadata rows #3146/Composer can accidentally enter queued-steer edit state without clear mode feedback #2054/v0.8.65: Cross-provider /model search using explicit route rows #3075/Hydrate model catalog metadata from provider APIs with offline cache and provenance #3072) is in progress to guide the next code waves.

🤖 Generated with Claude Code

…goal mode, PR triage) Synthesizes four investigations into the release plan: - Launch blocker #3216/#2211: TUI freeze root-caused at code level — global tool_exec_lock held across the agent_open flash-router model call in a serial, cancel-less, inline batch (mechanism), driven by prompt guidance that tells the model to spawn 'liberally' and advertises a 10/20 cap vs the real ~4 (trigger). - Sub-agent vs Fleet overlap (~70% unified; agent_open not yet on the durable path). - Goal Mode: three disconnected goal models; within-turn loop only; dead metering. - PR/issue stewardship recommendations with credit (not executed). Chunked-PR plan + exact push/PR/merge commands included. No GitHub state changed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The base prompt told the model to spawn sub-agents 'liberally' and advertised a cap of '10 concurrent / ceiling 20', while the effective concurrent-launch limit is ~4 (interactive_max_launch) — the rest queue. The parallel-first heuristic even said to fire 'all agent_open calls in one turn'. That guidance is what drove the six-sub-agent burst that wedged the TUI (#3216 / #2211): the model was coached into exactly the high-fanout pattern the runtime can't yet absorb. Reframe across all prompt representations (compiled constitution.md, legacy base.md, constitution.yaml source) and docs/SUBAGENTS.md: - 'use them liberally' -> 'use them deliberately; each is a real spawn, the win is a clean context, not free parallelism' - correct the cap: ~4 execute at once, rest queue; open a small batch (~4), poll with agent_eval, open the next batch; max_concurrent (10/20) caps TRACKED agents, not parallel execution - parallel-first bullet: 'all agent_open in one turn' -> 'a small batch (~4), then poll' Pairs with the freeze mechanism fix (cancel arm + drop global tool lock across the router model call). Prompt/docs only; no code-path change. tui bin compiles; all 90 prompt tests pass. The renderer (render_constitution.py) was not run — constitution.md uses {placeholder} expansion that the YAML inlines, so the files were edited consistently by hand to avoid a non-idempotent regeneration clobber. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…3216, #2211) When the model emitted several non-parallel tool calls in one turn — the classic case being six agent_open calls under /model auto — the turn loop ran them as sequential Serial batches with NO cancellation check between them. Each agent_open resolves a model route (the 4s flash router) while the global tool_exec_lock is held, so the batch could occupy the engine task for ~6x4s with the UI unable to interrupt it: a hard freeze where Esc/Ctrl+C did nothing. Cancellation is in fact delivered out-of-band: EngineHandle::cancel_with_reason locks the shared CancellationToken and cancels it directly (handle.rs), so the token is already cancelled while the batch runs — the loop simply never looked. The streaming and sub-agent-wait paths already race the token (turn_loop.rs 408/502/1107); the tool-batch loop did not. Fix: check self.cancel_token at the top of the "for batch in batches" loop. Once cancelled, stop launching further batches and record an interrupted result for every remaining plan (Ok(ToolResult success:false), not Err — so it does not inflate the step's error counters), keeping each tool_use paired with a tool_result so the transcript stays well-formed for resume. The post-loop check then ends the turn as Interrupted. This branch is a no-op on the normal (non-cancelled) path. Scope: this makes a runaway fan-out promptly cancellable (24s -> interrupt). It does NOT by itself make the parent nonblocking during fan-out — detaching agent_open onto the durable fleet-backed worker run (per docs/AGENT_RUNTIME.md cutover rule) remains the larger #3216 follow-up, designed in docs/V0_8_61_RELEASE_TRIAGE.md. Pairs with the prompt/cap-honesty chunk that stops the model spawning the burst in the first place. Verification: cargo fmt --all --check clean; new unit test (cancel_batch_tests::interrupted_tool_result_is_a_non_error_unexecuted_marker) passes; cargo test -p codewhale-tui --bins => 4695 pass. The single full-suite failure (a tools::test_runner meta-test that spawns a nested cargo) is a CARGO_TARGET_DIR test-isolation artifact of the local shared-target build — it passes in isolation and on the unmodified base; unrelated to this change (different module). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@mvanhorn

…icing table Fixes #3066 Harvested-from: PR #3201 by @mvanhorn Closes: #3066

@cyq1017

Harvested-from: PR #3195 by @cyq1017 Refs: #2966

@RobertEmprechtinger

Harvested-from: PR #3220 by @RobertEmprechtinger Refs: #3216

@gaord

…sion save Add a new PUT /v1/sessions endpoint that saves a thread's current engine state as a session, complementing the existing POST /v1/sessions which reconstructs messages from stored turn items. The new endpoint asks the engine for its live session snapshot via a oneshot channel, so token counts and message ordering are authoritative rather than reconstructed. This matches TUI's build_session_snapshot behavior. Changes: - ops.rs: add SessionSnapshot struct and Op::GetSessionSnapshot variant - engine.rs: handle GetSessionSnapshot in the engine loop - engine/handle.rs: add get_session_snapshot() method - runtime_threads.rs: expose get_engine() as public wrapper - runtime_api.rs: add PUT /v1/sessions route and save_current_session handler Also fixes the Greptile review issue where load_session errors were silently swallowed: only io::ErrorKind::NotFound falls back to creating a new session; other I/O errors (e.g. PermissionDenied) are now propagated. Ref: #2808 Harvested-from: PR #3199 by @gaord

@nightt5879

Harvested-from: PR #3197 by @nightt5879 Closes: #3069

@hongchen1993

Three fixes so that 'codewhale --provider wanjie-ark --base-url <url> --model auto exec ...' works without a wrapper script: 1. resolve_exec_model: fall back to CODEWHALE_MODEL/DEEPSEEK_MODEL env vars when the explicit arg is absent. 2. Exec command handler: read DEEPSEEK_BASE_URL and set it on the config before creating the client. 3. deepseek_base_url: try env_base_url_override() as a fallback before the provider default. Harvested-from: PR #3221 by @hongchen1993 Refs: #3205

…#3012) global_context_relative_paths() only honored AGENTS.md and the deprecated WHALE.md globally; ~/.codewhale/instructions.md (and .agents/ + .deepseek/ variants) was project-level only. Add the three instructions paths ranked between AGENTS.md (higher) and WHALE.md (lower), matching the documented project-level precedence. The existing load loop, merge, and deprecation-warning logic need no other change. Adds two tests (autoload+outranks-WHALE, AGENTS-outranks-instructions). Closes: #3012 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

… issues) Coverage matrix: ultracode triage of all 84 open milestone issues — disposition (already-done 16 / quick-fix 3 / design 52 / defer 13) + a concrete plan per issue. Execution plan: clusters the 52 design issues into 12 agent-owned workstreams with sequencing + dependencies, and makes the architectural spine explicit — WhaleFlow is the ultracode orchestration pattern realized inside CodeWhale, where workers are heterogeneous model types (flash scouts, pro synthesis, per-role model routes). Goal mode #3215 / durable fanout #3216 / fleet #3154 / profiles #3217 / swarm gate #3218 are facets of that one epic. Critical scope: multi-pass, merge only verified branches. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…3068) Documents the consolidated state-dir resolver (config::resolve_state_dir / ensure_state_dir, read-.codewhale-fallback-.deepseek / write-.codewhale) and a keep/deprecate/remove decision per legacy reference. Decision: keep-as-fallback for all; routing the remaining hardcoded sites through the resolver is a flagged follow-up. Closes: #3068 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

#3208 was confusion, not a bug: codewhale-<platform> (bare) is what the npm wrapper and in-app updater download; codewhale-<platform>.tar.gz bundles the same binaries + install.sh for manual installs. Clarify both the generated GitHub release body and INSTALL.md §6 so the Releases page is self-explanatory. Docs only; no pipeline logic. Closes: #3208 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Adds scripts/release/branch-hygiene.sh: a read-only, dry-run-by-default tool that makes the post-merge release state obvious and recommends safe branch cleanup. It reports the current checkout, local+remote release tips, and origin/main; flags branches whose tip is already contained in main or the release branch as safe deletes; and lists branches with unique commits as keep/review, naming the branch, unique-commit count, author(s), and reason. Enforces the contributor-preservation policy: a branch is only ever a maintainer-only safe/review item if every unique commit maps to Hunter (via a built-in list + the canonical side of .mailmap). Any unique commit from another contributor forces a KEEP and is never auto-deleted unless already merged. Deletion is gated behind --prune/--prune-remote + a confirmation (or --yes), and a diverged local/remote release tip exits non-zero. Adds a hermetic test (branch-hygiene.test.sh) that builds a synthetic repo and asserts safe-delete detection, contributor preservation, mailmap folding, the parked-checkout warning, prune behavior, and divergence detection. Documents the exact commands as section 5b of the release checklist. Refs: #3214 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The footer git chip previously showed only the branch and rendered nothing outside a git repo, so the status surface could read as an empty "Repo:" label (issue #3188). Surface a concise, factual workspace identity sourced strictly from workspace/git detection — never from model narration or config text: - Git repo: "Repo: <name> @ <branch>" (carries the cached "detached:<hash>" form for detached HEAD). - Non-git cwd: "Repo: <name> (no git)" instead of hiding the chip. Add width-aware formatting (`format_repo_identity`) that keeps the repo identity over the branch under width pressure, then truncates the name rather than collapsing to a bare prefix. The chip renders the full identity and lets the footer widget clip to terminal width (matching the prior branch-only chip). The truncation policy is unit-tested with explicit widths. Tests cover git repo, detached HEAD, non-git cwd, narrow-width truncation priority, and a real-git integration check. The existing two footer-branch tests are updated to the new identity contract. Refs: #3188 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Provider browsing surfaces have historically led with whichever provider sits first in ProviderKind::ALL (DeepSeek). Add a config-crate helper that returns the built-in providers sorted alphabetically (case-insensitively) by display name, giving model/provider UI a neutral order without DeepSeek being hard-coded first. The helper is purely additive: ProviderKind::ALL and all_providers() keep their stable insertion order for internal parsing and default selection, so provider resolution and defaults are unchanged. Doc comments now spell out which order is for stable internal matching vs UI display. DeepSeek remains present and searchable. Foundational public API for the UI wiring (model picker / provider picker / completions), which intersects unmerged provider-aware search (#3075) and is left as a follow-up so this slice stays safe and self-contained. Tests assert display order is alphabetical, differs from ProviderKind::ALL order, is complete and de-duplicated, and that DeepSeek is present but not first in display order. Refs: #3076

…e substrate Adds crates/tui/src/worker_profile.rs: the per-role capability contract every detached worker (agent_open sub-agent or Fleet worker) should run under — PermissionSet (write/network), ShellPolicy (None/ReadOnly/Full, replacing the legacy shell boolean), ToolScope (Inherit/Explicit, mirroring AgentWorkerToolProfile), ModelRoute (Inherit/Auto/Fixed — the heterogeneous-model piece), provider override, spawn-depth budget, and background flag. derive_child(parent, requested) intersects capabilities so a child can NEVER escalate beyond its parent (permissions AND-ed, shell min, explicit tools bounded by the parent set, depth decremented + clamped to MAX_SPAWN_DEPTH_CEILING). Reuses the existing SubAgentType role taxonomy (one taxonomy, not a parallel one). 7 tests. Foundation only (#![allow(dead_code)]): wiring agent_open/fleet to build + enforce these profiles, and mapping the legacy shell boolean / AgentWorkerToolProfile onto it, is the follow-up. This is the substrate for the WhaleFlow≈ultracode worker model. Refs: #3217, #3211, #3213, #414, #426, #1186 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add crates/tui/src/model_registry.rs: one place to look up model facts (id, provider grouping, context_window, max_output, supports_reasoning). Every numeric fact is SEEDED from the existing crate::models lookups (context_window_for_model / max_output_tokens_for_model / model_supports_reasoning) so the registry can never silently disagree with models.rs. Canonical model ids mirror the DEFAULT_* provider defaults in crates/config plus the explicit models.rs rows. This is additive foundation only: existing hard-coded call sites are left untouched and will be migrated to consume the registry in a later pass. A drift-guard test re-asserts the registry context window equals the live models.rs value for a per-provider sample, so a future hard-coded literal that drifts is caught in CI. DeepSeek defaults are seeded and classified first-class. Wires `mod model_registry;` into crates/tui/src/main.rs. Refs: #3071 #3073 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Wire the currently-dead per-goal token/time accounting on the durable ThreadGoal. The protocol ThreadGoal and the state thread_goals table already carry tokens_used/time_used_seconds, but they were set to 0 at creation and never incremented. Add StateStore::record_thread_goal_usage, an additive helper that atomically increments tokens_used and time_used_seconds (via SQL col = col + ?) and advances updated_at monotonically (MAX(updated_at, now)), returning the updated ThreadGoalRecord or None when the thread has no goal. It never creates a goal row. This is the durable-accounting foundation the persistent goal loop (#3215) and the sidebar will later read; no runtime goal loop or existing behavior is changed. Covered by two unit tests: multi-accrual accumulation with identity-field and monotonic-updated_at assertions, and the goalless no-op (returns None, creates nothing). Refs: #3215 #1976 #2029 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add an additive, pure data foundation for the future /provider readiness dashboard. New module `provider_readiness` assembles a `ProviderReadinessRow` per provider from the existing `config::provider_capability` + `has_api_key_for` + model-resolution helpers: provider, active flag, has_key, credential-derived `ProviderReadiness`, resolved model + `ModelProvenance`, base URL hint, context_window/max_output, and the thinking/cache/streaming capability flags. Foundation-only: no rendering and no network I/O (live health states are reserved variants for a later cached-health layer), and `provider_picker.rs` is intentionally untouched. Wired via `mod provider_readiness;` in main.rs. 10 unit tests cover local/hosted/active/inactive readiness, V4 metadata, catalog-default vs saved provenance, explicit-unknown metadata, and base-url surfacing. DeepSeek support and CodeWhale branding preserved. Refs: #3083 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Add crates/tui/src/context_budget.rs: a pure, I/O-free ContextBudget service that derives available input budget, the output token cap, a compaction-trigger threshold (~75% of window), and a Low/Medium/High/ Critical PressureLevel from a model context window, current input tokens, and a configured output cap. Mirrors the engine's hard-won budget semantics (window-dependent output reservation, window - reserved_output - headroom, saturating arithmetic that never underflows on small self-hosted windows) as standalone functions, with thorough unit tests across window sizes (8K..1M) and every pressure/compaction boundary. PressureLevel labels stay aligned with the existing context-report vocabulary. Foundation-only: wiring the engine capacity checkpoints and the TUI pressure indicator to consume this is a later pass. Additive — no existing behavior or tests changed; DeepSeek support and CodeWhale branding preserved. Refs: #3086 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…layer (#3215) Adds crates/tui/src/goal_loop.rs: the pure decision core that turns a one-shot /goal into a persistent work loop. decide_continuation(status, progress, budget) returns Continue or Stop(reason) — terminal model status (Completed/Blocked) wins, then budget/circuit-breaker (token, time, and an always-on continuation cap that prevents a runaway loop), else continue. Reads the durable per-goal accounting wired by crates/state record_thread_goal_usage. 7 tests. Foundation only (#![allow(dead_code)]): the engine continuation hook (turn_loop.rs goal_continuation_message_if_needed, today capped at 3/turn and reset each turn) consumes this in the follow-up that makes goal mode persistent across turns + durable. This is the orchestrator in the WhaleFlow≈ultracode mapping; it composes with the worker_profile (worker substrate) and goal-metering (durable accounting) foundations. Refs: #3215, #891, #1976, #2058, #2029 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Introduce an additive crates/tui/src/provider_adapter.rs defining the per-provider contract requested by #3084: a ProviderAdapter trait exposing a capability descriptor (sourced from config::provider_capability, not hard-coded), an AuthModel marker (EnvVar/OAuth/BuiltInKey), and a RequestDialect marker (OpenAiCompatible/DeepSeekNative/Anthropic). Includes a check/assert_adapter_conformance pair validating the context_window > 0, max_output > 0, max_output <= context_window invariants, plus worked DeepSeek and OpenAI-compatible adapters and thorough unit tests. Foundation only (#![allow(dead_code)]); consumers wired later. DeepSeek remains a first-class example. Refs: #3084 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…re (#3215) goal_continuation_message_if_needed now calls goal_loop::decide_continuation instead of an inline counter check. Behavior-preserving today (the per-turn continuation cap), but this is the seam where durable cross-turn budget — token/time from the per-goal accounting (crates/state record_thread_goal_usage) — gets enforced as goal mode becomes a persistent work loop. Makes the goal_loop foundation a live consumer rather than dead code. 32 goal tests pass. Refs: #3215 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Introduce crates/tui/src/resource_telemetry.rs, a pure, I/O-free foundation for surfacing token/time/resource usage during long tasks. - ResourceTelemetry { tokens_used, time_used_seconds, token_budget, time_budget_seconds } with builder helpers. - human_summary() renders a compact line, e.g. "12.3k tok · 4m12s · 41% budget": tokens abbreviated with k/M, time as Hh Mm Ss, budget segment omitted when unbounded. - token/time/budget fraction + percent helpers (None when unbounded or budget is zero, never infinity). - Coarse PressureLevel (Low/Medium/High) from the max bounded budget fraction; unbounded tasks are always Low. No rendering, no I/O, no behavior change. Carries #![allow(dead_code)] (consumers wired later, matching features.rs). Thorough tests cover bounded/unbounded, zero/large values, unit-boundary rounding, and the budget threshold edges. Refs: #2666

…on colors Pure, additive foundation for #3074: dark themes where the default gold accent over a light selection background renders selected rows unreadable. Adds crates/tui/src/theme_override.rs: - ThemeColorOverride { accent, selection_bg, selection_fg } (all Option<Rgb>, default = inherit), plus selection_contrast() and is_empty() helpers. - parse_hex_color("#RRGGBB"/"RRGGBB") -> (u8,u8,u8) with a typed HexColorParseError (impls Display + std::error::Error). - relative_luminance / contrast_ratio (WCAG 2.x) and meets_min_contrast so a future settings layer can validate legibility. No rendering, no settings I/O; consumers wired later (matches features.rs, #![allow(dead_code)]). Colors are plain (u8,u8,u8) triples to avoid coupling this foundation to ratatui; the existing palette::parse_hex_rgb_color returns Option<ratatui::Color> and would lose the typed error, so a local parser is used. Only the alphabetical `mod theme_override;` line is added to main.rs. Tests (cargo test -p codewhale-tui --bins theme_override): valid/invalid hex, black/white ~21:1 contrast, identical colors = 1:1, and a low-contrast gold-on-light-selection pair failing AA. 10 passed. Refs: #3074

Introduce crates/tui/src/request_tuning.rs: a pure, declarative foundation that encodes which providers honor which request-tuning params (reasoning effort + max output tokens), so the silent no-ops in #3024 can be surfaced and fixed deliberately later. - RequestTuning { reasoning_effort: Option<ReasoningEffort>, max_output_tokens: Option<u32> } reuses the canonical crate::tui::app::ReasoningEffort enum (already imported by auto_reasoning / model_routing) instead of a local copy. - provider_tuning_support(name) -> TuningSupport { honors_reasoning_effort, honors_max_output_tokens } with documented rows grounded in current client.rs / client/chat.rs behavior: DeepSeek honors both; OpenAI / Moonshot / Ollama / Atlascloud have gaps; unknown names fall back to a conservative default. Additive only: one new module behind #![allow(dead_code)] (consumers wired later, matching features.rs) plus its alphabetical mod line. No request building, no behavior change. Tests assert the support map for each named provider, the DeepSeek-CN aliases, and the default. Refs: #3024

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

gemini-code-assist

Code Review

This pull request introduces several core features and optimizations to CodeWhale, including a persistent goal loop orchestrator, a unified context-budget math module, and a nonblocking sub-agent fanout mechanism to prevent TUI freezes. It also adds a branch hygiene script for post-merge cleanup and expands pricing support for non-DeepSeek models. The code review highlights a few important issues: a portability bug on macOS in the branch hygiene script, performance overhead from string allocations during provider sorting, the use of unstable Rust let_chains in multiple files, and debug print noise in the main entry point.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-14T17:17:50Z

+    release_branch="$(git for-each-ref --format='%(refname:short)' 'refs/heads/codex/v*' \
+      | sort -V | tail -n1 || true)"


The sort -V (version sort) option is a GNU extension and is not supported by the default sort utility on macOS (BSD sort), which will cause this script to fail on macOS. Since Git has built-in version sorting support via --sort=version:refname, we can use that directly to ensure portability across all platforms.

Suggested change

release_branch="$(git for-each-ref --format='%(refname:short)' 'refs/heads/codex/v*' \

| sort -V | tail -n1 || true)"

release_branch="$(git for-each-ref --sort=version:refname --format='%(refname:short)' 'refs/heads/codex/v*' \

| tail -n1 || true)"

gemini-code-assist · 2026-06-14T17:17:50Z

+pub fn providers_sorted_for_display() -> Vec<&'static dyn Provider> {
+    let mut providers = all_providers().to_vec();
+    providers.sort_by(|a, b| {
+        a.display_name()
+            .to_ascii_lowercase()
+            .cmp(&b.display_name().to_ascii_lowercase())
+    });
+    providers
+}


Calling to_ascii_lowercase() on &str allocates a new String on every comparison. Since sort_by performs $O(N \log N)$ comparisons, this results in hundreds of unnecessary allocations. You can perform a zero-allocation case-insensitive comparison by comparing the byte iterators directly.

Suggested change

pub fn providers_sorted_for_display() -> Vec<&'static dyn Provider> {

let mut providers = all_providers().to_vec();

providers.sort_by(|a, b| {

a.display_name()

.to_ascii_lowercase()

.cmp(&b.display_name().to_ascii_lowercase())

});

providers

}

pub fn providers_sorted_for_display() -> Vec<&'static dyn Provider> {

let mut providers = all_providers().to_vec();

providers.sort_by(|a, b| {

a.display_name()

.bytes()

.map(|b| b.to_ascii_lowercase())

.cmp(b.display_name().bytes().map(|b| b.to_ascii_lowercase()))

});

providers

}

gemini-code-assist · 2026-06-14T17:17:50Z

+    if let Some(tokens) = budget.token_budget
+        && progress.tokens_used >= tokens
+    {
+        return ContinuationDecision::Stop(StopReason::TokenBudget);
+    }
+    if let Some(secs) = budget.time_budget_seconds
+        && progress.time_used_seconds >= secs
+    {
+        return ContinuationDecision::Stop(StopReason::TimeBudget);
+    }


The let_chains feature is currently unstable in Rust and requires a nightly compiler. To ensure compatibility with stable Rust, use Option::is_some_and instead, which is stable since Rust 1.70.0.

if budget.token_budget.is_some_and(|tokens| progress.tokens_used >= tokens) { return ContinuationDecision::Stop(StopReason::TokenBudget); } if budget.time_budget_seconds.is_some_and(|secs| progress.time_used_seconds >= secs) { return ContinuationDecision::Stop(StopReason::TimeBudget); }

gemini-code-assist · 2026-06-14T17:17:50Z

+    if let Some(limit) = query.replay_limit
+        && backlog.len() > limit
+    {
+        backlog = backlog.split_off(backlog.len() - limit);
+    }


The let_chains feature is unstable in Rust. To ensure compatibility with stable Rust, use a nested if let block instead.

Suggested change

if let Some(limit) = query.replay_limit

&& backlog.len() > limit

{

backlog = backlog.split_off(backlog.len() - limit);

}

if let Some(limit) = query.replay_limit {

if backlog.len() > limit {

backlog = backlog.split_off(backlog.len() - limit);

}

}

gemini-code-assist · 2026-06-14T17:17:50Z

+                if let Ok(env_url) = std::env::var("DEEPSEEK_BASE_URL") {
+                    let trimmed = env_url.trim();
+                    eprintln!("DEBUG DEEPSEEK_BASE_URL='{trimmed}'");
+                    if !trimmed.is_empty() {
+                        config.base_url = Some(trimmed.to_string());
+                    }
+                } else {
+                    eprintln!("DEBUG DEEPSEEK_BASE_URL not set");
+                }


These eprintln! calls print debug noise to stderr in production. Additionally, the check should also honor CODEWHALE_BASE_URL as the primary environment variable, falling back to DEEPSEEK_BASE_URL for backward compatibility.

let env_url = std::env::var("CODEWHALE_BASE_URL") .or_else(|_| std::env::var("DEEPSEEK_BASE_URL")); if let Ok(env_url) = env_url { let trimmed = env_url.trim(); if !trimmed.is_empty() { config.base_url = Some(trimmed.to_string()); } }

…acode 5-agent workflow examined kimi-code's swarm, the broader swarm pattern, and CodeWhale's WhaleFlow today (code + docs). Verdict: vision honors both targets (more ambitious than kimi — heterogeneous-model workers vs a single trained orchestrator) and the Train 3->4 plan is the right sequence, but implementation is largely foundation-only dead_code / prompt-only. 9 recommendations, each mapped to a v0.8.61 issue/train; the net-new gap is a swarm coordination substrate that un-orphans the crates/whaleflow IR via the Fleet ledger. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Apply the effective MCP read timeout inside the JSON-RPC receive loop and disconnect streams that emit non-JSON prompt text, so YOLO/non-interactive runs fail clearly instead of hanging on MCP startup prompts. Refs: #2475

Refs: #3072

Refs: #3073

Refs: #3190 Refs: #2666 Refs: #3194

Refs: #3203 Refs: #3224 Refs: #2054 Refs: #2982 Refs: #963 Refs: #3028 Refs: #3078 Refs: #3190 Refs: #2666 Refs: #3194

Refs: #3217

Use scoped env guards for secret-backend overrides and pin the save-key tests to a per-test config path so parallel route-filtered runs cannot redirect writes through process-global env state. Refs: #3211

Refs: #3216

Summarize Train 2 implementation, tests, residual risks, and final verification for the isolated v0.8.61 worktree. Refs: #3211

Refs: #3075

Refs: #3025

Refs: #3216

Resolve sub-agent assignment routes through WorkerRuntimeProfile::model, preserving explicit model overrides while giving scout/tool roles a provider-aware cheap lane and no-thinking request tuning. Synthesis roles continue to inherit the session route, and providers without a cheap tier stay on the parent model. Refs: #2027 Refs: #1768

Refs: #3205 Refs: #3204 Refs: #3213 Refs: #3072 Refs: #3073 Refs: #3075 Refs: #3025 Refs: #2027 Refs: #1768

#3217)

# Conflicts: # crates/tui/src/tools/subagent/mod.rs

# Conflicts: # crates/tui/src/core/engine.rs # crates/tui/src/prompts.rs

# Conflicts: # crates/tui/src/tui/app.rs # crates/tui/src/tui/subagent_routing.rs # crates/tui/src/tui/ui/tests.rs

# Conflicts: # crates/tui/src/tools/subagent/mod.rs

Six lints (collapsible-if x2 via let-chains, redundant-closure, derivable Default, manual is_multiple_of, needless deref) + dead-code allow on the legacy with_agent_tools wrapper (prod path uses with_agent_tools_policy). Fixed by dogfooding `codewhale exec --auto` on the freshly-built 0.8.61 binary; clippy --workspace --all-features -D warnings is clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- state: parity test expects schema v4 (Train 4 added thread_goals.continuation_count migration) - engine: Plan mode = ShellPolicy::None (no shell), consistent with the runtime prompt's shell_access=none — reverts an incidental Train-2 Plan->ReadOnly mapping that leaked shell tools into read-only planning; sandbox stays ReadOnly - shell: update exec_shell schema/move-to-background assertions to Train 2's reworded guidance (intent preserved: >5s -> background, references exec_shell_wait) - subagent: unify the no-record status-projection path onto worker_status_from_subagent_result so interrupted+continuable projects waiting_for_user (needs-parent-action) — the no-record and worker-record paths previously disagreed (the real cross-wiring) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…keep-open)

Bump all crates 0.8.60 -> 0.8.61, npm package, and web facts; fill the 0.8.61 changelog with the integrated runtime-control-plane work (TUI freeze fix, provider/model route isolation, fleet-worker convergence, durable goal mode, distribution hygiene) plus community contributions. Not tagged or published — release artifacts await maintainer approval. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Replace the verbose constitution head (preamble + 7 articles + {model_id} ceremony) with the maintainer's v4: a preamble + six articles (Ground Truth, Verification, Momentum, Legacy, Help, Priority), model-agnostic. constitution.md head is byte-identical to v4; the operational STATUTES/REGULATIONS/EVIDENCE tiers below are preserved (runtime-required). yaml + render_constitution.py + ~15 prompt tests updated to match. Full bin suite green (4872). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Richer README front door (open-models-first framing, '## The Idea' constitution essay, accurate 24-provider list) + website improvements (SEO opengraph/robots/sitemap routes, nav/footer, facts-driven pages, coupled locale-layout), 3-way reconciled to 0.8.61: provider count corrected 21->24 (verified vs crates/config), version/palette/dead-link fixes. Conservatively deferred 3 heavily-conflicting web pages (page/faq/install) to preserve main's richer versions. Verified: tsc --noEmit exit 0, eslint clean, cargo build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

greptile-apps · 2026-06-14T21:28:48Z

Too many files changed for review. (164 files found, 100 file limit)

Hmbown and others added 28 commits June 14, 2026 08:12

fix: revive cost tracking for non-DeepSeek models with an expanded pr…

1faf782

…icing table Fixes #3066 Harvested-from: PR #3201 by @mvanhorn Closes: #3066

fix(telegram): keep polling while turns stream

95499dd

style(telegram): align turn lifecycle status reply

796211c

Harvested-from: PR #3195 by @cyq1017 Refs: #2966

Limit mobile event history to prevent freezes

1d1467e

Harvested-from: PR #3220 by @RobertEmprechtinger Refs: #3216

Rename DeepSeek blue consumers to whale accent

ecf7006

Harvested-from: PR #3197 by @nightt5879 Closes: #3069

greptile-apps Bot reviewed Jun 14, 2026

View reviewed changes

gemini-code-assist Bot reviewed Jun 14, 2026

View reviewed changes

Hmbown and others added 28 commits June 14, 2026 12:09

fix(mcp): bound prompt-blocked stdio reads

7f98880

Apply the effective MCP read timeout inside the JSON-RPC receive loop and disconnect streams that emit non-JSON prompt text, so YOLO/non-interactive runs fail clearly instead of hanging on MCP startup prompts. Refs: #2475

feat(tui): hydrate model facts from offline catalog

9f1094d

Refs: #3072

fix(tui): derive model picker hints from registry

975496e

Refs: #3073

feat(tui): surface output token throughput

1be50de

Refs: #3190 Refs: #2666 Refs: #3194

docs: record train 5 results

4a9546b

Refs: #3203 Refs: #3224 Refs: #2054 Refs: #2982 Refs: #963 Refs: #3028 Refs: #3078 Refs: #3190 Refs: #2666 Refs: #3194

feat(subagents): wire runtime worker profiles

43cfb6b

Refs: #3217

test(config): isolate api key save env overrides

5d58c09

Use scoped env guards for secret-backend overrides and pin the save-key tests to a per-test config path so parallel route-filtered runs cannot redirect writes through process-global env state. Refs: #3211

test(tui): cover six-worker progress storm responsiveness

76af2a0

Refs: #3216

docs(train): record train 2 result

e12c1ae

Summarize Train 2 implementation, tests, residual risks, and final verification for the isolated v0.8.61 worktree. Refs: #3211

docs(whaleflow): link filed issues #3229 #3230

5951a78

fix(tui): expose cross-provider catalog rows in model picker

65f60d2

Refs: #3075

fix(tui): expire terminal sub-agent cards

4d70d7c

Refs: #3025

docs(train): record train 3 result

f0780f4

Refs: #3216

docs: record train 1 route isolation result

179384f

Refs: #3205 Refs: #3204 Refs: #3213 Refs: #3072 Refs: #3073 Refs: #3075 Refs: #3025 Refs: #2027 Refs: #1768

merge(train-3): worker/fleet convergence + freeze fix (#3216 #3096 #3226

9431b65

#3217)

merge(train-1): v0.8.61 runtime work

bacc0db

# Conflicts: # crates/tui/src/tools/subagent/mod.rs

merge(train-2): v0.8.61 runtime work

e3e0bae

# Conflicts: # crates/tui/src/core/engine.rs # crates/tui/src/prompts.rs

merge(train-5): v0.8.61 runtime work

0092cc0

# Conflicts: # crates/tui/src/tui/app.rs # crates/tui/src/tui/subagent_routing.rs # crates/tui/src/tui/ui/tests.rs

merge(train-4): goal mode (#3215 #891 #1976 #2058 #3218 #2029)

884681d

# Conflicts: # crates/tui/src/tools/subagent/mod.rs

chore: drop worker train result-docs (integration scratch artifacts)

bcdc44c

docs(release): v0.8.61 issue-closure manifest (31 ready-to-close, 33 …

e4b79a9

…keep-open)

Hmbown merged commit a70d5c9 into main Jun 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.8.61: community harvest + freeze fix + WhaleFlow foundation layer (WIP, for review)#3225

v0.8.61: community harvest + freeze fix + WhaleFlow foundation layer (WIP, for review)#3225
Hmbown merged 83 commits into
mainfrom
codex/v0.8.61

Hmbown commented Jun 14, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

		release_branch="$(git for-each-ref --format='%(refname:short)' 'refs/heads/codex/v*' \
		\| sort -V \| tail -n1 \|\| true)"

Conversation

Hmbown commented Jun 14, 2026

v0.8.61 — community harvest + launch-blocker fix + WhaleFlow foundation layer

👏 New community contributors (6) — harvested with authorship preserved + credited on each PR

Launch blocker — sub-agent fanout freeze (#3216 / #2211)

Quick-fix issues closed

WhaleFlow foundation layer (the spine)

Docs

Verification

Not in this PR / follow-ups

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants