Skip to content

fix: prevent llm preflight failures for reasoning models#356

Closed
l50 wants to merge 31 commits into
dreadnode:mainfrom
l50:fix/llm-preflight-budget-for-reasoning-models
Closed

fix: prevent llm preflight failures for reasoning models#356
l50 wants to merge 31 commits into
dreadnode:mainfrom
l50:fix/llm-preflight-budget-for-reasoning-models

Conversation

@l50
Copy link
Copy Markdown
Contributor

@l50 l50 commented May 27, 2026

Key Changes:

  • Increased the preflight LLM response budget to accommodate internal reasoning tokens
  • Prevented valid reasoning models from failing startup with max token limit errors
  • Updated preflight documentation to reflect that the request is no longer a 1-token ping

Changed:

  • LLM provider preflight behavior - Raised req.max_tokens from 1 to 64 in preflight_llm_provider so reasoning models have enough completion budget while keeping the validation call lightweight
  • Preflight comments - Clarified that response content is discarded without describing the request as a 1-token call, and documented why the larger token budget is needed for reasoning models

l50 and others added 30 commits May 27, 2026 11:54
Renovate skips forks by default. l50/ares is the production target for
this workflow run, so opt in via RENOVATE_FORK_PROCESSING=enabled.
**Key Changes:**

- Added optional remote cracking mode that delegates hashcat jobs to an HTTP service when configured
- Implemented authenticated job submission, polling, timeout handling, and potfile retrieval for remote jobs
- Preserved local hashcat execution as the default path when remote service configuration is absent
- Scoped remote execution to simple wordlist attacks so service-owned GPU and wordlist resources remain isolated

**Added:**

- Remote hashcat client module - Adds HTTP integration for submitting jobs, polling job status, retrieving cracked results, handling bearer authentication, and normalizing local wordlist paths to remote-safe basenames
- Remote service configuration support - Enables remote mode through HASHCAT_SERVICE_URL and requires HASHCAT_TOKEN for authenticated requests
- Remote result handling - Returns crackd logs, potfile contents, remote errors, exit codes, and timeout failures through the existing ToolOutput structure

**Changed:**

- Hashcat cracking flow - Updates crack_with_hashcat to check for remote service configuration first and delegate to the remote backend when available, while keeping the existing local hashcat behavior unchanged otherwise
**Added:**

- Renovate package rule to automerge patch and minor Cargo, Ansible Galaxy, Galaxy collection, and pre-commit updates via PR - .github/renovate.json5
| datasource | package          | from   | to     |
| ---------- | ---------------- | ------ | ------ |
| crate      | local-ip-address | 0.6.12 | 0.6.13 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource | package    | from    | to      |
| ---------- | ---------- | ------- | ------- |
| crate      | serde_json | 1.0.149 | 1.0.150 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource | package      | from   | to     |
| ---------- | ------------ | ------ | ------ |
| pypi       | ansible-core | 2.20.5 | 2.21.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package       | from  | to    |
| ----------------- | ------------- | ----- | ----- |
| galaxy-collection | ansible.posix | 2.1.0 | 2.2.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource | package | from  | to    |
| ---------- | ------- | ----- | ----- |
| crate      | sqlx    | 0.8.6 | 0.9.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package           | from   | to     |
| ----------------- | ----------------- | ------ | ------ |
| galaxy-collection | community.general | 12.6.1 | 13.0.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
**Removed:**

- Removed the temporary Renovate allowedVersions cap that blocked opentelemetry Rust crates from updating to 0.32 and later versions
* fix: assert safety for dynamic sqlx history queries

**Changed:**

- Wrapped dynamically assembled history queries with `AssertSqlSafe` so sqlx accepts SQL built from static fragments with bound user values - `ares-cli/src/history`
- Documented and applied the same safety assertion to credential hash search queries that construct placeholder lists dynamically - `ares-core/src/persistent_store/queries/credentials.rs`

* build: update windows-sys lockfile dependency
| datasource | package                            | from   | to     |
| ---------- | ---------------------------------- | ------ | ------ |
| crate      | opentelemetry                      | 0.31.0 | 0.32.0 |
| crate      | opentelemetry-otlp                 | 0.31.1 | 0.32.0 |
| crate      | opentelemetry-semantic-conventions | 0.31.0 | 0.32.0 |
| crate      | opentelemetry_sdk                  | 0.31.0 | 0.32.0 |
| crate      | tracing-opentelemetry              | 0.32.1 | 0.33.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource  | package                 | from   | to     |
| ----------- | ----------------------- | ------ | ------ |
| github-tags | actions/upload-artifact | v7.0.0 | v7.0.1 |
…latformAutomerge enables GH auto-merge on PR creation
Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
…-26) (#27)

**Key Changes:**

- Added fail-fast LLM validation so orchestrator startup aborts on auth, org, or restricted-model configuration errors before tasks are queued
- Hardened tool dispatch timeouts by raising the NATS client request deadline and applying per-tool timeout floors for slow recon and AD operations
- Made telemetry initialization idempotent to prevent double-init crashes and preserve correct service names for long-running subcommands
- Made the result demux JetStream consumer restart-safe by using a deterministic durable consumer and cleaning up stale instances

**Added:**

- LLM provider preflight ping - Verifies the selected model and credentials with a minimal request, supports ARES_LLM_PREFLIGHT_SKIP for offline or fixture-based runs, and treats retryable upstream errors as warnings
- OpenAI org-restriction detection - Classifies common 403 restricted-model responses as auth errors and appends actionable hints for OPENAI_ORG_ID and ARES_LLM_MODEL
- Per-tool timeout floors - Adds timeout minimums for slow tools such as nmap_scan, smb_sweep, smb_signing_check, enumerate_shares, domain_admin_checker, password_spray, and username_as_password
- Regression coverage - Adds tests for telemetry double initialization, OpenAI org-restricted message handling, auth hint augmentation, and per-tool timeout behavior

**Changed:**

- Tool dispatch waiting behavior - Redis-backed dispatch now uses the computed per-tool timeout instead of applying one shared timeout to every request
- NATS request handling - Increases the async-nats client request_timeout to 30 minutes so the broker client does not fail before dispatcher-level tool deadlines expire
- Result demux consumer lifecycle - Uses a fixed durable consumer name, deletes stale prior consumers on startup, and sets an inactive threshold to reduce manual recovery after crashes or pod evictions
- CLI telemetry routing - Detects orchestrator and worker subcommands anywhere in argv so global flags before the subcommand no longer cause telemetry to initialize with the wrong service name
- Telemetry initialization - Replaces panicking subscriber initialization with try_init, returning a no-op guard when telemetry has already been installed while still shutting down redundant OTLP providers safely
Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package          | from  | to    |
| ----------------- | ---------------- | ----- | ----- |
| galaxy-collection | community.docker | 5.2.0 | 5.2.1 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package           | from   | to     |
| ----------------- | ----------------- | ------ | ------ |
| galaxy-collection | community.general | 13.0.0 | 13.0.1 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource | package | from   | to     |
| ---------- | ------- | ------ | ------ |
| crate      | reqwest | 0.13.3 | 0.13.4 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package         | from  | to    |
| ----------------- | --------------- | ----- | ----- |
| galaxy-collection | ansible.windows | 3.5.0 | 3.6.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource        | package           | from  | to    |
| ----------------- | ----------------- | ----- | ----- |
| galaxy-collection | community.windows | 3.1.0 | 3.2.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
| datasource | package    | from   | to     |
| ---------- | ---------- | ------ | ------ |
| crate      | async-nats | 0.48.0 | 0.49.0 |

Co-authored-by: ares-renovate[bot] <286782180+ares-renovate[bot]@users.noreply.github.com>
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[codecov/codecov-action](https://redirect.github.com/codecov/codecov-action)
([changelog](https://redirect.github.com/codecov/codecov-action/compare/57e3a136b779b570ffcdbf80b3bdc90e7fab3de2..e79a6962e0d4c0c17b229090214935d2e33f8354))
| action | digest | `57e3a13` → `e79a696` |

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - At any time (no schedule defined)
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://redirect.github.com/renovatebot/renovate).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xOTUuMCIsInVwZGF0ZWRJblZlciI6IjQzLjE5NS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZSJdfQ==-->

Co-authored-by: dreadnode-renovate-bot[bot] <184170622+dreadnode-renovate-bot[bot]@users.noreply.github.com>

* chore(deps): update github/codeql-action action to v4.36.0 (dreadnode#334)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github/codeql-action](https://redirect.github.com/github/codeql-action)
| action | minor | `v4.35.4` → `v4.36.0` |

---

### Release Notes

<details>
<summary>github/codeql-action (github/codeql-action)</summary>

###
[`v4.36.0`](https://redirect.github.com/github/codeql-action/releases/tag/v4.36.0)

[Compare
Source](https://redirect.github.com/github/codeql-action/compare/v4.35.5...v4.36.0)

- *Breaking change*: Bump the minimum required CodeQL bundle version to
2.19.4.
[#&#8203;3894](https://redirect.github.com/github/codeql-action/pull/3894)
- Add support for SHA-256 Git object IDs.
[#&#8203;3893](https://redirect.github.com/github/codeql-action/pull/3893)
- Update default CodeQL bundle version to
[2.25.5](https://redirect.github.com/github/codeql-action/releases/tag/codeql-bundle-v2.25.5).
[#&#8203;3926](https://redirect.github.com/github/codeql-action/pull/3926)

###
[`v4.35.5`](https://redirect.github.com/github/codeql-action/releases/tag/v4.35.5)

[Compare
Source](https://redirect.github.com/github/codeql-action/compare/v4.35.4...v4.35.5)

- We have improved how the JavaScript bundles for the CodeQL Action are
generated to avoid duplication across bundles and reduce the size of the
repository by around 70%. This should have no effect on the runtime
behaviour of the CodeQL Action.
[#&#8203;3899](https://redirect.github.com/github/codeql-action/pull/3899)
- For performance and accuracy reasons, [improved incremental
analysis](https://redirect.github.com/github/roadmap/issues/1158) will
now only be enabled on a pull request when diff-informed analysis is
also enabled for that run. If diff-informed analysis is unavailable (for
example, because the PR diff ranges could not be computed), the action
will fall back to a full analysis.
[#&#8203;3791](https://redirect.github.com/github/codeql-action/pull/3791)
- If multiple inputs are provided for the GitHub-internal
`analysis-kinds` input, only `code-scanning` will be enabled. The
`analysis-kinds` input is experimental, for GitHub-internal use only,
and may change without notice at any time.
[#&#8203;3892](https://redirect.github.com/github/codeql-action/pull/3892)
- Added an experimental change which, when running a Code Scanning
analysis for a PR with [improved incremental
analysis](https://redirect.github.com/github/roadmap/issues/1158)
enabled, prefers CodeQL CLI versions that have a cached overlay-base
database for the configured languages. This speeds up analysis for a
repository when there is not yet a cached overlay-base database for the
latest CLI version. We expect to roll this change out to everyone in
May.
[#&#8203;3880](https://redirect.github.com/github/codeql-action/pull/3880)

</details>

---

### Configuration

📅 **Schedule**: (UTC)

- Branch creation
  - At any time (no schedule defined)
- Automerge
  - At any time (no schedule defined)

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://redirect.github.com/renovatebot/renovate).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiI0My4xODYuMSIsInVwZGF0ZWRJblZlciI6IjQzLjE5NS4wIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJyZW5vdmF0ZSJdfQ==-->

Co-authored-by: dreadnode-renovate-bot[bot] <184170622+dreadnode-renovate-bot[bot]@users.noreply.github.com>

* fix: honor blue-specific llm model configuration

**Changed:**

- Blue worker model selection now prefers `ARES_BLUE_LLM_MODEL`, falls back to `ARES_LLM_MODEL`, ignores empty values, and errors clearly when no LLM model is configured instead of using a hardcoded default.

---------

Co-authored-by: dreadnode-renovate-bot[bot] <184170622+dreadnode-renovate-bot[bot]@users.noreply.github.com>
**Key Changes:**

- Added the `procps` package to ARES blue agent image provisioning so process utilities are available at runtime
- Updated lateral analyst, threat hunter, and triage agent builds for both amd64 and arm64 variants
- Aligned the base ARES blue agent dependency set with the specialized agent templates

**Added:**

- Process inspection utilities - Installed `procps` across ARES blue agent templates to provide standard commands such as `ps` for tooling and runtime diagnostics

**Changed:**

- Provisioning dependency lists - Updated apt install commands in the ARES blue agent, lateral analyst, threat hunter, and triage Warpgate templates to include `procps` during image build setup
**Added:**

- Apt cache refresh before installing pipx on Debian-based systems
- Rescue fallback that retries apt update and installs pipx with `--fix-missing`
  to handle stale Kali rolling mirror package indexes

**Changed:**

- Documented the pipx install task as a block with separate cache refresh and
  install steps in the base role README
**Changed:**

- Updated Warpgate template source repositories from `github.com/dreadnode/ares`
  to `github.com/l50/ares` so builds clone the relocated repository
- Updated GHCR image references, build commands, badges, and usage examples from
  `ghcr.io/dreadnode` to `ghcr.io/l50`
- Switched the GPU cracker template base image to the `l50` namespace with its
  updated pinned digest
- Updated the dependent-template build workflow comment to match the active
  registry namespace
**Changed:**

- Increase the LLM preflight completion budget from 1 to 64 tokens so reasoning models have enough headroom to complete validation without failing on output limits while keeping the ping request inexpensive
@dreadnode-renovate-bot dreadnode-renovate-bot Bot added area/pre-commit Changes made to pre-commit hooks area/templates Changes made to warpgate template configurations area/github Changes made to GitHub Actions workflows labels May 27, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

❌ Patch coverage is 39.13043% with 140 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.77%. Comparing base (f774ba1) to head (553b723).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
ares-tools/src/cracker/remote.rs 4.90% 97 Missing ⚠️
ares-core/src/telemetry/init.rs 62.79% 16 Missing ⚠️
ares-cli/src/orchestrator/task_queue.rs 0.00% 15 Missing ⚠️
...s-core/src/persistent_store/queries/credentials.rs 0.00% 3 Missing ⚠️
ares-cli/src/history/search.rs 0.00% 2 Missing ⚠️
ares-cli/src/main.rs 0.00% 2 Missing ⚠️
ares-cli/src/history/cost.rs 0.00% 1 Missing ⚠️
ares-cli/src/history/list.rs 0.00% 1 Missing ⚠️
...c/orchestrator/tool_dispatcher/redis_dispatcher.rs 0.00% 1 Missing ⚠️
ares-core/src/nats.rs 75.00% 1 Missing ⚠️
... and 1 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #356      +/-   ##
==========================================
- Coverage   80.03%   78.77%   -1.26%     
==========================================
  Files         433      418      -15     
  Lines      125577   118527    -7050     
==========================================
- Hits       100500    93375    -7125     
- Misses      25077    25152      +75     
Files with missing lines Coverage Δ
ares-cli/src/orchestrator/tool_dispatcher/mod.rs 96.80% <100.00%> (+0.43%) ⬆️
ares-llm/src/provider/openai.rs 68.18% <100.00%> (+6.64%) ⬆️
ares-cli/src/history/cost.rs 0.00% <0.00%> (ø)
ares-cli/src/history/list.rs 0.00% <0.00%> (ø)
...c/orchestrator/tool_dispatcher/redis_dispatcher.rs 40.17% <0.00%> (ø)
ares-core/src/nats.rs 76.20% <75.00%> (+0.12%) ⬆️
ares-tools/src/cracker.rs 88.28% <66.66%> (-0.20%) ⬇️
ares-cli/src/history/search.rs 0.00% <0.00%> (ø)
ares-cli/src/main.rs 0.00% <0.00%> (ø)
...s-core/src/persistent_store/queries/credentials.rs 0.00% <0.00%> (ø)
... and 3 more

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@l50
Copy link
Copy Markdown
Contributor Author

l50 commented May 27, 2026

Closing — will land via direct push to fork main (l50/ares) since that's where the cluster's CI image builds are sourced from. Will batch back to dreadnode upstream later.

@l50 l50 closed this May 27, 2026
@l50 l50 deleted the fix/llm-preflight-budget-for-reasoning-models branch May 27, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/github Changes made to GitHub Actions workflows area/pre-commit Changes made to pre-commit hooks area/templates Changes made to warpgate template configurations

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant