← Back to root README · Claude deep reference
Multi-agent configuration for OpenAI Codex CLI (Rust implementation). This file covers agent spawn rules, model strategy, runtime profiles, execution architecture, mirrored skill usage, and Claude integration internals.
Contents
This repo (.codex/) is the source of truth. Home (~/.codex/) is a downstream copy:
cp -r .codex/ ~/.codex/ # activate globally (config_file paths are relative)Run after editing any agent config, config.toml, hooks, or AGENTS.md.
Install
npm install -g @openai/codex # install Codex CLI
cp -r .codex/ ~/.codex/ # activate globallyAll agents are standardized on gpt-5.4. Differentiation is via reasoning effort and role instructions.
| Agent | Effort | Purpose |
|---|---|---|
| sw-engineer | high | SOLID implementation, doctest-driven dev, ML pipeline architecture |
| qa-specialist | xhigh | Edge-case matrix, The Borda Standard, adversarial test review |
| squeezer | high | Profile-first optimization, GPU throughput, memory efficiency |
| doc-scribe | medium | 6-point Google/Napoleon docstrings, README stewardship, CHANGELOG |
| security-auditor | xhigh | OWASP Python, ML supply chain, secrets, CI/CD hygiene (read-only) |
| data-steward | high | Split leakage, DataLoader reproducibility, augmentation correctness |
| ci-guardian | medium | GitHub Actions, trusted PyPI publishing, pre-commit, flaky tests |
| linting-expert | medium | ruff, mypy, pre-commit config, rule progression, suppression discipline |
| oss-shepherd | high | Issue triage, PR review, SemVer, pyDeprecate, release checklist |
| solution-architect | high | System design, ADRs, API compatibility, migration planning |
| web-explorer | medium | External docs/release-note extraction and evidence gathering |
| self-mentor | medium | Config quality checks, drift/leak detection, workflow hygiene |
Codex selects agents autonomously based on task type (defined in AGENTS.md). You can also address agents by name in your prompt.
Automatic spawn patterns (from AGENTS.md):
sw-engineerhandles core implementation; on completion Codex can fan out toqa-specialist+doc-scribesecurity-auditoris used when tasks touch auth, credentials, external APIs, model weights, or deserializationdata-stewardis used when tasks touch data pipelines, splits, augmentation, or DataLoaderssqueezeris used for profiling, throughput, and memory optimization tasksci-guardianis used for CI workflow and publishing tasks
When to address by name vs letting Codex decide:
- Use by name when you want a specific perspective that task-type detection might not trigger
- Let Codex decide for broad tasks; orchestration can fan out automatically
Session defaults:
model = "gpt-5.4"review_model = "gpt-5.4"approval_policy = "on-request"sandbox_mode = "workspace-write"
Reasoning allocation:
| Effort | Roles | Why |
|---|---|---|
| xhigh | qa-specialist, security-auditor | Adversarial: exhaustive search for what could go wrong |
| high | sw-engineer, squeezer, data-steward, oss-shepherd, solution-architect | Analytical: depth without unbounded budget |
| medium | doc-scribe, ci-guardian, linting-expert, web-explorer, self-mentor | Writing/config/research balance |
Four runtime profiles in config.toml cover common mode switches. Activate with --profile <name>:
codex --profile deep-review "full security audit of src/api/"
codex --profile fast-edit "fix the typo in the docstring"| Profile | What changes | When to use |
|---|---|---|
cautious |
approval_policy = "untrusted" |
Unfamiliar codebases, production systems, destructive ops |
fast-edit |
model = "gpt-5-codex", medium reasoning, low verbosity, 2 threads |
Narrow mechanical edits where speed > depth |
fresh-docs |
web_search = "live", concise summaries |
Questions about volatile docs, library versions, API changes |
deep-review |
model = "gpt-5.4", xhigh reasoning, live web search |
Broad/high-risk changes needing maximum review depth |
Codex built-in slash commands (for example /fast) work normally.
Mirrored workflow skills in .codex/skills/* are instruction assets, not custom slash commands. That means:
/investigate,/resolve,/revieware not recognized as Codex slash commands in this setup- Use prompt-based invocation instead
Interactive prompt usage:
run investigate on this branch and find root cause of failing CI
run resolve on the current working tree and fix high-severity findings
run review, then develop, then audit for issue #42
One-shot shell usage:
codex "run investigate for current failing pytest and write findings artifact"
codex "run resolve on this diff and apply required quality gates"Agent targeting examples:
use the qa-specialist to review tests/ for missing edge cases
use the solution-architect to produce a minimal migration plan for this API change
use the self-mentor to review .codex drift and weak gates
Codex hooks are enabled in config.toml:
[features]
codex_hooks = trueConfigured hook files:
.codex/hooks.json.codex/hooks/rtk-enforce.js
Behavior:
- If
rtkis not installed, hook is a no-op - If command is already
rtk ..., hook is a no-op - For known RTK-eligible prefixes, hook denies once and instructs rerun as
rtk <cmd> - For excluded risky patterns (for example
git push, destructive git deletes), it passes through normal approvals unchanged
Note: current Codex PreToolUse parsing does not apply in-place command rewrites via updatedInput, so deny-and-rerun is the native fallback.
Configured in config.toml:
max_threads = 4
max_depth = 2
job_max_runtime_seconds = 3600How Codex schedules agents:
- The lead agent (or base session) classifies the task and decides which specialists to spawn
- Agents spawn concurrently up to
max_threads - Agents at depth 2 cannot spawn further (
max_depth = 2) - Jobs exceeding
job_max_runtime_secondsare stopped and surfaced to the orchestrator
Mirrored workflow backbone:
- Core loop:
review,develop,resolve,audit - Extended set:
calibrate,release,investigate,sync,manage,analyse,optimize,research
Shared gate references:
.codex/skills/_shared/quality-gates.md.codex/skills/_shared/run-gates.sh.codex/skills/_shared/write-result.sh.codex/skills/_shared/severity-map.md
Artifact contract:
.reports/codex/<skill>/<timestamp>/result.json
Calibration runner:
.codex/calibration/run.shCodex loads agent instructions in layers, with more specific layers overriding broader ones:
- Global baseline:
~/.codex/AGENTS.mdor project.codex/AGENTS.md - Project-local override: repo root
AGENTS.md
Project-local instructions take precedence for overlapping rules.
config.toml configures:
[mcp_servers.openaiDeveloperDocs]
url = "https://developers.openai.com/mcp"Purpose: live OpenAI/Codex documentation lookups for freshness-critical guidance.
→ Claude-side integration details: .claude/README.md — Integration with Codex · Full architecture: root README
Typical division:
- Codex: focused mechanical implementation, diff-scoped edits, fast in-repo execution
- Claude: long-horizon orchestration, broader review topology, final synthesis
The combined workflow catches blind spots better than either tool alone.