Cross-Model Spec Review via codex
Activates the dormant external-review path in /creview-spec so that codex (GPT-5.5) runs as a first-class adversarial spec reviewer alongside Claude’s six review agents. A different model, a different vendor, and a different failure distribution — an independent grader for the spec, in the same spirit as the core “never let an agent grade its own work” principle.
Full rules live in the spec: .correctless/specs/cross-model-spec-review.md (23 invariants). This doc covers behavior, configuration, and limitations only.
What it does
When /creview-spec runs at high+ intensity and an external model is configured, it invokes the codex CLI read-only against the whole spec and folds the returned findings into the synthesis as Source: codex (external). Codex findings are advisory only — they are renamespaced to EXT-NNN, presented to you at the Step 4 human disposition gate, and never auto-incorporated into the spec.
When no external model is configured (or the codex binary is absent), the path is silently dormant — /creview-spec runs exactly as before with its six Claude agents. This is the default: nothing turns on until you opt in.
How it works
flowchart LR
spec[Spec artifact] --> crs["/creview-spec Step 3"]
crs --> prod["external-review-run.sh review<br/>(sole writer · ABS-042)"]
prod -->|"spec on stdin<br/>--sandbox read-only (always)<br/>closed argv allowlist (INV-017)"| codex[codex CLI · GPT-5.5]
codex -->|JSON findings| gate["parse-gate + bounds + nonce-fence<br/>neutralize (INV-002/009/019)"]
gate --> synth["Claude review synthesis<br/>Source: codex (external) · EXT-NNN"]
synth --> human{"Step 4 human<br/>disposition gate"}
human -->|accept/reject/modify/defer| specout[Spec]
prod -.invocation-coupled record.-> hist[(external-review-history.json)]
The producer (scripts/external-review-run.sh) is the sole writer of .correctless/meta/external-review-history.json and the run-id-keyed codex output artifact. It injects --sandbox read-only unconditionally (a tampered config cannot run codex unsandboxed — closes the AP-022 dead-code-in-security-paths class), validates the entire invocation against a closed allowlist, and parse-gates, bounds (4 MiB ceiling), and nonce-fences the untrusted codex output before it ever reaches Claude’s reasoning context.
Configuration
In .correctless/config/workflow-config.json under workflow.external_models.codex:
| Field | Meaning |
|---|---|
bin | Path/name of the codex binary (realpath-validated) |
base_args | Base argv, e.g. ["exec","--ephemeral","--json"] — --sandbox, --output-schema, and --output-last-message are appended by the producer and rejected if present here |
model | Model id (charset-validated) |
timeout | Clamped per-run timeout |
/csetup auto-detects an installed codex CLI and offers to populate this block (with an egress disclosure). To set fields after setup, use the sanctioned writer: scripts/config-update.sh set-external-model …. require_external_review: true makes external review mandatory rather than flag-triggered.
Known limitations
/creview-spec-only. External review does not run in the/cautoautonomous pipeline — there is no human checkpoint there, so the codex path is intentionally excluded.- Full-repo egress. Running codex sends repository content (including any secrets,
.env, and git history within codex’s read scope) to OpenAI. This is disclosed at configuration time (INV-014) and again per run (INV-022). The read-only sandbox bounds writes, not egress — the egress boundary is the opt-in config gate, not the sandbox. - Advisory only. Codex findings never auto-incorporate; every one passes the human disposition gate.
Security boundaries
- TB-008 — external model output → Claude review synthesis → spec.
- TB-001c — structured external-tool config → argv (no eval, no shell).
- ABS-042 — sole-writer external-review producer.
See .correctless/ARCHITECTURE.md for the full entries.