Workflow History
2026-05-22 — Cross-Feature Intelligence Layer
Branch: feature/cross-feature-intelligence. Rules: 22 (16 invariants + 3 prohibitions + 3 boundary conditions). QA rounds: 1. Findings fixed: 0. Overrides: 2. Added scripts/cross-feature-intel.sh (876 lines) – a deterministic aggregation script that reads 6 data sources (deferred findings, Devil’s Advocate reports, override history, lens recommendations, debug investigations, phase effectiveness) and produces a single JSON brief at .correctless/meta/cross-feature-intel.json. The brief is filtered by file-scope overlap and recency (90-day staleness exclusion, 30-entry cap with per-section minimums). Modified skills/cspec/SKILL.md with Step 0a to read and present the brief during Socratic brainstorm, with Bash(*cross-feature-intel*) allowed-tools pattern and an anti-anchoring directive with calibration examples (weight-when/dismiss-when heuristics, same PMB-007 lesson). Modified skills/cstatus/SKILL.md with 3-state intelligence health reporting (no data / stale / current). Added ABS-037 architecture entry and updated TB-003 with mitigation variant (anti-anchoring directive for internal advisory content vs. UNTRUSTED fence for external untrusted content). 81 tests in tests/test-cross-feature-intel.sh.
2026-05-18 — DA-002 Debt Sprint
Branch: feature/da-002-debt-sprint. Rules: 19 (16 invariants + 3 prohibitions). QA rounds: 1. Findings fixed: 0. Overrides: 1. Decomposed hooks/workflow-advance.sh from 1,368 lines to a 533-line thin dispatcher that sources 3 modules from scripts/wf/ (transitions.sh, utility.sh, metadata.sh). Replaced the hardcoded 3,372-character test command in workflow-config.json with glob-based discovery (for f in tests/test-*.sh). Renamed tests/test.sh to tests/test-core.sh and removed its inline invocations of other test files. Updated setup, sync.sh, and lib.sh for scripts/wf/ subdirectory handling. Added SFG protection for scripts/wf/*.sh. Resolved 4 drift debt items (DRIFT-001 resolved, DRIFT-003 wont-fix, DRIFT-004 wont-fix, DRIFT-008 resolved). Added drift debt cadence check to /cspec Step 0. ABS-035 module contract added. 253 assertions in test-workflow-advance-decomp.sh. Motivated by Devil’s Advocate finding DA-002 (complexity outpacing maintainer).
2026-05-17 — Simplify Intensity Calibration (DA-004)
Branch: feature/simplify-intensity-calibration. Rules: 9 (7 invariants + 2 prohibitions). QA rounds: 0. Findings fixed: 0. Overrides: 0. Removed dead calibration infrastructure from /cspec Step 7b: active mode (auto-raise logic), hybrid mode (passive-until-5-entries-then-active), the intensity_calibration_mode config key, and the 200K token threshold comparison. None of these ever fired in production — 21 of 22 calibration entries had actual_tokens: 0, auto-raise threshold never triggered, mode config was never set. Calibration is now always advisory: displays QA rounds, BLOCKING findings, and token averages for overlapping file paths as read-only context. Data collection by /cverify unchanged (entries still written with all fields). Updated /csetup (removed mode selection decision), config templates (removed dead key), AGENT_CONTEXT.md and FEATURES.md (removed mode descriptions). Removed 5 test functions from existing suites (tests for dead modes), updated 3 test assertions (threshold references). Net -238 LOC. 79 new tests in test-simplify-intensity-calibration.sh. Motivated by Devil’s Advocate finding DA-004 (act on known dead code).
2026-05-16 — Adversarial Probe Framework
Branch: feature/adversarial-probe-framework. Rules: 20 (14 invariants + 3 prohibitions + 3 boundary conditions). QA rounds: 1. Findings fixed: 0. Overrides: 0. Added an adversarial probe round to /ctdd between QA and mini-audit at high+ intensity. Two probe types at high (mutation testing, config/input fuzzing) and three additional at critical (dependency sabotage, permission stripping, rollback simulation). Probes run in isolated git worktrees via isolation: "worktree" on the Agent tool — no modifications touch the main working tree. Budget-controlled parallel dispatch (formula-derived probe count from commands.test_duration_estimate). Surviving mutants trigger test generation with human review gate (interactive) or auto-commit (autonomous). Probe results artifact at .correctless/artifacts/probe-results-{branch-slug}.json committed via TB-004c allowlist modification. Advisory-only — probe failures never block pipeline progression (PRH-003). Added ABS-034 (probe results artifact contract), ENV-010 (Agent tool worktree isolation contract). 53 tests.
2026-05-15 — Deferred Findings Backlog
Branch: feature/deferred-findings-backlog. Rules: 12. QA rounds: 1. Findings fixed: 0. Overrides: 0. Centralized deferred review findings into .correctless/meta/deferred-findings.json with a dual-purpose sync script (scripts/sync-deferred-backlog.sh — seed from existing artifacts and ongoing re-derivation backstop). New /ctriage skill (31st) for wizard-style bulk triage with incremental saves. Review skills (/creview-spec, /creview) write deferred findings on disposition. /cauto sweeps open findings before PR. /cstatus shows severity breakdown with 20-item threshold warning and drift detection. /cmetrics shows 30-day trend, oldest open item, severity distribution. ABS-033 multi-writer advisory contract. Backlog is visibility only — PRH-001 prohibits gate enforcement. 65 tests.
2026-05-15 — Dashboard Visual Redesign
Branch: feature/dashboard-redesign. Rules: 11. QA rounds: 1. Findings fixed: 0. Complete visual and UX redesign of the project dashboard. New visual identity: DM Sans + DM Serif Display fonts from Google Fonts CDN with onerror fallback to system fonts, warm amber/gold accent (#c8842d light / #dba14a dark). Card-based layout with shadows and hover effects. Value narrative section showing total findings caught pre-merge with pipeline phase distribution. Polished dark and light modes with distinct color palettes. file:// protocol output URL. 135 tests (89 original + 46 new).
2026-05-14 — Project Dashboard UI
Branch: feature/project-dashboard-ui. Rules: 12. QA rounds: 2. Findings fixed: 0. Replaced scripts/generate-dashboard.sh with /cdashboard skill backed by scripts/build-dashboard.sh. Two-view HTML dashboard: Metrics (all existing sections preserved) + Artifact Browser (sidebar navigation for 7 artifact categories with marked.js + DOMPurify rendering). CDN dependencies SRI-pinned. Old script and output deleted, all references migrated (R-007). ABS-032 sole-writer contract added.
2026-05-08 — Audit Findings as Escape Metrics
Branch: feature/audit-escape-metrics. Rules: 10. Invariants: 4. Prohibitions: 2. QA rounds: 1. Overrides: 1. Reframed /caudit findings as pipeline escapes with a three-gate taxonomy (per-feature / audit / production), root-cause classification (implementation/spec/non-escape per finding), severity-weighted scoring (critical=5, high=3, medium=2, low=1, info=0), and per-cycle trend tracking. Extended ABS-029 round-JSON schema with optional escape_type field validated by audit-record.sh. Updated /cmetrics to replace single “Bug escape rate” line with full escape breakdown. Dashboard escape metrics section reads from metrics artifacts. Classification happens at specialist submission time during audit, not batched after convergence.
2026-05-08 — Autonomous Skill Contract
Branch: feature/autonomous-skill-contract. Rules: 14. QA rounds: 3. Findings fixed: 0. Overrides: 1. Added interaction_mode frontmatter field to all 29 skills (autonomous/interactive/hybrid), ## Autonomous Defaults sections with AD-xxx entries, autonomous dispatch protocol for /cauto (mode: autonomous in Task prompt), deferred escalation for fork+hybrid skills, and ABS-030 sole-writer JSONL contract for autonomous decisions logging.
2026-04-02 — Consolidate artifacts into .correctless directory
Branch: feature/correctless-directory. Rules: 20. QA rounds: 3. Findings fixed: 5. Moved all Correctless-generated files from scattered locations (.claude/artifacts/, docs/specs/, root ARCHITECTURE.md) into a unified .correctless/ directory with auto-migration for existing installs.
2026-04-03 — Add calm reset prompts to orchestrators
Branch: feature/calm-resets. Rules: 11. QA rounds: 2. Findings fixed: 7. Added desperation-vector management to /ctdd and /caudit — conditional reset prompts fire at known spiral trigger points (3+ consecutive failures in GREEN/fix rounds, recurring BLOCKINGs across QA rounds, diverging finding counts in audit). Each reset redirects the agent to re-read source material and offers human escalation. One reset per trigger per phase, then mandatory escalation with /cdebug suggestion.
2026-04-02 — Add /crelease skill for versioning and changelog
Branch: feature/crelease-skill. Rules: 19. QA rounds: 2. Findings fixed: 7. Added /crelease skill that automates version bumping from specs (not commits), changelog generation grouped by type, sanity gate, annotated git tagging, and optional push/GitHub release. Setup now detects version files (package.json, Cargo.toml, pyproject.toml, setup.cfg, Go constants, CHANGELOG.md) with section-aware TOML parsing. Also fixed pre-existing Rust JSON escape bug and incorrect consolidation test assertion.
2026-04-03 — Merge Lite and Full into single plugin distribution
Branch: feature/dynamic-rigor-stage1. Rules: 18. QA rounds: 3. Findings fixed: 11. Merged correctless-lite (19 skills) and correctless-full (26 skills) into a single correctless/ distribution with all 26 skills. Full-only skills (caudit, cmodel, creview-spec, cupdate-arch, cpostmortem, cdevadv, credteam) are visible but gated by intensity level — they check workflow.intensity from config and warn if invoked below threshold. sync.sh, marketplace.json, setup, README, design docs, and all skill terminology updated from Lite/Full to intensity levels. Setup detects and migrates old Lite/Full directories. This is Stage 1 of the dynamic rigor system — structural change only, no detection logic.
2026-04-04 — Wire intensity into remaining pipeline skills
Branch: feature/dynamic-rigor-stage4. Rules: 22. QA rounds: 1. Findings fixed: 0. Added Intensity Configuration tables and verbatim Effective Intensity sections to /cspec, /ctdd, /cverify, /cdocs, /cstatus so each adapts behavior based on effective intensity. Each skill’s table defines what changes at standard/high/critical (e.g., cspec: Socratic→Adversarial→Exhaustive question depth; ctdd: 2→3→5 QA round caps; cverify: basic→full+Serena→mutation survivor analysis). Promoted /creview’s ### Intensity-Aware Behavior to ## to fix section boundary for R-022 verbatim extraction. All 6 pipeline skills now share character-for-character identical Effective Intensity sections. This is Stage 4 of the dynamic rigor system — completes intensity wiring for the full pipeline.
2026-04-04 — Wire intensity into /creview
Branch: feature/dynamic-rigor-stage3. Rules: 11. QA rounds: 1. Findings fixed: 5. Added Intensity Configuration table and Effective Intensity section to /creview so it adapts review thoroughness based on max(project_intensity, feature_intensity). At standard: current behavior. At high: routes to /creview-spec (4-agent adversarial). At critical: routes to /creview-spec with zero-unresolved threshold. Updated all 7 gated skills to check effective intensity instead of project-only, enabling per-feature skill unlocking. Dropped the fourth “light” level during review — three levels (standard/high/critical) is sufficient. This is Stage 3 of the dynamic rigor system — first skill wired with intensity-variable behavior.
2026-04-03 — Add per-feature intensity detection to /cspec
Branch: feature/dynamic-rigor-stage2. Rules: 13. QA rounds: 2. Findings fixed: 13. Added automatic intensity detection to /cspec that evaluates four signals (file paths, keywords, trust boundaries, antipattern/QA history) to recommend standard/high/critical per feature. Detection runs for all projects regardless of intensity config. Recommendation stored in spec Metadata section and cached as feature_intensity in workflow state via new set-intensity subcommand. User always sees recommendation with reasoning and can override. Configurable signals via workflow.intensity_signals. Humility qualifier for projects with fewer than 5 completed features. This is Stage 2 of the dynamic rigor system — detection and calibration only, no downstream skill behavior changes yet.
2026-04-05 — Add sensitive file protection hook
Branch: feature/sensitive-file-protection. Rules: 10 invariants + 4 prohibitions + 5 boundary conditions. QA rounds: 3. Findings fixed: 6. Added PreToolUse hook that blocks agent writes to sensitive files (.env, credentials, keys, certificates) regardless of workflow phase. Fail-closed, no overrides. Hardcoded defaults for 20 common patterns + custom patterns via config. Bash write interception with extensionless file extraction. Case-insensitive matching, path-boundary-aware full-path patterns, quote stripping. Created .correctless/ARCHITECTURE.md with trust boundary, pattern, and environment entries. 98 integration tests.
2026-04-04 — Add auto-format PostToolUse hook
Branch: feature/auto-format-hooks. Rules: 12 invariants + 4 prohibitions + 4 boundary conditions. QA rounds: 3. Findings fixed: 7. Added PostToolUse hook that auto-formats files after Edit/Write/MultiEdit using the project’s configured formatter (Prettier, ESLint, Black, Ruff, gofmt, rustfmt). Extension-based routing, exact-match allowlist validation (closes command injection), array-based execution (closes path injection), timeout 5s, auto-adds –write/-w for stdout-based formatters. /csetup detects formatters and configures the hook. 70 integration tests.
2026-04-02 — Add /cexplain skill for guided codebase exploration
Branch: feature/cexplain-skill. Rules: 19. QA rounds: 2. Findings fixed: 6. Added /cexplain skill for interactive codebase exploration using mermaid diagrams and prose walkthroughs. Signal-based exploration menus, uncertainty markers, HTML export (incremental/snapshot modes), Serena MCP integration with silent fallback, output mode selection (terminal/HTML), 30-node grouping, and no-setup operation. QA caught missing Serena integration registration and missing spec-verbatim instructions.
2026-04-05 — AI antipattern scan
Branch: feature/antipattern-checklist. Rules: 19. QA rounds: 4. Findings fixed: 17. Added deterministic antipattern scanner (scripts/antipattern-scan.sh) that runs at phase transitions — before QA in /ctdd and during /cverify. Scans changed files for empty catch blocks, debug logging, placeholder credentials, trivial assertions, TODO comments, and error suppression across 5 language families (JS/TS, Python, Go, Rust, Shell). Outputs findings as JSON via jq (never string concatenation) with hardcoded descriptions to prevent JSON injection (TB-002). Universal placeholder detection on all text files regardless of extension. Shell checks use explicit allowlists for || true and echo exemptions. Extracted branch_slug() into shared scripts/lib.sh (ABS-001). Introduced PAT-003 for phase-transition scripts. Semantic checklist at .correctless/checklists/ai-antipatterns.md referenced by ctdd, creview, cverify. POSIX grep only — no GNU extensions. 222 tests.
2026-04-07 — Mechanical token tracking via PostToolUse hook
Branch: feature/token-tracking. Rules: 12. QA rounds: 2. Findings fixed: 3. Added PostToolUse hook that captures Agent tool token usage mechanically — eliminates prompt-based tracking dependency. Hook extracts input_tokens, output_tokens, total_cost_usd, duration_ms from tool_response and appends JSONL entries to per-branch log files. Tags entries with workflow phase, skill, and subagent metadata. jq handles all JSON construction (QA-001 class fix eliminated manual escaping antipattern). Follows PAT-005 PostToolUse conventions (fail-open, no set -e, || exit 0 guards). Setup script wires hook with Agent matcher. Updated /cmetrics and shared constraints to reference JSONL format. 92 tests.
2026-04-07 — Infrastructure hardening
Branch: feature/hardening-tests. Rules: 24. QA rounds: 2. Findings fixed: 3. Added unit tests for all 7 lib.sh functions (41 tests), mkdir-based state file locking with PID stale detection and atomic mv-based lock breaking (ABS-003), and gate path exceptions for .correctless/specs/ (spec phase), .correctless/artifacts/ (all phases), and workflow-advance.sh Bash commands. Locking functions in scripts/lib.sh consumed by workflow-advance.sh (write_state) and workflow-gate.sh (override decrement via locked_update_state). QA caught TOCTOU race in stale lock recovery, MultiEdit classification poisoning from excepted paths, and test wiring gap. 169 tests.
2026-04-08 — CI completeness and hook auto-registration
Branch: feature/ci-hook-wiring. Rules: 11 (9 invariants + 2 prohibitions). QA rounds: 3. Findings fixed: 11. Added all 26 test suites to CI (was 6), updated commands.test to include all test files on disk (was missing 5), refactored setup’s install_hooks() and register_hooks() to auto-discover hooks via glob and metadata headers (HOOK_TYPE/HOOK_MATCHER). Moved auto-format.sh from .claude/hooks/ to hooks/ source directory. Added matcher drift detection for existing settings.json. Type-based timeout convention (PreToolUse=5000ms, PostToolUse=1000ms). ShellCheck CI now scans scripts/ directory. QA caught grep -oP macOS incompatibility (replaced with POSIX sed), unreachable matcher drift code path, and accidental test state leakage. 69 tests.
2026-04-08 — Deterministic hook synchronization
Branch: feature/hook-sync-enforcement. Rules: 10 (8 invariants + 2 prohibitions). QA rounds: 2. Findings fixed: 4. Replaced hardcoded hook/script lists in sync.sh with glob-based auto-discovery, and extracted duplicated _has_write_pattern() and get_target_file() into scripts/lib.sh (ABS-001). Structurally eliminates cross-hook pattern drift that was caught by 3 consecutive audits. sync.sh –check now detects stale distribution files for hooks and scripts. QA caught head-1 vs head-5 regression in get_target_file (security gap for multi-file Bash commands) and sensitive-file-guard.sh blocking Edit/Write when lib.sh missing. 117 tests.
2026-04-06 — Shift-left review enhancement
Branch: feature/shiftleft-enhancement. Rules: 23. QA rounds: 3. Findings fixed: 9. Added historical findings data to /creview and /creview-spec — review agents now read past QA findings, Olympics audit history, and Devil’s Advocate reports to detect recurring patterns. Ephemeral LLM classification (ABS-002) with merge-broad directive. Subagent isolation preserved — orchestrator reads historical data, subagents get clean context (R-002). 10-file budget (PAT-004) using filename sort (ENV-003). Schema heterogeneity handling across JSON/markdown/free-form sources. Trust boundary TB-003 for historical findings feedback loop. Introduced [design] tag for LLM behavior rules with [unit] companions. Anti-anchoring defense honest about limitations (presentation-only for single-pass /creview, genuine isolation in /creview-spec). 146 tests.
2026-04-08 — Intensity Calibration Loop
Branch: feature/intensity-calibration-loop. Rules: 14. QA rounds: 2. Findings fixed: 4. Closed the feedback loop between intensity recommendations and actual outcomes. /cverify writes calibration entries (recommended vs actual intensity, QA rounds, BLOCKING findings, file paths) to .correctless/meta/intensity-calibration.json. /cspec reads them as a post-signal modifier during intensity detection. 3 modes: passive (advisory text with arithmetic), active (auto-raise floor at 3 QA rounds or 8 BLOCKING findings), hybrid (passive until 5 entries then active). Recommended-intensity metadata field in spec templates. Cross-skill data flow abstraction (ABS-005). 91 tests.
2026-04-09 — Token-aware intensity calibration
Branch: feature/token-aware-intensity. Rules: 10 (8 invariants + 2 prohibitions). QA rounds: 2. Findings fixed: 2. Added actual_tokens field to calibration entries — cverify sums total_tokens from token-log JSONL via deterministic jq (branch_slug file location, malformed line skipping). cspec reads actual_tokens as a post-signal modifier in Step 7b: 200K token threshold OR’d with QA rounds >= 3 and BLOCKING findings >= 8. Passive mode shows token arithmetic. cmetrics gets Per-Feature Token Cost table (skill-based category mapping) and concrete token trend computation (first-half/second-half, 20% threshold). ABS-006 documents token-log JSONL contract. 63 tests.
2026-04-09 — Add skill field to token-tracking hook JSONL
Branch: feature/token-tracking-skill-field. Rules: 8. QA rounds: 1. Findings fixed: 0. Added skill field to token-tracking PostToolUse hook JSONL output, derived from phase via hardcoded bash case statement (12 phases mapped). Enables accurate per-skill category attribution in cmetrics without fallback heuristics. R-008 sync test extracts all phases from workflow-advance.sh and verifies each is mapped. ABS-006 updated to reflect skill as standard field in hook-produced entries. 61 tests.
2026-04-09 — Auto-promote recurring antipatterns to architecture
Branch: feature/auto-recurring-patterns. Rules: 11 (9 invariants + 2 prohibitions). QA rounds: 2. Findings fixed: 3. Added auto-promotion logic to /cspec (Step 5b) and /cpostmortem (Step 3) — when antipatterns reach 3+ features in their Frequency field, skills suggest promoting them to ARCHITECTURE.md entries. Cap of 2 promotions per /cspec invocation prevents workflow hijacking. Draft includes Guards against: AP-xxx field for deduplication. Human always approves before write. Added Write(.correctless/ARCHITECTURE.md) to cpostmortem allowed-tools. 46 tests.
2026-04-09 — Semi-auto mode
Branch: feature/semi-auto-mode. Rules: 19 (R-001 through R-019) + 7 prerequisites. QA rounds: 2. Findings fixed: 9. Added /cauto orchestrator skill — automates the implementation pipeline (ctdd → simplify → cverify → cupdate-arch → cdocs → PR) after human-approved spec review. New .correctless/preferences.md surface for codifying judgment calls (QA triage, doc scope, commit style, escalation sensitivity, PR creation). Escalation with YAML frontmatter enables pipeline resumption. /simplify treated as untrusted contributor with commit-before-simplify and post-simplify validation. Fixed pre-existing is_full_mode() bug (ignored feature_intensity). Added TB-001b, TB-004, ABS-007, ABS-008, PAT-007, PAT-008, PAT-009, ENV-004 to ARCHITECTURE.md. Resolved pre-existing merge conflicts in README, ARCHITECTURE, CONTRIBUTING, .gitignore. 139 tests.
2026-04-10 — Migrate PAT-001 to path-scoped rule file
Branch: feature/path-scoped-rules-pat001-migration. Rules: 24 invariants (INV-001..011, INV-015..027) + 5 prohibitions + 3 boundary conditions + 5 environment assumptions + 7 design decisions. QA rounds: 2. Findings fixed: 2 (both NON-BLOCKING — circular self-reference in CLAUDE.md PostToolUse learning entry; stalled created_at_commit back-fill). Dogfood prototype for the rules-canonical / ARCHITECTURE.md-as-index pattern. Migrated PAT-001: PreToolUse hook conventions from .correctless/ARCHITECTURE.md into .claude/rules/hooks-pretooluse.md with paths: frontmatter scoping to hooks/workflow-gate.sh and hooks/sensitive-file-guard.sh. Added tests/test-architecture-drift.sh (1772 LOC, 55 structural checks + 10 negative-case fixtures) enforcing index-line shape, See-link target existence, paths-list set equality with PreToolUse hook discovery, in-file rule pointer comments, semantic integrity anchors (clause-5 literal + QA-R1-004/005 + persistence-year-PR anchor), and a static allowed-tools allowlist for Write(.claude/rules/*.md). Added ABS-009, ENV-005, ENV-006, Patterns reader-note blockquote to ARCHITECTURE.md. Pre-GREEN canary verified .claude/rules/ loading mechanism in a fresh Claude Code session (native Loaded .claude/rules/canary-*.md indicator + unprompted UUID marker surfacing). Dormant post-merge measurement gate (MG-001/002/003) at .correctless/meta/pat001-measurement-due.json with due_at_pr_count: 3; /cstatus has a dormant check that emits “Measurement overdue” when reached. /simplify pass extracted extract_see_link_paths, strip_shell_comments, get_learning_entry_section helpers and deleted 3 redundant sub-checks (58 → 55 checks). Class fixes for both QA findings added as structural drift checks. /cverify confirmed 22 genuine invariants covered + 2 deferred (INV-015 manual canary, INV-022 GREEN-phase fixture), all 5 prohibitions hold, 848 tests across 10 suites pass. 55 drift checks.
2026-04-12 — Auto Mode Phase 2
Branch: auto-mode-phase2. Rules: 29 (18 invariants + 5 prohibitions + 6 boundary conditions). QA rounds: 3. Findings fixed: 8 (4 BLOCKING, 4 HIGH). Adds policy-driven decision engine with tiered architecture (Tier 0 policy engine, Tier 1 worker self-resolution, Tier 2 ephemeral decision agents, Tier 3 lightweight supervisor, Tier 4 hard stop), structured DR-xxx decision requests, append-only decision record with size-regression detection, budget enforcement (token + time with warn/hard-stop), immutable intent summary with SHA-256 hash verification, Auto Run Report with hedging scan, pipeline lockfile for concurrent run prevention, and three-layer security constraint detection (PRH-001). New scripts: auto-policy.sh, auto-report.sh, budget-check.sh, cauto-lock.sh, decision-record.sh, decision-routing.sh, intent-hash.sh, security-scan.sh, workflow-state-ext.sh. New agents: supervisor.md, decision-agent.md. New template: auto-policy.json. Added ABS-011 through ABS-017, PAT-011 to ARCHITECTURE.md. 255 tests across 7 new test suites.
2026-04-13 — Auto Mode Phase 3
Branch: auto-mode-phase3. Rules: 38 (24 invariants + 6 prohibitions + 8 boundary conditions). QA rounds: 2. Findings fixed: 6 (3 BLOCKING, 3 HIGH). Extends /cauto to orchestrate from prompt to PR — spec writing, autonomous review with supervisor-triaged findings (batch activation), mandatory human spec approval gate. Supervisor mandate expansion: configurable levels (conservative/moderate/aggressive), architectural decisions with spec citation enforcement at conservative level, 7 hard-limit conditions bypass supervisor to Tier 4 (unspecced deps, security relaxation, budget/time exceeded, intent/policy tampering, CLAUDE.md mods, spec restructure). Override window scrutiny: 3-phase supervisor review (issuance → per-action → closure) with base-commit cross-check for “pre-existing” claims via git worktree, mechanical file-touch scope drift detection, spec completeness verification with concrete deliverable parsing, and Jaccard-similarity retry prevention (PRH-006, threshold ≥ 0.4). New scripts: review-triage.sh, supervisor-mandate.sh, override-scrutiny.sh, override-crosscheck.sh. Extracted sha256_hash_file to lib.sh (eliminates 3x duplication). Extended: supervisor.md (4 new activation types + Override Scrutiny Prefix), workflow-state-ext.sh (spec approval), auto-report.sh (review triage + override scrutiny sections), cauto SKILL.md (Phase 3 pipeline flow). Added ABS-018 through ABS-020, TB-004a to ARCHITECTURE.md. 213 tests across 6 new test suites.
2026-04-13 — Test Evasion Antipatterns
Branch: feature/test-evasion-antipatterns. Rules: 9. QA rounds: 2. Findings fixed: 1. Added AP-016 (test-routing around requirements), AP-017 (hand-rolled permissive mocks), AP-018 (phantom e2e execution) to the antipattern corpus based on Andrew’s external dogfooding feedback from clawker. Updated ctdd test audit prompt with checks 5/6/7 for spec-named resource verification, mock generator detection, and execution evidence verification. Added PAT-012 (wiring tests over keyword tests) and PAT-013 (doc-update invariant on refactoring) to ARCHITECTURE.md via antipattern promotion. Scanner implementation deferred to language-specific dogfooding. 52 tests.
2026-04-13 — Scanner Expansion
Branch: feature/scanner-expansion. Rules: 11. QA rounds: 1. Findings fixed: 0. Expanded scripts/antipattern-scan.sh with two new detection categories: grep portability checks (gnu-grep-p high, gnu-grep-ext medium, gnu-grep-ext-low low) detecting grep -P and GNU extensions (\s, \w, \d, \b) with line-scoped POSIX exclusion suppression, and dead-code-in-security-paths detection (dead-security-fn high) scanning security scripts for functions with zero production callers. Security scripts identified mechanically by filename patterns plus # scanner: security tag; # scanner: library tag excludes scripts called by LLM skill orchestrators (with orphan-library backstop). Added ctdd test audit check 8 (production call chain). Updated AP-001 with scanner enforcement references and added AP-022 (dead code in security paths). Content-pairing drift test ensures scanner+prompt stay in sync. Added PAT-014 (scanner tag conventions) and PAT-015 (content-pairing drift tests) to ARCHITECTURE.md. 91 new assertions across 2 test files.
2026-04-13 — /cauto UX Improvements
Branch: feature/auto-ux-improvements. Rules: 7. QA rounds: 1. Findings fixed: 0. Three UX improvements to /cauto: (1) flexible phase entry — accepts any active workflow phase via a fixed phase-to-step mapping instead of requiring review/review-spec, with artifact validation for skipped phases and configurable test timeout; (2) scoped commit consolidation before PR creation — stages only known pipeline output paths with belt-and-suspenders .correctless/artifacts/ guard and protected branch check (TB-004c); (3) structured end-of-pipeline summary with findings/decisions, phase breakdown (skill-name rows, duration from audit trail, token counts from JSONL), and artifact paths with truncation at >20 severity-bearing items. Added artifact_validation_failed audit event type (8 total). 70 new assertions in test-semi-auto-mode.sh (210 total).
2026-04-13 — Scripts namespace migration
Branch: feature/scripts-namespace-migration. Rules: 7. QA rounds: 1. Findings fixed: 0. Moved installed scripts (lib.sh, antipattern-scan.sh) from scripts/ to .correctless/scripts/, eliminating the only top-level namespace collision in user projects. Setup now installs to .correctless/scripts/ and detects+migrates old layouts on upgrade (prints advisory, does not delete scripts/). All 6 hook fallback paths updated; primary resolution path (../scripts/ relative to .correctless/hooks/) already resolved correctly. Skill references to antipattern-scan.sh updated. Source-tree layout unchanged — scripts/ remains the development directory. 60 tests.
2026-04-13 — Spec Mutation Alerts
Branch: feature/mutation-alerts. Rules: 5. QA rounds: 1. Findings fixed: 0. Added spec integrity checking to workflow-advance.sh: SHA-256 hashes the spec file at review-to-tests transition (spec_hash state field), re-hashes at done transition and emits an advisory warning if the spec was modified after review approval. Includes line count delta in the warning message. The spec-update flow (R-003) re-hashes legitimately to avoid false positives. Missing spec file at check time produces a warning, not a crash (R-004). Uses sha256_hash_file from lib.sh (PAT-011 fallback chain). 20 tests.
2026-04-13 — Override Frequency Metrics
Branch: feature/override-freq-metrics. Rules: 6. QA rounds: 2. Findings fixed: 0. Overrides: 1. Added cross-run override pattern detection and persistent override log preservation to the /cauto pipeline. New functions in scripts/override-scrutiny.sh: preserve_override_log copies branch-filtered override entries to .correctless/meta/overrides/{task-slug}-{YYYYMMDD}.json on every terminal state (Step 9.5); _list_overrides_by_timestamp sorts preserved files by completed_at; check_cross_run_overrides detects recurring override reasons across the last 10 runs (Jaccard >= 0.4, 2+ matches triggers escalation). /cmetrics gains Override Health section (total overrides, mean per run, reason clusters via Jaccard 0.3 threshold, elevated-rate warning at >0.5/run). /cdocs includes override count in workflow-history.md entries when >0. 50-file retention cap with malformed-first eviction. ABS-020 updated with cross-run pre-check; ABS-021 added for meta/overrides/ directory contract. 40 tests.
2026-04-14 — Stale Hook Detection
Branch: feature/stale-hook-detection. Rules: 5. QA rounds: 1. Findings fixed: 0. Overrides: 1. Detects when installed hooks and scripts drift from their source distribution by writing a SHA-256 install manifest (.correctless/.install-manifest.json) during setup and checking it at pipeline startup (/cauto) and status display (/cstatus). Covers install-vs-manifest drift, source-ahead-of-install detection (the PR #63 failure class), missing files, and unmanifested new files. Advisory warnings only, not blocking. check_install_freshness in scripts/lib.sh is the reader; setup is the sole writer. Manifest is gitignored local state with atomic write via temp+mv. Added ABS-022 to ARCHITECTURE.md. 36 tests.
2026-04-18 — TDD Mini-Audit Phase
Branch: feature/tdd-mini-audit. Rules: 20. QA rounds: 1. Findings fixed: 0. Overrides: 1. Added tdd-audit phase to /ctdd pipeline between QA and done. Three adversarial specialist agents (cross-component interaction, hostile input, resource bounds) run in parallel per round, intensity-scaled (standard=1, high=2, critical=3). Fixed rounds, not convergence. CRITICAL/HIGH findings are blocking with fix-and-recheck loop. Added MA- prefix for mini-audit findings, UNCERTAIN severity for honest ambiguity. Modified workflow-advance.sh (new phase + transitions), workflow-gate.sh (phase gating), token-tracking.sh (phase-to-skill mapping), ctdd SKILL.md, cauto SKILL.md. 79 tests.
2026-04-18 — Integration Test Contracts in Specs
Branch: feature/integration-test-contracts. Rules: 10. QA rounds: 1. Findings fixed: 0. Overrides: 1. Added Entry/Through/Exit integration test contract format to /cspec and /ctdd. When /cspec writes a rule tagged [integration], it now defines Entry (entrypoint from ARCHITECTURE.md test_via), Through (components that must be exercised and must NOT be mocked), and Exit (observable behavior assertion) constraints. /ctdd test audit verifies contracts with tiered severity: Entry=mechanical BLOCKING, Through=semi-mechanical BLOCKING/UNCERTAIN, Exit=semantic BLOCKING/ADVISORY. Added Step 4a to cspec SKILL.md, item 9 to ctdd test audit, updated both spec templates, created ABS-024 in ARCHITECTURE.md, and strengthened ABS-023 with consumer tracking and evolution constraints. 60 tests.
2026-04-18 — /carchitect Phase 1: Entrypoint-Aware TDD
Branch: feature/carchitect-phase1. Rules: 9. QA rounds: 1. Findings fixed: 0. Overrides: 1. Added entrypoint-aware context to the /ctdd RED phase test-writing agent and internal import bypass detection (check 10) to the test audit — both within existing skills/ctdd/SKILL.md. RED phase agent now reads ARCHITECTURE.md entrypoints, Key Patterns, Layer Conventions, and Trust Boundaries before writing integration tests through documented test_via patterns. Test audit check 10 detects when [integration] tests import internal packages directly instead of going through a documented entrypoint’s scope — language-aware for Go, TypeScript/JavaScript, Python, Rust with ADVISORY skip for unsupported languages. Graceful fallback when no entrypoints are documented. Consolidated finding when check 10 and check 9 (Entry contract verification) both fire on the same test. 33 tests.
2026-04-18 — Agent Hook for Internal Import Enforcement
Branch: feature/agent-hooks. Rules: 12. QA rounds: 1. Findings fixed: 0. Overrides: 1. First agent hook in Correctless — hooks/import-guard.json is a PreToolUse agent hook that denies test file writes when the test imports internal packages covered by a documented entrypoint. JSON config file (not bash script) with type: "agent" and an embedded prompt that performs sequential checks: test file detection, ARCHITECTURE.md entrypoint parsing, workflow.test_helpers allow-list matching, language-aware import detection (Go, TS/JS, Python, Rust), and actionable deny reasons with escape hatch guidance. Setup updated to auto-discover and register JSON agent hooks alongside bash command hooks. Sync updated for JSON hook propagation with bidirectional staleness detection. Added ABS-025 (agent hook JSON contract) to ARCHITECTURE.md. 47 tests.
2026-04-20 — Dashboard Trend Insights
Branch: feature/dashboard-insights. Rules: 6. QA rounds: 1. Findings fixed: 0. Added four trend analysis sections to the project dashboard: QA Rounds Trend (horizontal bars per feature), Intensity Accuracy (agreed/raised/lowered summary from calibration data), Override Rate (per-feature bars with mean summary), and Fix Rate (N/M findings fixed with percentage). Sections inserted into the dashboard’s vertical narrative between existing sections. All new sections degrade gracefully when data is missing.
2026-04-19 — Project Dashboard
Branch: feature/project-dashboard. Rules: 9. QA rounds: 2. Findings fixed: 0. Overrides: 1. Added scripts/build-dashboard.sh — reads .correctless/ artifacts and generates a self-contained dashboard at .correctless/dashboard/index.html with longitudinal quality metrics and artifact browser. Metrics view: quality trajectory, pipeline phase distribution, antipattern health with dormancy detection, intensity calibration, cost by phase, drift debt, and dev journal. Artifact browser: sidebar navigation for specs, verifications, review findings, research briefs, architecture docs, QA findings, audit history. Uses marked.js + DOMPurify (CDN, SRI-pinned). Graceful degradation for missing data. Gitignored output. /cmetrics updated with dashboard discovery line.
2026-04-20 — Session Cost Analysis
Branch: feature/cost-tracking-enhancement. Rules: 23. QA rounds: 2. Findings fixed: 0. Overrides: 1. Added scripts/compute-session-cost.sh — reads Claude Code session transcripts from ~/.claude/projects/, deduplicates by message.id (3.14x streaming inflation), computes per-model USD cost with hardcoded pricing and config overrides, attributes cost to workflow phases via audit trail timestamps, and writes a cost-{slug}.json artifact. Dashboard “Cost by Phase” section now reads real cost artifacts instead of phantom token-log data. Updated /cverify (actual_cost_usd in calibration), /cmetrics (ROI with real USD), /cdocs (cost summary in workflow-history). Added ABS-026 (cost artifact contract), TB-006 (session transcript reads), ENV-009 (Claude Code transcript storage), updated ABS-006 (PostToolUse token zeros documented). 89 tests. Cost: $0.00 (no completed sessions matched — active session not yet flushed to transcript).
2026-04-19 — Test Harness Extraction
Branch: feature/test-harness-extraction. Rules: 8. QA rounds: 1. Findings fixed: 2. Overrides: 1. Extracted shared test boilerplate (pass/fail/section/skip/summary, counters, colors, preamble) from 14 test files into tests/test-helpers.sh, removing ~400 lines of duplication. Three migration variants (A: full extraction, B: one-liner removal, C: preamble/counters only) accommodate differing levels of inline boilerplate. Variable normalization (FAILED_INVS to FAILED_IDS) in test-architecture-drift.sh. Registration guards in CI and drift tests updated to skip the shared helper. 128 tests. Cost: $0.00 (active session not flushed).
2026-04-19 — Upgrade Compatibility Lens
Branch: feature/upgrade-compatibility-lens. Rules: 8. QA rounds: 1. Findings fixed: 0. Overrides: 1. Added a 5th adversarial agent (Upgrade Compatibility Auditor) to /creview-spec at high+ intensity and a 4th mini-audit specialist (upgrade compatibility) to /ctdd at all intensity levels. Both agents mechanically check a 5-item checklist (installation mechanism, config defaults, schema backward compatibility, migration paths, graceful degradation) motivated by PMB-003 where setup silently skipped 16 scripts. Prompts reference AP-024 and PMB-003 as concrete examples. Count updates across both skill files, LENS enum addition (upgrade-compatibility), and token tracking agent_role updates. Prompt-only changes, no new scripts or hooks. 38 tests.
2026-04-22 — Skill Path Discovery
Branch: feature/skill-path-discovery. Rules: 7. QA rounds: 2. Findings fixed: 0. Overrides: 1. Fixed 4 skills (creview-spec, cverify, cpostmortem, csummary) to use explicit path discovery via workflow-advance.sh status for spec artifacts instead of vague references like “Read the spec artifact” that cause hallucinated paths in fresh sessions. Added structural guard (DISC-001/DISC-002) in test-architecture-drift.sh that maintains MUST_HAVE_DISCOVERY and EXCLUDED_FROM_DISCOVERY lists — any new skill not classified in either list fails the test. Added skill_body() shared helper to test-helpers.sh. AP-025 added for the bug class. 59 tests + 36 guard assertions.
2026-04-25 — Statusline Live Cost
Branch: feature/statusline-live-cost. Rules: 10. QA rounds: 1. Findings fixed: 0. Overrides: 1. Added per-feature cost tracking to the statusline’s workflow section (Section 4). Cost data read from a background-refreshed cache file (.correctless/artifacts/cost-cache-{slug}.json) with 30-second staleness threshold. Background refresh spawns compute-session-cost.sh --cache --phase with lock file + atomic write (mktemp + mv). Display format: $47.23 ($12.50 in GREEN) — total omitted when zero or no cache. Extended compute-session-cost.sh with --cache and --phase flags for lightweight stdout output. Extracted phase_display_name() and fmt_cost_nonzero() helpers. All existing statusline behavior unchanged. 43 tests.
2026-04-11 — Fix-diff reviewer plugin agent migration
Branch: fix-diff-reviewer-migration. Rules: 20 (INV-001..020) + 5 prohibitions + 6 boundary conditions. QA rounds: 3. Findings fixed: 12 (7 non-blocking deferred to drift-debt as DRIFT-001..007). Migrated the /caudit step 6a inline fix-diff-reviewer prompt into a structured plugin agent at agents/fix-diff-reviewer.md, enabling structural enforcement of AP-012’s fix-round verification contract. First dogfood of ABS-010 (plugin-agent file contract, narrow) and ENV-007 (Claude Code plugin-agent loader). Added sync.sh propagation for agents/*.md, pinned tool allowlist {Read, Grep, Glob}, UNTRUSTED_DIFF/UNTRUSTED_RULES fences with data-treatment clause, pre-diff rule-body reads via git show $ROUND_START_SHA:... (defends against self-referential attacks), 100 KB Task-prompt budget (DD-010, fail-closed on overflow), jq -e . identity parse gate (INV-017), cardinality-1 canonical fail-closed marker (PRH-003), HTML sentinel comments delimiting step 6a (INV-020, resolves test audit B01 where heading-based extraction collided with INV-018’s required ## Path-scoped rules heading). Committed three historical fixture diffs (tests/fixtures/fix-diff-reviewer-historical-r{1,2,3}.diff) rescued from the author’s reflog with SHA-256 pins for VP-002 replay reproducibility. VP-001 + VP-002 executed live in a fresh Claude Code session after plugin reinstall — both PASS, all three responses parsed cleanly, findings_returned_per_replay: [11, 5, 3], all three PMB-002 regression layers covered with non-placeholder finding IDs. Added AP-013 (inline subagent system prompts in skill files). 139 tests (125 feature + 65 core + 57 drift + 11 allowed-tools, with 3 replay-report skips converted to PASS after VP-001/VP-002).
2026-04-26 — Harness Fingerprint + Model Upgrade Detection
Branch: feature/opus-4-7-compat. Rules: 31 (19 invariants + 6 prohibitions + 5 boundary conditions + 1 architectural). QA rounds: 1. Findings fixed: 1 (MA-HI-003 paired-flag infinite-loop class fix). Overrides: 3. Added a deterministic harness fingerprint mechanism — scripts/harness-fingerprint.sh writes the literal {model_name}|{HARNESS_VERSION} string to .correctless/meta/harness-fingerprint.json at every /cspec Step -1 and emits a one-time version_bumped advisory when the maintainer increments the integer constant in the script. New /cmodelupgrade skill (29th skill, sole writer of .correctless/meta/model-baselines.json) compares the current {model}+{HARNESS_VERSION} combination’s per-feature pipeline metrics (qa_rounds, total_tokens, total_cost_usd, phase_count) against the stored baseline using a three-tier bootstrap lookup (exact-match pool / pre-fingerprint pool / no-baseline). Strictly advisory (PRH-001) — every code path returns exit 0; sole-writer enforcement is structural via hooks/sensitive-file-guard.sh blocking Edit/Write AND Bash redirects (PRH-002 mitigates AP-022). Added ABS-027 (fingerprint store contract), ABS-028 (test-features baseline contract), templates/test-features/baseline.md reference feature scaffolded by /csetup Step 2.6, get_current_session_id() and locked_update_file() to scripts/lib.sh, harness_version field to /cverify’s calibration entries (BND-005 prerequisite), Harness: advisory line to /cstatus (Section 3a), harness warning section to /cauto’s Auto Run Report (INV-016). 110 tests in tests/test-harness-fingerprint.sh, 0 failures. Replaces the v1 LLM-probe approach (~250 LOC of probe orchestration removed during /creview-spec round 2) with a debuggable literal string and a manually-bumped integer.
2026-04-28 — Harness-Fingerprint R2 Hardening
Branch: audit/harness-fingerprint. Rules: 28 (19 invariants + 5 prohibitions + 3 boundary conditions + 1 PAT-017 rule-file shape). QA rounds: 0 (implementation-only spec derived from the R2 audit transcripts; the spec itself encodes the class-level fixes that closed each bypass class found in R2). Three architectural pieces from the R2 audit of the harness-fingerprint R1 fix batch (which had a 71% defect rate). First piece: canonicalize_path in scripts/lib.sh — pure-bash segment-stack walker, total over arbitrary byte sequences, idempotent, ASCII-only . recognition (no Unicode-lookalike traversal — INV-002a), no shell expansion (literal bytes preserved — INV-004), <50ms on 1024-byte input. Second piece: hooks/sensitive-file-guard.sh refactor — replaced per-command dispatch in _extract_bash_targets (PRH-002) with an over-extracting default branch (every non-flag token becomes a candidate; the matcher filters per INV-006); redirect detection covers >, >>, 1>, 2>, &> in both whitespace-separated and inline-attached forms (INV-007); process substitution sub-tokenizes a single level (INV-007a); _has_write_pattern flags interpreter+eval-flag chains (INV-013); pattern matching pipes both target and protected pattern through canonicalize_path (INV-005, INV-008, PRH-004); a source-time v1 sentinel probe at hook load catches partial upgrades by exiting 2 fail-closed if canonicalize_path is missing or wrong (INV-005a). Third piece: --version flag and VERSION_OVERRIDE env-var stripped from scripts/harness-fingerprint.sh — HARNESS_VERSION=N becomes the sole production input (PRH-003 / INV-009). Tests now inject specific versions via a feature-specific helper tests/harness-fingerprint-test-helpers.sh make_test_harness_script (tmpdir copy + sed substitution; destination filename does NOT match the protected pattern per BND-003). Migration shipped as two commits per INV-011 — production-security decision and test-infrastructure decision are independently revertable. Closes AUTH-R2-001 (the testability flag was also the autonomous-bump escape hatch). Plus supporting wiring: new path-scoped rule file .claude/rules/canonicalize-path.md (PAT-017 — second dogfood of ABS-009 after PAT-001), setup upgrade-detection step that force-reinstalls pre-R2 harness-fingerprint.sh carrying VERSION_OVERRIDE (INV-014), regression test in tests/test-workflow-gate.sh for the extended _has_write_pattern (INV-013a), and PAT-016 promoted from AP-024 with mandatory glob-over-directory + count-match drift test. 533 tests pass across the 7 directly-affected suites (canonicalize-path 8, sensitive-file-guard 163, harness-fingerprint 119, workflow-gate 92, architecture-drift 106, stale-hook-detection 39, hook-sync 123). 0 failures. The R2 hardening closes the bug class, not the instances — each piece replaces an R1 patch the next R2 specialist round routed around.
2026-04-30 — Audit Findings Persistence Contract
Branch: feature/audit-findings-persistence-contract. Rules: 22 (10 invariants + 5 prohibitions + 2 boundary conditions + 4 environment assumptions + 1 architectural). QA rounds: 5. Findings fixed: 1. Overrides: 3. Corrective action for PMB-005. /caudit historically described its persistence step in skill prose; on 2026-04-26 the gate transitioned audit-done with no round-JSON written and /cmetrics reported the audit as 16 days stale when it had run a day prior. This feature ships three coupled mechanisms that close the gap structurally. First: cmd_audit_done precondition gate in hooks/workflow-advance.sh — refuses transition unless at least one round-JSON exists whose started_at field equals workflow state (content-based string equality, not mtime — robust to ENV-003 git-op timestamp drift). Validates .audit.type content before glob expansion against ^[a-z][a-z0-9-]{0,31}$, fails non-zero with explicit remediation message naming the expected glob and the started_at value. Second: scripts/audit-record.sh — sole writer (PAT-003 phase-transition CLI) with write-round and append-history subcommands; sources lib.sh, no mtime fallback in _state_file (MA-001 prevents cross-branch contamination), TTY-stdin guard prevents interactive hangs, trap-based tmp file cleanup, append-only history.md (PRH-004 — never truncates). Third: /cmetrics multi-signal staleness — max(history.md mtime, latest round-JSON mtime) with explicit “no data” label, plus a separate audit-done override counter for AP-023 routine-bypass detection. Sensitive-file-guard DEFAULTS extended to protect both scripts/audit-record.sh and .correctless/scripts/audit-record.sh against autonomous Edit/Write/Bash redirect (INV-009 — same harness-fingerprint sole-writer convention from CLAUDE.md learning 2026-04-26). Added ABS-029 (audit findings persistence contract), AP-026 (advisory-prose artifact-write contract), PMB-005 (the 2026-04-26 persistence gap). 43 tests in tests/test-audit-findings-persistence.sh, 569 tests across 6 affected suites, 0 failures. Override count of 3 reflects mid-feature stale .claude/hooks/workflow-gate.sh requiring resync from source — symptom of a separate known-class issue (MA-009) deferred to follow-up.
2026-05-06 — carchitect Phase 3: Architecture Adherence Auditor
Branch: feature/carchitect-phase3-audit-adherence. Rules: 11. QA rounds: 1. Findings fixed: 0. Overrides: 2. Added an Architecture Adherence Checker agent to all three /caudit presets (QA, Hacker, Performance). The agent reads .correctless/ARCHITECTURE.md and mechanically checks PAT-xxx pattern compliance, ABS-xxx abstraction invariant adherence, TB-xxx trust boundary enforcement, and undocumented pattern detection. Includes dormant-signal fallback for missing/placeholder ARCHITECTURE.md, 30-day staleness warning, architecture_ref field in findings JSON for deduplication, and Regression Hunter context for recurring architecture violations. QA preset agent count updated from 4-6 to 5-7. Phase 3 of the /carchitect roadmap — the architecture document now drives auditing in addition to spec writing (Phase 2). 48 tests in tests/test-carchitect-phase3.sh.
2026-05-23 — Review Intelligence Consumer
Branch: feature/review-intelligence-consumer. Rules: 16 (11 invariants + 2 prohibitions + 3 boundary conditions). QA rounds: 1. Findings fixed: 0. Extended both review skills (/creview-spec, /creview) to read the cross-feature intelligence brief during Historical Pattern Integration/Findings, giving them aggregated historical signal from deferred findings, overrides, lens recommendations, and phase effectiveness that they previously lacked. Review skills read .correctless/meta/cross-feature-intel.json directly via jq with client-side occurrences >= 3 filtering — they never invoke the script (PRH-002), preserving the invariant that only /cspec triggers regeneration and occurrence count increments. A review-adapted anti-anchoring directive (weight-when/dismiss-when heuristics) precedes brief data to prevent cognitive anchoring. The script gains occurrence tracking with _dormant_counts metadata for entries that temporarily leave the brief, capped at 100 entries. ABS-037 updated from “idempotent” to “stateful” with both review skills listed as consumers. /cstatus gains threshold proximity reporting (occurrence-level breakdown). Review findings artifacts record intelligence consumption metadata. 58 tests in tests/test-review-intel-consumer.sh, 1 test file updated (test-allowed-tools-check.sh).
2026-05-24 — Documentation and Artifact Pruning Skill
Branch: feature/cprune-skill. Rules: 27 (19 invariants + 4 prohibitions + 4 boundary conditions + 1 architectural). QA rounds: 2. Findings fixed: 0. Overrides: 1. Added /cprune skill for periodic documentation and artifact pruning. Scanner script (scripts/prune-scan.sh, 777 lines) mechanically detects staleness candidates across 9 categories: architecture entries, antipatterns, CLAUDE.md learnings, orphaned artifacts, deferred findings, AGENT_CONTEXT.md count drift, cross-reference consistency, completed specs, and drift debt. Two execution modes: autonomous (invoked by /cauto, auto-executes low-risk actions only) and interactive (presents formatted report with per-category disposition options). Archive-not-delete design: documentation entries move to .correctless/ARCHITECTURE_DEPRECATED.md, .correctless/antipatterns-archived.md, or .correctless/CLAUDE_LEARNINGS_ARCHIVED.md with original IDs preserved. SFG protection for scanner and all three archive files (INV-016). /cauto integration at intensity-aware placement points (after /cupdate-arch at high+, after /cverify at standard). /cstatus pruning-recommended signal with threshold detection. CLAUDE.md entirely excluded from autonomous mode (PRH-002). Added ABS-038 (archive file contract). 116 tests in tests/test-cprune.sh.
2026-06-03 — Disallowed-Tools Frontmatter
Branch: feature/disallowed-tools-read-only-skills. Rules: 7. QA rounds: 2. Findings fixed: 0. Added disallowed-tools frontmatter to 12 read-only skills as defense-in-depth alongside existing allowed-tools (PAT-018 application). Group A (write-nothing: chelp, cstatus, cdashboard) disallows Edit, Write, MultiEdit, NotebookEdit, CreateFile. Group B (artifact-only: cexplain, cwtf, cmetrics, csummary, cpr-review, cmaintain, cmodel, cmodelupgrade, ctriage) disallows Edit, MultiEdit, NotebookEdit, CreateFile. R-007 drift test ensures new skills are classified. ENV-011 added for Claude Code v2.1.150 dependency. 117 tests in tests/test-disallowed-tools.sh.
2026-05-22 — Review-Driven Mini-Audit Lenses
Branch: feature/review-driven-mini-audit-lenses. Rules: 19 (13 invariants + 3 prohibitions + 3 boundary conditions). QA rounds: 1. Findings fixed: 0. Overrides: 1. Bridges review and mini-audit phases: review agents (/creview-spec at high+ intensity, /creview at standard intensity) write structured lens recommendations to .correctless/artifacts/lens-recommendations-{branch_slug}.json, /ctdd’s mini-audit consumes them via a custom lens agent template with UNTRUSTED_RECOMMENDATION fence (TB-003/TB-005 mitigation), outcomes are recorded for /cmetrics lens coverage reporting and /cwtf auditability. 8-agent budget cap per round (6 default + up to 2 recommended), priority heuristic for selection (CRITICAL/HIGH findings first, then source diversity), dormant degradation (PAT-019) when artifact absent, non-blocking cmd_done warning in scripts/wf/transitions.sh when outcomes missing. Added ABS-036 (lens recommendation artifact contract). 80 tests in tests/test-review-driven-lenses.sh, 2 test files updated (test-upgrade-compatibility-lens.sh, test-ux-review-lens.sh).
2026-06-12 — AP-031 Fixture Divergence Prevention
Branch: feature/ap031-fixture-divergence-prevention. Rules: 6. QA rounds: 4. Findings fixed: 18. Overrides: 3. Two-layer prompt-level prevention for AP-031 (test fixtures diverging from real producer output), motivated by PMB-010 (sync script silently imported 0 of 25 findings) and PMB-011 (scanner shipped with 17 false positives). Layer 1: format-pinning directive in skills/cspec/SKILL.md Step 3 — when a feature parses another Correctless tool’s output, the spec must pin the exact format (heading regex, JSON schema, field names) and cite the producer file path as the authoritative source, with trigger-detection heuristics and a worked Example/Not contrast. Layer 2: real-fixture requirement in agents/ctdd-red.md (at least one fixture must be a verbatim excerpt of a real artifact with a language-aware Source: citation — # Source: / // Source: / -- Source:; live-read-only is insufficient because .correctless/artifacts/ is gitignored and absent in CI) plus fixture provenance check 11 in the /ctdd test audit (BLOCKING for synthetic-only or live-read-only suites; orchestrator passes labeled MODIFIED_TEST_FILES:/UNTRACKED_TEST_FILES: lists with fail-loud fallback when a label is absent; repo-relative path validation; TB-003 treat-fixtures-as-data fence; 10-fixture follow budget). Writer and auditor share a producer-to-artifact reference table for dormant detection — new producer + consumer in the same PR means no real artifact exists, the requirement goes dormant, and Layer 1 is the sole guard. AP-031 antipattern entry gains a “Prevention implemented” note: a recurrence now means the prevention failed and is a postmortem trigger, not just a PAT-020 promotion. 39 block-scoped tests in tests/test-ap031-fixture-divergence.sh (awk section extraction before grepping — AP-003 mitigation), 8 QA/mini-audit class fixes pinned as regression assertions. Test count 92→93 synced in AGENT_CONTEXT.md and CONTRIBUTING.md.
2026-06-14 — Slug-Type-Aware Artifact Classification in prune-scan.sh
Branch: feature/prune-scan-slug-aware-matching. Rules: 24 (18 invariants + 2 prohibitions + 2 boundary conditions + 1 extended environment assumption + 1 antipattern-scan rule). QA rounds: 1. Findings fixed: 10. Closes the 2nd AP-032 instance: scripts/prune-scan.sh’s scan_artifacts previously treated all artifact_patterns entries as branch-slug-named and matched filenames via substring search, which would have flagged live task-slug-named workflow artifacts (qa-findings-*.json, audit-mini-*.json) as low-risk deletion candidates for autonomous /cprune. This feature ships _classify_artifact_pattern — a total function over artifact_patterns mapping every pattern to one of branch-slug/task-slug/session-slug/unclassified — plus delimited-token slug matching ([[ $f =~ ^(.+-)?$slug([-.]|$) ]]), the structural prune-scan-substring-match rule in scripts/antipattern-scan.sh banning substring primitives, and a wrapped-object JSON schema migration {candidates, skipped_unclassified, protection_set, protection_status} (consumers /cprune and /cstatus migrated to read .candidates). Six fail-closed paths added at safety-belt boundaries: empty live-branch-slug set, empty live-task-slug set, missing realpath/readlink -f (PAT-020 — never silently lexical-fallback for symlink-equivalence decisions), workflow-state mid-write TOCTOU via content-based identity (started_at → composite → sha256 fallback chain, extending ABS-029), non-git BASE_DIR, lib.sh sourcing failure. Baseline manifest under .correctless/meta/ (ABS-040) — sole writer is prune-scan.sh --update-baseline, never auto-set as a scan side effect; newly-added patterns emit at risk: "medium" until human acknowledgement. ERE-metachar escape (_escape_ere_metachars) + slug validation (_slug_is_safe) provides defense-in-depth against malformed slug interpolation. Producer-pattern table in the spec is the source of truth for the (pattern to slug-type) mapping — INV-008 cross-references the table against _classify_artifact_pattern case branches and the artifact_patterns= assignment line at CI time. AP-031 satisfied by tests/fixtures/prune-scan/wfstate-real-sample.json — verbatim excerpt of a real workflow-state JSON. Added ABS-039 (slug-type classification), ABS-040 (baseline manifest), PAT-020 (fail-closed realpath probe); AP-032 frequency 1 to 2 (3rd instance promotes to PAT-xxx). 61 tests in tests/test-prune-scan-slug-aware.sh, plus antipattern-scan rule registration test.
2026-06-15 — Fix-diff reviewer class-shaped bug lens
Branch: feature/fix-diff-reviewer-class-shaped-bugs. Rules: 21 (CS-001..CS-021). QA rounds: 3. Findings fixed: 10 QA + 16 mini-audit (across 2 mini-audit rounds). Overrides: 3 (override-command attempts; the full suite ran green at the final audit-mini and done gates via a per-test retry wrapper). Added a class-shaped bug detection lens to the fix-diff reviewer (grep for sibling instances of a scope-narrowed fix before approving; SIBLING-DEFERRED carve-out with PR-base provenance) plus an SFG lift-and-restore backstop subsystem (sentinel + final-state check script + cmd_done gate + dedicated CI job + downstream propagation + ABS-041). The /caudit Step 6a fence producer was hardened class-wide against argv overflow and fence-delimiter injection (nonce-delimited fences, all artifact-sized data via stdin/file, reserved-tail close fences).
2026-06-16 — Cross-Model Spec Review via codex
Branch: feature/cross-model-spec-review. Rules: 23. QA rounds: 1. Findings fixed: 9. Activates the dormant external-review path in /creview-spec so codex (GPT-5.5) runs as a first-class adversarial spec reviewer alongside Claude’s six agents. Ships scripts/external-review-run.sh — the sole-writer producer (ABS-042) that invokes codex --sandbox read-only against the whole spec on stdin, validates the config-sourced invocation against a closed allowlist (INV-017), and parse-gates + bounds (4 MiB ceiling) + nonce-fences the untrusted codex output (INV-002/009/019, reusing build-caudit-prompt.sh) before it reaches Claude’s synthesis context. codex findings are renamespaced EXT-NNN, marked Source: codex (external), advisory-only through the Step 4 human disposition gate (PRH-003) — never auto-incorporated. Egress to OpenAI is disclosed at config time (INV-014) and per run (INV-022); the path is auto-off when codex is absent (INV-005). scripts/config-update.sh is the sanctioned non-redirect writer of the external-review config fields (BND-003); /csetup auto-detects an installed codex CLI. Added TB-008 (external output → Claude synthesis), TB-001c (structured config → argv, no eval), ABS-042 (sole-writer producer). The mini-audit caught a CRITICAL the other lenses missed: a tampered config omitting --sandbox would have run codex unsandboxed — fixed by injecting --sandbox read-only unconditionally and stripping config attempts (closes the AP-022 dead-code-in-security-paths class), re-verified by a round-2 hostile-input re-attack. 110 tests in tests/test-external-review.sh.