Live adversarial penetration testing against a running system with source code access and a specific objective to accomplish.
/caudit hacker. Red teaming is live exploitation toward a specific objective, not a checklist.Runs after Hacker Olympics fixes are applied. The recommended sequence is: Hacker Olympics (static, finds code-level flaws) then Red Team (live, proves exploitability) then Devil’s Advocate (periodic, challenges the security approach itself). The red team finds integration gaps, runtime behavior, and defense chain failures that no amount of code reading can surface.
Requires high intensity or above.
Your objective is: “Starting from an unauthenticated session, access internal service metrics.”
The agent reads the source and maps trust boundaries. It finds that the public API gateway validates auth tokens, but an internal service discovery endpoint (/internal/services) was added for health monitoring and sits outside the auth middleware chain.
The agent crafts a request to /internal/services and receives a list of internal service URLs. One service exposes a /metrics endpoint that returns system metrics including database connection strings. The agent chains: unauthenticated request to service discovery, then direct request to the metrics endpoint using the internal service URL.
Objective achieved via SSRF through internal service discovery. The report includes:
/internal/services behind auth middleware, restrict metrics endpoints to loopback| Reads | Writes |
|---|---|
| Source code (full access) | Report (.correctless/artifacts/redteam/report-{date}.md) |
ARCHITECTURE.md |
Regression tests |
| Running target (live requests) | Updated antipatterns (.correctless/antipatterns.md) |
Token log (.correctless/artifacts/token-log-{slug}.json) |
Agent structure:
| Mode | When to Use | Agents |
|---|---|---|
| Solo (default) | Most assessments | 1 agent, all angles |
| Team | Complex targets with multiple attack surfaces | 2-3 agents: Primary Interface Attacker, Control Plane Attacker, Resilience Attacker |
Overnight runs: For infrastructure targets, the red team can run unattended on an isolated VPS with --dangerously-skip-permissions and --max-turns 200. Read the report in the morning.
docker-compose.yml that stands up the full stack (app, database, supporting services, mock third-party APIs) on an isolated Docker network.