Make reversible work autonomous; keep a human gate on anything that touches prod.
The first earned level. You let the agent run reversible work on its own, while every path to production stays behind a human gate. This is the level where most of the durable wiring gets built. Target: fewer than 10 tickets in flight.
CLAUDE.md present with basic workflow instructions. If it's absent, create one.
Issue tracking wired (Linear). Track every idea here, organized. A letter+number scheme (e.g. A1, T3) works well; marking dependencies is optional at this stage.
GitHub & CI
GitHub connected and able to push.
Auto-merge enabled after CI passes, so agents don't poll CI manually.
Credentials to check PR status and CI/CD pipelines.
Autonomous workflows
Walk through each autonomous workflow and confirm the sandbox environment has the right tools (e.g. Playwright + browser automation). Test each workflow in-session.
Configure sandbox credentials and software.
Credentials & automation — the safety spine
This is the part that earns Warp 2. The principle is simple: agents should be able to do a lot, but never hold the keys to disaster.
Auto-deploys to production via GitHub Actions + secrets— not from the agent's hands.
Don't hand agents SSH keys directly. Grant access indirectly via GitHub Actions so you can revoke it any time without leaking keys.
Route high-compute or sensitive operations through GitHub Actions (deploys to Cloudflare/AWS, even experiments on your own GPU rigs) — deterministic and auditable.
Guard against disastrous data loss: require secondary-account approval to merge into main/prod; give the agent a dedicated user with restricted read/write rights and back up experiment data somewhere safe from accidental deletion; enable GitHub's deletion-protection policy (from your local machine that holds your credentials).
Treat GitHub Actions as the vault for any fixed script that needs secrets the agent shouldn't see raw.
Experiments as a third deploy target
The same Actions path you use to ship prod also unlocks a third deploy target: local experiment runs. An agent dispatches a workflow_dispatch job onto a runner you choose — GitHub-hosted, or self-hosted on your own GPU rig or lab box— and reads the result files back, while the secrets and runner registration stay in Actions, never in the agent's hands.
Scope it to the dispatchable, non-interactive slice: runs you can fire off with inputs and read artifacts from afterward. Live GPU debugging and GUI-driven tools stay on hands-on remote control — the two coexist.
The safety spine is non-negotiable before you climb. Warp 3 and 4 add more agents and more autonomy on top of exactly this foundation — if the blast radius isn't contained here, it only grows.