Experiments as Deployments
Run experiments on your own hardware — dispatched through Actions, results read back. The agent never holds the keys.
You already ship two things through GitHub Actions: a frontend website and a backend API. This adds a third target — local experiment runs — so an agent can launch experiments on a machine youcontrol and read the results back, without ever holding the runner's keys. It is an extension of the Warp 2 safety spine, not a new mechanism.
Three kinds of deployment
Everything ships through GitHub Actions. The agent triggers work but never holds the keys — the pipeline does. There are three deploy targets:
workflow_dispatch (agent or human), runs on a runner you choose (GitHub-hosted, or self-hosted on your own hardware), and produces result files the agent reads back.An experiment run is the backend-API-shapedone: it's invoked, runs to completion on the chosen host, and emits artifacts — rather than serving live traffic. The key property across all three: the agent stays one step removed from the credentials. It opens PRs and dispatches workflows; Actions holds the secrets and the runner registration.
What this fits — the dispatchable slice
This path is for dispatchable, non-interactiveexperiment runs: work you can express as a command with inputs, fire off, and read the artifacts from afterward. That's exactly the slice that benefits from batched async delegation — queue a set of runs, approve them once, let the agent dispatch them, and come back to results.
Triggering an experiment run
Use a GitHub Actions workflow_dispatch trigger — a manual/programmatic dispatch that takes inputsfor the things you'll vary between runs (hyperparameters, dataset path, config). The agent (or a human) dispatches it; the run lands on whichever host you've pointed it at. A minimal, illustrativetemplate (provider-agnostic — adapt it, don't ship it as-is):
# .github/workflows/experiment.yml
name: experiment
on:
workflow_dispatch:
inputs:
run_name: { description: "Label for this run", type: string, required: true }
lr: { description: "Learning rate", type: string, default: "3e-4" }
dataset: { description: "Dataset id or path", type: string, required: true }
jobs:
run:
# GitHub-hosted: `runs-on: ubuntu-latest`
# Self-hosted on YOUR hardware: target your runner's labels, e.g.:
runs-on: [self-hosted, gpu, lab-box]
steps:
- uses: actions/checkout@v4
- name: Run experiment
run: |
./run_experiment.sh \
--name "${{ inputs.run_name }}" \
--lr "${{ inputs.lr }}" \
--dataset "${{ inputs.dataset }}" \
--out ./results
# …then an upload step — see "Where results go" below.GitHub-hosted vs. self-hosted runners
The host choice is yours:
Where results go
The run produces files; where they land is up to you. There is no single mandated default — pick by how tightly you want the agent integrated vs. how much extra infra you're willing to stand up.
actions/upload-artifact) — zero extra infra, results attach straight to the run. The catch: artifacts are ephemeral / retention-limited (they expire) — good for quick iteration, not a durable record.Many teams mix these — artifacts for fast inner-loop iteration, plus a durable copy to CloudClawer storage or Drive for the runs worth keeping.
The public-upstream + private-mirror pattern
If your experiment code lives in a public repo, you must not put your secrets, environments, or self-hosted-runner registration in it. The pattern that solves this is a public upstream + per-customer private mirror:
The two streams stay in sync in two directions. Sync DOWN — pull public → private (routine):
git remote add upstream <public-repo-url> # one-time
git fetch upstream
git merge upstream/main # or: git rebase upstream/main
git push origin main # update your private mirrorSync UP — push private → public (occasional): contribute non-secret improvements back upstream via a PR from a clean branch or a cherry-pick of the specific non-secret commits — deliberately excluding anything secret. Never push a branch that contains customer config or credentials to the public remote.
Guardrails — do these before the first dispatch:
.gitignoreevery local secret file so it can't be committed.private/ directory or *.local.* files) so it never rides an upstream PR.Safety-spine alignment
This is the same Warp 2 contract as production deploys — just a third target: