devin.baby is a mini Devin focused on the core software-engineering loop: submit work, get an isolated runtime, run the agent, stream progress, and persist results in /workspace.
Sandboxes are an internal implementation detail. Users submit Tasks.
Kubernetes is the control plane. Firecracker microVMs are the execution plane. The runtime HTTP contract never changes — the agent only knows POST /run, POST /terminal, POST /git/*, and GET /events.
flowchart TB
User --> Web
Web --> Server
Server --> Scheduler
Scheduler --> Queue
Queue --> Orchestrator
Orchestrator --> SandboxCRD["Sandbox CRD"]
SandboxController --> SandboxCRD
SandboxController --> MachineCRD["FirecrackerMachine CR"]
MachineController --> MachineCRD
MachineController --> HostSelect["Firecracker Host Selection"]
HostSelect --> FCHost["firecracker-host daemon"]
FCHost --> SnapshotPool["Warm Snapshot Pool"]
SnapshotPool --> microVM["Firecracker microVM"]
microVM --> Runtime["Runtime Supervisor"]
Scheduler --> Runtime
Runtime --> Agent
Scheduler --> Events
Events --> Web
- User →
POST /api/v1/tasks{ "prompt": "...", "agent": "cursor" } - Server authenticates and forwards to Scheduler
- Scheduler enqueues work and emits
task.created - Worker creates a Sandbox CRD via Orchestrator (internal API)
- Sandbox controller creates a FirecrackerMachine CR (no Pods)
- Machine controller selects a FirecrackerHost, clones a warm snapshot, boots the microVM
- Runtime supervisor starts inside the VM and exposes the fixed HTTP contract
- Scheduler opens
GET /events?taskId=...and callsPOST /runwith the selected agent - Cursor CLI or Claude Code runs headlessly inside
/workspace - Agent output streams over SSE:
GET /api/v1/tasks/{id}/events - Scheduler deletes the sandbox when the task finishes
Tasks choose an agent provider that runs inside the sandbox microVM:
| Agent | CLI | Auth env | Runtime image |
|---|---|---|---|
mock (Template) |
control-plane scaffold + sandbox verify | OPENAI_API_KEY (draft planning) |
nextjs (greenfield default) |
cursor |
agent -p --force --trust |
CURSOR_API_KEY |
agent |
claude |
claude -p --bare |
ANTHROPIC_API_KEY |
agent |
Greenfield tasks default to the Template agent: OpenAI generates a draft plan on the scheduler, scaffold-from-draft materializes the repo, and the nextjs microVM runs npm install plus a smoke check before push. No Cursor CLI or api2.cursor.sh egress is required.
The scheduler never shells into the host. It only talks HTTP to the runtime supervisor, which invokes the agent CLI inside the Firecracker VM (Cursor/Claude only when explicitly selected):
POST /tasks
→ Sandbox CRD (runtime=nextjs for template, runtime=agent for cursor/claude)
→ Firecracker microVM
→ POST /run { taskId, prompt, agent } (skipped for template greenfield)
→ GET /events?taskId=... (agent.log, git.*)
→ SSE /tasks/{id}/events
Create a task with Cursor or Claude Code (optional):
curl -X POST http://localhost:8080/api/v1/tasks \
-H 'Content-Type: application/json' \
-d '{"prompt":"Add JWT auth to the Next.js app","agent":"cursor"}'For greenfield work without Cursor, omit agent or set "agent":"mock". With OPENAI_API_KEY configured, the scheduler plans the scaffold, hydrates it in the sandbox, verifies dependencies, and pushes to GitHub.
devin/
├── apps/
│ ├── web/ # Dashboard
│ ├── server/ # API gateway (auth + task proxy)
│ ├── scheduler/ # Task queue worker + SSE events
│ ├── orchestrator/ # Sandbox CRD controller + internal API
│ ├── firecracker-host/ # Node daemon: VM pool + snapshot manager
│ └── runtime/ # In-VM supervisor (PID 1)
├── packages/
│ ├── orchestrator/ # K8s reconciliation logic
│ ├── sandbox/ # Sandbox + Firecracker CRD types
│ ├── scheduler/ # Task scheduling library
│ ├── services/
│ │ ├── email/ # Resend client
│ │ └── queue/ # Task queue (memory + SQS)
│ ├── events/ # Event bus + SSE helpers
│ └── agent-sdk/ # Runtime HTTP client contract
├── deploy/
│ └── helm/ # Helm chart scaffold
└── runtime/ # agent, nextjs, go, rust, node, python → snapshots
| Namespace | Workloads |
|---|---|
devin-app |
web, server |
devin-system |
orchestrator |
devin-sandboxes |
Sandbox + FirecrackerMachine CRs |
devin-firecracker |
firecracker-host DaemonSet, scheduler DaemonSet, FirecrackerHost CRs |
Every microVM runs the same runtime supervisor:
| Method | Path | Purpose |
|---|---|---|
| POST | /run |
Execute agent task |
| POST | /terminal |
Shell commands |
| POST | /git/clone |
Clone repository |
| POST | /git/commit |
Commit changes |
| POST | /files/write |
Write workspace files |
| POST | /browser/open |
Browser automation |
| GET | /health |
Liveness |
| GET | /logs |
Supervisor logs |
| GET | /events |
Runtime event stream |
The orchestrator never executes shell commands — it only provisions infrastructure and talks to the runtime over HTTP.
| Kind | Purpose |
|---|---|
Sandbox |
Task-facing sandbox intent (taskId, runtime, cpu, memory) |
FirecrackerMachine |
Controller-managed microVM for a sandbox |
FirecrackerHost |
Node capacity + firecracker-host API address |
Snapshot |
Golden snapshot metadata per runtime image |
Production hosts maintain a pool of ready microVMs restored from golden snapshots (~300ms) instead of cold booting kernels (~8–12s). Each runtime/* directory builds a snapshot consumed by firecracker-host.
Build snapshots on a Linux Firecracker host:
go build -o apps/runtime/bin/runtime ./apps/runtime/cmd/runtime
sudo ./scripts/build-firecracker-rootfs.sh nextjs devin-runtime-nextjs:latest
sudo ./scripts/build-firecracker-snapshot.sh nextjsSet FIRECRACKER_DRY_RUN=false on firecracker-host to enable snapshot restore via the Firecracker SDK + CNI (fcnet).
The scheduler → HTTP → runtime path works whether the runtime lives in a Pod, Firecracker VM, Kata, or gVisor. Only the controller + host layer changes.
bun install
# terminal 1 — firecracker-host (dry-run VM pool)
bun run dev --filter=@devin/firecracker-host
# terminal 2 — orchestrator (dry-run, calls firecracker-host)
ORCHESTRATOR_DRY_RUN=true bun run dev --filter=@devin/orchestrator-app
# terminal 3 — runtime supervisor
bun run dev --filter=@devin/runtime
# terminal 4 — scheduler worker
bun run dev --filter=@devin/scheduler-app
# terminal 5 — API + web
bun run dev --filter=@devin/server
bun run dev --filter=@devin/webCreate a task:
curl -X POST http://localhost:8080/api/v1/tasks \
-H 'Content-Type: application/json' \
-d '{"prompt":"Build a Next.js auth system"}'Stream events:
curl -N http://localhost:9091/api/v1/tasks/{taskId}/eventsKubernetes manifests live in your GitOps repository (not this app repo). See migration.md for the full manifest bundle, overlay layout, and Argo CD / Flux wiring.
Production on AWS uses Path B (EKS + external EC2 execution hosts + Neon). Operational procedures — snapshots, EC2 hosts, Neon, ingress — are in deployment.md.
Sync the control plane from GitOps:
- Path B (recommended):
apps/devin-baby/overlays/<env>-external - Path A (in-cluster KVM):
apps/devin-baby/overlays/<env>-in-cluster+ label workersdevin.baby/firecracker-host=true
Set on server: DATABASE_URL to your Neon connection string; SCHEDULER_URL to your execution host scheduler URL (http://<private-ip>:9091).
| Command | Description |
|---|---|
bun run dev |
Start all apps |
bun run build |
Build all apps and packages |
bun run lint |
Lint the monorepo |
bun run check-types |
TypeScript type checking |