Skip to content

rshdhere/devin

Repository files navigation

Devin (devin.baby)

devin.baby is a mini Devin focused on the core software-engineering loop: submit work, get an isolated runtime, run the agent, stream progress, and persist results in /workspace.

Sandboxes are an internal implementation detail. Users submit Tasks.

Architecture

Kubernetes is the control plane. Firecracker microVMs are the execution plane. The runtime HTTP contract never changes — the agent only knows POST /run, POST /terminal, POST /git/*, and GET /events.

flowchart TB
  User --> Web
  Web --> Server
  Server --> Scheduler
  Scheduler --> Queue
  Queue --> Orchestrator
  Orchestrator --> SandboxCRD["Sandbox CRD"]
  SandboxController --> SandboxCRD
  SandboxController --> MachineCRD["FirecrackerMachine CR"]
  MachineController --> MachineCRD
  MachineController --> HostSelect["Firecracker Host Selection"]
  HostSelect --> FCHost["firecracker-host daemon"]
  FCHost --> SnapshotPool["Warm Snapshot Pool"]
  SnapshotPool --> microVM["Firecracker microVM"]
  microVM --> Runtime["Runtime Supervisor"]
  Scheduler --> Runtime
  Runtime --> Agent
  Scheduler --> Events
  Events --> Web
Loading

Request flow

  1. User → POST /api/v1/tasks { "prompt": "...", "agent": "cursor" }
  2. Server authenticates and forwards to Scheduler
  3. Scheduler enqueues work and emits task.created
  4. Worker creates a Sandbox CRD via Orchestrator (internal API)
  5. Sandbox controller creates a FirecrackerMachine CR (no Pods)
  6. Machine controller selects a FirecrackerHost, clones a warm snapshot, boots the microVM
  7. Runtime supervisor starts inside the VM and exposes the fixed HTTP contract
  8. Scheduler opens GET /events?taskId=... and calls POST /run with the selected agent
  9. Cursor CLI or Claude Code runs headlessly inside /workspace
  10. Agent output streams over SSE: GET /api/v1/tasks/{id}/events
  11. Scheduler deletes the sandbox when the task finishes

Agent workflow

Tasks choose an agent provider that runs inside the sandbox microVM:

Agent CLI Auth env Runtime image
mock (Template) control-plane scaffold + sandbox verify OPENAI_API_KEY (draft planning) nextjs (greenfield default)
cursor agent -p --force --trust CURSOR_API_KEY agent
claude claude -p --bare ANTHROPIC_API_KEY agent

Greenfield tasks default to the Template agent: OpenAI generates a draft plan on the scheduler, scaffold-from-draft materializes the repo, and the nextjs microVM runs npm install plus a smoke check before push. No Cursor CLI or api2.cursor.sh egress is required.

The scheduler never shells into the host. It only talks HTTP to the runtime supervisor, which invokes the agent CLI inside the Firecracker VM (Cursor/Claude only when explicitly selected):

POST /tasks
  → Sandbox CRD (runtime=nextjs for template, runtime=agent for cursor/claude)
  → Firecracker microVM
  → POST /run { taskId, prompt, agent }  (skipped for template greenfield)
  → GET /events?taskId=...  (agent.log, git.*)
  → SSE /tasks/{id}/events

Create a task with Cursor or Claude Code (optional):

curl -X POST http://localhost:8080/api/v1/tasks \
  -H 'Content-Type: application/json' \
  -d '{"prompt":"Add JWT auth to the Next.js app","agent":"cursor"}'

For greenfield work without Cursor, omit agent or set "agent":"mock". With OPENAI_API_KEY configured, the scheduler plans the scaffold, hydrates it in the sandbox, verifies dependencies, and pushes to GitHub.

Repository layout

devin/
├── apps/
│   ├── web/                 # Dashboard
│   ├── server/              # API gateway (auth + task proxy)
│   ├── scheduler/           # Task queue worker + SSE events
│   ├── orchestrator/        # Sandbox CRD controller + internal API
│   ├── firecracker-host/    # Node daemon: VM pool + snapshot manager
│   └── runtime/             # In-VM supervisor (PID 1)
├── packages/
│   ├── orchestrator/        # K8s reconciliation logic
│   ├── sandbox/             # Sandbox + Firecracker CRD types
│   ├── scheduler/           # Task scheduling library
│   ├── services/
│   │   ├── email/           # Resend client
│   │   └── queue/           # Task queue (memory + SQS)
│   ├── events/              # Event bus + SSE helpers
│   └── agent-sdk/           # Runtime HTTP client contract
├── deploy/
│   └── helm/                # Helm chart scaffold
└── runtime/                 # agent, nextjs, go, rust, node, python → snapshots

Kubernetes namespaces

Namespace Workloads
devin-app web, server
devin-system orchestrator
devin-sandboxes Sandbox + FirecrackerMachine CRs
devin-firecracker firecracker-host DaemonSet, scheduler DaemonSet, FirecrackerHost CRs

Runtime supervisor API

Every microVM runs the same runtime supervisor:

Method Path Purpose
POST /run Execute agent task
POST /terminal Shell commands
POST /git/clone Clone repository
POST /git/commit Commit changes
POST /files/write Write workspace files
POST /browser/open Browser automation
GET /health Liveness
GET /logs Supervisor logs
GET /events Runtime event stream

The orchestrator never executes shell commands — it only provisions infrastructure and talks to the runtime over HTTP.

CRDs

Kind Purpose
Sandbox Task-facing sandbox intent (taskId, runtime, cpu, memory)
FirecrackerMachine Controller-managed microVM for a sandbox
FirecrackerHost Node capacity + firecracker-host API address
Snapshot Golden snapshot metadata per runtime image

Warm snapshots

Production hosts maintain a pool of ready microVMs restored from golden snapshots (~300ms) instead of cold booting kernels (~8–12s). Each runtime/* directory builds a snapshot consumed by firecracker-host.

Build snapshots on a Linux Firecracker host:

go build -o apps/runtime/bin/runtime ./apps/runtime/cmd/runtime
sudo ./scripts/build-firecracker-rootfs.sh nextjs devin-runtime-nextjs:latest
sudo ./scripts/build-firecracker-snapshot.sh nextjs

Set FIRECRACKER_DRY_RUN=false on firecracker-host to enable snapshot restore via the Firecracker SDK + CNI (fcnet).

Swappable execution backends

The scheduler → HTTP → runtime path works whether the runtime lives in a Pod, Firecracker VM, Kata, or gVisor. Only the controller + host layer changes.

Local development

bun install

# terminal 1 — firecracker-host (dry-run VM pool)
bun run dev --filter=@devin/firecracker-host

# terminal 2 — orchestrator (dry-run, calls firecracker-host)
ORCHESTRATOR_DRY_RUN=true bun run dev --filter=@devin/orchestrator-app

# terminal 3 — runtime supervisor
bun run dev --filter=@devin/runtime

# terminal 4 — scheduler worker
bun run dev --filter=@devin/scheduler-app

# terminal 5 — API + web
bun run dev --filter=@devin/server
bun run dev --filter=@devin/web

Create a task:

curl -X POST http://localhost:8080/api/v1/tasks \
  -H 'Content-Type: application/json' \
  -d '{"prompt":"Build a Next.js auth system"}'

Stream events:

curl -N http://localhost:9091/api/v1/tasks/{taskId}/events

Kubernetes deploy

Kubernetes manifests live in your GitOps repository (not this app repo). See migration.md for the full manifest bundle, overlay layout, and Argo CD / Flux wiring.

Production on AWS uses Path B (EKS + external EC2 execution hosts + Neon). Operational procedures — snapshots, EC2 hosts, Neon, ingress — are in deployment.md.

Sync the control plane from GitOps:

  • Path B (recommended): apps/devin-baby/overlays/<env>-external
  • Path A (in-cluster KVM): apps/devin-baby/overlays/<env>-in-cluster + label workers devin.baby/firecracker-host=true

Set on server: DATABASE_URL to your Neon connection string; SCHEDULER_URL to your execution host scheduler URL (http://<private-ip>:9091).

Scripts

Command Description
bun run dev Start all apps
bun run build Build all apps and packages
bun run lint Lint the monorepo
bun run check-types TypeScript type checking

About

Devin, An AI software engineering platform that executes coding tasks inside secure firecracker sandboxes orchestrated on a Kubernetes (K8s) Cluster. An open-source baby devin for your workflow

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors