PandaStack

Architecture

How PandaStack fits together — control plane, data plane, and the workload microVMs.

Components

+---------------- damroo-host (single-node prod) ----------------+
|                                                                |
|  caddy (TLS terminator)                                        |
|    |--> damroo-api    (HTTP :8080, JWT/PAT, OpenAPI)           |
|    |     |--> /run/fcsandbox/agent.sock (unix domain socket)   |
|    +--> damroo-dashboard (Next.js :3000, SPA)                  |
|                                                                |
|  damroo-agent (single-node orchestrator)                       |
|    +- sqlite      /var/lib/damroo/damroo.db                    |
|    +- /var/lib/damroo/                                         |
|    |    +- kernels/vmlinux-5.10.239                            |
|    |    +- templates/{name}/rootfs.ext4                        |
|    |    +- template-snaps/{name}/{snap.bin, memory}            |
|    |    +- vms/{sandbox-id}/                                   |
|    |    |    +- rootfs.ext4   (reflink CoW from template)      |
|    |    |    +- firecracker.sock                               |
|    |    |    +- vm.log                                         |
|    |    +- volumes/{vol-id}.ext4                               |
|    +- per-sandbox netns (10.200.X.Y veth pair, DNAT for SSH)  |
|                                                                |
+----------------------------------------------------------------+

Boot path (target: <250 ms)

  1. API receives POST /v1/sandboxes. Validates JWT, picks template.
  2. API → agent over unix socket: CreateSandbox(template, cpu, mem).
  3. Agent allocates netns + IPs (NATID picker → /run/damroo/natid.db).
  4. Agent reflink-clones templates/<name>/rootfs.ext4vms/<id>/rootfs.ext4 (~1 ms, no data copy).
  5. Agent boots Firecracker:
    • Snapshot restore (default): FC_SNAPSHOT_LOAD memory + vmstate → kernel/userspace already past init (~80–120 ms).
    • Cold boot (no snapshot): full kernel boot (~700 ms).
  6. Agent runs DNAT rules so <host>:<random-port><guest>:22.
  7. Agent SSHs to confirm reachability (~5 ms over loopback).
  8. API returns {id, status:"running", boot_ms: 285}.

Snapshot/fork plane

templates/code-interpreter/rootfs.ext4   <-- read-only "golden" image
                |
                | reflink clone (instant, CoW)
                v
vms/{sandbox-id}/rootfs.ext4             <-- per-sandbox writable layer

template-snaps/code-interpreter/
  +- vmstate.bin     (firecracker CPU state, ~MB)
  +- memory.bin      (guest RAM, mmap'd)
                |
                | mmap + private-CoW
                v
vms/{sandbox-id}/memory.bin              <-- per-sandbox dirty RAM only

10 forks of one parent sandbox share 99% of their memory pages until they diverge.

Networking (NATID)

Each sandbox gets a unique /30 carved from 10.200.0.0/14:

  Host                  netns                  guest
  veth-host  <---->  veth-ns  <---->  tap0  <---->  eth0
  10.200.X.A         10.200.X.B       (l2)         10.200.X.C
  • iptables -t nat -A PREROUTING -d 10.200.X.B -p tcp --dport <hostPort> -j DNAT --to 10.200.X.C:22
  • Outbound MASQUERADE → host's default route.
  • No cross-sandbox connectivity (each netns is isolated).

State store

Today: sqlite at /var/lib/damroo/damroo.db (single agent).

Tomorrow (Phase B): Postgres + NATS-JetStream for events. See the separation plan in the repo root.

Future: multi-node

   api-edge (N)  --gRPC-->  scheduler (1)  --gRPC-->  agent-N
       |                         |                       |
       +----------- shared: pg + nats-jetstream ---------+
  • api-edge: stateless, behind CF, terminates JWT, hijack-proxies streams to the chosen agent.
  • scheduler: stateless picker (least-loaded, warm-pool for hot templates, quotas).
  • agent-N: workload nodes — what damroo-agent is today, minus state.

Postgres holds sandboxes/leases/templates/audit. NATS streams events for fan-out. mTLS between all three planes (SPIFFE-style cert rotation).

On this page