iii-sandbox
v0.17.0-next.1Spawn ephemeral microVMs and expose 14 sandbox::* triggers (lifecycle + filesystem) for isolated command execution and file ops.
skill doc
Callable ids:
sandbox::create,sandbox::exec,sandbox::list,sandbox::stop,sandbox::fs::*— pass these toagent_trigger { function: "sandbox::create" }(NOT the skill path fromdirectory::skills::list; that's documentation, not a function id). There is nosandbox::runtrigger —iii sandbox runis a CLI-only wrapper aroundcreate + exec + stop. From the agent loop, do the three calls yourself. Sessionq4czk143burned ~3 turns trying to callsandbox::runand gettingfunction_not_found.
When to use
Ephemeral microVMs owned by the iii-sandbox daemon. Each sandbox is scoped to the engine it runs inside. Reach for them to execute an LLM-generated patch against an untrusted codebase, spin up an ephemeral build worker, or provision an integration-test fixture.
| Question | Use this |
|---|---|
| Boot a fresh VM | sandbox::create |
| Run a command inside a VM | sandbox::exec |
| What's running right now? | sandbox::list |
| Tear a VM down | sandbox::stop |
| Read/write/list files inside a VM | sandbox::fs::* |
| One-shot "code in, stdout out" | create → exec → stop yourself (no sandbox::run trigger) |
Fetch the live schema, don't trust this page
For exact parameter and response shapes, call:
// engine::functions::info { function_id: "sandbox::create" }
// → request_format / response_format / description JSONCaveat for sandbox::fs::*: the daemon registers the filesystem ops with empty schemas (request_schema: {}, response_schema: {}). engine::functions::info won't tell you the shape. The README from directory::registry::workers::info { name: "iii-sandbox" } is the source of truth for those — read it once at the start of an authoring session.
The lifecycle ops (create / exec / list / stop) DO publish full schemas. This page is only for the cross-cutting behaviors that don't appear in either source — non-obvious traps that bite agents composing calls from JSON alone. Don't reconstruct field names or example payloads from this page; the engine drifts faster than the markdown.
Functions
The daemon registers 14 triggers (verified via engine::functions::list { prefix: "sandbox::" } on iii-sandbox 0.13.0). The CLI ships extra one-shot wrappers (iii sandbox run) that are NOT triggers — don't try to call them via agent_trigger.
sandbox::create— boot a fresh microVM from a catalog image; returns thesandbox_idevery other op uses.sandbox::exec— run a command inside a running sandbox. Serialized persandbox_id— concurrent calls return S003.sandbox::list— enumerate sandboxes currently running on this engine.sandbox::stop— destroy a sandbox VM bysandbox_id.sandbox::fs::ls/stat/mkdir/mv/rm/chmod— directory and metadata ops inside a running sandbox.sandbox::fs::grep— content search across files under a path.sandbox::fs::sed— regex substitution applied to a file.sandbox::fs::write— write a file into a sandbox.contentis aStreamChannelRefonly — there is no inline-string or base64 form. From the agent loop you can't easily construct one; usesandbox::execwith a heredoc instead (see "Behaviors" below).sandbox::fs::read— read a file out; inline string under 1 MiB,StreamChannelRefobject above.
For the actual catalog of bookable images on a given engine, read sandbox.image_allowlist from directory::registry::workers::info { name: "iii-sandbox" }'s config field. There's no sandbox::catalog::list trigger either.
Reaching the host engine from inside a sandbox
Processes inside a sandbox can't see the host's localhost. To call back into the engine that spawned the sandbox, pass network: true AND set III_ENGINE_URL in the boot env — the daemon rewrites localhost / 127.0.0.1 URLs to the per-sandbox gateway IP at VM boot, so the URL you write is the URL that resolves on the host.
{
"image": "node",
"network": true,
"env": ["III_ENGINE_URL=ws://127.0.0.1:49134"]
}Inside the guest, $III_ENGINE_URL resolves to e.g. ws://100.96.0.1:49134 on slot 0. The daemon picks the slot; callers don't. 49134 is the engine's default listen port — change it here if your engine listens elsewhere.
Things to know:
networkdefaults tofalse. Omit it and the rewrite is skipped: the URL keepslocalhost, which inside the guest resolves to the guest's own loopback — never the host. This is the single most common silent failure of "but it works on my host".- The rewrite is generic. ANY env value containing
://localhost:or://127.0.0.1:is rewritten the same way. Use it to hook other host services (databases, dev servers, sibling workers), not just the engine. - Boot-only. The rewrite runs once at
sandbox::create. AnIII_ENGINE_URLpassed insandbox::exec'senvis NOT rewritten — the URL passes through verbatim and resolves to the guest's loopback. Your worker's stdout will LIE to you here: it will happily logengine=ws://127.0.0.1:49134and "Worker registered and ready!" because those messages fire before / regardless of the WS handshake. The only authoritative signal isengine::functions::list { search: "— if that stays empty for >5 s after launch, the URL wasn't rewritten. Session" } c007a1a7burned ~10 turns on this: worker logged "ready", functions::list stayed empty, web::fetch returned 404. Fix: stop the sandbox and recreate with the env at create time. - Canonical name is
III_ENGINE_URL.III_URLis accepted as an alias by some daemon code paths but isn't the name to standardize on. - Verify the rewrite worked. After launch, the worker's startup log should show
engine=ws://100.96.0.x:49134(gateway IP), NOT127.0.0.1. If it still shows127.0.0.1, the rewrite didn't happen — the URL was set onexec, notcreate, ORnetwork: truewas missing.
Behaviors that aren't in the JSON schemas
The traps below are not visible from engine::functions::info and bite agents that compose calls from the schema alone.
Image — image is a catalog NAME, not an OCI ref
sandbox::create accepts the catalog key, not an OCI string. Bundled presets are exactly "node" and "python". Operators add more via sandbox.custom_images in iii.config.yaml (left side is the catalog name; right side is the OCI ref the daemon pulls).
| ❌ Wrong | ✅ Right |
|---|---|
{ "image": "ghcr.io/iii-hq/node:latest" } |
{ "image": "node" } |
{ "image": "node:20-alpine" } |
{ "image": "node" } |
{ "image": "docker.io/library/python:3.12" } |
{ "image": "python" } |
An OCI ref that's not also a registered catalog key returns S100. The error message names the allowed keys, so a wrong guess self-corrects on the next turn. To see the live set, read config.image_allowlist from directory::registry::workers::info { name: "iii-sandbox" }. An empty catalog with no presets denies every call — a deliberate fail-closed default.
Error envelope shape
Every sandbox::* op returns errors as:
{
"type": "validation|config|internal|transient|execution|filesystem|platform",
"code": "Sxxx",
"message": "...",
"docs_url": "https://.../README.md#Sxxx",
"retryable": true,
"fix": "<short hint> | { context, sandbox_id, inner_code }",
"fix_note": "one-line explainer"
}type is one of the seven categories above — NOT the literal string "SandboxError". The S codes are independent of the W codes used by the worker::* surface; don't conflate them.
The fix field carries an actionable hint as a string (or sometimes an object when nested context is useful). Read it before retrying — the daemon usually tells you exactly what to change.
sandbox::exec is serialized per sandbox_id
Only one exec runs at a time on a given sandbox. A concurrent call returns S003 while the previous one is still in flight. Consequences:
- Don't try to "tail the log while the worker runs" with a second exec — use
sandbox::fs::readon the log file, or boot a second sandbox. - To run two commands in parallel, boot two sandboxes.
cmd is one binary, not a shell line — use bash -c for && / pipes / redirects
Wrong (S001 — invalid request: cmd must be a single binary, not a shell line):
{ "cmd": "cd /tmp && npm install iii-sdk" } // ❌ && is shell syntax
{ "cmd": "node app.js > out.log" } // ❌ > is shell syntaxRight — wrap shell syntax in bash -c:
{ "cmd": "bash", "args": ["-c", "cd /tmp && npm install iii-sdk"] }
{ "cmd": "bash", "args": ["-c", "node app.js > out.log"] }Session svyuv3id burned one turn on this. The plain cmd field is shlex-parsed: quoting and whitespace word-splitting work (so cmd: "node -v" is fine — splits into ["node", "-v"]), but &&, ||, |, >, <, $VAR, $(…), *.js are not interpreted — they're either literal text or rejected outright.
sandbox::exec accepts three command shapes; precedence: argv wins if non-empty; whitespace-bearing cmd is shlex-split when args/argv are empty; otherwise cmd + args is the classic POSIX pair. Setting argv together with cmd or args returns S001 (ambiguous). Setting only args without cmd returns serialization error: missing field \cmd`— always supplycmd`.
Filesystem write semantics — agent loop can't call fs::write directly
sandbox::fs::write accepts content as a StreamChannelRef only — no inline-string, no base64. The agent loop has no way to construct a StreamChannelRef (it's an SDK channel handle, not a JSON shape you can write by hand). Trying { "content": { "data": "...", "type": "inline" } } or { "content_b64": "..." } returns data did not match any variant of untagged enum WriteContent (session c007a1a7 burned ~5 turns on this).
From the agent loop, write files via sandbox::exec with a heredoc:
// sandbox::exec
{
"sandbox_id": "<UUID>",
"cmd": "sh",
"args": ["-c", "cat > /tmp/worker.js << 'EOF'\n<file body>\nEOF"]
}Use /tmp/ as the worker path. /workspace does NOT exist by default on the bundled node and python images — writing to it returns cannot create /workspace/worker.js: Directory nonexistent. /tmp exists, is world-writable, and survives across exec calls in the same sandbox. Session q4czk143 lost a turn to this. If you really want /workspace, mkdir -p /workspace first.
Use a quoted heredoc delimiter (<< 'EOF') so shell interpolation doesn't eat $VAR or backticks in the body.
Do NOT escape $ in a quoted heredoc. Sessions c007a1a7 and acxrdx6g both hit this: writing JS template literals as \${todo.id} (thinking the $ needs escaping) lands \${todo.id} in the file literally — invalid JS at parse time, your worker won't boot. With << 'EOF', the shell ignores $ already; write template literals exactly as you'd write them in a source file: ${todo.id}, no backslash. The same goes for backticks — write them bare, not as ```.
For files with binary or many special chars, base64-encode and pipe through base64 -d:
{
"sandbox_id": "<UUID>",
"cmd": "sh",
"args": ["-c", "echo '<base64>' | base64 -d > /workspace/blob.bin"]
}From worker code (running inside a sandbox or on the host with the SDK), sandbox::fs::write works normally — iii.trigger(...) constructs the channel handle for you.
Filesystem read
sandbox::fs::read returns content as a string inline under 1 MiB, and as a StreamChannelRef object above that. Type-check before using. From the agent loop, prefer sandbox::exec with cat for any non-trivial size — the StreamChannelRef path isn't drainable from the tool surface.
parents: true is required
Intermediate directories aren't auto-created. fs::write returns an error carrying fix: { "parents": true } — merge and resubmit. Same for mkdir — pass parents: true for mkdir -p semantics.
The node image is bare — don't reach for ps, top, lsof
The bundled node preset ships minimal coreutils. Tools that ARE there: sh (POSIX, not bash), cat, echo, node, npm, mkdir, chmod, sleep, timeout. Tools that are NOT there: ps, top, pgrep, lsof, htop. Don't troubleshoot worker liveness with ps aux | grep node — it'll just exit 127. Use engine::functions::list { search: " to confirm a worker is registered (the authoritative signal) and cat /tmp/worker.log to see its stdout.
First exec after sandbox::create can hit transient S300 — retry once
The VM keeps booting after sandbox::create returns the sandbox_id. The first sandbox::exec against a freshly-booted VM occasionally fails with S300 VM boot failed: shell connect: worker relay not accepting connections: connect(...): Connection refused. This is transient — the relay socket hasn't bound yet. Wait ~1s and retry the same exec. Don't recreate the sandbox; that just resets the boot clock.
If the second retry also fails, then the VM really did fail to boot — check sandbox::list to confirm the sandbox is still stopped: false, and look at the error's fix_note for the platform reason (commonly: missing virtualization, see the iii-sandbox README's "Host requirements"). Session q4czk143 hit S300 twice in a row on the first sandbox and unnecessarily recreated it.
Run the worker in foreground first, then detach
A clean run order for authored workers, observed in session acxrdx6g:
- Install the SDK first —
cd /tmp && npm init -y && npm install iii-sdk. Thenodeimage doesn't shipiii-sdk. Skipping this step makes the worker fail atrequire('iii-sdk')withCannot find module. Sessionq4czk143skipped this and burned turns on diagnosis. - Use CommonJS by default in
.jsfiles —const { registerWorker } = require('iii-sdk'), notimport { registerWorker } from 'iii-sdk'. ESM works only if the package.json sets"type": "module"(or the file is.mjs); a bare.jsESM file fails withCannot use import statement outside a module. Thenpm init -yabove creates"type": "commonjs"so CJS is the safe default. - Do NOT use TypeScript syntax in .js files. Session
q4czk143wroteprocess.env.III_ENGINE_URL!(the TS non-null-assertion operator) and Node parsed it as JS, throwingSyntaxError: missing ) after argument list. Plain JS doesn't have!as a type assertion — drop it. - Foreground sanity check —
timeout 5 node /tmp/worker.js 2>&1 || true. Catches syntax errors, missing imports, badIII_ENGINE_URL, etc. Output is right there in the exec response instead of buried in a log file. The|| truelets the exec succeed even when timeout kills the process. (Or usenode --check /tmp/worker.jsfor a parse-only check; doesn't catch import resolution but is faster.) - Detached launch —
nohup node /tmp/worker.js > /tmp/worker.log 2>&1 &only after the foreground run prints "registered and ready" cleanly. - Verify —
engine::functions::list { search: ". Within 1–2 s of detached launch, your function ids should appear. If they don't, read" } /tmp/worker.log— but if the foreground run was clean, the only remaining failure mode is theIII_ENGINE_URLboot-rewrite trap (see "Reaching the host engine" above).
This five-step pattern collapses a typical 5–10 turn debug spiral into 5 turns: install deps, choose CJS, no TS syntax, foreground sanity, detach + verify. Sessions svyuv3id and c007a1a7-2116 (both local Qwen3-35B) followed this pattern end-to-end and shipped a full CRUD-plus-HTML todo app on the first deployment attempt.