Skip to content

Docker Workspace

The DockerWorkspace runs the agent's shell commands inside a Docker container while the agent's files live in a host tmpdir bind-mounted into that container at /workspace. The container is the process / network / resource isolation boundary; the bind mount is the shared workspace. fs reuses the hardened workspace-posix-core TmpdirFileSystem operating on the host side of the mount (full symlink-leaf / path-safety / size-cap hardening, zero new fs code); shell runs inside the container via docker exec, against the same files.

When to use

  • Reproducible, pinned userland. When the agent needs a specific image — language runtimes, system packages, a fixed toolchain — rather than whatever happens to be installed on the host. The image IS the userland.
  • cgroup-grade resource limits. Memory / CPU / pids caps that the host kernel actually enforces (memoryMb, cpus, pidsLimit).
  • A stronger, more uniform isolation boundary. Linux namespaces + seccomp + capability drop, the same on macOS and Linux (via the Docker VM). --network none by default for egress control.

Contrast with the sibling providers:

ProviderBoundaryDependencyPlatforms
Local BashApp-layer guards only (allowlist/metachar/env-deny)NonePOSIX
Local SandboxHost-kernel sandbox (seatbelt / bwrap)None (zero-dep)POSIX (macOS / Linux)
Docker (this page)Container (namespaces + cgroups + seccomp + cap-drop)A Docker daemonmacOS + Linux (uniform, via Docker VM)

The trade-off is a Docker daemon dependency and per-command docker exec latency. If you want OS-level isolation without a container, prefer Local Sandbox. For untrusted-input production on Cloudflare Workers, use Cloudflare Sandbox.

Capabilities supported

CapabilitySupported
fs
shell
code
snapshot

The provider advertises { fs: true, shell: true } on its WorkspaceRef.capabilities. Declaring a capability marked ❌ above causes WorkspaceFailedError at session start (the framework asserts that config.capabilities ⊆ ref.capabilities and that each declared module is present on the returned workspace). See the error-model table on the workspaces overview.

snapshot via docker commit is a clean future seam — documented below, NOT built in v1.

Install

bash
npm install @helix-agents/workspace-docker

dockerode is a runtime dependency of the package; you do not install it separately.

Requirements

A running Docker daemon. dockerode talks to the default socket (e.g. /var/run/docker.sock, or the Docker Desktop socket on macOS) — no extra configuration for the common case. The provider fails closed when the daemon is unreachable: open() (and resolve()) throw WorkspaceFailedError rather than silently running commands without isolation (see Fail-closed behavior below).

Config

Per-workspace config (DockerWorkspaceConfig)

typescript
interface DockerWorkspaceConfig {
  kind: 'docker';
  /** REQUIRED — the container image. Explicit, no surprise default pull. */
  image: string;
  /** Outbound network policy. Default 'off' → NetworkMode 'none'. */
  network?: 'off' | 'allow';
  /** Memory cap (MiB) → HostConfig.Memory. Default: unset (no cap). */
  memoryMb?: number;
  /** CPU cap → NanoCpus (cpus * 1e9). Default: unset (no cap). */
  cpus?: number;
  /** Max process count → HostConfig.PidsLimit. Default 512. */
  pidsLimit?: number;
  /** Image pull policy. Default 'if-not-present'. */
  pullPolicy?: 'if-not-present' | 'never';
}
FieldRequiredDefaultNotes
imageYesNo default: a defaulted image would trigger a silent multi-hundred-MB pull on first use.
networkNo'off''off'NetworkMode: 'none'; 'allow'bridge.
memoryMbNounset (no cap)HostConfig.Memory (bytes = memoryMb * 1024 * 1024).
cpusNounset (no cap)NanoCpus (cpus * 1e9).
pidsLimitNo512HostConfig.PidsLimit.
pullPolicyNo'if-not-present''never' fails closed if the image is absent (no implicit network — see gotcha 3).

image is required, not defaulted: an implicit default would silently pull hundreds of megabytes on first use. Explicit is safer and self-documenting.

Provider options (DockerProviderOptions)

typescript
interface DockerProviderOptions {
  /** Inject for tests; defaults to a real dockerode-backed engine. */
  engine?: DockerEngine;
  /** Override the tmpdir root. Defaults to os.tmpdir(). */
  tmpdirRoot?: string;
  /** Per-process cap on concurrent opens across all sessions. Defaults to Infinity. */
  maxGlobalConcurrentOpens?: number;
  /** Logger for security warnings + lifecycle events. Defaults to silent. */
  logger?: Logger;
  /** Constraints applied to in-container shell calls (allowlist / passEnv). */
  shellConstraints?: DockerShellConstraints;
  /** Test seam — override daemon detection. */
  detect?: (engine: DockerEngine) => Promise<{ available: boolean; reason: string }>;
  /** Override the container user (default: the host's uid:gid, or the image's default user on platforms without getuid/getgid). See gotcha 1. */
  containerUser?: string;
}

shellConstraints carries the same allowedCommands allowlist as Local Bash (secure-by-default — an empty allowlist denies ALL commands). passEnv differs: it is a named allowlist only (string[], no true). Forwarding the host's entire environment into a container is a footgun, and the host's PATH / HOME / TMPDIR are meaningless inside the image — so only explicitly-named host vars are forwarded.

Security model

The container is the isolation boundary, hardened by default at create time:

  • --cap-drop ALL — every Linux capability dropped.
  • no-new-privileges — setuid / setgid escalation disabled (SecurityOpt: ['no-new-privileges']).
  • Read-only root filesystem + /tmp tmpfsReadonlyRootfs: true, with a writable /tmp tmpfs for scratch.
  • --network none by default — no egress unless network: 'allow' is opted in (then bridge).
  • Non-root user — the container runs as the host's uid:gid (so the bind mount shares ownership — see gotcha 1), or the image's default user on platforms without getuid/getgid; override via containerUser.
  • pids limit512 by default, capping fork bombs.

On top of the container boundary, the app-layer guards run host-side BEFORE the exec — the same shared ShellGuard (command allowlist, shell-metacharacter rejection, glob/brace rejection, privilege-escalation env-var denylist) that Local Bash and Local Sandbox use. A command is checked, and rejected if disallowed, before any container is touched — the container never even sees a rejected command. This is defense-in-depth on top of the container, not a replacement for it.

The bind mount is the only host-writable surface. The host TmpdirFileSystem writes to the per-session tmpdir; the container sees those bytes at /workspace. The container's read-only rootfs means everything else inside the container is non-persistent (and /tmp is a throwaway tmpfs).

The three gotchas

  1. uid/gid on the bind mount. Files written in-container must be readable/writable by the host TmpdirFileSystem, and vice-versa. The container runs as the host's uid:gid (ContainerSpec.user) so both sides share ownership. On Docker Desktop for macOS (VirtioFS / gRPC-FUSE), uid is remapped at the VM boundary — if you hit permission errors against bind-mounted files, that remapping is the usual cause; an image that requires a fixed user can be accommodated via the containerUser provider option.
  2. Exec timeout / abort is enforced in two layers. Killing a docker exec from the host does NOT kill the in-container child — a naive host-PID kill is a no-op across PID namespaces. The engine bounds each call with an in-container timeout wrapper plus a host-side stream-destroy backstop: the in-container timeout terminates the process tree, and destroying the exec stream releases the host call even if the daemon is slow. Together they ensure a timeoutMs (or an aborted AbortSignal) actually bounds the call rather than hanging.
  3. pullPolicy: 'never' + a missing image fails closed. ensureImage honors the pull policy; 'never' makes NO network attempt and throws WorkspaceFailedError when the image is absent. This is the air-gap-safe path — there is no implicit pull.

Fail-closed behavior

When the Docker daemon is unreachable, this provider fails closed: open() (and resolve()) throw WorkspaceFailedError (docker daemon not available (...)) rather than silently running commands without isolation. The daemon probe (engine.ping()) result is cached only on success, so a transient outage does not permanently brick the provider — the next attempt re-probes. The error is flagged transient: true, so the registry retries it with backoff before surfacing.

Likewise, pullPolicy: 'never' with a missing image fails closed (gotcha 3) — no implicit network.

Network: off by default

Outbound network is off by default (network: 'off'NetworkMode: 'none'). Opt in per-workspace:

typescript
workspace: {
  provider: { kind: 'docker', image: 'alpine:3', network: 'allow' }, // → bridge
  capabilities: { fs: true, shell: true },
}

network: 'allow' puts the container on the default bridge network so commands can reach out. Leave it 'off' unless the agent legitimately needs egress.

Resume

The provider uses a recreate-over-persisted-tmpdir resume model. The persisted WorkspaceRef carries everything needed to rebuild a container over the host tmpdir — the image, network policy, and resource limits — but deliberately no containerId (any prior container is gone after a process boundary).

  • open() — probes the daemon (fail-closed if down), ensureImage, creates a fresh per-session host tmpdir, then creates + starts a labelled hardened container over it and returns a serializable ref.
  • resolve() (cold resume / DO hibernation) — validates the ref schema version + payload (Zod safeParse), re-probes the daemon LIVE, validates the persisted host tmpdir, then creates a fresh container around it (files intact via the bind mount). It does NOT reconnect to a prior container.
    • tmpdir gone → WorkspaceEvictedError (the framework re-resolves; the eviction error carries a bare message that does not echo the persisted tmpdir).
    • daemon down → WorkspaceFailedError (fail-closed, same as open()).
  • close() is TERMINAL — it stops + removes the container, then removes the host tmpdir. It is idempotent (cached promise, cleared on rejection so a failed close can retry).

close() does not compose with resume. Suspend / resume does NOT call close() — it re-resolves the persisted ref over the still-present tmpdir. Once close() runs, the tmpdir is gone; a later resolve() of that ref maps to WorkspaceEvictedError. Do not close a workspace you intend to resume.

snapshot future seam

docker commit would let a snapshot() capture the container's writable layer as a new image — a clean future seam. It is documented, not built in v1. The v1 capability set is fs + shell only (parity with Local Sandbox). For snapshots today, use Cloudflare Sandbox.

Usage

A minimal provider construction with a conservative allowlist:

typescript
import { defineAgent } from '@helix-agents/core';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { DockerWorkspaceProvider } from '@helix-agents/workspace-docker';

const agent = defineAgent({
  name: 'my-agent',
  llmConfig: { model: yourModel },
  workspace: {
    // `image` is required; network defaults to 'off'.
    provider: { kind: 'docker', image: 'alpine:3' },
    capabilities: {
      fs: true,
      // Allowlist (not boolean true) so the shell tool actually runs the
      // permitted read-only commands rather than rejecting every command.
      shell: { allowedCommands: ['echo', 'cat'] },
    },
  },
});

const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  yourLLMAdapter,
  {
    workspaceProviders: new Map([
      [
        'docker',
        new DockerWorkspaceProvider({
          // Defense-in-depth: same allowlist the agent's capabilities.shell
          // carries, applied to direct ws.shell.run() call sites too.
          shellConstraints: { allowedCommands: ['echo', 'cat'] },
        }),
      ],
    ]),
  }
);

Lifecycle

  • open()ping() (fail-closed if the daemon is unreachable) → ensureImage(image, pullPolicy) (fail-closed under 'never' + missing image) → mkdtemp a per-session host tmpdir → create + start a hardened container bind-mounting the tmpdir at /workspace → construct the DockerWorkspace honoring declared capabilities. Returns a serializable ref carrying the image, network policy, and resource limits (no containerId). Compensates (remove container + tmpdir) on any post-create failure.
  • resolve() — validates ref schema version + payload, re-probes the daemon LIVE, validates the persisted host tmpdir (missing → Evicted; invalid → Failed, both with bare messages), then recreates + starts a fresh container over the tmpdir. Capabilities come from ref.capabilities.
  • close() — stops + removes the container, then removes the host tmpdir. Idempotent and TERMINAL (see Resume).

Limitations (v1)

  • No code, script, or snapshot capability. Only fs + shell. For code execution or snapshots, use Cloudflare Sandbox.
  • Requires a Docker daemon — fails closed without one. This is the deliberate contrast with local-bash (which always runs) and the parallel of local-sandbox (which fails closed without a kernel backend).
  • Per-command docker exec latency. Each shell call execs into the long-lived container. For latency-sensitive workloads where a container boundary isn't required, Local Sandbox (host-kernel, no exec round-trip) may fit better.
  • macOS uid remapping (Docker Desktop VirtioFS). Bind-mount ownership can be remapped at the VM boundary — see gotcha 1.

Source

Released under the MIT License.