Building a Provider

This page is for developers writing their own WorkspaceProvider. If you're using one of the four built-in providers, you don't need to read it.

When you'd build your own

The provider you need isn't in the built-in set (e.g., E2B, Modal, Daytona, your own Firecracker host).
You have a proprietary backing store and want to plug it into the workspace abstraction.
You're benchmarking a new platform.

The `WorkspaceProvider<TConfig>` contract

typescript

interface WorkspaceProvider<TConfig = unknown> {
  readonly providerId: string;
  open(config: TConfig, session: SessionRef): Promise<OpenedWorkspace>;
  resolve(ref: WorkspaceRef): Promise<Workspace>;
}

interface OpenedWorkspace {
  readonly ws: Workspace;
  readonly ref: WorkspaceRef;
}

interface WorkspaceRef {
  readonly providerId: string;
  readonly ref: unknown;  // your serializable payload
  readonly capabilities: WorkspaceCapabilityFlags;
}

Three required pieces:

providerId — a string the registry uses to find your provider. The discriminator on WorkspaceConfig.provider.kind matches this.
open(config, session) — called the first time the agent uses a workspace tool. Construct the live Workspace object + a serializable WorkspaceRef for crash recovery. Both are returned.
resolve(ref) — called after a runtime boundary (DO hibernation, Temporal replay, executor restart) to reconstruct the workspace from the persisted ref.

The `WorkspaceConfig` discriminator

Your config type MUST have a kind field (string literal type) — the registry uses it to find your provider:

typescript

interface MyProviderConfig {
  readonly kind: 'my-provider';
  // ... your other config fields
}

The user declares it in defineAgent:

typescript

workspaces: {
  box: {
    provider: { kind: 'my-provider', /* ... your fields ... */ },
    capabilities: { fs: true },
  },
},

The lifecycle

1. User declares workspaces in defineAgent({...}).
2. Framework calls executor.execute(agent, ...).
3. Agent's first tool call hits the workspace registry.
4. Registry sees no live workspace; calls provider.open(config, session).
5. You return { ws, ref }. Live ws goes in the registry; ref is persisted.
6. Subsequent tool calls reuse the cached live ws.

[runtime boundary: DO hibernation, replay, restart]

7. Framework calls executor.resume(...) on a fresh runtime.
8. Registry sees no live workspace; calls provider.resolve(ref).
9. You reconstruct the live Workspace from the ref payload + return it.
10. Tool calls resume normally.

[session end]

11. Framework calls ws.close().

Your job: implement steps 5 and 9 (and step 11 if your workspace needs cleanup).

What goes in the ref payload

Everything resolve() needs to reconstruct the live workspace WITHOUT having the original config or session available. Typical contents:

The workspace's identity (id, namespace, etc.).
Names of bindings to look up at resolve-time (e.g., R2 bucket binding name; do NOT serialize the bucket object itself).
Provider-specific options that affect how the workspace was constructed (workspaceDir, sleepAfter, etc.).

What NOT to put in the ref:

Live objects (sandbox stubs, file handles, sockets). They don't survive serialization.
Secrets. Refs may be persisted to durable storage you don't fully control.
Anything you can re-derive from the runtime context.

Per-session state contract — providers MUST be stateless across sessions

⚠️ Read this section before writing your first provider. Failure to honor it is the single biggest source of subtle bugs in custom providers.

A WorkspaceProvider is constructed ONCE at executor / DO boot and reused across MANY sessions over its lifetime. The same provider instance services every session that lands on that process — there is no per-session provider instance.

This means: anything you store on this in your provider class will leak across sessions. A session-A write becomes a session-B read, with no isolation.

What goes where

State kind	Where to put it
Per-session tmpdir paths, sandbox container IDs, R2 namespace prefixes	Inside the `Workspace` returned by `open()` (closure-captured)
Per-session file caches, in-memory state maps	Inside the `Workspace`
Per-session cleanup state needed by `close()`	Inside the `Workspace`
Shared infrastructure handles (DO bindings, R2 bindings)	On the provider instance — these are process-wide
Shared config (logger, providerId, region)	On the provider instance

Bad pattern (cross-session leak)

typescript

class BadProvider implements WorkspaceProvider {
  readonly providerId = 'bad';
  // BAD: instance state retained across sessions
  private files = new Map<string, Uint8Array>();

  async open(_config, _session): Promise<OpenedWorkspace> {
    return {
      ws: {
        id: 'ws-bad',
        fs: {
          readFile: async (p) => this.files.get(p)!,           // sees other sessions' data!
          writeFile: async (p, d) => { this.files.set(p, d); }, // visible to other sessions!
          // ...
        },
        close: async () => {},
      },
      ref: { providerId: 'bad', ref: {}, capabilities: { fs: true } },
    };
  }
  async resolve() { /* ... */ }
}

Good pattern (per-session closure)

typescript

class GoodProvider implements WorkspaceProvider {
  readonly providerId = 'good';
  // OK: process-wide shared handles only
  constructor(private readonly logger: Logger) {}

  async open(_config, session): Promise<OpenedWorkspace> {
    // Per-session state captured in the Workspace closure, NOT on `this`.
    const files = new Map<string, Uint8Array>();
    return {
      ws: {
        id: `ws-${session.sessionId}`,
        fs: {
          readFile: async (p) => files.get(p)!,
          writeFile: async (p, d) => { files.set(p, d); },
          // ...
        },
        close: async () => { files.clear(); },
      },
      ref: { providerId: 'good', ref: { sessionId: session.sessionId }, capabilities: { fs: true } },
    };
  }
  async resolve() { /* ... */ }
}

The provider.test.ts suite includes a regression test for this contract — see provider.test.ts:C10.

Module construction strategies

open() accepts an OPTIONAL third arg: the agent's declared WorkspaceCapabilityFlags. You can use it (or ignore it) — both behaviors are valid.

Strategy A: always construct everything (back-compat default)

Simplest pattern; the third arg is ignored. The provider constructs every module it can support, regardless of what the agent declared. The framework's tool-injection layer wires only the declared capabilities, so unused modules are inert (allocated but never called).

typescript

async open(config, session) {
  const ws = new MyWorkspace({
    fs: new MyFs(/* ... */),
    shell: new MyShell(/* ... */),
  });
  return { ws, ref };
}

Use this strategy when modules are cheap to construct and you want simple, predictable code.

Strategy B: skip unused modules (D3 round-4)

When a module's constructor does meaningful work (allocates pools, opens sockets, primes caches), use the declaredCapabilities arg to skip construction for unused modules. The built-in cloudflare-sandbox provider uses this strategy as of D3.

typescript

async open(config, session, declaredCapabilities) {
  // declaredCapabilities is undefined for back-compat callers — fall back to "build everything".
  const wantFs = declaredCapabilities ? Boolean(declaredCapabilities.fs) : true;
  const wantShell = declaredCapabilities ? Boolean(declaredCapabilities.shell) : true;
  const ws = new MyWorkspace({
    fs: wantFs ? new MyFs(/* ... */) : undefined,
    shell: wantShell ? new MyShell(/* ... */) : undefined,
  });
  const ref: WorkspaceRef = {
    providerId: this.providerId,
    ref: { /* payload */ },
    // CRITICAL: ref.capabilities must match what you actually built.
    capabilities: { fs: wantFs, shell: wantShell },
    schemaVersion: 2,
  };
  return { ws, ref };
}

Either strategy passes the registry's invariant assertion (declared ⊆ populated). The registry's check is the single source of truth — it runs regardless of which strategy you picked.

`WorkspaceCapabilityFlags` advertisement on the ref

You MUST set capabilities on the returned WorkspaceRef to match the modules your open() actually populated:

typescript

const ref: WorkspaceRef = {
  providerId: this.providerId,
  ref: { /* your payload */ },
  capabilities: { fs: true, shell: true },  // what your live ws actually supports
};

This is AUTHORITATIVE. The registry asserts at both open() and resolve() time that:

Every capability declared in the agent's WorkspaceConfig.capabilities is also truthy on WorkspaceRef.capabilities (the ref must be a superset of the declaration), AND
Each declared module is non-undefined on the returned Workspace.

If a user declares a capability your provider doesn't support, the registry throws WorkspaceFailedError at session start (NOT at LLM tool-call time). Tool injection still reads WorkspaceConfig.capabilities; the ref's capabilities are the provider-side guarantee that the wired tools will find their module on the live Workspace.

Ref schema versioning (D4 round-4)

WorkspaceRef carries an optional schemaVersion: number field. Persisted refs may live across deployments; the version field is the contract that lets a deploy of N safely consume refs from N-1 (and vice versa for rollbacks).

The N±1 contract:

Each provider declares a CURRENT version N (as of D4 round-4, all built-in providers are at N = 2).
Every ref produced by open() MUST stamp schemaVersion: N.
resolve() MUST accept refs with schemaVersion:
- undefined (legacy / pre-D4 refs)
- N - 1 (one back; back-compat for in-flight rollouts and rollbacks)
- N (current) Anything else throws WorkspaceFailedError with a message naming the unsupported version + the supported set.

The framework provides a helper:

typescript

import { assertRefSchemaVersionSupported } from '@helix-agents/core';

async resolve(ref) {
  if (ref.providerId !== this.providerId) { /* ... */ }
  assertRefSchemaVersionSupported(ref.schemaVersion, this.providerId, this.logger);
  // ... your normal payload validation
}

When a future schema change requires bumping to N+1:

Update the constants in core/workspace/utils/ref-schema-version.ts (CURRENT becomes N+1; PREVIOUS becomes N).
Stamp schemaVersion: N+1 on new refs in every built-in provider.
The N±1 window means one DEPLOY worth of forward/back compat. Two-step migrations (N → N+2) require a stop on N+1 first to ensure rollback safety.

The framework calls logger.info with 'workspace ref: migrating ref from vX to vY' when an explicit lower-than-current version comes through (operator forensics for rollouts).

Capability auto-injection extension point — known limitation (D7 round-4)

The auto-injection logic in core/workspace/tool-injection.ts is hard-coded for the four built-in capabilities (fs, shell, code, snapshot). A custom provider that wants to expose a NEW capability — say git (clone/pull/push tools) or network (proxied HTTP fetch) — has no extension point today. Adding a new capability requires:

Adding the capability key to WorkspaceCapabilityFlags in core/workspace/types/config.ts.
Adding a make<Capability>Tools(name, caps) factory in core/workspace/tool-injection.ts.
Wiring the new factory into injectWorkspaceTools()'s if (caps.<key>) chain.
Releasing core.

This is intentional for v1 — the capability surface is curated to keep tool naming + LLM behavior consistent across providers. Future versions may introduce a WorkspaceCapabilityInjector extension point that lets providers register their own tool factories. Until then, file an issue if you have a use case for a new capability and we'll evaluate adding it to the built-in set.

Error model

Three error types you need to know:

`WorkspaceFailedError` — from `open()` / `resolve()`

Throw this when the workspace cannot be created or reconstructed. The registry transitions the entry to 'failed' state — subsequent tool calls fail fast with the same error.

typescript

import { WorkspaceFailedError } from '@helix-agents/core';

async open(config, session) {
  const result = await this.connectToBackend();
  if (!result.ok) {
    throw new WorkspaceFailedError(`Backend unavailable: ${result.error}`, {
      workspaceName: 'whatever',
      cause: result.cause,
    });
  }
  // ... happy path
}

Transient vs permanent errors (round-4 cluster C)

WorkspaceFailedError accepts a transient: true option. When set, the registry retries the open/resolve call with exponential backoff before transitioning the entry to 'failed':

typescript

// Known-transient cause: R2 timeout, container scheduling failure,
// network blip. Set transient: true so the registry retries.
throw new WorkspaceFailedError(`R2 read timed out after 30s`, {
  workspaceName: name,
  transient: true,
  cause: err,
});

// Permanent cause: capability mismatch, auth failure, config error.
// DO NOT set transient — retries cannot fix it.
throw new WorkspaceFailedError(`Workspace config has no R2 binding`, {
  workspaceName: name,
});

Auto-classification is unsafe — only the provider knows when an error is recoverable. Default is transient: false (no retry). Opt in per-throw for known-transient causes. The registry retries up to transientRetryAttempts times (default 3) with backoff capped at ~10s total.

`WorkspaceEvictedError` — from MODULE methods

Throw this from module method implementations (not from open / resolve!) when the underlying resource has been evicted and the framework should re-resolve via resolve(ref).

typescript

import { WorkspaceEvictedError } from '@helix-agents/core';

async readFile(path) {
  try {
    return await this.backend.readFile(path);
  } catch (err) {
    if (isEvictedError(err)) {
      throw new WorkspaceEvictedError(`Backend evicted`, { workspaceName: this.id });
    }
    throw err;
  }
}

The framework's withEvictionRetry (in tool-injection.ts) catches this, marks the registry entry as 'evicted', and the next tool call invokes provider.resolve(ref) to reattach. Useful for sandboxes that auto-evict after idle, tmpdirs that get cleaned, etc.

Don't throw WorkspaceEvictedError from open() or resolve() — the registry can't handle it cleanly there. Use WorkspaceFailedError instead.

Regular `Error` — from MODULE methods

Anything else propagates as a tool-error message to the LLM. The LLM sees the error message, can decide whether to retry, switch approaches, or surface to the user. Use plain Error (or a subclass) for "the operation failed but the workspace itself is fine."

Testing patterns

Structural test doubles, not `implements`

Don't make your test fake implements ISandbox (or whatever the upstream interface is). That forces you to fill in every method, even ones you don't use. Instead, build a test double that covers only the methods your adapter calls and cast it via as unknown as TSomeInterface:

typescript

// In your test:
const fake = new FakeBackend();  // not `implements TBackend`
const provider = new MyProvider({ backend: fake as unknown as TBackend });

The cast is local, explicit, and only applies at the boundary. If your adapter starts using a new method, the test fails with a clear "method not implemented" error from the fake, prompting you to add it.

Reference: `FakeSandbox` from runtime-cloudflare

The @helix-agents/runtime-cloudflare/testing subpath exports FakeSandbox, an in-memory ISandbox subset used by CloudflareSandboxWorkspaceProvider's tests. It's a good worked example — covers fs (Map-backed), exec/code (canned responses), backups (Map-backed). About 600 lines.

Worked example: `MyProvider`

A minimal provider wrapping a Map-backed filesystem. Demonstrates the full contract.

typescript

// my-provider.ts
import type {
  OpenedWorkspace,
  SessionRef,
  Workspace,
  WorkspaceProvider,
  WorkspaceRef,
  WorkspaceId,
  FileSystem,
  FileEntry,
  FileStat,
  GrepOptions,
  GrepResult,
} from '@helix-agents/core';

// 1. Config type with discriminator.
export interface MyProviderConfig {
  readonly kind: 'my-provider';
  /** Optional: scope for naming inside your backend. */
  readonly namespace?: string;
}

// 2. The fs adapter.
class MyFileSystem implements FileSystem {
  constructor(private readonly files: Map<string, Uint8Array>) {}

  async readFile(path: string): Promise<Uint8Array> {
    const bytes = this.files.get(path);
    if (!bytes) throw new Error(`MyFileSystem: file not found: ${path}`);
    return bytes;
  }

  async writeFile(path: string, data: Uint8Array | string): Promise<void> {
    const bytes = typeof data === 'string' ? new TextEncoder().encode(data) : data;
    this.files.set(path, bytes);
  }

  async stat(path: string): Promise<FileStat> {
    const bytes = this.files.get(path);
    if (!bytes) throw new Error(`MyFileSystem: not found: ${path}`);
    return { path, type: 'file', size: bytes.length };
  }

  async ls(path: string): Promise<FileEntry[]> {
    const prefix = path.endsWith('/') ? path : path + '/';
    return Array.from(this.files.keys())
      .filter((k) => k.startsWith(prefix))
      .map((k) => ({
        name: k.slice(prefix.length).split('/')[0],
        path: k,
        type: 'file' as const,
        size: this.files.get(k)!.length,
      }));
  }

  async glob(pattern: string): Promise<string[]> {
    const re = new RegExp(pattern.replace(/\*/g, '.*'));
    return Array.from(this.files.keys()).filter((k) => re.test(k));
  }

  async grep(pattern: string, opts?: GrepOptions): Promise<GrepResult[]> {
    const re = new RegExp(pattern, opts?.ignoreCase ? 'i' : '');
    const decoder = new TextDecoder();
    const out: GrepResult[] = [];
    for (const [path, bytes] of this.files) {
      if (opts?.path && !path.startsWith(opts.path)) continue;
      const lines = decoder.decode(bytes).split('\n');
      for (let i = 0; i < lines.length; i++) {
        if (re.test(lines[i])) {
          out.push({ path, lineNumber: i + 1, line: lines[i] });
          if (opts?.maxResults && out.length >= opts.maxResults) return out;
        }
      }
    }
    return out;
  }

  async rm(path: string): Promise<void> {
    if (!this.files.delete(path)) throw new Error(`MyFileSystem: not found: ${path}`);
  }

  async mkdir(): Promise<void> {
    // Implicit — directories aren't tracked separately in this toy impl.
  }
}

// 3. The Workspace aggregator.
class MyWorkspace implements Workspace {
  readonly id: WorkspaceId;
  readonly fs: FileSystem;

  constructor(id: string, fs: FileSystem) {
    this.id = id as WorkspaceId;
    this.fs = fs;
  }

  async close(): Promise<void> {
    // No-op — Map garbage-collects when references drop.
  }
}

// 4. The provider.
export class MyProvider implements WorkspaceProvider<MyProviderConfig> {
  readonly providerId = 'my-provider';

  // External-storage backing — keyed by namespace so resolve() reattaches.
  private static stores = new Map<string, Map<string, Uint8Array>>();

  async open(config: MyProviderConfig, session: SessionRef): Promise<OpenedWorkspace> {
    const namespace = config.namespace ?? session.sessionId;
    let store = MyProvider.stores.get(namespace);
    if (!store) {
      store = new Map();
      MyProvider.stores.set(namespace, store);
    }
    const fs = new MyFileSystem(store);
    const ws = new MyWorkspace(namespace, fs);
    const ref: WorkspaceRef = {
      providerId: this.providerId,
      ref: { namespace },
      capabilities: { fs: true },
    };
    return { ws, ref };
  }

  async resolve(ref: WorkspaceRef): Promise<Workspace> {
    if (ref.providerId !== this.providerId) {
      throw new Error(`MyProvider: refusing to resolve foreign provider ref`);
    }
    const payload = ref.ref as { namespace?: string } | undefined;
    if (!payload?.namespace) {
      throw new Error(`MyProvider: ref payload missing namespace`);
    }
    let store = MyProvider.stores.get(payload.namespace);
    if (!store) {
      // Could throw here if you want to fail; or auto-create as we do.
      store = new Map();
      MyProvider.stores.set(payload.namespace, store);
    }
    const fs = new MyFileSystem(store);
    return new MyWorkspace(payload.namespace, fs);
  }
}

Wire it like any other provider:

typescript

const executor = new JSAgentExecutor(/* ... */, {
  workspaceProviders: new Map([
    ['my-provider', new MyProvider()],
  ]),
});

Reference: existing providers

Read these for full real-world examples:

@helix-agents/workspace-memory — simplest provider. fs only. ~150 lines.
@helix-agents/workspace-local-bash — POSIX tmpdir + subprocess shell. ~600 lines.
runtime-cloudflare/src/workspaces/filestore — Cloudflare DO SQLite filestore. ~400 lines.
runtime-cloudflare/src/workspaces/sandbox — full Linux container with all 4 modules. ~1500 lines including tests.

Source references

Provider contract: packages/core/src/workspace/types/provider.ts
Workspace + module interfaces: packages/core/src/workspace/types/
Registry semantics: packages/core/src/workspace/registry.ts
Tool injection + withEvictionRetry: packages/core/src/workspace/tool-injection.ts
Error types: packages/core/src/workspace/errors.ts

Building a Provider ​

When you'd build your own ​

The WorkspaceProvider<TConfig> contract ​

The WorkspaceConfig discriminator ​

The lifecycle ​

What goes in the ref payload ​

Per-session state contract — providers MUST be stateless across sessions ​

What goes where ​

Bad pattern (cross-session leak) ​

Good pattern (per-session closure) ​

Module construction strategies ​

Strategy A: always construct everything (back-compat default) ​

Strategy B: skip unused modules (D3 round-4) ​

WorkspaceCapabilityFlags advertisement on the ref ​

Ref schema versioning (D4 round-4) ​

Capability auto-injection extension point — known limitation (D7 round-4) ​

Error model ​

WorkspaceFailedError — from open() / resolve() ​

Transient vs permanent errors (round-4 cluster C) ​

WorkspaceEvictedError — from MODULE methods ​

Regular Error — from MODULE methods ​

Testing patterns ​

Structural test doubles, not implements ​

Reference: FakeSandbox from runtime-cloudflare ​

Worked example: MyProvider ​

Reference: existing providers ​

Source references ​

Building a Provider

When you'd build your own

The `WorkspaceProvider<TConfig>` contract

The `WorkspaceConfig` discriminator

The lifecycle

What goes in the ref payload

Per-session state contract — providers MUST be stateless across sessions

What goes where

Bad pattern (cross-session leak)

Good pattern (per-session closure)

Module construction strategies

Strategy A: always construct everything (back-compat default)

Strategy B: skip unused modules (D3 round-4)

`WorkspaceCapabilityFlags` advertisement on the ref

Ref schema versioning (D4 round-4)

Capability auto-injection extension point — known limitation (D7 round-4)

Error model

`WorkspaceFailedError` — from `open()` / `resolve()`

Transient vs permanent errors (round-4 cluster C)

`WorkspaceEvictedError` — from MODULE methods

Regular `Error` — from MODULE methods

Testing patterns

Structural test doubles, not `implements`

Reference: `FakeSandbox` from runtime-cloudflare

Worked example: `MyProvider`

Reference: existing providers

Source references