Skip to content

Sub-Agent Orchestration

Sub-agents enable hierarchical agent systems where a parent agent can delegate specialized tasks to child agents. This is powerful for complex workflows that benefit from separation of concerns.

What Are Sub-Agents?

A sub-agent is an agent invoked by another agent as if it were a tool. The parent agent decides when to delegate, the sub-agent executes to completion, and the result flows back to the parent.

mermaid
graph TB
    Parent["Parent Agent"]
    Parent --> Tools["Uses regular tools"]
    Parent --> SubAgent["Delegates to Sub-Agent"]
    SubAgent --> Run["Sub-agent runs to completion"]
    SubAgent --> Result["Result returned to parent"]

Creating Sub-Agent Tools

Use createSubAgentTool() to turn an agent into a tool:

typescript
import { defineAgent, createSubAgentTool } from '@helix-agents/sdk';
import { z } from 'zod';

// Define a specialized agent
const AnalyzerAgent = defineAgent({
  name: 'text-analyzer',
  description: 'Analyzes text for sentiment and topics',
  systemPrompt: 'You analyze text. Determine sentiment and extract key topics.',
  outputSchema: z.object({
    sentiment: z.enum(['positive', 'negative', 'neutral']),
    confidence: z.number(),
    topics: z.array(z.string()),
  }),
  llmConfig: { model: openai('gpt-4o-mini') },
});

// Create a tool that invokes this agent
const analyzeTool = createSubAgentTool(
  AnalyzerAgent,
  z.object({
    text: z.string().describe('Text to analyze'),
  }),
  {
    description: 'Analyze text for sentiment and key topics',
    timeoutMs: 60_000, // Optional: per-tool timeout in ms
  }
);

// Use in parent agent
const OrchestratorAgent = defineAgent({
  name: 'orchestrator',
  systemPrompt: 'You coordinate research. Use the analyzer for sentiment analysis.',
  tools: [searchTool, analyzeTool], // Mix regular tools and sub-agents
  llmConfig: { model: openai('gpt-4o') },
});

Requirements

Sub-agents must have outputSchema:

typescript
// ✓ Valid sub-agent - has outputSchema
const ValidSubAgent = defineAgent({
  name: 'valid',
  outputSchema: z.object({ result: z.string() }), // Required!
  // ...
});

// ✗ Invalid sub-agent - no outputSchema
const InvalidSubAgent = defineAgent({
  name: 'invalid',
  // No outputSchema - will throw error when used as sub-agent
  // ...
});

The outputSchema defines the contract between parent and child - what the parent receives as the tool result.

Cloudflare Durable Objects Runtime

In the DO runtime, createSubAgentTool() works transparently — no changes to your agent definitions are needed. Internally, each sub-agent is routed to a sibling DO instance with its own isolated SQLite state. You only need to add subAgentNamespace to your createAgentServer config, and register all sub-agents in the same AgentRegistry. See Sub-Agents in the DO Runtime for setup details.

HITL cascade: when a sub-agent suspends (v7)

If a sub-agent calls a client-executed tool or hits an approval gate, the v7 stateless-suspension model cascades the suspension up to the parent:

  1. The child's run loop yields RunOutcome.suspended_client_tool and persists its pending entries to its own SessionState.pendingClientToolCalls.
  2. The parent's run loop observes that a child is incomplete and yields RunOutcome.suspended_awaiting_children with a SuspendedChildWait[] payload listing each pending child.
  3. The parent state writes suspendedAwaitingChildren to its suspensionContext durably.
  4. The root-session clientToolCallOwnership map records that toolCallId is owned by the child's session — the framework persists this mapping atomically with the child's pending-entry write.

When the client submits a result, it always submits against the root sessionId (not the sub-agent's). The framework looks up clientToolCallOwnership[toolCallId] to find the owning sub-agent session, writes the response into that child's pendingClientToolCalls, and the next executor.resume({ sessionId: rootSessionId }) cascades down: the child resumes and runs to completion, then the parent's suspendedAwaitingChildren clears and the parent resumes.

This means consumers never need to know about sub-agent sessions when submitting — they always submit against the root. The cascade is internal to the runtime.

Persistent sub-agents on Cloudflare Durable Objects

As of v7.0 (commit fb3180f6b), the Cloudflare DO runtime supports persistent sub-agents via DO-stub dispatch. Each persistent child runs in its own DO instance addressed by stable sessionId ({parent}-agent-{name}); the parent's auto-injected companion__* tools (five always, plus companion__waitForResult when a child is mode: 'blocking') translate into subAgentNamespace stub calls against the existing sub-agent endpoints. Ephemeral sub-agents (createSubAgentTool) work fully on every runtime that supports HITL.

How Sub-Agent Execution Works

When the parent LLM calls a sub-agent tool:

  1. Tool Call Detection - Framework identifies the tool as a sub-agent (prefixed with subagent__)
  2. Sub-Agent Initialization - New agent run created with:
    • Fresh runId (derived from parent's ID)
    • Same streamId as parent (for unified streaming)
    • Input converted to user message
  3. Execution - Sub-agent runs its full loop until completion
  4. Result Return - Sub-agent's output becomes the tool result
typescript
// Parent LLM calls:
{
  name: 'subagent__text-analyzer',
  arguments: { text: 'This product is amazing!' }
}

// Framework:
// 1. Creates sub-agent run
// 2. Converts input to message: "This product is amazing!"
// 3. Runs AnalyzerAgent loop
// 4. Returns output: { sentiment: 'positive', confidence: 0.95, topics: ['product'] }

Input Mapping

The input schema defines what arguments the parent provides. These are converted to a user message for the sub-agent:

typescript
const subAgentTool = createSubAgentTool(
  SubAgent,
  z.object({
    query: z.string(), // Common field names are recognized
    context: z.string().optional(),
  })
);

// When called with { query: "analyze this", context: "..." }
// Sub-agent receives user message: '{"query":"analyze this","context":"..."}'

The full tool input is JSON-serialized and sent as the user message. The remote agent server is responsible for parsing the JSON and constructing the appropriate context for its agent.

Streaming Integration

Sub-agent events stream alongside parent events on the same stream:

typescript
for await (const chunk of stream) {
  switch (chunk.type) {
    case 'text_delta':
      // Could be from parent or sub-agent
      console.log(`[${chunk.agentType}]`, chunk.delta);
      break;

    case 'subagent_start':
      console.log(`Starting sub-agent: ${chunk.subAgentType}`);
      break;

    case 'subagent_end':
      console.log(`Sub-agent ${chunk.subAgentType} result:`, chunk.result);
      break;

    case 'tool_start':
      // Includes sub-agent tool calls from within sub-agents
      console.log(`[${chunk.agentType}] Tool: ${chunk.toolName}`);
      break;
  }
}

Sub-agent streaming events:

  • tool_start for the parent's subagent__<name> tool call
  • subagent_start when the sub-agent begins
  • All proxied sub-agent chunks (text_delta, tool_start/tool_end for inner tools, etc.)
  • subagent_end when the sub-agent completes
  • tool_end for the parent's subagent__<name> tool call

The tool_start/tool_end pair on the parent is what closes the parent's dynamic-tool UI part for AI SDK consumers (transitions 'input-available''output-available'). See Sub-agent chunk ordering for the full semantics, including failure-path behavior and the subagent_end → tool_end ordering invariant.

This enables real-time visibility into nested execution.

State Isolation

Sub-agents have completely isolated state:

typescript
const ParentAgent = defineAgent({
  name: 'parent',
  stateSchema: z.object({
    parentCounter: z.number().default(0),
  }),
  // ...
});

const SubAgent = defineAgent({
  name: 'child',
  stateSchema: z.object({
    childCounter: z.number().default(0), // Separate from parent
  }),
  // ...
});

Key points:

  • Sub-agent cannot read parent's custom state
  • Parent cannot read sub-agent's custom state
  • Each has its own messages, stepCount, etc.
  • Sub-agent output is the only communication channel

To share data, pass it through the input and receive it in the output.

Error Handling

Sub-Agent Failures

If a sub-agent fails, the error becomes the tool result:

typescript
// Sub-agent throws error
throw new Error('Analysis failed: text too short');

// Parent receives tool result:
{
  success: false,
  error: 'Analysis failed: text too short'
}

The parent LLM sees the error and can decide how to proceed (retry, try different approach, etc.).

Handling Errors

Check for failures in parent's tools or logic:

typescript
const processResultTool = defineTool({
  name: 'process_analysis',
  execute: async (input, context) => {
    // The sub-agent result may have succeeded or failed
    if (!input.analysisResult.success) {
      // Handle sub-agent failure
      return {
        processed: false,
        reason: input.analysisResult.error,
      };
    }

    // Process successful result
    const analysis = input.analysisResult.result;
    // ...
  },
});

Nested Sub-Agents

Sub-agents can themselves have sub-agents:

typescript
// Level 3: Leaf agent
const SentimentAnalyzer = defineAgent({
  name: 'sentiment',
  outputSchema: z.object({ sentiment: z.string() }),
  // ...
});

// Level 2: Uses sentiment analyzer
const TextProcessor = defineAgent({
  name: 'processor',
  tools: [createSubAgentTool(SentimentAnalyzer /* ... */)],
  outputSchema: z.object({ processed: z.string() }),
  // ...
});

// Level 1: Uses text processor
const Orchestrator = defineAgent({
  name: 'orchestrator',
  tools: [createSubAgentTool(TextProcessor /* ... */)],
  // ...
});

Stream events include all levels:

[orchestrator] text_delta: "Let me analyze..."
[orchestrator] subagent_start: processor
  [processor] text_delta: "Processing..."
  [processor] subagent_start: sentiment
    [sentiment] text_delta: "Analyzing..."
    [sentiment] output: { sentiment: "positive" }
  [processor] subagent_end: sentiment
  [processor] output: { processed: "..." }
[orchestrator] subagent_end: processor
[orchestrator] text_delta: "Based on the analysis..."

Patterns

Specialist Pattern

Delegate specific tasks to specialists:

typescript
const ResearchAgent = defineAgent({
  name: 'researcher',
  tools: [
    searchTool,
    createSubAgentTool(FactCheckerAgent /* ... */),
    createSubAgentTool(SummarizerAgent /* ... */),
  ],
  systemPrompt: `You are a research coordinator.
1. Search for information
2. Send claims to the fact-checker
3. Send findings to the summarizer
4. Compile final report`,
});

Pipeline Pattern

Chain agents in a processing pipeline:

typescript
// Each agent processes and passes to next
const ExtractorAgent = defineAgent({
  name: 'extractor',
  outputSchema: z.object({ entities: z.array(z.string()) }),
});

const EnricherAgent = defineAgent({
  name: 'enricher',
  outputSchema: z.object({
    enrichedEntities: z.array(
      z.object({
        /* ... */
      })
    ),
  }),
});

const FormatterAgent = defineAgent({
  name: 'formatter',
  outputSchema: z.object({ formatted: z.string() }),
});

// Coordinator runs the pipeline
const PipelineAgent = defineAgent({
  name: 'pipeline',
  tools: [
    createSubAgentTool(ExtractorAgent, z.object({ text: z.string() })),
    createSubAgentTool(EnricherAgent, z.object({ entities: z.array(z.string()) })),
    createSubAgentTool(FormatterAgent, z.object({ data: z.unknown() })),
  ],
  systemPrompt: `Process text through the pipeline:
1. Extract entities
2. Enrich each entity
3. Format the output`,
});

Parallel Delegation Pattern

Delegate multiple tasks simultaneously:

typescript
const MultiAnalyzerAgent = defineAgent({
  name: 'multi-analyzer',
  tools: [
    createSubAgentTool(SentimentAgent, z.object({ text: z.string() })),
    createSubAgentTool(TopicAgent, z.object({ text: z.string() })),
    createSubAgentTool(EntityAgent, z.object({ text: z.string() })),
  ],
  systemPrompt: `Analyze text from multiple angles.
You can run multiple analyses in parallel.
Combine results into a comprehensive report.`,
});

The framework executes parallel tool calls concurrently when the LLM requests them.

Conditional Delegation Pattern

Delegate based on input characteristics:

typescript
const RouterAgent = defineAgent({
  name: 'router',
  tools: [
    createSubAgentTool(SimpleQAAgent, z.object({ question: z.string() }), {
      description: 'For simple factual questions',
    }),
    createSubAgentTool(ResearchAgent, z.object({ topic: z.string() }), {
      description: 'For topics requiring deep research',
    }),
    createSubAgentTool(MathAgent, z.object({ problem: z.string() }), {
      description: 'For mathematical calculations',
    }),
  ],
  systemPrompt: `Route questions to the appropriate specialist:
- Simple facts → SimpleQA
- Complex topics → Research
- Math problems → Math

Choose the best agent for each request.`,
});

Remote Sub-Agents

For agents running on a separate HTTP service, use createRemoteSubAgentTool() instead of createSubAgentTool(). This enables cross-service and cross-runtime delegation via HTTP + SSE.

typescript
import {
  defineAgent,
  createRemoteSubAgentTool,
  HttpRemoteAgentTransport,
} from '@helix-agents/core';
import { z } from 'zod';

const transport = new HttpRemoteAgentTransport({
  url: 'http://localhost:4000',
});

const researcherTool = createRemoteSubAgentTool('researcher', {
  description: 'Delegate research to a remote specialist agent',
  inputSchema: z.object({ query: z.string() }),
  outputSchema: z.object({
    findings: z.array(z.object({ title: z.string(), snippet: z.string() })),
  }),
  transport,
  remoteAgentType: 'researcher',
  timeoutMs: 120_000,
});

const OrchestratorAgent = defineAgent({
  name: 'orchestrator',
  tools: [researcherTool], // Works like any other tool
  // ...
});

Remote sub-agents stream events using the same subagent_start/subagent_end protocol as local sub-agents, so frontends don't need to distinguish between them.

Remote sub-agents cannot use client-executed tools

A remote sub-agent that calls a client-executed tool (execute: 'client') cannot be resumed — the browser-submitted result has no route across the HTTP boundary to the remote server's pending state. This is enforced: the parent fails fast with RemoteSubAgentClientToolUnsupportedError (dispatch failureReason: 'client-tool-unsupported') instead of hanging. Tracked for future support in GitLab #107.

For the full guide — including server setup, transport configuration, and production considerations — see Remote Agents.

Persistent Sub-Agents

Overview

Persistent sub-agents are long-lived child agents that maintain state across multiple interactions. Unlike ephemeral sub-agents (created with createSubAgentTool()), persistent children can receive follow-up messages and be managed throughout the parent's lifecycle.

Configure persistent sub-agents via the persistentAgents field on AgentConfig:

typescript
import { defineAgent } from '@helix-agents/core';
import { z } from 'zod';

const ResearcherAgent = defineAgent({
  name: 'researcher',
  systemPrompt: 'You research topics.',
  outputSchema: z.object({ findings: z.string() }),
  llmConfig: { model: openai('gpt-4o-mini') },
});

const OrchestratorAgent = defineAgent({
  name: 'orchestrator',
  systemPrompt: 'You coordinate research tasks using your persistent children.',
  outputSchema: z.object({ summary: z.string() }),
  persistentAgents: [{ agent: ResearcherAgent, mode: 'blocking' }],
  llmConfig: { model: openai('gpt-4o') },
});

Two Modes

Blocking (mode: 'blocking'): Parent waits for the child to complete before continuing. Use when you need the child's result before making further decisions.

Non-blocking (mode: 'non-blocking'): Parent continues immediately after spawning. Child runs concurrently and the parent receives a completion notification later. Use for fire-and-forget background tasks.

typescript
persistentAgents: [
  { agent: ResearcherAgent, mode: 'blocking' },      // Parent waits
  { agent: BackgroundWorker, mode: 'non-blocking' },  // Fire-and-forget
],

Note: Each agent type can only appear once in persistentAgents. defineAgent() will throw if the same agent type appears multiple times. To use the same agent logic in both modes, create two separate agent definitions with distinct names.

Companion Tools

When persistentAgents is configured, companion tools are auto-injected into the parent agent (prefixed with companion__). Five are always injected; companion__waitForResult is added only when at least one persistent child is configured mode: 'blocking'. The parent's LLM decides when and how to call them based on the system prompt and conversation context — you never wire them by hand.

ToolInjected whenDescription
companion__spawnAgentalwaysCreate and start a new persistent child.
companion__sendMessagealwaysSend a follow-up message to an active child. On all five runtimes a completed child is continued on its preserved session (memory retained) instead of erroring.
companion__listChildrenalwaysList all persistent children and their current statuses.
companion__getChildStatusalwaysGet detailed status (and last output) of one child by name.
companion__terminateChildalwaysTerminate a running child.
companion__waitForResultonly if a child blocksBlock until a child completes and return its result.

Tool reference (arguments → result)

The argument and result schemas below are the exact Zod shapes the framework injects (packages/core/src/tools/companion/*.ts). Identical on every runtime.

companion__spawnAgent — start a new persistent child.

  • Args: { agent: string, initialMessage: string, name?: string }agent is one of the parent's configured persistentAgents types (the LLM sees it as an enum). name is optional; omit it for auto-naming. When spawning multiple children of the same type in a single step, pass an explicit unique name for each — concurrent auto-naming can collide.
  • Result: { name: string, status: string }. For a non-blocking child, status is 'spawned'/'running' and the call returns immediately. For a blocking child, the spawn waits for the child to reach a terminal status and the result also carries the child's output inline (delivered into the parent's tool result).
  • Errors ({ error } tool result, not a thrown exception): unknown agent type (Unknown persistent agent type), a name already in use by an active child (already running), or a spawn failure (the orphaned ref is compensated to failed so the name stays re-spawnable).

companion__sendMessage — queue a follow-up message for a running child.

  • Args: { name: string, message: string }.
  • Result: { delivered: boolean }. delivered: false (not an error) when the child's loop has already exited between the parent reading its status and the send (I1-race) — the message is intentionally not stranded against a dead loop.
  • Errors: unknown child (No child agent found), or child currently suspended on unresolved client-tool calls (unresolved client-tool calls — submit those or terminate first). For a child in a terminal status (is not active): a completed child no longer errors on any of the five runtimes — it is continued on its preserved session (memory retained, fresh output). A failed or terminated child still errors (use spawnAgent to re-spawn). Sending to an interrupted child re-spawns a fresh execution to consume it.

companion__listChildren{}Array<{ name, agent, status }> (persistent children only; ephemeral sub-agent refs are filtered out). Stale running refs whose underlying child has actually finished are lazily synced before returning.

companion__getChildStatus{ name: string }{ name, agent, status, lastOutput? }. lastOutput is the child's output once it has completed. Unknown name → { error: 'No child agent found...' }.

companion__waitForResult — block until a named child is terminal.

  • Args: { name: string, timeout?: number }. timeout is in milliseconds; omit it to wait indefinitely (until the child is terminal or the call is cancelled). On durable runtimes the wait is durable (Temporal activity / DBOS.sleep) and survives crashes.
  • Result: { name, status, result? }result is the child's output for a completed child; for failed/terminated there is no result. If the timeout elapses first, status is a timeout marker (the child keeps running).
  • Returning a terminal result here marks the child's ref completionDelivered so the per-turn completion notifier does not also re-deliver it (see below).

companion__terminateChild{ name: string }{ name, terminated: boolean, status }. Terminate-truth: terminating an already-terminal child returns terminated: false and preserves its terminal ref (it does not lie by reporting a kill or clobber a completed result to terminated). Terminating a live child sets its interrupt flag, marks the ref terminated, and flags it completionDelivered so it isn't re-scanned.

All companion tools validate their arguments at dispatch; a malformed call returns a clean { error } tool result rather than throwing. The validated constraints are: name must be non-empty and ≤128 characters; the companion message (initialMessage on spawnAgent, message on sendMessage) must be non-empty (an empty message is rejected); and timeout (on waitForResult) must be positive.

Child Naming

Children can be named explicitly via the name argument in companion__spawnAgent, or auto-named using the pattern {agentType}-{counter} (e.g., researcher-1, researcher-2).

typescript
// Explicit naming
spawnAgent({
  agent: 'researcher',
  initialMessage: 'Research AI safety',
  name: 'safety-researcher',
});

// Auto-naming (uses counter)
spawnAgent({ agent: 'researcher', initialMessage: 'Research quantum computing' });
// -> named 'researcher-1'
spawnAgent({ agent: 'researcher', initialMessage: 'Research fusion energy' });
// -> named 'researcher-2'

Session IDs

Each persistent child gets a deterministic session ID: {parentSessionId}-agent-{name}. This enables:

  • Stable references across parent restarts
  • Predictable state store lookups
  • Clean cleanup on parent completion

Re-spawning

If a child with the same name is spawned after a previous one reached a terminal status, behavior depends on the prior status and the runtime:

  • completed (all five runtimes): the spawn continues on the preserved session — the old session is not deleted, so the child's memory is retained — and the initialMessage is appended as the next turn's input. The SubSessionRef is reset to running and its completionDelivered flag is cleared so the new run's completion is delivered. On durable runtimes the continuation is replay-safe: Temporal does the store-side reopen inside an activity; DBOS starts a fresh persistent restart workflow with a deterministic id derived from the toolCall.id (so a workflow-body replay never double-starts it) and CAS-gates the reopen so the consult is appended exactly once. On Cloudflare Workflows the continuation runs as a fresh workflow instance with a unique-but-deterministic id (agent__<type>__<childSession>__continue__<stepCount>__<toolCallId> — a Cloudflare Workflows instance id is write-once globally, so the completed child's base id can't be recreated); the consult is carried as the new instance's newMessages and appended exactly once by the instance's !isResumable continuation branch (the same path as root multi-turn continuation), so the child session stays completed until the new instance reopens it.
  • failed / terminated (all runtimes): the old session is cleaned up (deleteSession, best-effort) and a new one starts fresh; the SubSessionRef is reset to running and completionDelivered is cleared.

You cannot re-use the name of a still-active child — that returns an already running error.

Re-consulting a persistent companion (the critic loop)

A persistent companion's defining feature is that it retains memory across rounds. The canonical use is a maker → critic loop: a parent produces an artifact, spawns a critic that returns a typed verdict, reads the verdict, fixes the artifact, then re-consults the SAME critic — which still remembers the prior round and can comment on what changed.

typescript
const CriticAgent = defineAgent({
  name: 'critic',
  systemPrompt: `You review the maker's artifact and return a verdict.
You remember prior rounds, so call out what changed since last time.`,
  outputSchema: z.object({
    verdict: z.enum(['pass', 'revise']),
    notes: z.string(),
  }),
  tools: [loadArtifactTool], // loads the current artifact BY REFERENCE (see below)
  llmConfig: { model: openai('gpt-4o-mini') },
});

const MakerAgent = defineAgent({
  name: 'maker',
  systemPrompt: `Produce an artifact, then consult the 'critic' companion.
Re-consult the SAME critic ('reviewer') after each fix until it returns
verdict: 'pass', or you hit your own round cap.`,
  outputSchema: z.object({ final: z.string() }),
  // `PersistentAgentConfig` has no `name` field — `agent` is the registered
  // agent TYPE. The stable instance name ('reviewer') is pinned at SPAWN time
  // via companion__spawnAgent({ name: 'reviewer', ... }); that name is what you
  // re-consult on later rounds.
  persistentAgents: [{ agent: CriticAgent, mode: 'blocking' }],
  // companion__spawnAgent / sendMessage / waitForResult / ... auto-injected.
});

A single round of the loop, as the maker's LLM drives it:

typescript
// Round 1 — blocking spawn returns the typed verdict INLINE on `.output`.
const r1 = companion__spawnAgent({
  agent: 'critic',
  name: 'reviewer',
  initialMessage: 'Review artifact v1.',
});
// r1 === { name: 'reviewer', status: 'completed', output: { verdict: 'revise', notes: '...' } }
if (r1.output.verdict === 'revise') {
  // ...fix the artifact (v2)...

  // Round 2 — re-consult the SAME critic by spawning with the SAME name.
  // The session is continued (not recreated): the critic still remembers v1.
  const r2 = companion__spawnAgent({
    agent: 'critic',
    name: 'reviewer',
    initialMessage: 'Review artifact v2 — I addressed your notes.',
  });
  // r2.output.verdict is the FRESH verdict; loop until 'pass' or your cap.
}

Ergonomics — prefer blocking spawnAgent for a typed re-consult. A blocking spawnAgent re-consult returns the fresh typed verdict inline as the clean .output of the tool result — this is the recommended path. companion__sendMessage re-consult returns only { delivered: true } (NOT the output); to read the verdict you must follow it with a companion__waitForResult({ name }) call in a later step. A minor cross-runtime nuance: on Temporal, a blocking child's sendMessage-continue may enrich the tool result with the output inline, but do not rely on that — for a portable typed re-consult, use blocking spawnAgent (or sendMessage + waitForResult).

Re-consulting with a revised artifact's bytes. The companion message (initialMessage on spawnAgent, message on sendMessage) is text-only — you cannot attach bytes (a revised image, a binary, etc.) to it. To re-consult over a non-text artifact, give the critic a tool that loads the artifact by reference (loadArtifactTool above, keyed by an id/path/URL the maker passes in the text message); the critic fetches the current bytes itself each round.

Known limitations:

  • Cloudflare pre-heal upgrade gap. A structured-output companion that completed under a prior release — before the per-step __finish__ heal shipped — is auto-healed on its first re-consult on JS / Temporal / DBOS, but not on Cloudflare (DO or Workflows). On Cloudflare, complete such a child once under the new release before re-consulting; otherwise its first re-consult may send a malformed transcript (a dangling __finish__ tool_use with no tool_result).
  • Same-turn waitForResult + sendMessage race. Calling companion__waitForResult and companion__sendMessage on the same completed child within one assistant message can race and return the stale prior output. Re-consult and read the result in separate turns (e.g. sendMessage/spawnAgent in one turn, read the verdict in the next).

Performance & resource considerations:

  • Per-round token cost grows with history. Each re-consult is a continuation on the preserved child session: the critic remembers every prior round (intended — that's what makes the loop converge). But the child's transcript accumulates, so the token cost of each round grows with the number of rounds. Cap your re-consult rounds in your loop policy (the critic pattern above does this with a bounded for loop / a max-rounds guard), rather than looping until convergence unbounded.
  • Each re-consult round gets a fresh per-turn maxSteps budget. A re-consult is a new turn on the child session, so the child's step count restarts and the full maxSteps budget applies again each round — maxSteps bounds a single round, not the critic's lifetime across rounds. (Same per-turn budget model as root continuation; see maxSteps.)
  • Durable runtimes start a fresh child workflow/instance per round. On Temporal, DBOS, and Cloudflare Workflows, each re-consult begins a brand-new child workflow/instance for that round (the durability trade-off — every round is independently replayable/recoverable). For very high-frequency critic loops, the in-process JS runtime has the lowest per-round overhead (no workflow/instance spin-up).
  • No built-in per-parent child cap. There is no framework limit on the number of distinct persistent children a parent may spawn. Re-consulting the same named child reuses its session (no new ref), but spawning many distinctly-named children grows the parent's SubSessionRef list unboundedly — bound this in your own agent design if a parent can fan out widely.

How the parent learns a child finished

There are two ways the parent observes a child's outcome, and they cooperate so a completion is delivered exactly once:

  1. Pull (blocking spawn or companion__waitForResult) — the result is returned inline as the tool result, in the same turn. The child's ref is flagged completionDelivered: true at that moment.
  2. Push (deliver-on-next-turn) — for a non-blocking child that finishes on its own, a per-turn completion notifier injects a hidden message at the start of the parent's next turn (e.g. Sub-agent 'researcher-1' completed with result: …, or Sub-agent 'researcher-1' failed: … if it failed). This is failure-aware (a failed child yields a failure notification, not silence) and skips terminated children (the parent killed those deliberately).

The completionDelivered flag on the SubSessionRef is the durable dedup: once a child's outcome has been delivered by either path, the notifier won't re-inject it — even across a process restart, a parent re-entry, or a step retry. This is why a child waited-on via waitForResult doesn't also produce a duplicate push notification on the next turn.

State Tracking

Persistent children are tracked via SubSessionRef entries with mode: 'persistent':

typescript
interface SubSessionRef {
  subSessionId: string;
  agentType: string;
  parentToolCallId: string;
  status:
    | 'running'
    | 'completed'
    | 'failed'
    | 'interrupted'
    | 'terminated'
    | 'paused_awaiting_client'; // child suspended on a client-executed tool
  startedAt: number;
  completedAt?: number;
  mode: 'ephemeral' | 'persistent'; // 'persistent' for companion-managed children
  name?: string; // required when mode === 'persistent'
  completionDelivered?: boolean; // durable dedup: outcome already delivered to the parent
}

Usage tracking

A persistent companion records token, tool, and custom usage (ctx.recordUsage) under its own child session — exactly like an ephemeral sub-agent — and the parent records a discoverable kind:'subagent' entry, so getUsageRollup({ includeSubAgents: true }) on the parent aggregates the companion's cost. A companion's usage accumulates across re-consults on its preserved session and is counted exactly once. The parent-side entry is a discovery pointer (provisional duration/success); read the child-session rollup for a companion's true outcome/duration. See Usage Tracking → Persistent companions. (Exception: the Cloudflare Durable Object runtime stores usage per-DO and can't aggregate cross-DO; the CFW Workflows runtime requires a usageStore in its workflow dependencies — see that guide.)

Ephemeral vs Persistent Comparison

FeatureEphemeral (createSubAgentTool)Persistent (persistentAgents)
Created byParent's tool call to subagent__companion__spawnAgent tool
LifecycleRuns to completion, result returnedLong-lived, can receive messages
Follow-up messagesNot supportedVia companion__sendMessage
Result accessImmediate (tool result)Via companion__getChildStatus or companion__waitForResult
NamingAuto-generatedExplicit or auto-incremented
ModeAlways blockingBlocking or non-blocking
Session ID{parentSessionId}-sub-{callId}{parentSessionId}-agent-{name}
SubSessionRef mode'ephemeral''persistent'

Example: Research Coordinator

typescript
const ResearcherAgent = defineAgent({
  name: 'researcher',
  systemPrompt: 'You research topics thoroughly and return findings.',
  outputSchema: z.object({
    findings: z.string(),
    sources: z.array(z.string()),
  }),
  tools: [searchTool],
  llmConfig: { model: openai('gpt-4o-mini') },
});

const CoordinatorAgent = defineAgent({
  name: 'coordinator',
  systemPrompt: `You coordinate research tasks.
You have persistent researcher children that you can spawn, send messages to, and check results.
1. Spawn researchers for different topics
2. Wait for their results
3. Compile a final summary`,
  outputSchema: z.object({ summary: z.string() }),
  persistentAgents: [
    {
      agent: ResearcherAgent,
      mode: 'blocking',
      description: 'Spawns researcher agents for deep dives',
    },
  ],
  llmConfig: { model: openai('gpt-4o') },
});

The coordinator can then:

  1. Spawn researcher-1 for topic A
  2. Spawn researcher-2 for topic B
  3. Check status or wait for results
  4. Terminate if needed
  5. Compile final output

Runtime Support

Persistent sub-agents work across all five runtimes. The companion-tool surface is identical everywhere — the same tools (5 always, plus companion__waitForResult when a child is mode: 'blocking'), the same arguments, the same terminate-truth / completion-dedup semantics — because every runtime routes through the shared core dispatcher (executeCompanionToolDispatch). The differences below are in how a runtime starts and waits on a child, not in the LLM-facing behavior.

RuntimeBlockingNon-blockingWorkspaces on childrenNotes
JSYesYesYesIn-process execution; the parent dispatcher parks blocking spawns on the child loop.
TemporalYesYesNo (fail-fast)Child workflows; blocking spawn returns the child's output inline. HITL inside children supported.
Cloudflare WorkflowsYesYesNo (fail-fast)Nested workflow instances; HITL inside children supported via cascade.
Cloudflare DOYesYesYesChildren are sibling DOs dispatched via subAgentNamespace (v7.0, commit fb3180f6b).
DBOSYes¹YesNo (fail-fast)Durable child workflows; waitForResult uses durable DBOS.sleep; completion notifier is a @DBOS.step.

¹ DBOS non-blocking spawn is fully supported. DBOS blocking spawn has a known limitation (it blocks until the workflow is idle and can mis-report a failed child) — tracked as FU-DBOS-BLOCKING-SPAWN-SEMANTICS. Prefer non-blocking spawn + companion__waitForResult on DBOS until that lands.

Workspaces on persistent children are supported only on the JS and Cloudflare DO runtimes. On Temporal, Cloudflare Workflows, and DBOS, a persistent child that declares a workspace (and is not inheritWorkspace: true) fails fast at spawn with a clear error (the "C8" guard) — those orchestration models can't host the stateful workspace lifecycle. See Workspaces below.

Workspaces

Persistent sub-agents can use workspaces in two modes:

Per-invocation (default)

Each companion__spawnAgent and companion__sendMessage cycle opens the child's workspaces fresh and closes them when the child exits. This is the safe default, but cost-bearing providers (Cloudflare sandboxes, R2 namespaces) pay the open() cost N times for N sends.

State persistence across invocations depends on the provider:

  • InMemoryWorkspaceProvider — state is LOST between invocations (in-memory state has no backing store).
  • CloudflareFileStoreWorkspace / CloudflareSandboxWorkspace — state persists via the underlying Durable Object storage; close+reopen cycles reattach to the same R2 prefix / sandbox container.
  • LocalBashWorkspace — state is LOST (per-cycle tmpdir).

Persistent

Not yet honored in v1 — reserved

workspaceLifetime: 'persistent' is a reserved type field. Only 'per-invocation' is currently honored by the JS runtime; 'persistent' is documented as the intended option but executor support is pending (per the workspaceLifetime JSDoc on packages/core/src/types/agent.ts). Until support lands, a config setting 'persistent' behaves as 'per-invocation'. To minimize per-invocation open-cost in the meantime, use providers whose resolve() reattaches efficiently (CloudflareFileStoreWorkspace, CloudflareSandboxWorkspace).

Set workspaceLifetime: 'persistent' on the persistentAgents entry to (once supported) keep workspaces open across the child's lifetime. They would close only when the child is terminated (via companion__terminateChild) or the parent shuts down.

typescript
persistentAgents: [
  {
    agent: ResearcherAgent,
    mode: 'blocking',
    workspaceLifetime: 'persistent', // workspaces open once, reused across sends
  },
],

Use 'persistent' when:

  • The child's workspaces have meaningful open-cost (sandboxes, network-backed FS).
  • The child receives many sendMessage calls in quick succession.

Use 'per-invocation' (default) when:

  • The child is short-lived or rarely re-invoked.
  • Workspace state needs to be reset between invocations.
  • The provider already handles open-cost cheaply (in-memory).

Inheriting the parent workspace

Set inheritWorkspace: true on a persistentAgents entry to share the parent's workspace with the child. Same semantics as the ephemeral createSubAgentTool({ inheritWorkspace: true }) flag — the child runs against the parent's WorkspaceRegistry directly. An inheriting child must NOT also declare its own workspace; doing so throws a clear, named error at sub-agent execution time.

typescript
persistentAgents: [
  {
    agent: ResearcherAgent,
    mode: 'blocking',
    inheritWorkspace: true, // child shares the parent's WorkspaceRegistry
  },
],

When inheritWorkspace: true, the workspaceLifetime field has no effect — the child uses the parent's registry, whose lifetime is bounded by the parent's runLoop.

Status of workspaceLifetime (round-5 D10). The workspaceLifetime field on a persistentAgents entry is reserved for future use. As of v1, all values behave as 'per-invocation' — the workspace opens at sub-agent spawn/resume and closes at sub-agent exit. The 'persistent' lifetime (the workspace stays open across multiple companion__sendMessage calls) is filed as a known follow-up. Until support lands, prefer providers whose resolve() reattaches efficiently (CloudflareFileStoreWorkspace, CloudflareSandboxWorkspace) to minimize per-invocation cost.

Practical guidance. Use inheritWorkspace: true when the child should operate on the parent's workspace (a working notes workspace, a shared cache). For sub-agent operations whose workspace state must be isolated from the parent's session, leave inheritWorkspace unset and declare a workspace on the child config — the child's runLoop owns its own registry and persistRef writes to the child's session state.

Sibling workspace visibility (round-5 D17)

When TWO sibling sub-agents both opt into inheritWorkspace: true, they share the SAME physical workspace storage via the parent's registry. Concretely:

  • Sibling A writes /notes/idea.md. Sibling B can read it back.
  • Sibling B's writes are visible to Sibling A and to the parent.
  • All three (parent + A + B) operate against the same WorkspaceRegistry instance.

This is the natural consequence of registry sharing — the registry is a per-session singleton, and inheritWorkspace: true means "use the parent's registry directly." There is no per-sub-agent isolation when inheriting.

If sibling sub-agents need workspace isolation from each other, do NOT set inheritWorkspace: true on either; declare each sub-agent's workspace on the child config so each gets its own isolated registry.

Reserved tool prefixes

The companion__ prefix is reserved by the framework for the auto-injected companion tools. User-defined tools whose name starts with companion__ cause defineAgent() to throw at build time, regardless of whether the agent declares any persistentAgents. This is enforced unconditionally so the prefix's reserved status is a stable contract — your agent code keeps working when you add a persistent sub-agent later. Use any other naming pattern (e.g. helper__listChildren, myCompanion) for your own tools.

The workspace_ prefix is similarly reserved (see Workspaces — reserved prefix).

Best Practices

1. Clear Output Schemas

Define precise output schemas for clear contracts:

typescript
// Good: Specific schema
const agent = defineAgent({
  outputSchema: z.object({
    sentiment: z.enum(['positive', 'negative', 'neutral']),
    confidence: z.number().min(0).max(1),
    reasoning: z.string(),
  }),
});

// Avoid: Vague schema
const agent = defineAgent({
  outputSchema: z.object({
    result: z.unknown(), // What is this?
  }),
});

2. Descriptive Sub-Agent Tools

Help the parent LLM choose correctly:

typescript
const tool = createSubAgentTool(AnalyzerAgent, z.object({ text: z.string() }), {
  description: `Analyze text for sentiment and extract key topics.
Use when you need:
- Sentiment classification (positive/negative/neutral)
- Topic extraction from text
- Confidence scores for analysis

Returns: { sentiment, confidence, topics }`,
});

3. Appropriate Granularity

Balance specialization vs. overhead:

typescript
// Good: Meaningful specialization
const FactCheckerAgent = defineAgent({
  /* verifies claims */
});
const SummarizerAgent = defineAgent({
  /* creates summaries */
});

// Avoid: Over-granular
const CapitalizerAgent = defineAgent({
  /* just capitalizes text */
});
// ^ This is better as a regular tool or string method

4. Handle Sub-Agent Limits

Set appropriate maxSteps for sub-agents, and use timeoutMs to enforce wall-clock limits:

typescript
const SubAgent = defineAgent({
  name: 'focused-task',
  maxSteps: 5, // Sub-agents should complete quickly
  // ...
});

const subAgentTool = createSubAgentTool(SubAgent, inputSchema, {
  timeoutMs: 30_000, // 30-second wall-clock limit (important in DO runtime)
});

5. Test Sub-Agents Independently

Sub-agents are full agents - test them alone first:

typescript
// Test sub-agent directly
const subHandle = await executor.execute(AnalyzerAgent, 'Test text');
const subResult = await subHandle.result();
expect(subResult.status).toBe('completed');

// Then test in orchestration
const parentHandle = await executor.execute(OrchestratorAgent, 'Analyze this');
const parentResult = await parentHandle.result();

Limitations

No Shared State

Sub-agents cannot access parent state. Design inputs/outputs to carry needed context:

typescript
// Pass context through input
const tool = createSubAgentTool(
  SubAgent,
  z.object({
    query: z.string(),
    context: z.object({
      previousFindings: z.array(z.string()),
      constraints: z.array(z.string()),
    }),
  })
);

Sequential by Default

Multiple sub-agent calls in one LLM response may execute in parallel, but the parent waits for all before continuing.

Overhead

Each sub-agent invocation includes:

  • State initialization
  • Full agent loop (potentially multiple LLM calls)
  • State persistence

For simple transformations, prefer regular tools.

Lifecycle Hook Guarantees

Sub-agents fire their own lifecycle hooks (onAgentStart, onAgentComplete, onAgentFail) independently from the parent agent. This is important for tracing integrations that need to emit spans for each sub-agent. The parent's stream is not closed when a sub-agent completes — only its hooks fire. This behavior is consistent across all runtimes. See Sub-Agent Execution Internals for implementation details.

Next Steps

Released under the MIT License.