Sub-Agent Orchestration
Sub-agents enable hierarchical agent systems where a parent agent can delegate specialized tasks to child agents. This is powerful for complex workflows that benefit from separation of concerns.
What Are Sub-Agents?
A sub-agent is an agent invoked by another agent as if it were a tool. The parent agent decides when to delegate, the sub-agent executes to completion, and the result flows back to the parent.
graph TB
Parent["Parent Agent"]
Parent --> Tools["Uses regular tools"]
Parent --> SubAgent["Delegates to Sub-Agent"]
SubAgent --> Run["Sub-agent runs to completion"]
SubAgent --> Result["Result returned to parent"]Creating Sub-Agent Tools
Use createSubAgentTool() to turn an agent into a tool:
import { defineAgent, createSubAgentTool } from '@helix-agents/sdk';
import { z } from 'zod';
// Define a specialized agent
const AnalyzerAgent = defineAgent({
name: 'text-analyzer',
description: 'Analyzes text for sentiment and topics',
systemPrompt: 'You analyze text. Determine sentiment and extract key topics.',
outputSchema: z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number(),
topics: z.array(z.string()),
}),
llmConfig: { model: openai('gpt-4o-mini') },
});
// Create a tool that invokes this agent
const analyzeTool = createSubAgentTool(
AnalyzerAgent,
z.object({
text: z.string().describe('Text to analyze'),
}),
{
description: 'Analyze text for sentiment and key topics',
timeoutMs: 60_000, // Optional: per-tool timeout in ms
}
);
// Use in parent agent
const OrchestratorAgent = defineAgent({
name: 'orchestrator',
systemPrompt: 'You coordinate research. Use the analyzer for sentiment analysis.',
tools: [searchTool, analyzeTool], // Mix regular tools and sub-agents
llmConfig: { model: openai('gpt-4o') },
});Requirements
Sub-agents must have outputSchema:
// ✓ Valid sub-agent - has outputSchema
const ValidSubAgent = defineAgent({
name: 'valid',
outputSchema: z.object({ result: z.string() }), // Required!
// ...
});
// ✗ Invalid sub-agent - no outputSchema
const InvalidSubAgent = defineAgent({
name: 'invalid',
// No outputSchema - will throw error when used as sub-agent
// ...
});The outputSchema defines the contract between parent and child - what the parent receives as the tool result.
Cloudflare Durable Objects Runtime
In the DO runtime, createSubAgentTool() works transparently — no changes to your agent definitions are needed. Internally, each sub-agent is routed to a sibling DO instance with its own isolated SQLite state. You only need to add subAgentNamespace to your createAgentServer config, and register all sub-agents in the same AgentRegistry. See Sub-Agents in the DO Runtime for setup details.
HITL cascade: when a sub-agent suspends (v7)
If a sub-agent calls a client-executed tool or hits an approval gate, the v7 stateless-suspension model cascades the suspension up to the parent:
- The child's run loop yields
RunOutcome.suspended_client_tooland persists its pending entries to its ownSessionState.pendingClientToolCalls. - The parent's run loop observes that a child is incomplete and yields
RunOutcome.suspended_awaiting_childrenwith aSuspendedChildWait[]payload listing each pending child. - The parent state writes
suspendedAwaitingChildrento itssuspensionContextdurably. - The root-session
clientToolCallOwnershipmap records thattoolCallIdis owned by the child's session — the framework persists this mapping atomically with the child's pending-entry write.
When the client submits a result, it always submits against the root sessionId (not the sub-agent's). The framework looks up clientToolCallOwnership[toolCallId] to find the owning sub-agent session, writes the response into that child's pendingClientToolCalls, and the next executor.resume({ sessionId: rootSessionId }) cascades down: the child resumes and runs to completion, then the parent's suspendedAwaitingChildren clears and the parent resumes.
This means consumers never need to know about sub-agent sessions when submitting — they always submit against the root. The cascade is internal to the runtime.
Persistent sub-agents on Cloudflare Durable Objects
As of v7.0 (commit fb3180f6b), the Cloudflare DO runtime supports persistent sub-agents via DO-stub dispatch. Each persistent child runs in its own DO instance addressed by stable sessionId ({parent}-agent-{name}); the parent's auto-injected companion__* tools (five always, plus companion__waitForResult when a child is mode: 'blocking') translate into subAgentNamespace stub calls against the existing sub-agent endpoints. Ephemeral sub-agents (createSubAgentTool) work fully on every runtime that supports HITL.
How Sub-Agent Execution Works
When the parent LLM calls a sub-agent tool:
- Tool Call Detection - Framework identifies the tool as a sub-agent (prefixed with
subagent__) - Sub-Agent Initialization - New agent run created with:
- Fresh
runId(derived from parent's ID) - Same
streamIdas parent (for unified streaming) - Input converted to user message
- Fresh
- Execution - Sub-agent runs its full loop until completion
- Result Return - Sub-agent's
outputbecomes the tool result
// Parent LLM calls:
{
name: 'subagent__text-analyzer',
arguments: { text: 'This product is amazing!' }
}
// Framework:
// 1. Creates sub-agent run
// 2. Converts input to message: "This product is amazing!"
// 3. Runs AnalyzerAgent loop
// 4. Returns output: { sentiment: 'positive', confidence: 0.95, topics: ['product'] }Input Mapping
The input schema defines what arguments the parent provides. These are converted to a user message for the sub-agent:
const subAgentTool = createSubAgentTool(
SubAgent,
z.object({
query: z.string(), // Common field names are recognized
context: z.string().optional(),
})
);
// When called with { query: "analyze this", context: "..." }
// Sub-agent receives user message: '{"query":"analyze this","context":"..."}'The full tool input is JSON-serialized and sent as the user message. The remote agent server is responsible for parsing the JSON and constructing the appropriate context for its agent.
Streaming Integration
Sub-agent events stream alongside parent events on the same stream:
for await (const chunk of stream) {
switch (chunk.type) {
case 'text_delta':
// Could be from parent or sub-agent
console.log(`[${chunk.agentType}]`, chunk.delta);
break;
case 'subagent_start':
console.log(`Starting sub-agent: ${chunk.subAgentType}`);
break;
case 'subagent_end':
console.log(`Sub-agent ${chunk.subAgentType} result:`, chunk.result);
break;
case 'tool_start':
// Includes sub-agent tool calls from within sub-agents
console.log(`[${chunk.agentType}] Tool: ${chunk.toolName}`);
break;
}
}Sub-agent streaming events:
tool_startfor the parent'ssubagent__<name>tool callsubagent_startwhen the sub-agent begins- All proxied sub-agent chunks (
text_delta,tool_start/tool_endfor inner tools, etc.) subagent_endwhen the sub-agent completestool_endfor the parent'ssubagent__<name>tool call
The tool_start/tool_end pair on the parent is what closes the parent's dynamic-tool UI part for AI SDK consumers (transitions 'input-available' → 'output-available'). See Sub-agent chunk ordering for the full semantics, including failure-path behavior and the subagent_end → tool_end ordering invariant.
This enables real-time visibility into nested execution.
State Isolation
Sub-agents have completely isolated state:
const ParentAgent = defineAgent({
name: 'parent',
stateSchema: z.object({
parentCounter: z.number().default(0),
}),
// ...
});
const SubAgent = defineAgent({
name: 'child',
stateSchema: z.object({
childCounter: z.number().default(0), // Separate from parent
}),
// ...
});Key points:
- Sub-agent cannot read parent's custom state
- Parent cannot read sub-agent's custom state
- Each has its own
messages,stepCount, etc. - Sub-agent output is the only communication channel
To share data, pass it through the input and receive it in the output.
Error Handling
Sub-Agent Failures
If a sub-agent fails, the error becomes the tool result:
// Sub-agent throws error
throw new Error('Analysis failed: text too short');
// Parent receives tool result:
{
success: false,
error: 'Analysis failed: text too short'
}The parent LLM sees the error and can decide how to proceed (retry, try different approach, etc.).
Handling Errors
Check for failures in parent's tools or logic:
const processResultTool = defineTool({
name: 'process_analysis',
execute: async (input, context) => {
// The sub-agent result may have succeeded or failed
if (!input.analysisResult.success) {
// Handle sub-agent failure
return {
processed: false,
reason: input.analysisResult.error,
};
}
// Process successful result
const analysis = input.analysisResult.result;
// ...
},
});Nested Sub-Agents
Sub-agents can themselves have sub-agents:
// Level 3: Leaf agent
const SentimentAnalyzer = defineAgent({
name: 'sentiment',
outputSchema: z.object({ sentiment: z.string() }),
// ...
});
// Level 2: Uses sentiment analyzer
const TextProcessor = defineAgent({
name: 'processor',
tools: [createSubAgentTool(SentimentAnalyzer /* ... */)],
outputSchema: z.object({ processed: z.string() }),
// ...
});
// Level 1: Uses text processor
const Orchestrator = defineAgent({
name: 'orchestrator',
tools: [createSubAgentTool(TextProcessor /* ... */)],
// ...
});Stream events include all levels:
[orchestrator] text_delta: "Let me analyze..."
[orchestrator] subagent_start: processor
[processor] text_delta: "Processing..."
[processor] subagent_start: sentiment
[sentiment] text_delta: "Analyzing..."
[sentiment] output: { sentiment: "positive" }
[processor] subagent_end: sentiment
[processor] output: { processed: "..." }
[orchestrator] subagent_end: processor
[orchestrator] text_delta: "Based on the analysis..."Patterns
Specialist Pattern
Delegate specific tasks to specialists:
const ResearchAgent = defineAgent({
name: 'researcher',
tools: [
searchTool,
createSubAgentTool(FactCheckerAgent /* ... */),
createSubAgentTool(SummarizerAgent /* ... */),
],
systemPrompt: `You are a research coordinator.
1. Search for information
2. Send claims to the fact-checker
3. Send findings to the summarizer
4. Compile final report`,
});Pipeline Pattern
Chain agents in a processing pipeline:
// Each agent processes and passes to next
const ExtractorAgent = defineAgent({
name: 'extractor',
outputSchema: z.object({ entities: z.array(z.string()) }),
});
const EnricherAgent = defineAgent({
name: 'enricher',
outputSchema: z.object({
enrichedEntities: z.array(
z.object({
/* ... */
})
),
}),
});
const FormatterAgent = defineAgent({
name: 'formatter',
outputSchema: z.object({ formatted: z.string() }),
});
// Coordinator runs the pipeline
const PipelineAgent = defineAgent({
name: 'pipeline',
tools: [
createSubAgentTool(ExtractorAgent, z.object({ text: z.string() })),
createSubAgentTool(EnricherAgent, z.object({ entities: z.array(z.string()) })),
createSubAgentTool(FormatterAgent, z.object({ data: z.unknown() })),
],
systemPrompt: `Process text through the pipeline:
1. Extract entities
2. Enrich each entity
3. Format the output`,
});Parallel Delegation Pattern
Delegate multiple tasks simultaneously:
const MultiAnalyzerAgent = defineAgent({
name: 'multi-analyzer',
tools: [
createSubAgentTool(SentimentAgent, z.object({ text: z.string() })),
createSubAgentTool(TopicAgent, z.object({ text: z.string() })),
createSubAgentTool(EntityAgent, z.object({ text: z.string() })),
],
systemPrompt: `Analyze text from multiple angles.
You can run multiple analyses in parallel.
Combine results into a comprehensive report.`,
});The framework executes parallel tool calls concurrently when the LLM requests them.
Conditional Delegation Pattern
Delegate based on input characteristics:
const RouterAgent = defineAgent({
name: 'router',
tools: [
createSubAgentTool(SimpleQAAgent, z.object({ question: z.string() }), {
description: 'For simple factual questions',
}),
createSubAgentTool(ResearchAgent, z.object({ topic: z.string() }), {
description: 'For topics requiring deep research',
}),
createSubAgentTool(MathAgent, z.object({ problem: z.string() }), {
description: 'For mathematical calculations',
}),
],
systemPrompt: `Route questions to the appropriate specialist:
- Simple facts → SimpleQA
- Complex topics → Research
- Math problems → Math
Choose the best agent for each request.`,
});Remote Sub-Agents
For agents running on a separate HTTP service, use createRemoteSubAgentTool() instead of createSubAgentTool(). This enables cross-service and cross-runtime delegation via HTTP + SSE.
import {
defineAgent,
createRemoteSubAgentTool,
HttpRemoteAgentTransport,
} from '@helix-agents/core';
import { z } from 'zod';
const transport = new HttpRemoteAgentTransport({
url: 'http://localhost:4000',
});
const researcherTool = createRemoteSubAgentTool('researcher', {
description: 'Delegate research to a remote specialist agent',
inputSchema: z.object({ query: z.string() }),
outputSchema: z.object({
findings: z.array(z.object({ title: z.string(), snippet: z.string() })),
}),
transport,
remoteAgentType: 'researcher',
timeoutMs: 120_000,
});
const OrchestratorAgent = defineAgent({
name: 'orchestrator',
tools: [researcherTool], // Works like any other tool
// ...
});Remote sub-agents stream events using the same subagent_start/subagent_end protocol as local sub-agents, so frontends don't need to distinguish between them.
Remote sub-agents cannot use client-executed tools
A remote sub-agent that calls a client-executed tool (execute: 'client') cannot be resumed — the browser-submitted result has no route across the HTTP boundary to the remote server's pending state. This is enforced: the parent fails fast with RemoteSubAgentClientToolUnsupportedError (dispatch failureReason: 'client-tool-unsupported') instead of hanging. Tracked for future support in GitLab #107.
For the full guide — including server setup, transport configuration, and production considerations — see Remote Agents.
Persistent Sub-Agents
Overview
Persistent sub-agents are long-lived child agents that maintain state across multiple interactions. Unlike ephemeral sub-agents (created with createSubAgentTool()), persistent children can receive follow-up messages and be managed throughout the parent's lifecycle.
Configure persistent sub-agents via the persistentAgents field on AgentConfig:
import { defineAgent } from '@helix-agents/core';
import { z } from 'zod';
const ResearcherAgent = defineAgent({
name: 'researcher',
systemPrompt: 'You research topics.',
outputSchema: z.object({ findings: z.string() }),
llmConfig: { model: openai('gpt-4o-mini') },
});
const OrchestratorAgent = defineAgent({
name: 'orchestrator',
systemPrompt: 'You coordinate research tasks using your persistent children.',
outputSchema: z.object({ summary: z.string() }),
persistentAgents: [{ agent: ResearcherAgent, mode: 'blocking' }],
llmConfig: { model: openai('gpt-4o') },
});Two Modes
Blocking (mode: 'blocking'): Parent waits for the child to complete before continuing. Use when you need the child's result before making further decisions.
Non-blocking (mode: 'non-blocking'): Parent continues immediately after spawning. Child runs concurrently and the parent receives a completion notification later. Use for fire-and-forget background tasks.
persistentAgents: [
{ agent: ResearcherAgent, mode: 'blocking' }, // Parent waits
{ agent: BackgroundWorker, mode: 'non-blocking' }, // Fire-and-forget
],Note: Each agent type can only appear once in
persistentAgents.defineAgent()will throw if the same agent type appears multiple times. To use the same agent logic in both modes, create two separate agent definitions with distinct names.
Companion Tools
When persistentAgents is configured, companion tools are auto-injected into the parent agent (prefixed with companion__). Five are always injected; companion__waitForResult is added only when at least one persistent child is configured mode: 'blocking'. The parent's LLM decides when and how to call them based on the system prompt and conversation context — you never wire them by hand.
| Tool | Injected when | Description |
|---|---|---|
companion__spawnAgent | always | Create and start a new persistent child. |
companion__sendMessage | always | Send a follow-up message to an active child. On all five runtimes a completed child is continued on its preserved session (memory retained) instead of erroring. |
companion__listChildren | always | List all persistent children and their current statuses. |
companion__getChildStatus | always | Get detailed status (and last output) of one child by name. |
companion__terminateChild | always | Terminate a running child. |
companion__waitForResult | only if a child blocks | Block until a child completes and return its result. |
Tool reference (arguments → result)
The argument and result schemas below are the exact Zod shapes the framework injects (packages/core/src/tools/companion/*.ts). Identical on every runtime.
companion__spawnAgent — start a new persistent child.
- Args:
{ agent: string, initialMessage: string, name?: string }—agentis one of the parent's configuredpersistentAgentstypes (the LLM sees it as an enum).nameis optional; omit it for auto-naming. When spawning multiple children of the same type in a single step, pass an explicit uniquenamefor each — concurrent auto-naming can collide. - Result:
{ name: string, status: string }. For a non-blocking child,statusis'spawned'/'running'and the call returns immediately. For a blocking child, the spawn waits for the child to reach a terminal status and the result also carries the child's output inline (delivered into the parent's tool result). - Errors (
{ error }tool result, not a thrown exception): unknown agent type (Unknown persistent agent type), anamealready in use by an active child (already running), or a spawn failure (the orphaned ref is compensated tofailedso the name stays re-spawnable).
companion__sendMessage — queue a follow-up message for a running child.
- Args:
{ name: string, message: string }. - Result:
{ delivered: boolean }.delivered: false(not an error) when the child's loop has already exited between the parent reading its status and the send (I1-race) — the message is intentionally not stranded against a dead loop. - Errors: unknown child (
No child agent found), or child currently suspended on unresolved client-tool calls (unresolved client-tool calls— submit those or terminate first). For a child in a terminal status (is not active): acompletedchild no longer errors on any of the five runtimes — it is continued on its preserved session (memory retained, fresh output). Afailedorterminatedchild still errors (usespawnAgentto re-spawn). Sending to aninterruptedchild re-spawns a fresh execution to consume it.
companion__listChildren — {} → Array<{ name, agent, status }> (persistent children only; ephemeral sub-agent refs are filtered out). Stale running refs whose underlying child has actually finished are lazily synced before returning.
companion__getChildStatus — { name: string } → { name, agent, status, lastOutput? }. lastOutput is the child's output once it has completed. Unknown name → { error: 'No child agent found...' }.
companion__waitForResult — block until a named child is terminal.
- Args:
{ name: string, timeout?: number }.timeoutis in milliseconds; omit it to wait indefinitely (until the child is terminal or the call is cancelled). On durable runtimes the wait is durable (Temporal activity /DBOS.sleep) and survives crashes. - Result:
{ name, status, result? }—resultis the child's output for acompletedchild; forfailed/terminatedthere is no result. If the timeout elapses first,statusis atimeoutmarker (the child keeps running). - Returning a terminal result here marks the child's ref
completionDeliveredso the per-turn completion notifier does not also re-deliver it (see below).
companion__terminateChild — { name: string } → { name, terminated: boolean, status }. Terminate-truth: terminating an already-terminal child returns terminated: false and preserves its terminal ref (it does not lie by reporting a kill or clobber a completed result to terminated). Terminating a live child sets its interrupt flag, marks the ref terminated, and flags it completionDelivered so it isn't re-scanned.
All companion tools validate their arguments at dispatch; a malformed call returns a clean
{ error }tool result rather than throwing. The validated constraints are:namemust be non-empty and ≤128 characters; the companion message (initialMessageonspawnAgent,messageonsendMessage) must be non-empty (an empty message is rejected); andtimeout(onwaitForResult) must be positive.
Child Naming
Children can be named explicitly via the name argument in companion__spawnAgent, or auto-named using the pattern {agentType}-{counter} (e.g., researcher-1, researcher-2).
// Explicit naming
spawnAgent({
agent: 'researcher',
initialMessage: 'Research AI safety',
name: 'safety-researcher',
});
// Auto-naming (uses counter)
spawnAgent({ agent: 'researcher', initialMessage: 'Research quantum computing' });
// -> named 'researcher-1'
spawnAgent({ agent: 'researcher', initialMessage: 'Research fusion energy' });
// -> named 'researcher-2'Session IDs
Each persistent child gets a deterministic session ID: {parentSessionId}-agent-{name}. This enables:
- Stable references across parent restarts
- Predictable state store lookups
- Clean cleanup on parent completion
Re-spawning
If a child with the same name is spawned after a previous one reached a terminal status, behavior depends on the prior status and the runtime:
completed(all five runtimes): the spawn continues on the preserved session — the old session is not deleted, so the child's memory is retained — and theinitialMessageis appended as the next turn's input. TheSubSessionRefis reset torunningand itscompletionDeliveredflag is cleared so the new run's completion is delivered. On durable runtimes the continuation is replay-safe: Temporal does the store-side reopen inside an activity; DBOS starts a fresh persistent restart workflow with a deterministic id derived from thetoolCall.id(so a workflow-body replay never double-starts it) and CAS-gates the reopen so the consult is appended exactly once. On Cloudflare Workflows the continuation runs as a fresh workflow instance with a unique-but-deterministic id (agent__<type>__<childSession>__continue__<stepCount>__<toolCallId>— a Cloudflare Workflows instance id is write-once globally, so the completed child's base id can't be recreated); the consult is carried as the new instance'snewMessagesand appended exactly once by the instance's!isResumablecontinuation branch (the same path as root multi-turn continuation), so the child session stayscompleteduntil the new instance reopens it.failed/terminated(all runtimes): the old session is cleaned up (deleteSession, best-effort) and a new one starts fresh; theSubSessionRefis reset torunningandcompletionDeliveredis cleared.
You cannot re-use the name of a still-active child — that returns an already running error.
Re-consulting a persistent companion (the critic loop)
A persistent companion's defining feature is that it retains memory across rounds. The canonical use is a maker → critic loop: a parent produces an artifact, spawns a critic that returns a typed verdict, reads the verdict, fixes the artifact, then re-consults the SAME critic — which still remembers the prior round and can comment on what changed.
const CriticAgent = defineAgent({
name: 'critic',
systemPrompt: `You review the maker's artifact and return a verdict.
You remember prior rounds, so call out what changed since last time.`,
outputSchema: z.object({
verdict: z.enum(['pass', 'revise']),
notes: z.string(),
}),
tools: [loadArtifactTool], // loads the current artifact BY REFERENCE (see below)
llmConfig: { model: openai('gpt-4o-mini') },
});
const MakerAgent = defineAgent({
name: 'maker',
systemPrompt: `Produce an artifact, then consult the 'critic' companion.
Re-consult the SAME critic ('reviewer') after each fix until it returns
verdict: 'pass', or you hit your own round cap.`,
outputSchema: z.object({ final: z.string() }),
// `PersistentAgentConfig` has no `name` field — `agent` is the registered
// agent TYPE. The stable instance name ('reviewer') is pinned at SPAWN time
// via companion__spawnAgent({ name: 'reviewer', ... }); that name is what you
// re-consult on later rounds.
persistentAgents: [{ agent: CriticAgent, mode: 'blocking' }],
// companion__spawnAgent / sendMessage / waitForResult / ... auto-injected.
});A single round of the loop, as the maker's LLM drives it:
// Round 1 — blocking spawn returns the typed verdict INLINE on `.output`.
const r1 = companion__spawnAgent({
agent: 'critic',
name: 'reviewer',
initialMessage: 'Review artifact v1.',
});
// r1 === { name: 'reviewer', status: 'completed', output: { verdict: 'revise', notes: '...' } }
if (r1.output.verdict === 'revise') {
// ...fix the artifact (v2)...
// Round 2 — re-consult the SAME critic by spawning with the SAME name.
// The session is continued (not recreated): the critic still remembers v1.
const r2 = companion__spawnAgent({
agent: 'critic',
name: 'reviewer',
initialMessage: 'Review artifact v2 — I addressed your notes.',
});
// r2.output.verdict is the FRESH verdict; loop until 'pass' or your cap.
}Ergonomics — prefer blocking spawnAgent for a typed re-consult. A blocking spawnAgent re-consult returns the fresh typed verdict inline as the clean .output of the tool result — this is the recommended path. companion__sendMessage re-consult returns only { delivered: true } (NOT the output); to read the verdict you must follow it with a companion__waitForResult({ name }) call in a later step. A minor cross-runtime nuance: on Temporal, a blocking child's sendMessage-continue may enrich the tool result with the output inline, but do not rely on that — for a portable typed re-consult, use blocking spawnAgent (or sendMessage + waitForResult).
Re-consulting with a revised artifact's bytes. The companion message (initialMessage on spawnAgent, message on sendMessage) is text-only — you cannot attach bytes (a revised image, a binary, etc.) to it. To re-consult over a non-text artifact, give the critic a tool that loads the artifact by reference (loadArtifactTool above, keyed by an id/path/URL the maker passes in the text message); the critic fetches the current bytes itself each round.
Known limitations:
- Cloudflare pre-heal upgrade gap. A structured-output companion that
completedunder a prior release — before the per-step__finish__heal shipped — is auto-healed on its first re-consult on JS / Temporal / DBOS, but not on Cloudflare (DO or Workflows). On Cloudflare, complete such a child once under the new release before re-consulting; otherwise its first re-consult may send a malformed transcript (a dangling__finish__tool_use with no tool_result). - Same-turn
waitForResult+sendMessagerace. Callingcompanion__waitForResultandcompanion__sendMessageon the same completed child within one assistant message can race and return the stale prior output. Re-consult and read the result in separate turns (e.g.sendMessage/spawnAgentin one turn, read the verdict in the next).
Performance & resource considerations:
- Per-round token cost grows with history. Each re-consult is a continuation on the preserved child session: the critic remembers every prior round (intended — that's what makes the loop converge). But the child's transcript accumulates, so the token cost of each round grows with the number of rounds. Cap your re-consult rounds in your loop policy (the critic pattern above does this with a bounded
forloop / a max-rounds guard), rather than looping until convergence unbounded. - Each re-consult round gets a fresh per-turn
maxStepsbudget. A re-consult is a new turn on the child session, so the child's step count restarts and the fullmaxStepsbudget applies again each round —maxStepsbounds a single round, not the critic's lifetime across rounds. (Same per-turn budget model as root continuation; seemaxSteps.) - Durable runtimes start a fresh child workflow/instance per round. On Temporal, DBOS, and Cloudflare Workflows, each re-consult begins a brand-new child workflow/instance for that round (the durability trade-off — every round is independently replayable/recoverable). For very high-frequency critic loops, the in-process JS runtime has the lowest per-round overhead (no workflow/instance spin-up).
- No built-in per-parent child cap. There is no framework limit on the number of distinct persistent children a parent may spawn. Re-consulting the same named child reuses its session (no new ref), but spawning many distinctly-named children grows the parent's
SubSessionReflist unboundedly — bound this in your own agent design if a parent can fan out widely.
How the parent learns a child finished
There are two ways the parent observes a child's outcome, and they cooperate so a completion is delivered exactly once:
- Pull (blocking spawn or
companion__waitForResult) — the result is returned inline as the tool result, in the same turn. The child's ref is flaggedcompletionDelivered: trueat that moment. - Push (deliver-on-next-turn) — for a non-blocking child that finishes on its own, a per-turn completion notifier injects a hidden message at the start of the parent's next turn (e.g.
Sub-agent 'researcher-1' completed with result: …, orSub-agent 'researcher-1' failed: …if it failed). This is failure-aware (a failed child yields a failure notification, not silence) and skipsterminatedchildren (the parent killed those deliberately).
The completionDelivered flag on the SubSessionRef is the durable dedup: once a child's outcome has been delivered by either path, the notifier won't re-inject it — even across a process restart, a parent re-entry, or a step retry. This is why a child waited-on via waitForResult doesn't also produce a duplicate push notification on the next turn.
State Tracking
Persistent children are tracked via SubSessionRef entries with mode: 'persistent':
interface SubSessionRef {
subSessionId: string;
agentType: string;
parentToolCallId: string;
status:
| 'running'
| 'completed'
| 'failed'
| 'interrupted'
| 'terminated'
| 'paused_awaiting_client'; // child suspended on a client-executed tool
startedAt: number;
completedAt?: number;
mode: 'ephemeral' | 'persistent'; // 'persistent' for companion-managed children
name?: string; // required when mode === 'persistent'
completionDelivered?: boolean; // durable dedup: outcome already delivered to the parent
}Usage tracking
A persistent companion records token, tool, and custom usage (ctx.recordUsage) under its own child session — exactly like an ephemeral sub-agent — and the parent records a discoverable kind:'subagent' entry, so getUsageRollup({ includeSubAgents: true }) on the parent aggregates the companion's cost. A companion's usage accumulates across re-consults on its preserved session and is counted exactly once. The parent-side entry is a discovery pointer (provisional duration/success); read the child-session rollup for a companion's true outcome/duration. See Usage Tracking → Persistent companions. (Exception: the Cloudflare Durable Object runtime stores usage per-DO and can't aggregate cross-DO; the CFW Workflows runtime requires a usageStore in its workflow dependencies — see that guide.)
Ephemeral vs Persistent Comparison
| Feature | Ephemeral (createSubAgentTool) | Persistent (persistentAgents) |
|---|---|---|
| Created by | Parent's tool call to subagent__ | companion__spawnAgent tool |
| Lifecycle | Runs to completion, result returned | Long-lived, can receive messages |
| Follow-up messages | Not supported | Via companion__sendMessage |
| Result access | Immediate (tool result) | Via companion__getChildStatus or companion__waitForResult |
| Naming | Auto-generated | Explicit or auto-incremented |
| Mode | Always blocking | Blocking or non-blocking |
| Session ID | {parentSessionId}-sub-{callId} | {parentSessionId}-agent-{name} |
| SubSessionRef mode | 'ephemeral' | 'persistent' |
Example: Research Coordinator
const ResearcherAgent = defineAgent({
name: 'researcher',
systemPrompt: 'You research topics thoroughly and return findings.',
outputSchema: z.object({
findings: z.string(),
sources: z.array(z.string()),
}),
tools: [searchTool],
llmConfig: { model: openai('gpt-4o-mini') },
});
const CoordinatorAgent = defineAgent({
name: 'coordinator',
systemPrompt: `You coordinate research tasks.
You have persistent researcher children that you can spawn, send messages to, and check results.
1. Spawn researchers for different topics
2. Wait for their results
3. Compile a final summary`,
outputSchema: z.object({ summary: z.string() }),
persistentAgents: [
{
agent: ResearcherAgent,
mode: 'blocking',
description: 'Spawns researcher agents for deep dives',
},
],
llmConfig: { model: openai('gpt-4o') },
});The coordinator can then:
- Spawn
researcher-1for topic A - Spawn
researcher-2for topic B - Check status or wait for results
- Terminate if needed
- Compile final output
Runtime Support
Persistent sub-agents work across all five runtimes. The companion-tool surface is identical everywhere — the same tools (5 always, plus companion__waitForResult when a child is mode: 'blocking'), the same arguments, the same terminate-truth / completion-dedup semantics — because every runtime routes through the shared core dispatcher (executeCompanionToolDispatch). The differences below are in how a runtime starts and waits on a child, not in the LLM-facing behavior.
| Runtime | Blocking | Non-blocking | Workspaces on children | Notes |
|---|---|---|---|---|
| JS | Yes | Yes | Yes | In-process execution; the parent dispatcher parks blocking spawns on the child loop. |
| Temporal | Yes | Yes | No (fail-fast) | Child workflows; blocking spawn returns the child's output inline. HITL inside children supported. |
| Cloudflare Workflows | Yes | Yes | No (fail-fast) | Nested workflow instances; HITL inside children supported via cascade. |
| Cloudflare DO | Yes | Yes | Yes | Children are sibling DOs dispatched via subAgentNamespace (v7.0, commit fb3180f6b). |
| DBOS | Yes¹ | Yes | No (fail-fast) | Durable child workflows; waitForResult uses durable DBOS.sleep; completion notifier is a @DBOS.step. |
¹ DBOS non-blocking spawn is fully supported. DBOS blocking spawn has a known limitation (it blocks until the workflow is idle and can mis-report a failed child) — tracked as FU-DBOS-BLOCKING-SPAWN-SEMANTICS. Prefer non-blocking spawn + companion__waitForResult on DBOS until that lands.
Workspaces on persistent children are supported only on the JS and Cloudflare DO runtimes. On Temporal, Cloudflare Workflows, and DBOS, a persistent child that declares a
workspace(and is notinheritWorkspace: true) fails fast at spawn with a clear error (the "C8" guard) — those orchestration models can't host the stateful workspace lifecycle. See Workspaces below.
Workspaces
Persistent sub-agents can use workspaces in two modes:
Per-invocation (default)
Each companion__spawnAgent and companion__sendMessage cycle opens the child's workspaces fresh and closes them when the child exits. This is the safe default, but cost-bearing providers (Cloudflare sandboxes, R2 namespaces) pay the open() cost N times for N sends.
State persistence across invocations depends on the provider:
InMemoryWorkspaceProvider— state is LOST between invocations (in-memory state has no backing store).CloudflareFileStoreWorkspace/CloudflareSandboxWorkspace— state persists via the underlying Durable Object storage; close+reopen cycles reattach to the same R2 prefix / sandbox container.LocalBashWorkspace— state is LOST (per-cycle tmpdir).
Persistent
Not yet honored in v1 — reserved
workspaceLifetime: 'persistent' is a reserved type field. Only 'per-invocation' is currently honored by the JS runtime; 'persistent' is documented as the intended option but executor support is pending (per the workspaceLifetime JSDoc on packages/core/src/types/agent.ts). Until support lands, a config setting 'persistent' behaves as 'per-invocation'. To minimize per-invocation open-cost in the meantime, use providers whose resolve() reattaches efficiently (CloudflareFileStoreWorkspace, CloudflareSandboxWorkspace).
Set workspaceLifetime: 'persistent' on the persistentAgents entry to (once supported) keep workspaces open across the child's lifetime. They would close only when the child is terminated (via companion__terminateChild) or the parent shuts down.
persistentAgents: [
{
agent: ResearcherAgent,
mode: 'blocking',
workspaceLifetime: 'persistent', // workspaces open once, reused across sends
},
],Use 'persistent' when:
- The child's workspaces have meaningful open-cost (sandboxes, network-backed FS).
- The child receives many
sendMessagecalls in quick succession.
Use 'per-invocation' (default) when:
- The child is short-lived or rarely re-invoked.
- Workspace state needs to be reset between invocations.
- The provider already handles open-cost cheaply (in-memory).
Inheriting the parent workspace
Set inheritWorkspace: true on a persistentAgents entry to share the parent's workspace with the child. Same semantics as the ephemeral createSubAgentTool({ inheritWorkspace: true }) flag — the child runs against the parent's WorkspaceRegistry directly. An inheriting child must NOT also declare its own workspace; doing so throws a clear, named error at sub-agent execution time.
persistentAgents: [
{
agent: ResearcherAgent,
mode: 'blocking',
inheritWorkspace: true, // child shares the parent's WorkspaceRegistry
},
],When inheritWorkspace: true, the workspaceLifetime field has no effect — the child uses the parent's registry, whose lifetime is bounded by the parent's runLoop.
Status of
workspaceLifetime(round-5 D10). TheworkspaceLifetimefield on apersistentAgentsentry is reserved for future use. As of v1, all values behave as'per-invocation'— the workspace opens at sub-agent spawn/resume and closes at sub-agent exit. The'persistent'lifetime (the workspace stays open across multiplecompanion__sendMessagecalls) is filed as a known follow-up. Until support lands, prefer providers whoseresolve()reattaches efficiently (CloudflareFileStoreWorkspace,CloudflareSandboxWorkspace) to minimize per-invocation cost.
Practical guidance. Use inheritWorkspace: true when the child should operate on the parent's workspace (a working notes workspace, a shared cache). For sub-agent operations whose workspace state must be isolated from the parent's session, leave inheritWorkspace unset and declare a workspace on the child config — the child's runLoop owns its own registry and persistRef writes to the child's session state.
Sibling workspace visibility (round-5 D17)
When TWO sibling sub-agents both opt into inheritWorkspace: true, they share the SAME physical workspace storage via the parent's registry. Concretely:
- Sibling A writes
/notes/idea.md. Sibling B can read it back. - Sibling B's writes are visible to Sibling A and to the parent.
- All three (parent + A + B) operate against the same
WorkspaceRegistryinstance.
This is the natural consequence of registry sharing — the registry is a per-session singleton, and inheritWorkspace: true means "use the parent's registry directly." There is no per-sub-agent isolation when inheriting.
If sibling sub-agents need workspace isolation from each other, do NOT set inheritWorkspace: true on either; declare each sub-agent's workspace on the child config so each gets its own isolated registry.
Reserved tool prefixes
The companion__ prefix is reserved by the framework for the auto-injected companion tools. User-defined tools whose name starts with companion__ cause defineAgent() to throw at build time, regardless of whether the agent declares any persistentAgents. This is enforced unconditionally so the prefix's reserved status is a stable contract — your agent code keeps working when you add a persistent sub-agent later. Use any other naming pattern (e.g. helper__listChildren, myCompanion) for your own tools.
The workspace_ prefix is similarly reserved (see Workspaces — reserved prefix).
Best Practices
1. Clear Output Schemas
Define precise output schemas for clear contracts:
// Good: Specific schema
const agent = defineAgent({
outputSchema: z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
confidence: z.number().min(0).max(1),
reasoning: z.string(),
}),
});
// Avoid: Vague schema
const agent = defineAgent({
outputSchema: z.object({
result: z.unknown(), // What is this?
}),
});2. Descriptive Sub-Agent Tools
Help the parent LLM choose correctly:
const tool = createSubAgentTool(AnalyzerAgent, z.object({ text: z.string() }), {
description: `Analyze text for sentiment and extract key topics.
Use when you need:
- Sentiment classification (positive/negative/neutral)
- Topic extraction from text
- Confidence scores for analysis
Returns: { sentiment, confidence, topics }`,
});3. Appropriate Granularity
Balance specialization vs. overhead:
// Good: Meaningful specialization
const FactCheckerAgent = defineAgent({
/* verifies claims */
});
const SummarizerAgent = defineAgent({
/* creates summaries */
});
// Avoid: Over-granular
const CapitalizerAgent = defineAgent({
/* just capitalizes text */
});
// ^ This is better as a regular tool or string method4. Handle Sub-Agent Limits
Set appropriate maxSteps for sub-agents, and use timeoutMs to enforce wall-clock limits:
const SubAgent = defineAgent({
name: 'focused-task',
maxSteps: 5, // Sub-agents should complete quickly
// ...
});
const subAgentTool = createSubAgentTool(SubAgent, inputSchema, {
timeoutMs: 30_000, // 30-second wall-clock limit (important in DO runtime)
});5. Test Sub-Agents Independently
Sub-agents are full agents - test them alone first:
// Test sub-agent directly
const subHandle = await executor.execute(AnalyzerAgent, 'Test text');
const subResult = await subHandle.result();
expect(subResult.status).toBe('completed');
// Then test in orchestration
const parentHandle = await executor.execute(OrchestratorAgent, 'Analyze this');
const parentResult = await parentHandle.result();Limitations
No Shared State
Sub-agents cannot access parent state. Design inputs/outputs to carry needed context:
// Pass context through input
const tool = createSubAgentTool(
SubAgent,
z.object({
query: z.string(),
context: z.object({
previousFindings: z.array(z.string()),
constraints: z.array(z.string()),
}),
})
);Sequential by Default
Multiple sub-agent calls in one LLM response may execute in parallel, but the parent waits for all before continuing.
Overhead
Each sub-agent invocation includes:
- State initialization
- Full agent loop (potentially multiple LLM calls)
- State persistence
For simple transformations, prefer regular tools.
Lifecycle Hook Guarantees
Sub-agents fire their own lifecycle hooks (onAgentStart, onAgentComplete, onAgentFail) independently from the parent agent. This is important for tracing integrations that need to emit spans for each sub-agent. The parent's stream is not closed when a sub-agent completes — only its hooks fire. This behavior is consistent across all runtimes. See Sub-Agent Execution Internals for implementation details.
Next Steps
- Remote Agents - Delegate to agents on separate HTTP services
- Streaming - Handle sub-agent stream events
- Interrupt and Resume - How interrupts propagate through sub-agent hierarchies
- Runtimes - How different runtimes handle sub-agents
- Persistent Sub-Agents example - A runnable, offline (mock-LLM) companion-tool demo
- Examples - Real-world orchestration examples
- Hooks - Observe sub-agent execution with
beforeSubAgentandafterSubAgenthooks