Skip to content

v6 to v7 Migration Guide — Stateless Suspension Redesign

This guide covers the v7 release of @helix-agents/*, the largest single-version change since the project began. v7 reshapes how the runtime suspends and resumes for human-in-the-loop (HITL) interactions — client-executed tools and approval-gated tools — by removing every in-memory bridge between an HTTP request and the agent loop and making the state store the single source of truth across requests.

If you have been running v6 in production with chat-style HITL agents, read this guide end-to-end before upgrading. Some changes are forward-compatible-only at the storage layer (rolling back to v6 after applying migrations is unsafe; see Rollback Semantics).


Table of contents

  1. Overview / motivation
  2. Breaking changes per package
  3. Operational guidance
  4. Code migration examples
  5. Rollback semantics
  6. Validation checklist

Overview / motivation

What changed and why

In v6, when a Helix agent reached a HITL boundary (a client-executed tool, or an approval-gated tool) the runtime parked the in-flight JavaScript loop in memory and waited for the consumer to land a submitToolResult() call on the same process. In a Cloudflare Durable Object, that meant a hibernation guard kept the DO awake. In the JS runtime, that meant a setTimeout-driven promise dangling on the heap. In Temporal it meant a workflow blocked on a condition, which was the only place this design composed cleanly.

Three problems compounded on top of that model:

  1. Determinism timeouts at the editor seam. The flagship example was the editContent client tool: long edits would deterministically blow past the v6 client-tool deadline because the suspended waiter billed wall-clock time end-to-end.
  2. Stream resumption was a gut-feel system. AI SDK v6 introduced first-class streamId HITL primitives. Plumbing them through to v6 Helix required ad-hoc transports (HelixChatTransport) that short-circuited the SDK's own stream-close-and-reopen lifecycle.
  3. Cost ratchet on long pauses. Anything that paused longer than a few seconds amplified DO/Temporal wall-time billing. Internal data showed long-pause sessions could amortize ~80% of their wall-time waiting on the human.

v7 in one sentence

The agent loop is allowed to die at every HTTP request boundary; the state store is the only durable thing across requests; resumption is driven by executor.resume() reading durable suspension context, not by signaling an in-memory waiter.

This unlocks:

  • Deterministic suspension semantics — the client-tool deadline now measures wall-clock from suspension write to submission write, not from in-memory dispatch to in-memory resolve.
  • Native AI SDK v6 stream-close-and-reopen integration via prepareHelixChatRequest and useResumeClientTools.
  • Approval-gated tools as a first-class primitive (requireApproval on defineTool) rather than something every consumer had to build by hand.
  • Cost reduction on long pauses, particularly on Cloudflare DO where the v6 hibernation guard kept the object loaded.

Scope

v7.0 ships v7 stateless HITL support across the four currently-supported runtimes:

  • runtime-js (full stateless)
  • runtime-cloudflare (Durable Object path) (full stateless)
  • runtime-temporal (full stateless)
  • runtime-cloudflare (Workflows path) (full stateless)
  • runtime-dbos (consumer-equivalent; see DBOS divergence note)

The first four runtimes use durable-state-only suspension at every HITL boundary — the workflow exits, durable state captures the suspension, and a subsequent executor.resume() re-enters from durable state. No runtime blocks on in-memory or in-step waits during HITL.

DBOS divergence (per second-round review #5/#8 finding P0.5)

The DBOS runtime is NOT durable-state-only at the suspension boundary. DBOS implements HITL by blocking inline on DBOS.recv() inside the workflow — the workflow itself stays alive across the suspension, and the durable state captures only the input/output of the recv call rather than the full workflow exit/resume cycle.

This is consumer-equivalent for the public surface area: executor.execute() / submitToolResult() / executor.resume() behave identically from the user's perspective. The lifecycle hooks (onAgentSuspended / onAgentResumed) fire at the same logical points, and the persisted SessionState shape matches the other runtimes.

It's NOT identical for cost & operational characteristics:

  • DBOS workflows accumulate uptime for the duration of a HITL pause (because the workflow process stays alive). Long-running pauses on DBOS cost more compute than the same pause on JS/Temporal/CF.
  • DBOS hook firing order around Promise.all-driven sub-agent dispatch differs from the other runtimes (hooks fire BEFORE the Promise.all resolves, vs after on the others). Tracing pipelines that assume the post-Promise.all timing observe DBOS as an outlier.
  • Lifecycle hooks for the awaiting_children discriminator have a different code path on DBOS (the phase1ClientToolIds.length > 0 guard previously caused them to skip entirely; fixed in v7.0-final).

The DBOS hook-firing-order parity table in docs/internals/concepts.md records these differences precisely.


Breaking changes per package

Each package follows independent semver. The branch omnara/stateless-suspension-redesign is the v7 release train; consult each package's CHANGELOG.md for the specific v7 release version (e.g. core@0.29.x, runtime-temporal@2.2.x, ai-sdk@17.0.0, runtime-js@8.0.0, runtime-cloudflare@5.1.0, runtime-dbos@3.0.0). The list below highlights only consumer-observable changes — internal refactors are documented in the package CHANGELOG.

@helix-agents/core

New types:

  • RunOutcome<TOutput> — runtime-internal discriminated union returned by every runLoop implementation. Five variants: completed, failed, suspended_client_tool, suspended_awaiting_children, suspended_step_partial. Marked @internal but visible in stack traces and structured-logger payloads, so worth knowing.
  • StepOutcome — per-step counterpart of RunOutcome, also @internal.
  • SuspendedChildWait — describes a sub-agent the parent is waiting on; one entry per pending child in suspended_awaiting_children.

New requireApproval flag on defineTool:

ts
defineTool({
  name: 'sendEmail',
  parameters: z.object({ ... }),
  requireApproval: true,           // or (input, ctx) => boolean
  execute: async (input) => { ... }
});

requireApproval is mutually exclusive with execute: 'client' and finishWith: true. The function form fails-closed (an exception inside the evaluator is treated as requireApproval = true, matching the Mastra precedent).

New SubmitToolResult union:

The v6 single shape ({ toolCallId, result | error }) is now one variant of a discriminated union; the second variant is the approval-response shape:

ts
type SubmitToolResult =
  | { kind: 'approval-response'; toolCallId: string; approved: boolean; reason?: string }
  | { kind: 'client-tool-result'; toolCallId: string; result: unknown }
  | { kind: 'client-tool-result'; toolCallId: string; error: string };

Note on kind: v6 callers that did not pass kind get rejected at SubmitToolResultSchema.parse() time. The schema accepts both variants; routing happens off toolCallId, not kind.

compareAndSetStatus return shape:

Old: Promise<boolean>. New: Promise<{ ok: true; newVersion: number } | { ok: false; currentStatus: SessionStatus; currentVersion: number }>.

This is the single most-commonly-tripped breaking change in v7. Update every call site:

ts
// v6
const ok = await store.compareAndSetStatus(sessionId, ['active'], 'paused');
if (ok) { ... }

// v7
const result = await store.compareAndSetStatus(sessionId, ['active'], 'paused');
if (result.ok) {
  console.log('promoted to version', result.newVersion);
} else {
  console.log('lost CAS — store is at', result.currentStatus, 'version', result.currentVersion);
}

saveStateAndPromoteStaging is now a required interface method on SessionStateStore, and it MUST be atomic. All in-tree stores ship atomic implementations. A previously-exported defaultSaveStateAndPromoteStaging() helper (non-atomic, sequential appendMessages → saveState → promoteStaging) was removed in P3.R3-BC-FALLBACK — the crash-between-calls window it created is exactly the corruption the atomic primitive was added to prevent. Custom stores: implement using a transaction (Postgres) or compare-and-swap (Redis/DO).

New SessionState.failureReason field. When a session enters status: 'failed', this discriminator distinguishes a child marked failed because its parent suspended ('parent_suspended') from a genuine child execution failure. The γ-cascade in applyResultsAndReload uses this to decide re-spawn vs. observe — children with failureReason === 'parent_suspended' are re-spawned (parent's resume gives them another chance), while genuinely failed children are drained to the parent as failure results.

Custom store authors: persist failureReason round-trip. Stores that silently drop the field will break the γ-cascade — the parent's resume will treat suspended children as genuinely failed and skip re-spawn, producing dangling sub-agent state.

@helix-agents/runtime-js

  • The legacy JSAgentExecutor.runLoop is gone (~1725 LOC removed). The new loop is built around runStepIterator() from core and uses the new RunOutcome discriminated union.
  • handle.result() now resolves to AgentResult.status of 'suspended_client_tool' | 'suspended_awaiting_children' | 'suspended_step_partial' for HITL agents that did not complete in-process. Existing switch statements that only handled 'completed' | 'failed' | 'interrupted' will fall through silently for HITL agents.
  • The client-tool wait moved from in-memory promise + setTimeout to durable state (SessionState.suspensionContext). Submission resumes the loop via executor.resume(), not by waking an in-process promise.
  • JsClientToolResolver is no longer exported from the public surface. Internal CF DO use only.

@helix-agents/runtime-cloudflare (DO path)

  • The hibernation guard is gone (~365 LOC removed). DOs are now free to evict during HITL waits; the alarm scheduler retains only heartbeat and interrupt-poll subscribers (the client-tool-deadline alarm subscriber is gone — deadlines are enforced in findExpiredPending at request time).
  • DOAlarmTimerStrategy is removed; the runtime no longer needs a timer strategy for client-tool waits.
  • persistentAgents are now supported on the DO path (commit fb3180f6b). Each persistent child runs in its own DO instance addressed by stable sessionId ({parent}-agent-{name}); the parent's six auto-injected companion__* tools dispatch via the subAgentNamespace DO stub against the existing sub-agent endpoints. The earlier v7.0 fail-fast guard has been lifted.

@helix-agents/runtime-cloudflare (Workflows path)

Full v7 stateless HITL support. The workflow body returns early from runAgentWorkflow when the run hits a HITL boundary (client tool, approval gate, or sub-agent suspension cascade). The exit point writes durable suspension state via the commitSuspendedStep activity and returns AgentWorkflowResult with status: 'suspended_client_tool', 'suspended_awaiting_children', or 'suspended_step_partial'. The workflow instance terminates — there is no step.waitForEvent keeping it alive across the HITL pause.

executor.resume(sessionId) starts a new workflow instance with mode: 'resume'. The new instance begins with applyResultsAndReload, which drains any queued submissions (pendingClientToolCalls whose results have landed) into the conversation and resumes the agent loop from the durable state snapshot. Sub-agent suspension cascades up — a child workflow that suspends propagates suspended_awaiting_children to its parent's AgentWorkflowResult, which terminates the parent workflow and writes parent suspension state for later resume.

agent.workspaces is still rejected at run-start on CFW Workflows. That gate remains a v7.1 deferral. Workspaces require the JS or CF DO runtimes.

@helix-agents/runtime-temporal

Full v7 HITL support. The TemporalAgentExecutor API surface is unchanged — execute(), resume(), submitToolResult(), interruptAgent(), abortAgent() — but several behaviors changed:

  • handle.result() for HITL agents now resolves with 'suspended_client_tool' | 'suspended_awaiting_children' | 'suspended_step_partial'. Consumers must handle the new statuses; there is no backwards-compat shim. Exhaustive switch statements that only handled 'completed' | 'failed' | 'interrupted' will fall through silently for HITL agents.
  • executor.resume(sessionId) starts a NEW workflow instance with workflow ID ${prefix}__${agentType}__${sessionId}__resume-${N} (single-dash suffix; WorkflowIdReusePolicy.ALLOW_DUPLICATE). The prior workflow has already exited at the HITL boundary; resume's mode='resume' branch calls the applyResultsAndReload activity, which drains submitted client-tool results into messages, fires onMessage + afterTool hooks, synthesizes timeouts for expired deadlines, and drains completed sub-agent children.
  • submitToolResult accepts the SubmitToolResult discriminated union with kind: 'client-tool-result' or kind: 'approval-response' (same as runtime-js). The submission is a durable write only — no Temporal signal is sent (the workflow has already exited at the HITL boundary).
  • Sub-agents run as Temporal child workflows. On parent suspension, in-flight children are marked failed:'parent_suspended' (mitigation #3) and re-spawned via the __resume-N workflow ID convention on parent's resume (γ-cascade). Completed children are drained on parent resume via recordSubSessionResult.

Deleted exports — clean break, no aliases. Code that imported these must remove the imports:

  • runAgentWorkflow (function) → use the new agentWorkflow from @helix-agents/runtime-temporal/workflow. The post-A.2 rewrite collapsed runAgentWorkflow + the WorkflowActivities/WorkflowDeps injection shim into a single self-contained agentWorkflow function.
  • TemporalClientToolResolver → no replacement. Client-tool suspension is owned by durable state writes; there is nothing for consumers to wire.
  • executeClientToolInWorkflow → no replacement.
  • AgentWorkflowActivities, AgentWorkflowOptions → no replacement. Activities are now provided directly by GenericActivities from @helix-agents/runtime-temporal; the workflow body proxies them internally without consumer wiring.
  • registerToolResultHandler, getSubmittedToolResult, clearSubmittedToolResult → no replacement. Submit is durable-only post-A.2; the workflow has already exited at the HITL boundary by the time submitToolResult is called, so there's no in-workflow signal handler to register. Consumers drive the next loop iteration via executor.resume().
  • Five Temporal signal definitions are removed: submitToolResultSignal, childSuspendedSignal, childWokeSignal, runResumedSignal, plus the RUN_RESUMED_SIGNAL_NAME constant. INTERRUPT_SIGNAL_NAME is retained for cross-process interrupt backward compat.

Worker setup. Register agentWorkflow directly, or wrap it in a thin delegate so the worker bundles a workflow under the conventional name your TemporalAgentExecutor.workflowName expects:

ts
// workflows.ts (bundled by the worker)
import { agentWorkflow as v7AgentWorkflow } from '@helix-agents/runtime-temporal/workflow';
import type {
  AgentWorkflowInput,
  AgentWorkflowResult,
} from '@helix-agents/runtime-temporal';

// Re-export under whatever name your TemporalAgentExecutor.workflowName
// expects. Most deployments use the convention 'agentWorkflow' directly,
// in which case you can re-export by name without aliasing.
export async function agentWorkflow(
  input: AgentWorkflowInput
): Promise<AgentWorkflowResult> {
  return v7AgentWorkflow(input);
}

The imported agentWorkflow sets up its own proxyActivities, the INTERRUPT_SIGNAL_NAME handler, and child-workflow dispatch internally. The wrapper exists only to register under a stable name.

AgentRegistry.replace() (Cloudflare + Temporal)

Both runtime-cloudflare and runtime-temporal add a public AgentRegistry.replace(config) method that returns boolean (true if a prior agent under the same name was replaced; false if it was a fresh registration). Use this to install a per-call hook variant during tests without unregistering and re-registering:

ts
// Test-clone-with-hooks pattern
const cloned = { ...AgentDef, hooks: { afterTool: spy } };
registry.replace(cloned);
// ... run test ...
registry.replace(AgentDef); // restore

Throws if the agent was originally registered as a factory (registerFactory()); use unregister() + registerFactory() for that case. Verified at packages/runtime-cloudflare/src/registry.ts:188-200 and packages/runtime-temporal/src/registry.ts:231-243.

@helix-agents/runtime-dbos

Full v7 HITL support mirroring runtime-js + runtime-temporal. The F7 resume() contract fix lands in packages/runtime-dbos/src/lifecycle/ resume.ts — a single multi-status CAS replaces the previous two-step-and-sometimes-skip pattern. Resumes now move through paused_awaiting_client | paused_awaiting_children | suspended_step_partial | interrupted to running atomically.

Hooks (onAgentSuspended / onAgentResumed / beforeTool / afterTool / onMessage / onStateChange) fire from runtime-dbos workflows identically to other runtimes — wired through the lifecycle pipeline at workflows/shared.ts + workflows/standard-workflow.ts.

Workspaces are silently unsupported in v1. Agents declaring workspaces will silently lose those tools without a fail-fast guard. Tracked as future work — see docs/dev/future-work.md. Workaround: use runtime-js or CF DO until that gap closes.

@helix-agents/agent-server

Five new chat handler routes layered on top of the existing 7 AgentExecutor routes:

  • POST /chat — start or continue a chat turn. Always streams.
  • GET /chat/{id}/stream — reattach to an in-progress stream after a refresh.
  • POST /chat/{id}/submit-tool-result — submit a client-tool result or approval response.
  • POST /chat/{id}/interrupt — durable interrupt; replaces the v6 in-memory interrupt(handle).
  • POST /chat/{id}/abort — abort the current run.

Wire them via AgentServer.toExpressMiddleware('/chat') (Express) or the equivalent generic adapter.

INTERRUPT_NOT_LOCAL 503 is removed. v6 returned a 503 from agent-server when interrupt was issued against a session whose handle was on a different process. v7 writes a durable interrupt request to the state store; the running loop picks it up at its next checkpoint.

@helix-agents/ai-sdk

  • HelixChatTransport is deleted. It existed solely to short- circuit AI SDK v5's transport in service of v6's in-memory HITL bridge. v7 uses the SDK's native stream-close-and-reopen lifecycle.
  • New helper: prepareHelixChatRequest({ api, resumeFromSequence, existingMessageId }) — pass to DefaultChatTransport to drive resume.
  • New helper: prepareHelixReconnectRequest({ api }) — pass to DefaultChatTransport.prepareReconnectToStreamRequest so the AI SDK's built-in reconnectToStream after page-refresh hits GET /chat/{id}/stream with the right X-Resume-From-Sequence / X-Existing-Message-Id headers. Without this helper, page-refresh-during-stream silently hangs tool calls in pending forever because reconnectToStream defaults to a 404 HTML page rather than the SSE endpoint. This is the single most-frequently-missed migration step.
  • New React hook: useResumeClientTools({ chat, toolHandlers }) — intercepts tool_start chunks for client-executed tools, runs the consumer-supplied handler, and submits the result back to the server. Replaces the manual submitToolResult plumbing each consumer wrote in v6.
  • New helpers: extractResumeIntent (server-side, parses the client's resume cookie/header into { resumeFromSequence, existingMessageId }) and findExpiredPending (server-side, inspects pendingClientToolCalls for entries past their deadline, used by the chat handler to emit synthetic tool_error chunks rather than letting clients spin forever).
  • New orchestrator: handleChatStream({ executor, stateStore, streamManager, agent }, params) — the canonical chat handler. Pass it to AgentServer({ chatHandler }).

Storage adapters

All four state stores apply a forward migration that adds a suspension_context JSONB (or TEXT for D1) column to the session state row, plus indexes on pendingClientToolCalls for efficient expiration queries.

PackageMigration
@helix-agents/store-postgresV7
@helix-agents/store-redis(no schema, but consumes new RedisJSON paths — version bump only)
@helix-agents/store-cloudflare (D1)V9
@helix-agents/store-cloudflare (DO SQLite)V5
@helix-agents/store-memory(in-memory; no migration needed)

All four implement atomic saveStateAndPromoteStaging. The default non-atomic fallback in core is reserved for third-party stores that have not yet upgraded.

@helix-agents/tracing-langfuse

  • Trace ID seed changed from runId to sessionId. This is a deliberate behavior change: with v7's stateless-suspension model, a single conversational session can span many runs (each resume after a HITL boundary creates a new run). Seeding from runId would produce one trace per run, fragmenting the chat conversation across many traces in the Langfuse UI. Seeding from sessionId keeps the entire conversation in one trace.
  • New hook handlers: onAgentResumed and onAgentSuspended produce matching event spans inside the session-scoped trace, so you can visually see where a run paused and where it resumed.

The legacy core/tracing/tracing-hooks.ts adapter is HITL- incompatible in v7: it relies on an in-memory tracingStateMap that the stateless-suspension redesign cannot populate across process restarts. v7 fail-fasts when requireApproval or client-tool agents are run with the legacy adapter. Migrate to @helix-agents/tracing- langfuse (or implement the v7 Logger-style interface in your own adapter) before upgrading.


Operational guidance

Pre-deploy checklist

  1. Drain agent traffic to a < 60s window. v7's storage migrations are forward-compatible-only. New code reading old data is fine; old code reading new data is undefined behavior.
  2. Apply storage migrations before deploying new code. Each store ships a CLI migration runner; verify with:
    sql
    SELECT version FROM __agents_migrations;
    Postgres should show V7; D1 should show V9; the DO SQLite tier should show V5.
  3. Verify no Temporal / CFW Workflows agents declare agent.workspaces. Workspaces remain unsupported on those runtimes in v7.0 (still a v7.1 follow-up). HITL primitives (requireApproval, client-executed tools) and persistent sub-agents (CF DO) are all shipped in v7.0 — only agent.workspaces is gated outside the JS and CF DO runtimes.

Post-deploy monitoring

  • client_tool_timeout count should drop to 0 (or near it). Pre-v7, long-running client tools exceeded the in-memory deadline routinely. Post-v7, the deadline measures durable wall-clock from suspension write to submission write, so genuine timeouts should only occur for clients that actually never submit.
  • Wall-time billed on long-pause sessions should drop ~80%. The CF DO hibernation guard was the dominant cost driver. With it gone, a 5-minute HITL pause now bills approximately the cost of two HTTP requests instead of 5 minutes of DO uptime.
  • Watch the new structured logger events:client_tool.suspended, client_tool.submitted, client_tool.timeout, agent.resumed, agent.suspended. These replace the v6 in-memory bookkeeping and are now your only visibility into HITL state.

Recovery from stuck sessions

If a session ends up stuck in pendingClientToolCalls with no submission landing (e.g., a client crashed mid-flow), the operator runbook is:

  1. Inspect SessionState.pendingClientToolCalls to see which tool call IDs are pending.
  2. Use findExpiredPending(state) from @helix-agents/ai-sdk to identify expired entries.
  3. Force-fail by calling executor.submitToolResult({ kind: 'client-tool-result', toolCallId, error: 'operator_force_failed' }). This emits a synthetic tool error and resumes the run.

See the v6 client-executed-tools guide ("Operating in production" section) for additional context — most of it carries forward unchanged.

Temporal cutover

The v7 cutover on Temporal is a clean break. There is no worker versioning, no v6/v7 coexistence, no v6 drain path:

  1. Deploy v7 workers. They register agentWorkflow and the v7 GenericActivities surface only — the v6 runAgentWorkflow body, the v6 activity injection shim, and TemporalClientToolResolver are gone.
  2. Terminate any v6 workflows still running. Their durable state in store-postgres / store-redis is unaffected — affected sessions can be re-executed under v7 against the same sessionId.
  3. If you skip step 2, v6 workflows fail at the next activity call once they hit the v7 worker pool, because the v6 activity names are no longer registered.

There is no rollback path on the Temporal side either: the v6 code is deleted from the package. Pin to @helix-agents/runtime-temporal@6.x in your dependency manager if you genuinely need to revert; do not attempt to patch v6 behavior back into v7.

CFW Workflows v7 stateless cutover

CFW Workflows now exits the workflow instance on HITL boundaries (client tools, approval gates, sub-agent suspensions). v6 deployments that have in-flight workflows blocking on step.waitForEvent are not upgrade-compatible — those workflows will not resume under v7.

Cutover procedure:

  1. Deploy v7 workers alongside v6 workers (Cloudflare Workflows version routing).
  2. Drain v6 traffic: route new agent runs to v7; allow existing v6 workflows to complete or terminate.
  3. After all v6 workflows have terminated (typical: minutes to hours), decommission v6 workers.

Forced cut-over: terminate any in-flight v6 workflow instances via wrangler workflows instance terminate. Their durable session state is preserved; the next consumer call against the session triggers a fresh v7 workflow instance.

Billable wall-time reduction: v6 billed for the entire HITL wait duration (workflow instance running). v7 billing is bound to the active LLM + tool work plus a few seconds for resume bootstrap. Multi-minute approvals see ~80% wall-time reduction.


Code migration examples

Server-side route handler

Before (v6):

ts
import express from 'express';
import { createFrontendHandler } from '@helix-agents/ai-sdk';
import { JSAgentExecutor } from '@helix-agents/sdk';

const executor = new JSAgentExecutor({ /* ... */ });
const handler = createFrontendHandler({ executor, agent });

const app = express();
app.post('/api/chat', handler);

After (v7):

ts
import express from 'express';
import { AgentServer } from '@helix-agents/agent-server';
import { handleChatStream } from '@helix-agents/ai-sdk';
import { JSAgentExecutor } from '@helix-agents/sdk';

const executor = new JSAgentExecutor({ stateStore, streamManager, /* ... */ });

const server = new AgentServer({
  executor,
  // No auth shown — for production, configure `authenticate` or pass
  // `allowUnauthenticated: true` (v7 fail-fasts in the constructor if
  // neither is set).
  allowUnauthenticated: true,
  chatHandler: (params) =>
    handleChatStream({ executor, stateStore, streamManager, agent }, params),
});

const app = express();
// One call wires every route under /chat:
//   POST /chat
//   GET  /chat/:id/stream
//   POST /chat/:id/submit-tool-result
//   POST /chat/:id/interrupt
//   POST /chat/:id/abort
app.use('/api', server.toExpressMiddleware('/chat'));

Client-side useChat configuration

Before (v6):

tsx
import { useChat } from 'ai/react';
import { HelixChatTransport } from '@helix-agents/ai-sdk';

const transport = new HelixChatTransport({
  api: '/api/chat',
  sessionId,
  resumeFromSequence,
});

const { messages, sendMessage } = useChat({ transport });

After (v7):

tsx
import { useChat } from 'ai/react';
import { DefaultChatTransport } from 'ai';
import {
  prepareHelixChatRequest,
  useResumeClientTools,
} from '@helix-agents/ai-sdk/react';

const transport = new DefaultChatTransport({
  api: '/api/chat',
  prepareSendMessagesRequest: prepareHelixChatRequest({
    api: '/api/chat',
    resumeFromSequence: snapshot?.streamSequence,
    existingMessageId: snapshot?.existingMessageId,
  }),
});

const chat = useChat({ transport });

useResumeClientTools({
  chat,
  toolHandlers: {
    editContent: async (input, { toolCallId }) =>
      runEditContentClientSide(input, toolCallId),
  },
});

Tool definition with built-in approval

Before (v6) — roll your own:

ts
defineTool({
  name: 'sendEmail',
  parameters: z.object({
    to: z.string(),
    subject: z.string(),
    body: z.string(),
  }),
  execute: async (input, ctx) => {
    // v6 had no built-in approval; consumers wired up their own state
    // machine, often by emitting a custom chunk and then waiting on a
    // user-supplied "confirmation" tool.
    await ctx.emit({ type: 'awaiting_approval', input });
    const approved = await waitForApprovalSomehow(ctx);
    if (!approved) throw new Error('rejected');
    await actuallySendEmail(input);
    return { ok: true };
  },
});

After (v7) — first-class flag:

ts
defineTool({
  name: 'sendEmail',
  parameters: z.object({
    to: z.string(),
    subject: z.string(),
    body: z.string(),
  }),
  // Static form — every call requires approval:
  requireApproval: true,
  // Or function form — only require approval over a threshold:
  // requireApproval: (input) => input.body.length > 1000,
  execute: async (input) => {
    // Only runs after the approval response submits with approved=true.
    await actuallySendEmail(input);
    return { ok: true };
  },
});

When the LLM calls a requireApproval tool, the runtime:

  1. Emits a tool_approval_request stream chunk with the tool name and parsed input.
  2. Suspends the run with status suspended_client_tool (the same primitive carries both client-tool and approval flows; routing happens off the kind field of the submission).
  3. On submission with { kind: 'approval-response', approved: true }, resumes and runs execute().
  4. On submission with { kind: 'approval-response', approved: false, reason }, emits a tool_error chunk ('Tool call X was not approved by the user') and skips execute().

Consuming handle.result()

Before (v6):

ts
const result = await handle.result();
switch (result.status) {
  case 'completed': /* output ready */ break;
  case 'failed':    /* error */ break;
  case 'interrupted': /* resumable */ break;
}

After (v7):

ts
const result = await handle.result();
switch (result.status) {
  case 'completed':                    /* output ready */ break;
  case 'failed':                       /* error */ break;
  case 'interrupted':                  /* resumable */ break;
  case 'suspended_client_tool':        /* client must submit results */ break;
  case 'suspended_awaiting_children':  /* sub-agents pending */ break;
  case 'suspended_step_partial':       /* partial-step suspend */ break;
}

The three new suspended_* statuses carry a result.suspended payload with the routing info (toolCallIds, children, stepId) that the chat handler needs to drive resume. Most consumers will never read this directly — handleChatStream does it on their behalf — but it is part of the public type surface and exhaustive switches need to handle it.

Submit-tool-result for client-executed tools

Before (v6):

ts
await executor.submitToolResult({
  sessionId,
  toolCallId,
  result: { url: 'https://...' },
});

After (v7):

ts
// client-tool-result variant
await executor.submitToolResult({
  kind: 'client-tool-result',
  sessionId: rootSessionId, // still required — routes to owning sub-agent
  toolCallId,
  result: { url: 'https://...' },
});

// approval-response variant
await executor.submitToolResult({
  kind: 'approval-response',
  sessionId: rootSessionId,
  toolCallId,
  approvalId,
  approved: true,
  reason: 'optional reason string',
});

sessionId is still required in v7 — it's the root sessionId, used by the executor to route the submission to the owning sub-agent via SessionState.clientToolCallOwnership. Per the routing invariant, submissions always go against the root sessionId, even when the pending tool call was emitted by a sub-agent.

The change from v6 is structural: SubmitToolResult is now a discriminated union (kind: 'client-tool-result' | 'approval-response') where sessionId lives inside the submission object alongside toolCallId. v6 callers that omit kind are rejected at schema-parse time; the schema lives at packages/core/src/types/client-tool-submit.ts and is the source of truth for the field set.


Rollback semantics

Rolling back from v7 to v6 is unsafe by default. The Postgres V7, D1 V9, and DO V5 migrations add a suspension_context column that v6 does not know about. v6 readers will simply ignore the column, but any session that paused under v7 and is then resumed under v6 will silently lose its suspension context — the client tool will appear to never resume, and the consumer will see a hung session.

If a rollback is genuinely necessary:

  1. Drain HITL agent traffic to zero. Confirm pending_client_tool_calls is empty for every active session. Note: this is a top-level TEXT column added in V3 (NOT a JSONB field on a state column — earlier versions of this guide had the wrong path; corrected per round-3 review #6 finding P1.M1):
    sql
    SELECT count(*) FROM __agents_states
      WHERE pending_client_tool_calls IS NOT NULL
        AND pending_client_tool_calls != '{}'
        AND pending_client_tool_calls != '';
  2. Optionally drop the suspension_context column to fully revert. This is destructive — any v7-era state in the column is lost, and sessions paused under v7 cannot be revived even by re-upgrading:
    sql
    -- Postgres
    ALTER TABLE __agents_states DROP COLUMN suspension_context;
    (D1 / DO SQLite have equivalents.)
  3. For sessions stuck in pending_client_tool_calls state during rollback, clean up by force-failing each one:
    sql
    UPDATE __agents_states
    SET pending_client_tool_calls = '{}',
        status = 'failed'
    WHERE pending_client_tool_calls IS NOT NULL
      AND pending_client_tool_calls != '{}'
      AND pending_client_tool_calls != '';
    Then notify affected clients out-of-band.

If you anticipate any chance of rollback, leave the column in place during the rollback (option 1 only). v6 ignores it, and re-upgrading to v7 lets the session pick back up where it left off.


Validation checklist

Run this checklist after deploying v7 to a non-production environment.

Pre-flight (storage)

  • [ ] Postgres migration V7 applied: SELECT version FROM __agents_migrations ORDER BY version DESC LIMIT 1 returns 7 or higher.
  • [ ] D1 migration V9 applied (Cloudflare deployments): same query against the D1 binding returns 9 or higher.
  • [ ] DO SQLite migration V5 applied (Cloudflare DO deployments): same query against the DO storage returns 5 or higher.

Pre-flight (versions)

  • [ ] Every @helix-agents/* package upgraded to its v7 release version per its CHANGELOG.md. Each package follows independent semver — there is no single ^7.0.0 constraint to grep for. Consult per-package changelogs in packages/*/CHANGELOG.md for the exact v7 versions.

Pre-flight (code)

  • [ ] No imports of the deleted HelixChatTransport: grep -r "HelixChatTransport" src/ returns no hits.
  • [ ] No imports of JsClientToolResolver from app code: grep -r "JsClientToolResolver" src/ returns no hits (the symbol is internal-only in v7).
  • [ ] No agents declare agent.workspaces on the Temporal or CFW Workflows paths (if applicable). HITL primitives and persistent sub-agents are now supported on every HITL-capable runtime; only agent.workspaces outside JS / CF DO remains a v7.1 deferral.
  • [ ] No imports of the deleted Temporal v6 surface: grep -r "runAgentWorkflow\|TemporalClientToolResolver\|executeClientToolInWorkflow\|AgentWorkflowActivities\|AgentWorkflowOptions\|registerToolResultHandler" src/ returns no hits.

Functional (HITL)

  • [ ] Client tool round-trip: start a chat that calls a client- executed tool, verify the run suspends with status suspended_client_tool, submit the result, verify the run resumes and produces output. Refresh mid-suspension and verify the stream reattaches.
  • [ ] Approval flow: call a requireApproval: true tool, verify a tool_approval_request chunk emits, submit { kind: 'approval-response', approved: true }, verify execute runs. Repeat with approved: false and verify a tool_error chunk emits without running execute.
  • [ ] Stream resumption: start a long stream (10+ seconds of tokens), refresh the page mid-stream, verify the client reattaches and continues receiving tokens.
  • [ ] Long-pause cost: start a chat, pause at a HITL boundary for five minutes, then submit. Inspect billing/wall-time metrics and confirm < 1 minute of runtime cost was billed (compared to ~5 minutes under v6 on the CF DO path).

Functional (Temporal)

  • [ ] HITL on Temporal: repeat the client-tool round-trip and approval-flow tests above against TemporalAgentExecutor. Verify handle.result() resolves with the v7 'suspended_*' statuses and executor.resume() starts a NEW workflow instance with a __resume-N workflow ID suffix that drains submitted results via the applyResultsAndReload activity.
  • [ ] Worker registers agentWorkflow (or a thin delegate) and no v6 activity names remain in the worker bundle.

Functional (non-HITL)

  • [ ] CFW Workflows HITL (smoke test): client-tool round-trip and approval-flow tests against a CFW Workflows-deployed agent; verify the workflow instance exits on suspension and that executor.resume() starts a fresh instance.
  • [ ] Sub-agents (ephemeral and persistent) work on JS, CF DO, and Temporal. CF DO persistent sub-agents dispatch via the subAgentNamespace DO stub (commit fb3180f6b); ephemeral sub-agents remain unchanged.

Observability

  • [ ] client_tool_timeout counter: trending toward 0 over 24h.
  • [ ] Langfuse traces: each chat session shows up as a single trace, not one trace per run.
  • [ ] client_tool.suspended and client_tool.submitted events visible in your structured log aggregator.

Getting help

The v7 release was a large rewrite. If something behaves differently from this document, file an issue — the document is the canonical contract for what v7 should do.

Released under the MIT License.