Skip to content

Future work (separate sessions)

This doc captures meaningful sub-projects that are scoped to be tackled in a future, dedicated session rather than as inline follow-ups to the current branch's work.

These are NOT in the active follow-ups backlog (docs/dev/follow-ups.md) — they're roadmap items that need their own spec/plan/implementation cycle.


Workspaces parity matrix (Temporal / CFW Workflows / DBOS)

Why future: Substantial standalone feature. Each runtime needs workspace providers (local-bash, etc.) wrapped in its own deterministic boundary (activities for Temporal, steps for DBOS, workflow steps for CFW Workflows). Cross-cuts state-store semantics, sandboxing, filesystem isolation, and per-runtime sandbox enforcement.

Current state:

  • JS runtime: ✅ Full workspace support
  • CF Durable Objects (via agent-server): ✅ Full workspace support
  • Temporal (runtime-temporal): ❌ Run-start fail-fast at executor.ts:268. Documented as "not supported" in CLAUDE.md.
  • CFW Workflows (runtime-cloudflare/src/workflow.ts:392): ❌ Run-start fail-fast.
  • DBOS (runtime-dbos): ⚠️ Silent unsupported — execute-tool.ts:164 passes workspaces: undefined; no fail-fast guard. This is worse than Temporal/CFW Workflows because consumers get silent breakage rather than an immediate error.

Scope for the future session:

  1. Decide architectural model:

    • Option A: workspace providers run inside activities/steps (durability boundary preserved; activities can call non-deterministic I/O).
    • Option B: workspaces become "workflow-level" abstractions with deterministic state in the workflow body and async I/O delegated to dedicated workspace activities.
    • Option C: narrow workspaces to read-only scopes initially; full mutation support comes later.
  2. Per-runtime implementation:

    • Temporal: workspace activities + remove the fail-fast guard.
    • CFW Workflows: workspace step wrappers + remove the fail-fast guard.
    • DBOS: add fail-fast guard FIRST (so silent breakage becomes loud); then workspace step wrappers.
  3. Cross-runtime parity tests in packages/e2e exercising every workspace operation against every runtime.

  4. Update CLAUDE.md's parity matrix when complete.

Estimated effort: 2-3 weeks across all four runtimes. Dependencies: None — A.2/A.3/B all complete on the HITL surface. Owner: TBD; needs its own brainstorm session to choose the architectural model before scoping the plan.


F6 — Atomic counter merge across parallel tools

Why future: Real feature work that needs an API design decision. Consistently scoped out of D, A.1, A.2, and the original B carving as "separate sub-project" (per test-infrastructure-roadmap.md F6 entry and the A.2 design spec).

Current state: Parallel scalar customState writes are last-write-wins on every runtime + every state store (verified in packages/runtime-dbos/src/__tests__/integration/staging-atomicity.integ.test.ts and packages/e2e/src/__tests__/staged-state.integ.test.ts:335-338). Arrays with arrayDeltaMode: true correctly merge via deltas. Counters / numeric scalars don't. The DBOS staging-atomicity test was deliberately weakened from toBe(2) to toBeGreaterThanOrEqual(1) during D to mask the bug, and the limit was documented in CLAUDE.md

  • the upgrade guide rather than fixed.

Workaround for users today: Use array-append semantics (push entries, assert array length), or serialize via a single tool, or fan out via separate child workflows.

Scope for the future session — API decision needed:

  1. Option F6.A — New MergeChanges opcodes Add IncrementBy / DecrementBy opcodes to the existing MergeChanges schema. Tools opt in via either Immer-equivalent diff-detection (recognize "this was an increment") or a new explicit API (ctx.incrementState('count', 1)). Pros: schema-free. Cons: detection is fragile; explicit API diverges from Immer style.

  2. Option F6.B — Schema-decoration for merge strategyz.number().describe('@merge:counter'). State store reads schema, applies per-key merge on commit. Pros: declarative, automatic. Cons: couples store to schema; describe-based markers are brittle.

  3. Option F6.C — Explicit increment API only (no opcodes)ctx.incrementState(path, 1) generates the right opcode internally. Pros: simplest, explicit, no magic. Cons: diverges from natural Immer pattern users already know.

Implementation work (regardless of API choice): ~1 week.

  • Core: new opcode types + apply logic
  • Each state store's applyMergeToCustomState (5 stores)
  • Cross-runtime parity tests
  • Re-tighten staging-atomicity.integ.test.ts from toBeGreaterThanOrEqual(1) to toBe(N) for N concurrent writers
  • Update CLAUDE.md / upgrade guide to remove the documented limit

Dependencies: None — independent of HITL/sub-agent/workspace work. Owner: TBD; needs its own brainstorm session to pick the API model before scoping the plan.


CFW Workflows γ-cascade re-spawn — ✅ landed

Closed by: commit 6cbd78808, tracked as FU-A2-40 in docs/dev/follow-ups.md. See the "Done items" section there for the full closure write-up. Mirrors runtime-temporal's FU-A2-09 closure: parent's commitSuspendedStep marks each suspendedAwaitingChildren entry as failed:'parent_suspended'; resume branch's applyResultsAndReload surfaces them via childrenToRespawn; the workflow body re-dispatches via workflowBinding.create({ id: 'agent__<type>__<id>__respawn-<attempt>' }) and polls each child's durable state until terminal, then drains via recordSubSessionResult

  • a final clear step that resets the parent's suspension discriminators when fully resolved.

Coverage: subagent-respawn-on-resume.integ.test.ts (3 D1+Miniflare tests). 220/220 runtime-cloudflare integ + 1048/1048 unit pass.


CF DO + CFW Workflows harness setup helpers (O5 — partial)

Status: FU-A2-38 + FU-A2-39 ✅ landed (see "Done items" in docs/dev/follow-ups.md); the residual O5 work remains open (workerd-context CF DO + CFW Workflows setup helpers proper, marked via the synthetic IMPL_PENDING_O5 env requirement on cf-do-d1 / cfw-workflows-d1 descriptors).

What landed (round-3 closure):

  • FU-A2-39: harness/backend-descriptor.ts now uses a lazyImport<T>(path: string) pattern that defeats the bundler's static analysis. Each backend exposes setupLoader (lazy) + setup (resolved accessor). getViableBackends switched the workerd-context probe from the broken D1Database global (not exposed by @cloudflare/vitest-pool-workers) to WebSocketPair (verified empirically) and now hard-filters cross-context backends. lifecycle-hooks-parity.cf.test.ts re-enabled via a direct import './lifecycle-hooks-parity.integ.test.js'.
  • FU-A2-38: wired persistentAgentTestLLM + PersistentAgentTestServer + interruptTestLLM + InterruptTestServer in packages/e2e/src/test-worker.ts, plus matching DO bindings in packages/e2e/wrangler.cloudflare.toml. Both cloudflare-do-persistent-agents.cf.test.ts (G1) and cloudflare-do-interrupt-protocol.cf.test.ts (G2) now run end-to-end against real workerd DOs.

Residual O5 — what's left for a future session:

  1. CF backend setup helpers themselves: replace the deferred throws in harness/setup-helpers/cf-do-d1.ts / cfw-workflows-d1.ts with real workerd-context implementations that build the DO runtime + D1 binding via env.* globals. The cf-do-d1 and cfw-workflows-d1 descriptors carry an IMPL_PENDING_O5 env requirement so they're filtered out of the parity matrices today; once the helpers land, drop the marker.
  2. Any new .cf.test.ts companion that wants the C2 import-companion pattern can do so today via the FU-A2-39 plumbing — the residual O5 work is ONLY needed for parity-test coverage.

Estimated remaining effort: ~1-2 days for the helpers themselves. Priority: medium — the dedicated .cf.test.ts files cover the production-supported CF DO + CFW Workflows runtimes today; O5 closure upgrades them to parity-matrix participants.


Redis customState pipeline → Lua atomic script — ✅ landed

Closed by: commit 7a2432d7a, tracked as FU-A2-41 in docs/dev/follow-ups.md. See the "Done items" section there for the full closure write-up. The SAVE_STATE_ATOMIC_SCRIPT now performs the CAS version check, main hash field write, TTL application, and the complete customState replacement (scalars + arrays + array-key index + per-key TTLs) inside a single EVAL call. The pre-fix non-atomic pipeline + orphan-recovery loop are both gone — partial state is now structurally impossible (any crash before script completion leaves the prior state untouched).

Coverage: packages/store-redis/src/__tests__/integration/save-state-atomic.integ.test.ts covers happy path, replacement semantics (absent-key-drop), scalar↔ array type transitions, CAS rejection on stale version, 10-way parallel-save serialization, no-orphan-list-keys after a sequence of array/scalar/delete transitions, and empty-customState clears.


Pre-existing Temporal integration test bisect (FU-A2-42)

Why future: Diagnostic + fix; needs git bisect across the v7 stateless-suspension commit train. Tracked in docs/dev/follow-ups.md as FU-A2-42.

Current state: packages/runtime-temporal/src/__tests__/integration/temporal.integ.test.ts has 12 failing tests when run against a live Temporal server. Verified failing on commit a7b325f67 (the commit BEFORE round-2 work started), so they predate the round-2 fixes. The simplest failure (should use provided initial state) shows that initialState: { notes: ['Pre-existing 1'] } is being lost between runner.executeWorkflow and the workflow's customState — empty array reaches the state store.

Scope for the future session:

  1. Run git bisect between the last known-good commit (the v6→v7 transition point) and a7b325f67 to identify the breaking commit.
  2. Verify initialState reaches executeWorkflow and the workflow's Immer baseline.
  3. Likely a small fix once root cause is known.

Estimated effort: 0.5–1 day. Dependencies: None. Priority: medium — these tests cover real consumer-facing surfaces (initial state, conversation continuation, branching).


P2 polish backlog (round-2 review remainders) ✅ closed

Status: closed (P2 polish bundled sweep).

All six items landed in one session — interface contracts updated, parity tests added, and (per the user's standing "no module-level state" rule) module-level mutable state across packages audited and either encapsulated or explicitly documented as intentional.

P2 backlog items:

  1. failStream idempotency guard — added to memory store, Cloudflare D1 StreamDurableObject, and DOStreamManager. RedisStreamManager already had it via its CAS_TO_TERMINAL_SCRIPT. The contract is now documented at the interface level (packages/core/src/store/stream-manager.ts): endStream/failStream are idempotent — the first terminal writer wins. Cross-store parity pinned by packages/e2e/src/__tests__/stream-terminal-state-parity.integ.test.ts.
  2. getViableBackends skip-reason visibility — the harness now prints (a) a one-line summary when ALL backends got filtered out (the "0 tests ran with no signal why" pain point), and (b) a per-call structured breakdown when HELIX_TEST_VERBOSE_SKIP=1 is set.
  3. Debug console.log cleanup in non-canonical examples — audited every example app's source; the only "debug leftover" was examples/research-assistant-cloudflare-do/src/tools/web-search.ts (a tool-body console.log). Removed. The other console.log usages in examples are legitimate CLI-banner / demo-script output and the my-agent-server.ts lifecycle-hook example is a template illustrating where users would add their own logging.
  4. chunkParseCache per-instance LRU — closed earlier as part of the Redis round-3 closures (commit 8a411c26a); now a per-instance ChunkParseCache class with proper LRU eviction.
  5. Memory-store latestSequence parity test — the new stream-terminal-state-parity.integ.test.ts asserts uniformly across memory + Redis that getStreamInfo().latestSequence stays the monotonic counter after cleanupToStep shrinks the chunks table.
  6. prepareHelixReconnectRequest getter-form docs — JSDoc cross-references the resumeFromSequence / existingMessageId function-form contract (mirroring the prepareHelixChatRequest docs) so consumers know the getter pattern works on the reconnect path too.

Bonus — module-level mutable state audit (per the user's "no module-level stuff" rule, applied to ALL of packages/, not just the P2 backlog scope):

  • packages/core/src/tracing/tracing-hooks.tstracingStateMap
    • cleanupCounter were module-level singletons shared across every createTracingHooks invocation in the process; rewritten as a TracingStateRegistry class instantiated per-invocation. The standalone getTraceContextFromHook / injectTraceContext exports (which read the singleton) now throw with a clear migration message pointing at @helix-agents/tracing-langfuse.
  • packages/runtime-dbos/src/steps/execute-companion-tool.ts — the let _companionDeps: BindCompanionToolDeps | null plain module-level singleton converted to a CompanionToolStep class with a static deps field, mirroring the ExecuteToolStep / ExecuteSubAgentStep patterns elsewhere in the package. (True per-instance scoping isn't possible without DBOS structural changes — workflow bodies are registered globally and run outside any executor instance context — but the class form gives the bind/get pair a named container so the lifecycle is greppable.)
  • packages/runtime-cloudflare/src/client-tool-workflow-helper.tsrootOwnershipLocks — documented inline as intentional isolate-scoped serialization. Within a CFW isolate every concurrent workflow shares the lock-space for the same rootId, which is exactly the serialization required. Moving to per-instance scoping would break the cross-workflow serialization.
  • packages/runtime-cloudflare/src/workspaces/sandbox/code.tswarnedLanguages — documented inline as intentional cross-process log-spam suppression. The Set is bounded by the count of distinct non-pinned languages (currently 0).

Round-3 remaining items (deferred)

Round-3 surfaced ~50 findings across 8 review angles (security, errors, concurrency, perf, type safety, migration, observability, build/release). Batches H–N landed the P0s + high-impact P1s. The items below are real but each is bounded enough to defer.

Redis round-3 closures landed (4 commits on omnara/stateless-suspension-redesign)

The Redis-side P3.R3-CONC, P3.R3-PERF, P3.R3-OBS, P3.R3-MISC, and P2-polish items shipped in commits 7a2432d7a296ad28e7:

  • 7a2432d7a — FU-A2-41 atomic saveState Lua (no more split customState pipeline; orphan-recovery loop deleted)
  • 11723f815 — P3.R3-CONC + P3.R3-MISC: patchMetadata, updateStatus, setInterruptFlag atomic Lua + compareAndSetStatus const hoist
  • 8a411c26a — Redis-stream hardening: initStream atomic, scan-based getStreamCount, NOT_ACTIVE log, Logger option, maxChunks=0 startup warning, per-instance LRU chunkParseCache
  • 296ad28e7 — deleteSession TOCTOU (enumerate all status indexes), listRuns per-status secondary sorted-set index, cleanupOrphanedStagingData pipelined + structured summary log

Verification: 154 unit + 484 integ tests in store-redis (incl. 4 new test files dedicated to the Redis closures with 34 new tests covering atomicity, CAS, concurrency, type transitions, structural-orphan absence, and observability). Cross-runtime e2e suite: 1390 passed | 61 skipped | 1 failed (C-3 temporal-memory pre-existing timing flake; unrelated). Full npm run test:integration matrix: only 12 FU-A2-42 runtime-temporal failures (pre-existing) — every Redis-touching package green.

The remaining bullet items below are Postgres / D1 / non-Redis or not yet closed.

P3.R3-CONC: Concurrency CAS Lua follow-ups

Status: ✅ mostly closed — 5 of 6 items landed across commits 11723f815, 8a411c26a, 296ad28e7 (also tracked as FU-A2-44 in follow-ups.md). Only the RedisLockManager fencing-token bit remains. Surfaced by: round-3 review #3 P1.C1 / P1.C3 + P2 cluster.

Closed sub-items:

  1. patchMetadata — converted to atomic PATCH_METADATA_SCRIPT (read + merge + write in one EVAL). Two concurrent patches with disjoint keys can no longer race; per-key last-write-wins is preserved.
  2. updateStatus — converted to UPDATE_STATUS_ATOMIC_SCRIPT (status field write + interrupt-context handling + secondary-index ZREM/ZADD + TTL refresh, all in one EVAL). Concurrent transitions can no longer leave a session indexed in two status:* ZSETs.
  3. initStream — converted to atomic INIT_STREAM_SCRIPT (HSETNX status + sequence/createdAt + HSET updatedAt + EXPIRE meta/chunks in one EVAL). Closed the window where a crash between HSETNX status and EXPIRE left the stream key without a TTL.
  4. deleteSession TOCTOU — cleanup ZREM list now enumerates ALL valid SessionStatus values (not just the one read back from HGETALL), so a concurrent updateStatus between the HGETALL and the index Lua can't leave the sessionId orphaned in a status index we didn't touch.
  5. setInterruptFlag — converted to SET_INTERRUPT_FLAG_SCRIPT (HSET reason + timestamp + EXPIRE in one EVAL). Closed the narrow window where a process crash between the HSET and the EXPIRE left the flag without a TTL. Cross-store contract preserved: last-writer-wins on reason.

Still open:

  • RedisLockManager fencing-token INCR not atomic with the lock acquisition. Redlock's mutex still bounds the race per getNextFencingToken call, so this is low priority — not blocking production. Tracked for completeness; fix would consolidate the fencing-token bump into the same EVAL as the lock acquisition.

Effort for remaining: ~half-day. Priority: low — narrow race, bounded by Redlock's mutex.

P3.R3-PERF: Performance follow-ups

Status: ✅ partial close — P3.R3-PERF.1 (Redis listRuns status index) + P3.R3-PERF.2 (D1 promoteStaging single targeted read) + P3.R3-PERF.3 (Postgres concurrentStatements migration phase) + P3.R3-PERF.4 (Postgres + D1 message_count denormalization) + P3.R3-PERF.5 (Redis cleanupOrphanedStagingData pipelining) + P3.R3-PERF.6 (Redis maxChunks=0 warning) all landed. Remaining items are marked individually below. Surfaced by: round-3 review #4 P1.P1 / P1.P2 / P1.P3 + P2.

  1. listRuns with status filter — ✅ done (Redis: 296ad28e7; Postgres: V8 composite (session_id, status, turn) index in commit c596e67ab).
  2. D1 promoteStaging — ✅ done (commit 388691974).
  3. Postgres v7 migration CREATE INDEX CONCURRENTLY — ✅ done (commit 1d2bb369c). Three-phase migration runner now supports concurrentStatements that run outside any transaction; V7's partial expiresAt index was retagged to use the new phase. Crash safety via INVALID-index cleanup before retry.
  4. Postgres + D1 listSessions correlated COUNT(*) — ✅ done (commit 054194def). V9 (Postgres) and V10 (D1) add a denormalized message_count INTEGER NOT NULL DEFAULT 0 column to __agents_states. listSessions reads it directly; appendMessages / saveStateAndPromoteStaging / cloneSession maintain it in lockstep with the messages table. Cursor pagination is NOT yet shipped — additive follow-up if the OFFSET path remains a hot spot.
  5. cleanupOrphanedStagingData — ✅ done. Redis pipelining landed in 296ad28e7; Postgres + D1 cross-session sweep landed in commit cab06588e (mirror Redis's API + structured summary log).
  6. Redis getChunksFromSequence / getChunksFromStep / startup warning for maxChunks: 0 — ✅ done in commit 8a411c26a.

Remaining open under P3.R3-PERF:

  • Postgres listSessions cursor pagination (the second half of P3.R3-PERF.4): the denormalization commit landed the per-row perf win, but OFFSET-style pagination remains O(skipped rows) on deep pages. Adding a cursor?: { updatedAt: number; sessionId: string } option to ListSessionsOptions (additive, OFFSET path stays for back-compat) would let consumers paginate via WHERE (updated_at, session_id) < ($cursor_updated, $cursor_sid) for O(log N + page) deep-page reads. Same shape applies to D1.

Effort for remaining: ~1-2 days for the cursor pagination addition (Postgres + D1 + cross-runtime parity test). Priority: low — the denormalization closes the dominant cost. Cursor pagination is only a measurable win on dashboards that paginate beyond ~10k rows, which is rare in HITL deployments.

P3.R3-OBS: Observability follow-ups ✅ closed

Status: closed (P3.R3-OBS sweep).

All seven sub-items landed across the sweep. Highlights:

  1. RedisStateStore.cleanupOrphanedStagingData Logger wired through constructor options (Round-3 closures, commit 296ad28e7).
  2. safeInvokeHook logger param tightened to required Logger. The logger ?? console fallback is gone — callers without a structured logger pass noopLogger. Test asserts no console writes happen on the noop path.
  3. parentSpanSource plumbed into Langfuse span metadata via registerResumedRun({ metadata: { 'helix.tracing.parentSpanSource' } }). Operators can now filter resumed runs by recovery path ('checkpoint' / 'session-state' / 'none') on their dashboards. Test pinned at suspend-resume-hooks.test.ts.
  4. Redis stream writer NOT_ACTIVE throw logs (commit 8a411c26a).
  5. interruptAgent 504 path emits LogEvent.agentServer.interruptDeadlineExceeded warn with { sessionId, deadlineMs, reason } before throwing. abortAgent has parity.
  6. skipped_record_tokens severity now depends on cause: debug when the LLM step errored (expected downstream effect of the already-logged upstream failure), warn when there's no error cause (adapter bug worth flagging).
  7. LogEvent canonical vocabulary added at packages/core/src/logger/events.ts. Migrated call sites: safeInvokeHook, agentServer.{authenticate,interrupt,abort,submitSchemaLimits}, usage.hook.{skipped,record_tokens,record_tool,record_subagent}, langfuse.lifecycle_hook_failed. Test enforces the <domain>.<subject>.<action> snake_case convention.

Additional console-fallback removals folded into this sweep (consistent with the user directive "we should always be using the logger"):

  • agent-server.ts allowUnauthenticated console.warn fallback — removed; warning routes through configured logger only.
  • ai-sdk/src/react/index.ts useAutoResync / useResumableChat console.error fallbacks — removed; consumers wire onError or read resyncError from the hook return.
  • core/src/orchestration/wait-for-status-transition.ts dev-only console.warn — removed.
  • core/src/workspace/types/metrics.ts consoleMetrics constant replaced with createLoggerMetrics(logger: Logger) factory.

The only remaining production-code console.* calls live in core/src/logger/console.ts (the legitimate consoleLogger implementation operators opt into) and core/src/logger/default.ts.

P3.R3-TYPE: Type-safety polish

Status: ✅ mostly closed — items 1-5 addressed by FU-TYPE-SAFETY-2026-05 (see follow-ups.md "Done items"). Only the zod/v3 ↔ zod/v4 schema-drift migration remains. Surfaced by: round-3 review #5 P2 cluster.

Closed via FU-TYPE-SAFETY-2026-05:

  1. Tool<any, any>[] and AgentConfig<any, any> epidemic — Stage A introduced AnyTool / AnyAgentConfig aliases as Tool<z.ZodType, z.ZodType> / AgentConfig<z.ZodType, z.ZodType> and propagated across 40+ files.
  2. AgentHooks<any, any> — tightened to AgentHooks<unknown, unknown> alongside the HookManager.invoke<K extends keyof AgentHooks> tightening that removed 6+ triple-casts.
  3. AgentConfig<any, any> in PersistentAgentConfig.agent — covered by the same Stage A bulk replace.
  4. getAllToolInvocations / getToolParts return types — Stage A switched to AISDKToolPart[] using the local isToolPart guard (which now validates state + toolCallId, not just the discriminator).
  5. ToolContext.getState<T>() — documented as a caller-driven API contract rather than refactored. JSDoc on core/src/types/tool.ts explains the contract end-to-end; the recommended pattern is Schema.parse(ctx.getState()) for safety-critical tools. Threading TState through Tool generics would be a deeper refactor that diverges from the current input/ output-only generic shape; the documented contract is the chosen trade-off.

Still open:

  1. zod vs zod/v4 schema drift between ErrorDetailSchema (zod) and StreamFailEventSchema's inline copy (zod/v4). Migrate error-detail.ts to zod/v4 so both surfaces agree on a single schema source.

Effort for remaining: ~half-day. Priority: low — drift is detectable by tests; no runtime breakage today.

P3.R3-SEC: Security polish

Status: open Surfaced by: round-3 review #1 P2 cluster.

  1. Body-size caps on /chat, /start, /resume agent-server routes (only /submit-tool-result has one today).
  2. SUBAGENT_TOOL_PREFIX collision check in buildEffectiveTools (mirrors the existing workspace__ / companion__ checks).
  3. sessionId length cap + character allowlist at the HTTP boundary before Redis key interpolation (log-injection vector via ANSI escapes in long sessionIds).
  4. Demo auth.ts strengthen the comment + add explicit "this checks ONLY presence" to the function body.

Effort: ~half-day total. Priority: medium — bounded attack surface, but body-size DoS is real on auth-disabled deployments.

P3.R3-MISC: Round-3 P3 polish

Status: open

  • expiredSessionCleanup uses logger.info?. on REQUIRED interface methods (info/warn/error are non-optional in the Logger interface; ?. hides type errors).
  • cleanupOrphanedStagingData silently deletes parse-failed keys; log the count separately from genuine orphans.
  • getStreamCount blocking KEYSSCAN (or add @internal warning).
  • compareAndSetStatus Lua script string-built per call → hoist to module-level const.

Effort: ~half-day. Priority: low.


Back-compat removal pass — deferred deletions

Per round-3 inverted-review pair: 4 high-confidence deletions landed in commit 676d1b339. The following are real back-compat affordances we don't want, but each deletion is gated by a test or audit that needs sub-project scope:

P3.R3-BC-FALLBACK: defaultSaveStateAndPromoteStaging ✅ closed

Status: closed (P3.R3 back-compat sweep).

defaultSaveStateAndPromoteStaging was removed from packages/core/src/store/state-store.ts and all docs were updated to require atomic saveStateAndPromoteStaging from custom stores. The non-atomic sequential fallback opens a crash window between appendMessages → saveState → promoteStaging that defeats the purpose of the atomic primitive — the "STATE CORRUPTION RISK" doc note added in round-2 was the signal that the right answer was deletion, not "kept-and-warned." All in-tree stores (memory, redis, postgres, D1, DO) already shipped atomic implementations.

Docs updated:

  • docs/guide/state-stores.md — removed the fallback example, rephrased "two paths" as a single required atomic implementation.
  • docs/internals/session-model.md — removed "Fallback for third-party stores" subsection.
  • docs/upgrade-guides/v6-to-v7-stateless-suspension.md — clarified that the helper was removed and atomic is mandatory.

P3.R3-BC-LUA-FALLBACK: allowSequentialFallback in Redis ✅ closed

Status: closed (commit 4a47638fe).

RedisStateStoreOptions.allowSequentialFallback option deleted, private promoteStagingSequential method deleted, conditional gone. The promoteStaging catch block now unconditionally rethrows the original Lua-EVAL error — there's no quiet non-atomic fallback that could silently corrupt state on a misconfigured production deployment.

Four unit-test files that previously used ioredis-mock (which doesn't support EVAL, hence why the fallback existed at all) were migrated to integration tests against real Redis under packages/store-redis/src/__tests__/integration/. This matches the project's existing integ-suite pattern and removes the last allowSequentialFallback consumer.

The in-code comment at redis-state.ts:2919-2927 documents the removal rationale (the catch-block "STATE CORRUPTION RISK" doc note added in round-2 was the signal that the right answer was deletion, not "kept-and-warned").

P3.R3-BC-CONVERTER: isToolResultError content-shape fallback ✅ closed

Status: closed (P3.R3 back-compat sweep).

The heuristic content-inspection fallback was deleted in packages/ai-sdk/src/converter/helix-to-aisdk-converter.ts. isToolResultError now reads ONLY the explicit metadata[COMMON_METADATA_KEYS.TOOL_FAILED] flag. Absent the flag, the tool is treated as successful — the safer default for partial- success outputs like { error: '', data: [...] }. Two regression tests at helix-to-aisdk-converter.test.ts (lines marked P3.R3-BC-CONVERTER closure) lock the new behavior.

P3.R3-BC-FRONTENDHANDLER: FrontendHandler redundancy with handleChatStream ✅ closed

Status: closed in v8. FrontendHandler + createFrontendHandler

  • createCloudflareFrontendHandler removed. The replacement surface (handleChatStream, buildSnapshot, getUIMessages, createCloudflareChatHandler) shipped in v7 and is the only public path going forward. See docs/upgrade-guides/v7-to-v8.md for the migration walkthrough and three observable behavior gaps (missing-stream HTTP 200 vs 204, no ValidationError class on bad-request rejection, derived generateMessageId for multi-turn de-dup).

Total deleted: 9730 LOC across 9 files (the 1425-LOC handler-factory.ts, its 77-case unit test, six FrontendHandler-only integ tests, and the Cloudflare convenience factory). The FrontendHandlerError base class survives (still used by route handlers + the express adapter's catch blocks); FrontendResponse survives (still used by buildSSEResponse + express adapters).

P3.R3-BC-MISC: Smaller back-compat removals ✅ closed

Status: closed (P3.R3 back-compat sweep).

  • buildAgentInput bare-string fallback — ✅ done. The bare-string return path at handle-chat-stream.ts was removed; the function now always returns the structured AgentInputObject form ({ message: [userMsg] }). Closure documented inline at handle-chat-stream.ts:821-830.
  • ReplayContent legacy ordering branch — ✅ done. The ~65-LOC duplicate emit path that flattened text / reasoning / toolCalls into a hardcoded order was deleted. The remaining ~25-line synthesizeOrderedItemsFromFlatFields helper is called ONLY at the input boundary as a convenience normalization for callers that didn't build an orderedItems array — the emit loop has a single code path against orderedItems regardless of input shape. Closure documented inline at replay-events.ts:175-194.
  • D1 migration chain collapse — ✅ done (commit b23000542). runMigration() now detects fresh databases and applies a single collapsed schema instead of walking V1..V10 incrementally — saves ~100-500ms on every fresh-DB worker boot. Schema parity vs the incremental path is pinned by an integ test that uses PRAGMA table_info / index_list to compare both paths structurally.

(Other future work goes here as it's identified.)

Released under the MIT License.