Skip to content

Vercel AI SDK Adapter

The Vercel AI SDK adapter (@helix-agents/llm-vercel) connects Helix Agents to any LLM provider supported by the Vercel AI SDK. This is the recommended adapter for most applications.

When to Use

Good fit:

  • Production applications
  • Multiple provider support needed
  • Streaming responses required
  • Using OpenAI, Anthropic, Google, or other major providers

Not ideal for:

  • Unit testing (use MockLLMAdapter instead)
  • Custom/private LLM APIs not in Vercel AI SDK

Installation

bash
npm install @helix-agents/llm-vercel ai

Also install provider packages for your chosen models:

bash
# OpenAI
npm install @ai-sdk/openai

# Anthropic
npm install @ai-sdk/anthropic

# Google
npm install @ai-sdk/google

Basic Usage

typescript
import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { openai } from '@ai-sdk/openai';

// Create adapter
const adapter = new VercelAIAdapter();

// Create executor
const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  adapter
);

// Define agent with Vercel AI SDK model
const agent = defineAgent({
  name: 'assistant',
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: {
    model: openai('gpt-4o'),
    temperature: 0.7,
  },
});

Supported Providers

The Vercel AI SDK supports many providers:

ProviderPackageExample Model
OpenAI@ai-sdk/openaiopenai('gpt-4o')
Anthropic@ai-sdk/anthropicanthropic('claude-sonnet-4-20250514')
Google@ai-sdk/googlegoogle('gemini-1.5-pro')
Cohere@ai-sdk/coherecohere('command-r-plus')
Mistral@ai-sdk/mistralmistral('mistral-large-latest')
Amazon Bedrock@ai-sdk/amazon-bedrockVarious models
Azure OpenAI@ai-sdk/azureAzure-hosted models

See the Vercel AI SDK documentation for the full list.

Configuration

Model Configuration

typescript
const agent = defineAgent({
  name: 'my-agent',
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: {
    // Required: The model to use
    model: openai('gpt-4o'),

    // Generation parameters
    temperature: 0.7, // 0-2, higher = more creative
    maxOutputTokens: 4096, // Maximum tokens to generate
    topP: 0.95, // Nucleus sampling
    topK: 40, // Top-k sampling

    // Penalties
    presencePenalty: 0, // Reduce repetition of topics
    frequencyPenalty: 0, // Reduce repetition of tokens

    // Control
    stopSequences: ['END'], // Stop generation at these sequences
    seed: 12345, // For deterministic outputs

    // Reliability
    maxRetries: 3, // Retry on transient failures

    // Prompt caching (automatic provider-specific optimization)
    caching: 'auto',
  },
});

Provider-Specific Options

Important: Reasoning features require AI SDK provider packages v3+:

  • @ai-sdk/openai@^3.0.0
  • @ai-sdk/anthropic@^3.0.0

Earlier v2.x versions use specificationVersion: "v2" which triggers compatibility mode in AI SDK v6, stripping reasoning features.

Enable features specific to certain providers:

typescript
// OpenAI o-series reasoning
const agent = defineAgent({
  name: 'reasoning-agent',
  systemPrompt: 'Solve complex problems step by step.',
  llmConfig: {
    model: openai('o1'),
    providerOptions: {
      openai: {
        reasoningSummary: 'detailed',
        reasoningEffort: 'high',
      },
    },
  },
});

// Anthropic extended thinking
const agent = defineAgent({
  name: 'thinking-agent',
  systemPrompt: 'Think through problems carefully.',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    providerOptions: {
      anthropic: {
        thinking: {
          type: 'enabled',
          budgetTokens: 10000,
        },
      },
    },
  },
});

Dynamic Configuration

Override LLM config based on agent state:

typescript
const agent = defineAgent({
  name: 'adaptive-agent',
  stateSchema: z.object({
    complexity: z.enum(['simple', 'complex']),
    stepCount: z.number(),
  }),
  llmConfig: {
    model: openai('gpt-4o-mini'),
    temperature: 0.5,
  },
  llmConfigOverride: (customState, stepCount) => {
    // Use more powerful model for complex tasks
    if (customState.complexity === 'complex') {
      return {
        model: openai('gpt-4o'),
        temperature: 0.2,
        maxOutputTokens: 8192,
      };
    }

    // Increase temperature over time for variety
    if (stepCount > 5) {
      return { temperature: 0.8 };
    }

    return {};
  },
});

Prompt Caching

Prompt caching reduces cost and latency by reusing cached prompt prefixes across LLM calls. Set caching: 'auto' in your agent's llmConfig to enable automatic caching:

typescript
const agent = defineAgent({
  name: 'cached-agent',
  systemPrompt: 'You are a helpful assistant with detailed instructions...',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    caching: 'auto',
  },
});

The framework automatically detects the provider and applies the appropriate caching strategy. No provider-specific code is needed in your agent definition.

How It Works

When caching: 'auto' is set, the framework calls applyCacheBreakpoints() before each LLM call. This pure function inspects the model's provider metadata and applies provider-specific optimizations:

Anthropic (Claude) - Places cache_control: { type: 'ephemeral' } markers on:

  1. The last system message (caches the system prompt)
  2. The last tool definition (caches the tool schema)
  3. The conversation boundary (caches older conversation history)
typescript
// What happens automatically under the hood for Anthropic:
// messages[0].providerOptions = { anthropic: { cacheControl: { type: 'ephemeral' } } }
// tools[lastIndex].providerOptions = { anthropic: { cacheControl: { type: 'ephemeral' } } }

OpenAI (GPT-4o, o1, o3, etc.) - Sets a promptCacheKey provider option derived from the session ID, enabling cache affinity for repeated conversations within the same session:

typescript
// What happens automatically for OpenAI:
// providerOptions: { openai: { promptCacheKey: sessionId } }

Google Gemini - No action needed. Gemini uses automatic prefix caching built into the API. The framework detects Google/Vertex providers and skips annotation.

xAI/Grok - Sets the x-grok-conv-id header from the session ID for conversation-level cache routing:

typescript
// What happens automatically for xAI:
// headers: { 'x-grok-conv-id': sessionId }

Cache Token Tracking

Cache hit/miss metrics flow through the standard token usage pipeline:

typescript
// In afterLLMCall hook
hooks: {
  afterLLMCall: (payload, ctx) => {
    if (payload.usage) {
      console.log(`Prompt tokens: ${payload.usage.promptTokens}`);
      console.log(`Cached tokens: ${payload.usage.cachedTokens}`);      // Cache hits
      console.log(`Cache writes: ${payload.usage.cacheWriteTokens}`);   // New cache entries
    }
  },
}

Cache tokens also appear in:

  • Stream chunks: step_end chunks include cachedTokens and cacheWriteTokens in their usage field
  • Usage tracking: The TokenCounts rollup includes cached and cacheWrite fields
  • Langfuse tracing: Mapped to cache_read_input_tokens and cache_creation_input_tokens

Custom Cache Control

For advanced use cases, you can set providerOptions directly on messages, content parts, and tools instead of using caching: 'auto':

typescript
// Manual Anthropic cache control on a specific message
const messages: Message[] = [
  {
    role: 'system',
    content: 'Expensive system prompt...',
    providerOptions: {
      anthropic: { cacheControl: { type: 'ephemeral' } },
    },
  },
  { role: 'user', content: 'Hello' },
];

When using manual providerOptions, omit caching: 'auto' to avoid the framework overwriting your markers.

Streaming

The adapter supports real-time streaming:

typescript
// Streaming happens automatically in execute()
const handle = await executor.execute(agent, 'Research AI agents');

// Get the stream
const stream = await handle.stream();
if (stream) {
  for await (const chunk of stream) {
    switch (chunk.type) {
      case 'text_delta':
        process.stdout.write(chunk.delta);
        break;
      case 'thinking':
        console.log('[Thinking]', chunk.content);
        break;
      case 'tool_start':
        console.log(`[Tool: ${chunk.toolName}]`);
        break;
    }
  }
}

Chunk Mapping

The adapter maps Vercel AI SDK stream parts to framework chunks:

Vercel AI SDKFrameworkNotes
text-deltatext_deltaGenerated text tokens
reasoning-deltathinkingReasoning/thinking content
tool-input-starttool_startTool call begins
tool-calltool_startComplete tool call
tool-resulttool_endTool result
errorerrorGeneration error

Thinking/Reasoning Content

Both Anthropic and OpenAI support reasoning features:

Anthropic Extended Thinking

typescript
const agent = defineAgent({
  name: 'claude-thinker',
  llmConfig: {
    model: anthropic('claude-sonnet-4-20250514'),
    providerOptions: {
      anthropic: {
        thinking: {
          type: 'enabled',
          budgetTokens: 10000, // Token budget for thinking
        },
      },
    },
  },
});

// Thinking content streams via 'thinking' chunks
for await (const chunk of stream) {
  if (chunk.type === 'thinking') {
    console.log('[Claude thinking...]', chunk.content);
  }
}

OpenAI Reasoning

typescript
const agent = defineAgent({
  name: 'o1-reasoner',
  llmConfig: {
    model: openai('o1'),
    providerOptions: {
      openai: {
        reasoningSummary: 'detailed', // or 'concise'
        reasoningEffort: 'high', // or 'medium', 'low'
      },
    },
  },
});

Message Conversion

The adapter converts framework messages to Vercel AI SDK format:

Framework → Vercel AI SDK

typescript
// Framework format
const messages: Message[] = [
  { role: 'system', content: 'You are helpful.' },
  { role: 'user', content: 'Hello' },
  {
    role: 'assistant',
    content: 'I will search for that.',
    toolCalls: [{ id: 'tc1', name: 'search', arguments: { q: 'test' } }],
  },
  {
    role: 'tool',
    toolCallId: 'tc1',
    toolName: 'search',
    content: JSON.stringify({ results: [] }),
  },
];

// Automatically converted to Vercel AI SDK ModelMessage[]

The conversion handles:

  • System, user, and assistant messages
  • Tool calls in assistant messages
  • Tool results in tool messages
  • Mixed text + tool call content

Tool Conversion

Framework tools (with Zod schemas) are converted to Vercel AI SDK tools:

typescript
// Framework tool
const searchTool = defineTool({
  name: 'search',
  description: 'Search the web',
  inputSchema: z.object({
    query: z.string(),
    limit: z.number().optional(),
  }),
  execute: async (input, ctx) => {
    // ...
  },
});

// Automatically converted to Vercel AI SDK tool format
// The Zod schema is passed directly (AI SDK 5.x supports Zod)

Error Handling

The adapter handles errors gracefully:

typescript
const adapter = new VercelAIAdapter({
  logger: console, // Optional: log warnings
});

// Errors are returned as ErrorStepResult, not thrown
const result = await adapter.generateStep(input);

if (result.type === 'error') {
  console.error('LLM error:', result.error.message);
  // Framework handles this appropriately
}

Retry Configuration

Configure retries for transient failures:

typescript
const agent = defineAgent({
  llmConfig: {
    model: openai('gpt-4o'),
    maxRetries: 5, // Retry up to 5 times on transient errors
  },
});

Logger Integration

Pass a custom logger for debug output:

typescript
import { VercelAIAdapter } from '@helix-agents/llm-vercel';

const logger = {
  debug: (msg: string) => console.debug(`[DEBUG] ${msg}`),
  info: (msg: string) => console.info(`[INFO] ${msg}`),
  warn: (msg: string) => console.warn(`[WARN] ${msg}`),
  error: (msg: string) => console.error(`[ERROR] ${msg}`),
};

const adapter = new VercelAIAdapter({ logger });

Complete Example

typescript
import { defineAgent, defineTool } from '@helix-agents/core';
import { JSAgentExecutor } from '@helix-agents/runtime-js';
import { InMemoryStateStore, InMemoryStreamManager } from '@helix-agents/store-memory';
import { VercelAIAdapter } from '@helix-agents/llm-vercel';
import { openai } from '@ai-sdk/openai';
import { z } from 'zod';

// Create adapter
const adapter = new VercelAIAdapter();

// Define tool
const searchTool = defineTool({
  name: 'web_search',
  description: 'Search the web for information',
  inputSchema: z.object({
    query: z.string().describe('Search query'),
  }),
  outputSchema: z.object({
    results: z.array(z.string()),
  }),
  execute: async (input) => {
    // Simulate search
    return { results: [`Result for: ${input.query}`] };
  },
});

// Define agent
const ResearchAgent = defineAgent({
  name: 'researcher',
  description: 'Researches topics using web search',
  systemPrompt: `You are a research assistant.
Use the web_search tool to find information.
Summarize your findings clearly.`,
  tools: [searchTool],
  outputSchema: z.object({
    summary: z.string(),
    sources: z.array(z.string()),
  }),
  llmConfig: {
    model: openai('gpt-4o'),
    temperature: 0.3,
    maxOutputTokens: 2048,
  },
});

// Create executor
const executor = new JSAgentExecutor(
  new InMemoryStateStore(),
  new InMemoryStreamManager(),
  adapter
);

// Execute
async function main() {
  const handle = await executor.execute(
    ResearchAgent,
    'What are the latest developments in AI agents?'
  );

  // Stream output
  const stream = await handle.stream();
  if (stream) {
    for await (const chunk of stream) {
      if (chunk.type === 'text_delta') {
        process.stdout.write(chunk.delta);
      }
    }
  }

  // Get result
  const result = await handle.result();
  console.log('\n\nResult:', result.output);
}

main();

Limitations

Model-Specific Features

Not all features work with all models:

  • Thinking/reasoning: Only Anthropic Claude and OpenAI o-series
  • Tool calling: Most models, but check provider docs
  • JSON mode: Provider-specific implementation

Token Counting

The adapter doesn't provide token counting. Use provider SDKs directly for token estimation.

Image/Multimodal

The framework supports file uploads (images, PDFs, etc.) via the files field in AgentInput. Files are converted to ContentPart[] alongside the text message and passed to the LLM. This works across all runtimes (JS, Temporal, Cloudflare).

typescript
await executor.execute(
  agent,
  {
    message: 'Describe this image',
    files: [
      {
        data: base64EncodedData,
        mediaType: 'image/png',
        filename: 'screenshot.png', // optional
      },
    ],
  },
  { sessionId }
);

Next Steps

Released under the MIT License.