Custom LLM Adapters

Build your own LLM adapter to integrate with custom providers, private APIs, or specialized models. This guide covers the adapter interface and implementation patterns.

The LLMAdapter Interface

Every adapter must implement:

typescript

interface LLMAdapter {
  generateStep(input: LLMGenerateInput): Promise<StepResult<unknown>>;
}

The single method receives all context and returns a typed result.

Input Structure

typescript

interface LLMGenerateInput {
  // Conversation history (includes system prompt as first message)
  messages: Message[];

  // Available tools (framework format)
  tools: Tool[];

  // LLM configuration
  config: LLMConfig;

  // Cancellation signal
  abortSignal?: AbortSignal;

  // Streaming callbacks
  callbacks?: LLMStreamCallbacks;

  // Context
  agentId: string;
  agentType: string;
}

Messages

Messages use a simple format:

typescript

type Message =
  | { role: 'system'; content: string }
  | { role: 'user'; content: string }
  | { role: 'assistant'; content: string; toolCalls?: ToolCallInfo[] }
  | { role: 'tool'; toolCallId: string; toolName: string; content: string };

Convert these to your provider's format in the adapter.

Tools

Framework tools have Zod schemas:

typescript

interface Tool {
  name: string;
  description: string;
  inputSchema: z.ZodType; // Zod schema
  outputSchema?: z.ZodType;
  execute: (input, context) => Promise<unknown>;
}

Convert to your provider's format (usually JSON Schema):

typescript

import { zodToJsonSchema } from 'zod-to-json-schema';

function convertTool(tool: Tool): ProviderTool {
  return {
    name: tool.name,
    description: tool.description,
    parameters: zodToJsonSchema(tool.inputSchema),
  };
}

Output Structure

Return one of four result types:

Text Response

typescript

return {
  type: 'text',
  content: 'The generated text',
  thinking: { content: 'Reasoning...' }, // Optional
  shouldStop: true, // true for natural completion
  stopReason: 'end_turn',
};

Tool Calls

typescript

return {
  type: 'tool_calls',
  toolCalls: [{ id: 'tc1', name: 'search', arguments: { query: 'AI' } }],
  subAgentCalls: [], // Framework handles sub-agents
  content: 'Let me search for that.', // Optional accompanying text
  thinking: undefined,
  shouldStop: false, // Never stop on tool calls
  stopReason: 'tool_use',
};

Structured Output

When the LLM calls the __finish__ tool:

typescript

return {
  type: 'structured_output',
  output: { summary: 'The result', score: 0.95 },
  thinking: undefined,
  shouldStop: true,
  stopReason: 'tool_use',
};

Error

typescript

return {
  type: 'error',
  error: new Error('Rate limit exceeded'),
  shouldStop: true,
  stopReason: 'error',
};

Stop Reason Mapping

Map your provider's finish reasons to framework stop reasons:

typescript

import type { StopReason } from '@helix-agents/core';

function mapStopReason(providerReason: string): StopReason {
  switch (providerReason) {
    case 'stop':
    case 'end':
      return 'end_turn';

    case 'tool_calls':
    case 'function_call':
      return 'tool_use';

    case 'length':
    case 'max_tokens':
      return 'max_tokens';

    case 'content_filter':
    case 'safety':
      return 'content_filter';

    default:
      return 'unknown';
  }
}

The framework handles stop reasons:

end_turn, stop_sequence: Agent completes successfully
tool_use: Continue execution (not a real stop)
max_tokens, content_filter, refusal, error, unknown: Agent fails

Streaming Callbacks

Invoke callbacks for real-time updates:

typescript

async generateStep(input: LLMGenerateInput): Promise<StepResult<unknown>> {
  const { callbacks } = input;

  // Stream text tokens
  for await (const token of provider.streamTokens()) {
    callbacks?.onTextDelta?.(token);
  }

  // Stream thinking content
  if (thinkingChunk) {
    callbacks?.onThinking?.(thinkingChunk, isComplete);
  }

  // Notify of tool calls
  for (const toolCall of toolCalls) {
    callbacks?.onToolCall?.(toolCall);
  }

  // Report errors
  if (error) {
    callbacks?.onError?.(error);
  }

  return result;
}

Complete Example

Here's a minimal adapter for a hypothetical API:

typescript

import type {
  LLMAdapter,
  LLMGenerateInput,
  StepResult,
  ParsedToolCall,
  StopReason,
} from '@helix-agents/core';
import { FINISH_TOOL_NAME } from '@helix-agents/core';
import { zodToJsonSchema } from 'zod-to-json-schema';

export class MyProviderAdapter implements LLMAdapter {
  private apiKey: string;

  constructor(apiKey: string) {
    this.apiKey = apiKey;
  }

  async generateStep(input: LLMGenerateInput): Promise<StepResult<unknown>> {
    const { messages, tools, config, abortSignal, callbacks, agentId, agentType } = input;

    try {
      // Convert messages to provider format
      const providerMessages = this.convertMessages(messages);

      // Convert tools to provider format
      const providerTools = tools.map((t) => ({
        name: t.name,
        description: t.description,
        parameters: zodToJsonSchema(t.inputSchema),
      }));

      // Call provider API (streaming)
      const response = await fetch('https://api.myprovider.com/chat', {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${this.apiKey}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          messages: providerMessages,
          tools: providerTools,
          max_tokens: config.maxOutputTokens,
          temperature: config.temperature,
          stream: true,
        }),
        signal: abortSignal,
      });

      // Process streaming response
      let text = '';
      const toolCalls: ParsedToolCall[] = [];
      let finishReason: string | undefined;

      const reader = response.body?.getReader();
      if (!reader) throw new Error('No response body');

      while (true) {
        const { done, value } = await reader.read();
        if (done) break;

        const chunk = JSON.parse(new TextDecoder().decode(value));

        // Handle different chunk types
        if (chunk.type === 'text_delta') {
          text += chunk.text;
          callbacks?.onTextDelta?.(chunk.text);
        } else if (chunk.type === 'tool_call') {
          const toolCall = {
            id: chunk.id,
            name: chunk.name,
            arguments: chunk.arguments,
          };
          toolCalls.push(toolCall);
          callbacks?.onToolCall?.(toolCall);
        } else if (chunk.type === 'finish') {
          finishReason = chunk.reason;
        }
      }

      // Map stop reason
      const stopReason = this.mapStopReason(finishReason);

      // Handle tool calls
      if (toolCalls.length > 0) {
        // Check for __finish__ tool (structured output)
        const finishCall = toolCalls.find((tc) => tc.name === FINISH_TOOL_NAME);
        if (finishCall) {
          return {
            type: 'structured_output',
            output: finishCall.arguments,
            shouldStop: true,
            stopReason: 'tool_use',
          };
        }

        return {
          type: 'tool_calls',
          toolCalls,
          subAgentCalls: [],
          content: text || undefined,
          shouldStop: false,
          stopReason: 'tool_use',
        };
      }

      // Text response
      return {
        type: 'text',
        content: text,
        shouldStop: stopReason !== 'tool_use',
        stopReason,
      };
    } catch (error) {
      callbacks?.onError?.(error instanceof Error ? error : new Error(String(error)));

      return {
        type: 'error',
        error: error instanceof Error ? error : new Error(String(error)),
        shouldStop: true,
        stopReason: 'error',
      };
    }
  }

  private convertMessages(messages: LLMGenerateInput['messages']) {
    return messages.map((msg) => {
      switch (msg.role) {
        case 'system':
          return { role: 'system', content: msg.content };
        case 'user':
          return { role: 'user', content: msg.content };
        case 'assistant':
          if (msg.toolCalls) {
            return {
              role: 'assistant',
              content: msg.content,
              tool_calls: msg.toolCalls.map((tc) => ({
                id: tc.id,
                function: { name: tc.name, arguments: JSON.stringify(tc.arguments) },
              })),
            };
          }
          return { role: 'assistant', content: msg.content ?? '' };
        case 'tool':
          return {
            role: 'tool',
            tool_call_id: msg.toolCallId,
            content: msg.content,
          };
      }
    });
  }

  private mapStopReason(reason: string | undefined): StopReason {
    switch (reason) {
      case 'stop':
        return 'end_turn';
      case 'tool_calls':
        return 'tool_use';
      case 'length':
        return 'max_tokens';
      case 'content_filter':
        return 'content_filter';
      default:
        return 'unknown';
    }
  }
}

MockLLMAdapter for Testing

The framework includes MockLLMAdapter for testing:

typescript

import { MockLLMAdapter } from '@helix-agents/core';

// Create with pre-configured responses
const mock = new MockLLMAdapter([
  { type: 'text', content: 'Searching...', shouldStop: false },
  {
    type: 'tool_calls',
    toolCalls: [{ id: 'tc1', name: 'search', arguments: { query: 'AI' } }],
  },
  { type: 'structured_output', output: { result: 'Found it!' } },
]);

// Or add responses incrementally
mock.addResponse({ type: 'text', content: 'Hello', shouldStop: true });

// Use in tests
const executor = new JSAgentExecutor(stateStore, streamManager, mock);
const handle = await executor.execute(agent, 'Test');
const result = await handle.result();

// Verify call count
expect(mock.getCallCount()).toBe(3);

Mock Response Types

typescript

// Text response
{
  type: 'text',
  content: 'Generated text',
  shouldStop: true,
  stopReason: 'end_turn',  // Optional
}

// Tool calls
{
  type: 'tool_calls',
  toolCalls: [{ id: 'tc1', name: 'search', arguments: { q: 'test' } }],
  subAgentCalls: [],  // Optional
  content: 'Accompanying text',  // Optional
}

// Structured output
{
  type: 'structured_output',
  output: { summary: 'Done' },
}

// Error
{
  type: 'error',
  message: 'Rate limit exceeded',
  recoverable: true,  // Optional
}

Testing Patterns

typescript

import { describe, it, expect, beforeEach } from 'vitest';
import { MockLLMAdapter } from '@helix-agents/core';

describe('MyAgent', () => {
  let mock: MockLLMAdapter;

  beforeEach(() => {
    mock = new MockLLMAdapter();
  });

  it('should use tools correctly', async () => {
    // First step: LLM requests tool
    mock.addResponse({
      type: 'tool_calls',
      toolCalls: [{ id: 'tc1', name: 'lookup', arguments: { id: '123' } }],
    });

    // Second step: LLM completes with output
    mock.addResponse({
      type: 'structured_output',
      output: { status: 'success', data: 'Found' },
    });

    const result = await executor.execute(agent, 'Look up item 123');
    expect(result.output).toEqual({ status: 'success', data: 'Found' });
  });

  it('should handle errors', async () => {
    mock.addResponse({
      type: 'error',
      message: 'API unavailable',
      recoverable: false,
    });

    const result = await executor.execute(agent, 'Test');
    expect(result.status).toBe('failed');
  });
});

Best Practices

1. Handle Abort Signals

typescript

async generateStep(input: LLMGenerateInput): Promise<StepResult<unknown>> {
  const { abortSignal } = input;

  // Pass to fetch
  const response = await fetch(url, { signal: abortSignal });

  // Or check manually
  if (abortSignal?.aborted) {
    return {
      type: 'error',
      error: new Error('Request aborted'),
      shouldStop: true,
      stopReason: 'error',
    };
  }
}

2. Always Return StepResult

Never throw from generateStep(). Return error results instead:

typescript

try {
  // LLM call
} catch (error) {
  return {
    type: 'error',
    error: error instanceof Error ? error : new Error(String(error)),
    shouldStop: true,
    stopReason: 'error',
  };
}

3. Handle the finish Tool

Check for the special finish tool that signals structured output:

typescript

import { FINISH_TOOL_NAME } from '@helix-agents/core';

const finishCall = toolCalls.find((tc) => tc.name === FINISH_TOOL_NAME);
if (finishCall) {
  return {
    type: 'structured_output',
    output: finishCall.arguments,
    shouldStop: true,
    stopReason: 'tool_use',
  };
}

4. Preserve Accompanying Text

When LLM returns text with tool calls, preserve it:

typescript

// LLM: "Let me search for that." + [search tool call]
return {
  type: 'tool_calls',
  toolCalls: [...],
  content: 'Let me search for that.',  // Don't discard!
  ...
};

5. Invoke Callbacks Correctly

Stream callbacks as data arrives, not in bulk at the end:

typescript

// Good: Stream as received
for await (const chunk of stream) {
  callbacks?.onTextDelta?.(chunk);
}

// Bad: Buffer then emit at end
const text = await stream.collect();
callbacks?.onTextDelta?.(text); // Not streaming!

Next Steps

LLM Overview - Understanding the adapter interface
Vercel AI SDK Adapter - Reference implementation
Testing Guide - Using mocks in tests

Custom LLM Adapters ​

The LLMAdapter Interface ​

Input Structure ​

Messages ​

Tools ​

Output Structure ​

Text Response ​

Tool Calls ​

Structured Output ​

Error ​

Stop Reason Mapping ​

Streaming Callbacks ​

Complete Example ​

MockLLMAdapter for Testing ​

Mock Response Types ​

Testing Patterns ​

Best Practices ​

1. Handle Abort Signals ​

2. Always Return StepResult ​

3. Handle the finish Tool ​

4. Preserve Accompanying Text ​

5. Invoke Callbacks Correctly ​

Next Steps ​

Custom LLM Adapters

The LLMAdapter Interface

Input Structure

Messages

Tools

Output Structure

Text Response

Tool Calls

Structured Output

Error

Stop Reason Mapping

Streaming Callbacks

Complete Example

MockLLMAdapter for Testing

Mock Response Types

Testing Patterns

Best Practices

1. Handle Abort Signals

2. Always Return StepResult

3. Handle the finish Tool

4. Preserve Accompanying Text

5. Invoke Callbacks Correctly

Next Steps