Tracing & Observability
Tracing provides visibility into agent execution for debugging, performance analysis, and cost tracking. Helix Agents integrates with Langfuse for comprehensive LLM observability.
Overview
Tracing captures:
- Agent Runs - Full execution lifecycle with timing and status
- LLM Calls - Model, tokens, latency, prompts and responses
- Tool Executions - Arguments, results, and timing
- Sub-Agent Calls - Nested traces with parent-child relationships
- Metadata - User attribution, session grouping, custom tags
Why Trace?
- Debugging - Understand why an agent behaved a certain way
- Performance - Identify slow LLM calls or inefficient tool usage
- Cost Tracking - Monitor token usage across users and features
- Quality - Evaluate agent outputs and improve prompts
- Compliance - Audit trail of LLM interactions
Quick Start
1. Install the Package
npm install @helix-agents/tracing-langfuse langfuse2. Set Up Langfuse
Create a Langfuse account and get your API keys:
# .env
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...3. Add Hooks to Your Agent
import { createLangfuseHooks } from '@helix-agents/tracing-langfuse';
import { defineAgent, JSAgentExecutor } from '@helix-agents/sdk';
// Create hooks (auto-reads credentials from env)
const { hooks, flush } = createLangfuseHooks();
// Use with agent
const agent = defineAgent({
name: 'my-agent',
hooks,
systemPrompt: 'You are a helpful assistant.',
llmConfig: { model: { provider: 'openai', name: 'gpt-4o' } },
});
// Execute
const handle = await executor.execute(agent, 'Hello!');
const result = await handle.result;
// Flush in serverless (optional in long-running processes)
await flush();4. View Traces in Langfuse
Open your Langfuse dashboard to see:
- Trace timeline with all observations
- Token usage and costs
- Latency breakdown
- Error details
Configuration
Basic Options
const { hooks } = createLangfuseHooks({
// Credentials (optional if using env vars)
publicKey: 'pk-lf-...',
secretKey: 'sk-lf-...',
baseUrl: 'https://cloud.langfuse.com', // or self-hosted URL
// Version tag for filtering
release: '1.0.0',
// Default tags for all traces
defaultTags: ['production', 'v2'],
// Default metadata for all traces
defaultMetadata: {
service: 'chat-api',
team: 'platform',
},
// Debug logging
debug: false,
});Data Capture Options
Control what data is sent to Langfuse:
const { hooks } = createLangfuseHooks({
// Agent state snapshots (may be large)
includeState: false,
// Full conversation messages (may contain PII)
includeMessages: false,
// Tool arguments (default: true)
includeToolArgs: true,
// Tool results (may be large)
includeToolResults: false,
// LLM prompts (default: true)
includeGenerationInput: true,
// LLM responses (default: true)
includeGenerationOutput: true,
});Privacy
For production systems handling PII, consider disabling includeMessages, includeGenerationInput, and includeGenerationOutput to avoid logging sensitive user data.
Metadata & Tagging
Metadata enables filtering and attribution in Langfuse.
Passing Metadata at Execution
await executor.execute(agent, input, {
// User attribution
userId: 'user-123',
// Session grouping (e.g., conversation threads)
sessionId: 'conversation-456',
// Tags for filtering
tags: ['premium', 'mobile'],
// Custom key-value metadata
metadata: {
environment: 'production',
region: 'us-west-2',
feature: 'chat',
},
});Using the Context Builder
For better ergonomics, use the fluent builder:
import { tracingContext } from '@helix-agents/tracing-langfuse';
const context = tracingContext()
.user('user-123')
.session('conversation-456')
.tags('premium', 'mobile')
.environment('production')
.version('1.0.0')
.metadata('region', 'us-west-2')
.build();
await executor.execute(agent, input, context);Typed Metadata
For common metadata patterns, use typed interfaces:
import { createTracingMetadata } from '@helix-agents/tracing-langfuse';
const metadata = createTracingMetadata({
environment: 'production',
version: '1.0.0',
service: 'chat-api',
region: 'us-west-2',
tier: 'premium',
source: 'mobile',
});
await executor.execute(agent, input, { metadata });Trace Hierarchy
Every agent run creates a trace with nested observations:
graph TB
subgraph Trace ["trace: my-agent"]
G1["generation: llm.generation<br/><i>model: gpt-4o, tokens: 1234</i>"]
T1["span: tool:search<br/><i>args: { query: '...' }</i>"]
T2["span: tool:calculate"]
G2["generation: llm.generation"]
subgraph SubAgent ["span: agent:sub-agent"]
SG["generation: llm.generation"]
ST["span: tool:fetch"]
end
end
G1 --> T1 --> T2 --> G2 --> SubAgent
SG --> ST- Trace - Root container, represents the full agent run
- Generation - LLM call with model, tokens, timing
- Span - Tool or sub-agent execution
Lifecycle Hooks
Customize observations with lifecycle hooks:
onAgentTraceCreated
Called when the root trace is created:
const { hooks } = createLangfuseHooks({
onAgentTraceCreated: ({ runId, agentName, hookContext, updateTrace }) => {
// Add environment info
updateTrace({
metadata: {
nodeVersion: process.version,
environment: process.env.NODE_ENV,
},
});
},
});onGenerationCreated
Called when an LLM generation starts:
const { hooks } = createLangfuseHooks({
onGenerationCreated: ({ model, modelParameters, updateGeneration }) => {
// Tag by provider
const provider = model?.includes('gpt') ? 'openai' : 'anthropic';
updateGeneration({
metadata: { provider },
});
},
});onToolCreated
Called when a tool span starts:
const { hooks } = createLangfuseHooks({
onToolCreated: ({ toolName, toolCallId, updateTool }) => {
// Categorize tools
const category = toolName.startsWith('db_') ? 'database' : 'external';
updateTool({
metadata: { category },
});
},
});onObservationEnding
Called before any observation ends:
const { hooks } = createLangfuseHooks({
onObservationEnding: ({ type, observationId, durationMs, success, error }) => {
if (!success) {
console.error(`${type} failed after ${durationMs}ms:`, error);
}
},
});Custom Attribute Extraction
Extract attributes from hook context for all observations:
const { hooks } = createLangfuseHooks({
extractAttributes: (context) => ({
stepCount: String(context.stepCount),
hasParent: String(!!context.parentSessionId),
// Access execution metadata
region: context.metadata?.region,
}),
});Sub-Agent Tracing
Sub-agents automatically inherit tracing context:
const researchAgent = defineAgent({
name: 'researcher',
// ... config
});
const orchestrator = defineAgent({
name: 'orchestrator',
hooks, // Langfuse hooks
tools: [
createSubAgentTool({
name: 'research',
agent: researchAgent,
description: 'Delegate research tasks',
}),
],
});In Langfuse, you'll see:
graph TB
subgraph Trace ["trace: orchestrator"]
G1["generation: llm.generation"]
subgraph SubAgent ["span: agent:researcher"]
SG["generation: llm.generation"]
ST["span: tool:search"]
end
end
G1 --> SubAgent
SG --> STSub-agents inherit userId, sessionId, tags, and metadata from the parent.
Serverless Considerations
Langfuse batches events and sends them asynchronously. In serverless environments, flush before the function returns:
// AWS Lambda / Vercel / Cloudflare Workers
export async function handler(event) {
const { hooks, flush } = createLangfuseHooks();
const agent = defineAgent({ hooks, ... });
const executor = new JSAgentExecutor({ ... });
const handle = await executor.execute(agent, event.message);
const result = await handle.result;
// IMPORTANT: Flush before returning
await flush();
return { statusCode: 200, body: JSON.stringify(result) };
}For graceful shutdown in long-running processes:
const { hooks, shutdown } = createLangfuseHooks();
process.on('SIGTERM', async () => {
await shutdown(); // Flushes and closes
process.exit(0);
});Self-Hosted Langfuse
To use a self-hosted Langfuse instance:
const { hooks } = createLangfuseHooks({
baseUrl: 'https://langfuse.your-company.com',
publicKey: 'pk-...',
secretKey: 'sk-...',
});Or via environment variables:
LANGFUSE_BASEURL=https://langfuse.your-company.com
LANGFUSE_PUBLIC_KEY=pk-...
LANGFUSE_SECRET_KEY=sk-...Troubleshooting
Traces Not Appearing
- Check credentials: Ensure
LANGFUSE_PUBLIC_KEYandLANGFUSE_SECRET_KEYare set - Enable debug mode:
createLangfuseHooks({ debug: true }) - Flush in serverless: Call
await flush()before function returns - Check network: Verify connectivity to
cloud.langfuse.com
Missing Metadata
Metadata must be passed at execute() time, not in agent definition:
// WRONG: Agent definition doesn't support execution metadata
const agent = defineAgent({
metadata: { userId: '123' }, // This won't work!
});
// CORRECT: Pass at execution time
await executor.execute(agent, input, {
userId: '123',
metadata: { custom: 'value' },
});High Memory Usage
If tracing increases memory usage:
- Disable state capture:
includeState: false - Disable message capture:
includeMessages: false - Disable result capture:
includeToolResults: false - Check for stale runs (cleanup happens after 1 hour idle)
Next Steps
- API Reference - Full API documentation
- Hooks Guide - Learn about the hooks system
- Langfuse Docs - Langfuse platform documentation