Skills

Skills give an agent a library of specialized capabilities — workflows, runbooks, reference protocols — without paying for all of them on every request. They implement Anthropic's Agent Skills pattern (3-level progressive disclosure) on Helix's append-only, cacheable substrate.

Why skills

A capable agent often needs many specialized playbooks: "how to process a PDF", "how to run the deploy runbook", "the schema for the billing API". Stuffing every playbook into the system prompt works, but it costs tokens on every single request — even the turns where none of them are relevant — and it bloats the context the model has to reason over.

Skills solve this by disclosing capability progressively:

The model always sees a small catalog — just each skill's name + description (tens of tokens each).
It loads a skill's full instructions only when a task actually matches.
It reads a skill's bundled resource files only when those instructions point at them.

Because every load is append-only (catalog in the cached system-prompt prefix; bodies and files arrive as tool results), enabling skills does not invalidate prompt caching. You get a large capability library at near-zero idle token cost.

Skills disclose content; they do not execute code. Level-3 "scripts" are readable text — running them is delegated to the agent's own shell/workspace tools. This keeps the feature decoupled from any execution environment, so it works on every runtime.

The 3-level progressive-disclosure model

mermaid

graph TB
    L1["Level 1 — Catalog (always resident)<br/>name + description for every skill<br/>rendered into the system prompt"]
    L2["Level 2 — Body (loaded on demand)<br/>full skill instructions<br/>returned by load_skill"]
    L3["Level 3 — Resource files (read on demand)<br/>references/ scripts/ assets/<br/>returned by read_skill_file"]
    L1 -->|"model matches a task<br/>calls load_skill(name)"| L2
    L2 -->|"body points at a file<br/>calls read_skill_file(skill, path)"| L3

Level 1 — catalog (always resident). Every skill's name + description is rendered into a ## Skills section appended to the system prompt. The catalog is deterministically sorted by name so it is byte-stable within a session — it lives in the cached prefix and never invalidates it. The catalog carries a load-bearing guardrail ("A Skill is NOT a tool — call load_skill"), because models otherwise try to invoke skill names as if they were tools.
Level 2 — body (loaded on demand). The full skill body is loaded by the auto-injected load_skill tool, which returns the body as the tool result — the model acts on it in the same continuation, with no wasted round-trip. Because tool results only ever append to history, this is append-only and cache-safe. load_skill also emits an informational skill_loaded custom stream event.
Level 3 — resource files (read on demand). Bundled reference/script/asset files are read by the auto-injected read_skill_file tool, which supports optional startLine/endLine ranges and a path-traversal guard.

Providers

Skills resolve to plain data behind a small async interface (SkillProvider). Two providers ship in v1.

Provider	Package	Backing	Runs on
`inCodeSkillProvider`	`@helix-agents/core`	TypeScript data bundled with the agent	Everywhere (Workers-safe)
`fileSystemSkillProvider`	`@helix-agents/skill-fs`	`SKILL.md` directories on disk (Anthropic format)	Node only

`inCodeSkillProvider` (the "plain data" mode)

Skills are TypeScript data bundled with the agent. Dependency-free and Workers-safe (no node:fs). Use this on Cloudflare Workers, or anywhere you want skills version-controlled alongside your agent code. You rarely call inCodeSkillProvider directly — passing an array of skill definitions to skills does it for you (see below).

`fileSystemSkillProvider` (Node only)

Reads Anthropic-format SKILL.md directories from disk. Node only (uses node:fs/promises + yaml); not usable on Cloudflare Workers. Install the package:

bash

npm install @helix-agents/skill-fs

See the @helix-agents/skill-fs reference for options and behavior.

Want to author skills as remote packages but still ship them through the Workers-safe in-code provider? See Loading remote skill packages (build-time bake) below.

Loading remote skill packages (build-time bake)

The two providers above cover skills bundled in your own code and skills read from local disk. To pull in skills published as remote packages — a git repo, or a Claude plugin marketplace (e.g. hyperframes) — use the build-time baker, @helix-agents/skill-cli.

The model is resolve + pin at build time, then ship through the in-code provider:

You declare remote sources, pinned to a version, in a helix.skills.json manifest.
helix-skills sync fetches the selected skills and writes a generated TypeScript module exporting SkillDefinition[] — plus a committed lockfile with a per-skill sha256 integrity hash.
You import { skills } from that module and pass it to defineAgent({ skills }).

Because the output is plain in-code data, there is zero runtime fetch and no node:fs — the baked skills run everywhere, including Cloudflare Workers. The CLI itself is Node-only and runs only at build time; it is never imported by your agent.

bash

npm i -D @helix-agents/skill-cli

helix.skills.json:

json

{
  "skills": {
    "hyperframes": {
      "type": "git",
      "url": "https://github.com/heygen-com/hyperframes.git",
      "version": "0.6.70",
      "include": ["hyperframes", "hyperframes-media"]
    }
  }
}

bash

npx helix-skills sync          # bake → src/skills.generated.ts + helix.skills.lock
npx helix-skills sync --check  # CI: fail if the lockfile would change

typescript

import { defineAgent } from '@helix-agents/core';
import { skills } from './src/skills.generated';

const agent = defineAgent({
  name: 'video-agent',
  systemPrompt: 'You are a video-composition assistant.',
  llmConfig: { model },
  skills, // baked in-code skills — zero runtime fetch, Workers-safe
});

Recommended policy: gitignore the generated src/skills.generated.ts (it is a regenerable build artifact), commit helix.skills.lock, and run helix-skills sync --check in CI so manifest/lockfile drift fails the build. The manifest also supports claude-marketplace sources and version / ref / sha pinning — see the @helix-agents/skill-cli reference for the full manifest + lockfile schemas, the programmatic API (bakeSkills, parseManifest, GitRepoSource, ClaudeMarketplaceSource), and the v1 limitations.

Defining skills

Set AgentConfig.skills to either a SkillProvider or an array of in-code SkillDefinitions. The array form is sugar for inCodeSkillProvider.

In-code (array sugar)

typescript

import { defineAgent } from '@helix-agents/core';

const agent = defineAgent({
  name: 'assistant',
  systemPrompt: 'You are a helpful assistant.',
  llmConfig: { model },
  skills: [
    {
      name: 'pdf-processing',
      description:
        'Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files.',
      body: '# PDF processing\n…full instructions…',
    },
  ],
});

When skills is present, the framework appends the catalog to the system prompt and auto-injects load_skill + read_skill_file into the tool list. An empty/unset skills is a total no-op.

`defineSkill` (validation helper)

defineSkill(def) validates a skill definition (name rules, non-empty body) and returns it unchanged. Useful for defining skills in their own modules with eager validation:

typescript

import { defineSkill } from '@helix-agents/core';

export const deployRunbook = defineSkill({
  name: 'deploy-runbook',
  description:
    'Step-by-step production deploy + rollback procedure. Use when deploying or rolling back a release.',
  body: '# Deploy runbook\n1. …',
  resources: {
    'references/rollback.md': '# Rollback\n…',
    // A lazy loader is also allowed (string | () => string | Promise<string>):
    'scripts/healthcheck.sh': () => readFileSync('./hc.sh', 'utf8'),
  },
});

const agent = defineAgent({ /* … */ skills: [deployRunbook] });

defineAgent() also validates the array form at build time — it throws on an invalid name/description/body or a duplicate skill name.

Filesystem (one line)

typescript

import { fileSystemSkillProvider } from '@helix-agents/skill-fs';

const agent = defineAgent({
  // …
  skills: fileSystemSkillProvider({ roots: ['./skills'] }),
});

The `SKILL.md` format

fileSystemSkillProvider scans each root for <root>/<skill-name>/SKILL.md. Each SKILL.md is YAML frontmatter + a markdown body, optionally accompanied by references/, scripts/, and assets/ files in the same directory.

skills/
└── pdf-processing/
    ├── SKILL.md
    ├── references/
    │   └── forms.md
    └── scripts/
        └── extract.py

markdown

---
name: pdf-processing
description: Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files.
license: MIT
---

# PDF processing

Full instructions for working with PDFs…

Frontmatter fields:

Field	Required	Notes
`name`	Yes	Lowercase `a-z` / `0-9` with single hyphens; ≤64 chars; must equal the directory name.
`description`	Yes	≤1024 chars. Triggers-only — see writing good descriptions.
`license`	No	Free-form string.
`compatibility`	No	Free-form string (≤500 chars).
`metadata`	No	`Record<string, string>`.
`allowed-tools`	No	Open-standard field; parsed and carried in v1 but not enforced.

Any file under the skill directory other than SKILL.md is surfaced as a Level-3 resource (skill-relative path). See the @helix-agents/skill-fs reference for the resource read behavior (binary refusal, 64 KB cap, line ranges, traversal guard).

The auto-injected tools

When an agent declares skills, two tools are auto-injected. Their names are reserved — user tools cannot shadow them, and skill names use [a-z0-9-] (no underscores) so they can never collide.

`load_skill`

load_skill({ name: "pdf-processing" })

Returns the full skill body as the tool result, wrapped in a <skill name="…">…</skill> block (plus a <skill_resources> listing if the skill bundles files). Emits an informational skill_loaded custom stream event ({ name }, surfaced as a data-skill_loaded AI-SDK event) that consumers MAY render. An unknown name returns a not-found message listing the available skills (it does not throw).

`read_skill_file`

read_skill_file({ skill: "pdf-processing", path: "references/forms.md", startLine: 1, endLine: 40 })

Returns the file contents wrapped in a <skill_file … untrusted="true"> block with a note instructing the model to treat the content as untrusted data. startLine/endLine are optional, 1-indexed, inclusive.

Preloaded skills

Some skills are relevant on every turn (a house style guide, a domain glossary, the one runbook this agent exists to run). For those, skip lazy loading and inject the body up front with preloadSkills.

AgentConfig.preloadSkills?: string[] injects the named skills' full bodies into the system prompt on every step — always in context, no load_skill call needed. The bodies render as an ### Active Skills (already loaded) block inside the same deterministically-sorted, cache-stable fragment as the catalog, so the prefix stays byte-stable per session.

typescript

defineAgent({
  // …
  skills: [deployRunbook, pdfProcessing],
  preloadSkills: ['deploy-runbook'], // body always in context
});

Behavior:

Preloaded skills also appear in the loadable catalog marked loaded="true" — a static marker (cache-safe, decided at config time) that tells the model not to reload them, while load_skill remains a recovery path (e.g. if history compaction later drops the system-prompt-injected body, load_skill can re-fetch it).
Each name must resolve in the agent's provider. Unknown names warn-and-skip at resolution time — one bad name never crashes the agent or breaks the rest. (defineAgent() additionally throws at build time if a preloadSkills name isn't among the in-code skills array, or if a name is duplicated.)
Sub-agents do NOT inherit a parent's preloadSkills (nor its skills) — a sub-agent uses skills only if its own config declares them.

When to preload vs lazy-load. Preload a skill when it is relevant to (nearly) every turn and the body is small enough to justify always-on cost. Lazy-load when relevance is occasional — the catalog entry is cheap, and the model loads the body only when a task matches.

Writing good descriptions

The catalog description is the only thing the model sees before deciding to load a skill, so it is selection metadata, not documentation. Write it as a trigger, not a summary of the workflow.

What + "Use when…". State what the skill does, then the conditions that should trigger it. Example: "Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files."
Triggers only. Do NOT summarize the step-by-step workflow — that belongs in the body, which the model gets after loading.
Third person. Describe the skill, not the model ("Extracts…", not "You should extract…").
Name rules. Skill name is lowercase a-z/0-9 with single hyphens between segments (no leading/trailing/double hyphen, no underscores), ≤64 chars, and must not contain the words anthropic/claude. For the filesystem provider, the name must equal the skill's directory name.

Cross-runtime support

All skills logic lives in shared core, plus a one-line per-run catalog-resolution hook at each runtime's message-build call site (the same place memory retrieval resolves — where IO / non-determinism is allowed). The two tools therefore work on all runtimes; the catalog is threaded on JS / Temporal / Cloudflare / DBOS.

Runtime	In-code provider	Filesystem provider
JS	✅	✅
Temporal	✅	✅ (resolve catalog + tool reads in activities)
DBOS	✅	✅ (resolve in steps)
Cloudflare DO / Workflows	✅	❌ (no `node:fs` — use the in-code provider)

fileSystemSkillProvider works wherever node:fs exists — i.e. NOT Cloudflare Workers; use the in-code provider there (or a future workspace/D1-backed provider). On Temporal/DBOS, filesystem IO must run where IO is allowed (the per-step activity/step), never in workflow code; in-code catalogs are deterministic data and need no special handling.

Cache behavior

Skill loading is purely additive by construction:

The Level-1 catalog is a stable, deterministically-sorted system-prompt fragment that is NEVER annotated with per-skill loaded-state (no "✓ loaded" marks at runtime) — annotating it would make the cached prefix volatile and bust the cache on every load. "Already loaded" handling lives entirely in the load_skill result, never in the catalog. (Preloaded skills' loaded="true" marker is static — decided at config time — so it does not break stability.)
Level-2 bodies and Level-3 resources arrive append-only as tool results, covered by the cache strategy's breakpoints (the system anchor + the latest turn's tool-result batch). The skills feature required no changes to the cache strategy or the LLM adapter.

The only inherited caveat (identical to memory injection): on the turn a body lands, the latest-turn breakpoint sits on/after it, so that one breakpoint won't cache-hit across that turn boundary; the system anchor still does.

Limitations (v1)

Re-loading a skill returns the body again. ToolContext exposes no transcript access, so the load_skill tool cannot dedup on its own — v1 returns the body on every call (correct + cache-safe; rarely triggered because the model sees its own prior load_skill results). collectLoadedSkillNames(messages) ships as the building block for programmatic dedup, but the dispatch-layer short-circuit is deferred.
The filesystem traversal guard is lexical. It rejects resolved paths that escape the skill dir but does NOT follow symlinks — safe for operator-provisioned skill directories.
fs staleness re-scan detects root-entry add/remove, not in-place edits. Editing an existing skill's files in place won't be picked up until restart or a touch of the root directory.
No token budget. There is no cap on catalog size or preloaded-body size (future work).

Next steps

Skills (Progressive Disclosure) — internals — the design deep-dive.
@helix-agents/core reference — AgentConfig.skills, the types, and the public helpers.
@helix-agents/skill-fs reference — fileSystemSkillProvider options and behavior.
@helix-agents/skill-cli reference — the build-time baker for remote skill packages.
Sub-Agents — sub-agents scope their own skills (no inheritance).
Workspaces — run the scripts a skill discloses via shell/code tools.

Skills ​

Why skills ​

The 3-level progressive-disclosure model ​

Providers ​

inCodeSkillProvider (the "plain data" mode) ​

fileSystemSkillProvider (Node only) ​

Loading remote skill packages (build-time bake) ​

Defining skills ​

In-code (array sugar) ​

defineSkill (validation helper) ​

Filesystem (one line) ​

The SKILL.md format ​

The auto-injected tools ​

load_skill ​

read_skill_file ​

Preloaded skills ​

Writing good descriptions ​

Cross-runtime support ​

Cache behavior ​

Limitations (v1) ​

Next steps ​

Skills

Why skills

The 3-level progressive-disclosure model

Providers

`inCodeSkillProvider` (the "plain data" mode)

`fileSystemSkillProvider` (Node only)

Loading remote skill packages (build-time bake)

Defining skills

In-code (array sugar)

`defineSkill` (validation helper)

Filesystem (one line)

The `SKILL.md` format

The auto-injected tools

`load_skill`

`read_skill_file`

Preloaded skills

Writing good descriptions

Cross-runtime support

Cache behavior

Limitations (v1)

Next steps