Skills
Skills give an agent a library of specialized capabilities — workflows, runbooks, reference protocols — without paying for all of them on every request. They implement Anthropic's Agent Skills pattern (3-level progressive disclosure) on Helix's append-only, cacheable substrate.
Why skills
A capable agent often needs many specialized playbooks: "how to process a PDF", "how to run the deploy runbook", "the schema for the billing API". Stuffing every playbook into the system prompt works, but it costs tokens on every single request — even the turns where none of them are relevant — and it bloats the context the model has to reason over.
Skills solve this by disclosing capability progressively:
- The model always sees a small catalog — just each skill's
name+description(tens of tokens each). - It loads a skill's full instructions only when a task actually matches.
- It reads a skill's bundled resource files only when those instructions point at them.
Because every load is append-only (catalog in the cached system-prompt prefix; bodies and files arrive as tool results), enabling skills does not invalidate prompt caching. You get a large capability library at near-zero idle token cost.
Skills disclose content; they do not execute code. Level-3 "scripts" are readable text — running them is delegated to the agent's own shell/workspace tools. This keeps the feature decoupled from any execution environment, so it works on every runtime.
The 3-level progressive-disclosure model
graph TB
L1["Level 1 — Catalog (always resident)<br/>name + description for every skill<br/>rendered into the system prompt"]
L2["Level 2 — Body (loaded on demand)<br/>full skill instructions<br/>returned by load_skill"]
L3["Level 3 — Resource files (read on demand)<br/>references/ scripts/ assets/<br/>returned by read_skill_file"]
L1 -->|"model matches a task<br/>calls load_skill(name)"| L2
L2 -->|"body points at a file<br/>calls read_skill_file(skill, path)"| L3- Level 1 — catalog (always resident). Every skill's
name+descriptionis rendered into a## Skillssection appended to the system prompt. The catalog is deterministically sorted by name so it is byte-stable within a session — it lives in the cached prefix and never invalidates it. The catalog carries a load-bearing guardrail ("A Skill is NOT a tool — callload_skill"), because models otherwise try to invoke skill names as if they were tools. - Level 2 — body (loaded on demand). The full skill body is loaded by the auto-injected
load_skilltool, which returns the body as the tool result — the model acts on it in the same continuation, with no wasted round-trip. Because tool results only ever append to history, this is append-only and cache-safe.load_skillalso emits an informationalskill_loadedcustom stream event. - Level 3 — resource files (read on demand). Bundled reference/script/asset files are read by the auto-injected
read_skill_filetool, which supports optionalstartLine/endLineranges and a path-traversal guard.
Providers
Skills resolve to plain data behind a small async interface (SkillProvider). Two providers ship in v1.
| Provider | Package | Backing | Runs on |
|---|---|---|---|
inCodeSkillProvider | @helix-agents/core | TypeScript data bundled with the agent | Everywhere (Workers-safe) |
fileSystemSkillProvider | @helix-agents/skill-fs | SKILL.md directories on disk (Anthropic format) | Node only |
inCodeSkillProvider (the "plain data" mode)
Skills are TypeScript data bundled with the agent. Dependency-free and Workers-safe (no node:fs). Use this on Cloudflare Workers, or anywhere you want skills version-controlled alongside your agent code. You rarely call inCodeSkillProvider directly — passing an array of skill definitions to skills does it for you (see below).
fileSystemSkillProvider (Node only)
Reads Anthropic-format SKILL.md directories from disk. Node only (uses node:fs/promises + yaml); not usable on Cloudflare Workers. Install the package:
npm install @helix-agents/skill-fsSee the @helix-agents/skill-fs reference for options and behavior.
Want to author skills as remote packages but still ship them through the Workers-safe in-code provider? See Loading remote skill packages (build-time bake) below.
Loading remote skill packages (build-time bake)
The two providers above cover skills bundled in your own code and skills read from local disk. To pull in skills published as remote packages — a git repo, or a Claude plugin marketplace (e.g. hyperframes) — use the build-time baker, @helix-agents/skill-cli.
The model is resolve + pin at build time, then ship through the in-code provider:
- You declare remote sources, pinned to a version, in a
helix.skills.jsonmanifest. helix-skills syncfetches the selected skills and writes a generated TypeScript module exportingSkillDefinition[]— plus a committed lockfile with a per-skill sha256 integrity hash.- You
import { skills }from that module and pass it todefineAgent({ skills }).
Because the output is plain in-code data, there is zero runtime fetch and no node:fs — the baked skills run everywhere, including Cloudflare Workers. The CLI itself is Node-only and runs only at build time; it is never imported by your agent.
npm i -D @helix-agents/skill-clihelix.skills.json:
{
"skills": {
"hyperframes": {
"type": "git",
"url": "https://github.com/heygen-com/hyperframes.git",
"version": "0.6.70",
"include": ["hyperframes", "hyperframes-media"]
}
}
}npx helix-skills sync # bake → src/skills.generated.ts + helix.skills.lock
npx helix-skills sync --check # CI: fail if the lockfile would changeimport { defineAgent } from '@helix-agents/core';
import { skills } from './src/skills.generated';
const agent = defineAgent({
name: 'video-agent',
systemPrompt: 'You are a video-composition assistant.',
llmConfig: { model },
skills, // baked in-code skills — zero runtime fetch, Workers-safe
});Recommended policy: gitignore the generated src/skills.generated.ts (it is a regenerable build artifact), commit helix.skills.lock, and run helix-skills sync --check in CI so manifest/lockfile drift fails the build. The manifest also supports claude-marketplace sources and version / ref / sha pinning — see the @helix-agents/skill-cli reference for the full manifest + lockfile schemas, the programmatic API (bakeSkills, parseManifest, GitRepoSource, ClaudeMarketplaceSource), and the v1 limitations.
Defining skills
Set AgentConfig.skills to either a SkillProvider or an array of in-code SkillDefinitions. The array form is sugar for inCodeSkillProvider.
In-code (array sugar)
import { defineAgent } from '@helix-agents/core';
const agent = defineAgent({
name: 'assistant',
systemPrompt: 'You are a helpful assistant.',
llmConfig: { model },
skills: [
{
name: 'pdf-processing',
description:
'Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files.',
body: '# PDF processing\n…full instructions…',
},
],
});When skills is present, the framework appends the catalog to the system prompt and auto-injects load_skill + read_skill_file into the tool list. An empty/unset skills is a total no-op.
defineSkill (validation helper)
defineSkill(def) validates a skill definition (name rules, non-empty body) and returns it unchanged. Useful for defining skills in their own modules with eager validation:
import { defineSkill } from '@helix-agents/core';
export const deployRunbook = defineSkill({
name: 'deploy-runbook',
description:
'Step-by-step production deploy + rollback procedure. Use when deploying or rolling back a release.',
body: '# Deploy runbook\n1. …',
resources: {
'references/rollback.md': '# Rollback\n…',
// A lazy loader is also allowed (string | () => string | Promise<string>):
'scripts/healthcheck.sh': () => readFileSync('./hc.sh', 'utf8'),
},
});
const agent = defineAgent({ /* … */ skills: [deployRunbook] });defineAgent() also validates the array form at build time — it throws on an invalid name/description/body or a duplicate skill name.
Filesystem (one line)
import { fileSystemSkillProvider } from '@helix-agents/skill-fs';
const agent = defineAgent({
// …
skills: fileSystemSkillProvider({ roots: ['./skills'] }),
});The SKILL.md format
fileSystemSkillProvider scans each root for <root>/<skill-name>/SKILL.md. Each SKILL.md is YAML frontmatter + a markdown body, optionally accompanied by references/, scripts/, and assets/ files in the same directory.
skills/
└── pdf-processing/
├── SKILL.md
├── references/
│ └── forms.md
└── scripts/
└── extract.py---
name: pdf-processing
description: Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files.
license: MIT
---
# PDF processing
Full instructions for working with PDFs…Frontmatter fields:
| Field | Required | Notes |
|---|---|---|
name | Yes | Lowercase a-z / 0-9 with single hyphens; ≤64 chars; must equal the directory name. |
description | Yes | ≤1024 chars. Triggers-only — see writing good descriptions. |
license | No | Free-form string. |
compatibility | No | Free-form string (≤500 chars). |
metadata | No | Record<string, string>. |
allowed-tools | No | Open-standard field; parsed and carried in v1 but not enforced. |
Any file under the skill directory other than SKILL.md is surfaced as a Level-3 resource (skill-relative path). See the @helix-agents/skill-fs reference for the resource read behavior (binary refusal, 64 KB cap, line ranges, traversal guard).
The auto-injected tools
When an agent declares skills, two tools are auto-injected. Their names are reserved — user tools cannot shadow them, and skill names use [a-z0-9-] (no underscores) so they can never collide.
load_skill
load_skill({ name: "pdf-processing" })Returns the full skill body as the tool result, wrapped in a <skill name="…">…</skill> block (plus a <skill_resources> listing if the skill bundles files). Emits an informational skill_loaded custom stream event ({ name }, surfaced as a data-skill_loaded AI-SDK event) that consumers MAY render. An unknown name returns a not-found message listing the available skills (it does not throw).
read_skill_file
read_skill_file({ skill: "pdf-processing", path: "references/forms.md", startLine: 1, endLine: 40 })Returns the file contents wrapped in a <skill_file … untrusted="true"> block with a note instructing the model to treat the content as untrusted data. startLine/endLine are optional, 1-indexed, inclusive.
Preloaded skills
Some skills are relevant on every turn (a house style guide, a domain glossary, the one runbook this agent exists to run). For those, skip lazy loading and inject the body up front with preloadSkills.
AgentConfig.preloadSkills?: string[] injects the named skills' full bodies into the system prompt on every step — always in context, no load_skill call needed. The bodies render as an ### Active Skills (already loaded) block inside the same deterministically-sorted, cache-stable fragment as the catalog, so the prefix stays byte-stable per session.
defineAgent({
// …
skills: [deployRunbook, pdfProcessing],
preloadSkills: ['deploy-runbook'], // body always in context
});Behavior:
- Preloaded skills also appear in the loadable catalog marked
loaded="true"— a static marker (cache-safe, decided at config time) that tells the model not to reload them, whileload_skillremains a recovery path (e.g. if history compaction later drops the system-prompt-injected body,load_skillcan re-fetch it). - Each name must resolve in the agent's provider. Unknown names warn-and-skip at resolution time — one bad name never crashes the agent or breaks the rest. (
defineAgent()additionally throws at build time if apreloadSkillsname isn't among the in-codeskillsarray, or if a name is duplicated.) - Sub-agents do NOT inherit a parent's
preloadSkills(nor itsskills) — a sub-agent uses skills only if its own config declares them.
When to preload vs lazy-load. Preload a skill when it is relevant to (nearly) every turn and the body is small enough to justify always-on cost. Lazy-load when relevance is occasional — the catalog entry is cheap, and the model loads the body only when a task matches.
Writing good descriptions
The catalog description is the only thing the model sees before deciding to load a skill, so it is selection metadata, not documentation. Write it as a trigger, not a summary of the workflow.
- What + "Use when…". State what the skill does, then the conditions that should trigger it. Example: "Extract text and tables from PDFs, fill forms, merge documents. Use when working with PDF files."
- Triggers only. Do NOT summarize the step-by-step workflow — that belongs in the body, which the model gets after loading.
- Third person. Describe the skill, not the model ("Extracts…", not "You should extract…").
- Name rules. Skill
nameis lowercasea-z/0-9with single hyphens between segments (no leading/trailing/double hyphen, no underscores), ≤64 chars, and must not contain the wordsanthropic/claude. For the filesystem provider, the name must equal the skill's directory name.
Cross-runtime support
All skills logic lives in shared core, plus a one-line per-run catalog-resolution hook at each runtime's message-build call site (the same place memory retrieval resolves — where IO / non-determinism is allowed). The two tools therefore work on all runtimes; the catalog is threaded on JS / Temporal / Cloudflare / DBOS.
| Runtime | In-code provider | Filesystem provider |
|---|---|---|
| JS | ✅ | ✅ |
| Temporal | ✅ | ✅ (resolve catalog + tool reads in activities) |
| DBOS | ✅ | ✅ (resolve in steps) |
| Cloudflare DO / Workflows | ✅ | ❌ (no node:fs — use the in-code provider) |
fileSystemSkillProvider works wherever node:fs exists — i.e. NOT Cloudflare Workers; use the in-code provider there (or a future workspace/D1-backed provider). On Temporal/DBOS, filesystem IO must run where IO is allowed (the per-step activity/step), never in workflow code; in-code catalogs are deterministic data and need no special handling.
Cache behavior
Skill loading is purely additive by construction:
- The Level-1 catalog is a stable, deterministically-sorted system-prompt fragment that is NEVER annotated with per-skill loaded-state (no "✓ loaded" marks at runtime) — annotating it would make the cached prefix volatile and bust the cache on every load. "Already loaded" handling lives entirely in the
load_skillresult, never in the catalog. (Preloaded skills'loaded="true"marker is static — decided at config time — so it does not break stability.) - Level-2 bodies and Level-3 resources arrive append-only as tool results, covered by the existing rolling cache breakpoints. There are zero changes to the cache strategy or the LLM adapter.
The only inherited caveat (identical to memory injection): on the turn a body lands, the rolling latest-turn breakpoint sits on/after it, so that one breakpoint won't cache-hit across that turn boundary; the system/tools/previous-turn breakpoints still do.
Limitations (v1)
- Re-loading a skill returns the body again.
ToolContextexposes no transcript access, so theload_skilltool cannot dedup on its own — v1 returns the body on every call (correct + cache-safe; rarely triggered because the model sees its own priorload_skillresults).collectLoadedSkillNames(messages)ships as the building block for programmatic dedup, but the dispatch-layer short-circuit is deferred. - The filesystem traversal guard is lexical. It rejects resolved paths that escape the skill dir but does NOT follow symlinks — safe for operator-provisioned skill directories.
- fs staleness re-scan detects root-entry add/remove, not in-place edits. Editing an existing skill's files in place won't be picked up until restart or a touch of the root directory.
- No token budget. There is no cap on catalog size or preloaded-body size (future work).
Next steps
- Skills (Progressive Disclosure) — internals — the design deep-dive.
@helix-agents/corereference —AgentConfig.skills, the types, and the public helpers.@helix-agents/skill-fsreference —fileSystemSkillProvideroptions and behavior.@helix-agents/skill-clireference — the build-time baker for remote skill packages.- Sub-Agents — sub-agents scope their own skills (no inheritance).
- Workspaces — run the scripts a skill discloses via shell/code tools.