the quality gate for Agent Skills
Does your skill trip on
the right prompts?
Everything you need to confirm a skill works and follows best
practices — in one tool. Tripwire lints your SKILL.md, probes which
prompts actually activate it with real Claude sessions, and gates every PR in CI.
npm install -g tripwire-skills
Best-practice lint
Instant, offline checks: description starts with “Use when”, stays under 1024 chars, and avoids workflow summaries; name is kebab-case; the body has real examples and isn’t a stub.
Real activation coverage
Generates a prompt matrix across four zones and runs real Claude sessions to map what fires, what silently misses, and what false-triggers — the check a linter can’t do.
CI-ready regression gate
Commit the generated scenarios and rerun them deterministically — in CI or locally — to catch activation regressions before they ship.
the part a linter can’t catch
Activation coverage, measured against real runs
A skill’s description decides whether Claude invokes it. Tripwire generates
prompts across four zones, runs them through real sessions, and reports exactly where your
skill over- or under-fires.
Coverage Report: brainstorming Core triggers 7/8 activated (88%) ✓ Adjacent/edge 3/8 activated (38%) ⚠ Negatives 1/8 activated (13%) ✓ Keyword variants 4/5 activated (80%) ✓ ─ GAPS ─────────────────────────── ✗ "what's the best way to approach X?" [adjacent — miss] ✗ "brainstorm ideas for Y" [variant — miss] ─ SUGGESTIONS ───────────────────── 1. Add "brainstorm", "think through" to the description 2. Cover "approach" questions — add "planning how to approach"
runs on every pull request
The same checks, as a GitHub Action
Gate skill changes the way you gate lint and tests. Add one workflow file — it lints changed skills, runs the coverage probe with your own API key in your own runner, annotates the diff, and posts a sticky summary. The key never leaves your CI.
# .github/workflows/tripwire.yml name: Tripwire on: pull_request jobs: skills: runs-on: ubuntu-latest permissions: { contents: read, pull-requests: write } steps: - uses: actions/checkout@v4 with: { fetch-depth: 0 } - uses: bharath31/tripwire@v1 with: probe: true anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
No key on a fork PR? The probe skips with a notice — lint still runs and gates the PR.
as your skills grow
More than lint, and more than one skill at a time
Skill-set conflict detection
Two skills that each lint clean can still fight at runtime. Scan a whole directory for name collisions and description overlap — skills shadowing each other.
Outcome evals
Did the skill actually do the right thing, not just fire? Author assertions plus an optional LLM-judge rubric per case.
Model-drift detection
A scenarios file that passed in March can silently regress in June. Re-probe on a schedule and fail loudly when a skill’s activation behavior drifts.
Cross-agent probing
Skills now run across 30+ agents on one SKILL.md. Check that yours fires in each agent you ship to — not just Claude Code.
Pluggable rules
extends: tripwire:recommended, ESLint-style. Turn rules off, change severity, or add org-specific checks as plain JS — applied everywhere, including the Action.
Badges & auto-fix
A live README coverage badge, plus one-command fixes for the mechanically-safe lint issues. Tripwire also ships as a skill and as an early VS Code extension.
Try the lint check, right here
The real lint engine, bundled to your browser — an instant taste. The full activation-coverage probe runs from the CLI and the Action (it needs real Claude sessions). Nothing leaves this page.