06-reference

superpowers plugin analysis

Fri Apr 03 2026 20:00:00 GMT-0400 (Eastern Daylight Time) ·analysis ·source: claude-plugins-official/superpowers

Claude Code Official Plugins — Analysis

There is no “Superpowers” plugin. The name doesn’t exist in the official marketplace. What does exist is a rich set of individual plugins at ~/.claude/plugins/marketplaces/claude-plugins-official/plugins/. This analysis covers the key ones and what patterns we can borrow.

Plugin Inventory

32 plugins total. The ones relevant to our work:

PluginTypeWhat it does
skill-creatorSkillMeta-skill for creating, evaluating, and iterating on skills
feature-devCommand + AgentsPhased feature development with parallel subagents
code-reviewCommandPR review with 5 parallel Sonnet agents + confidence scoring
code-simplifierAgentPost-edit code cleanup (Opus model)
playgroundSkillInteractive single-file HTML playground builder
plugin-devSkills (7)Full plugin/skill/agent authoring reference
ralph-loopCommand + HookSelf-referential iteration loop via Stop hook
frontend-designSkillAnti-AI-slop frontend generation

Architecture Patterns

1. Three-Level Progressive Disclosure

The central organizing principle for skills:

  1. Metadata (name + description) — always loaded, ~100 words. This is the trigger surface.
  2. SKILL.md body — loaded when skill triggers, target <500 lines / <5k words.
  3. Bundled resources (scripts/, references/, assets/) — loaded on demand, unlimited size. Scripts can execute without being read into context.

This is genuinely well-designed. The key insight: scripts in scripts/ save tokens because they execute without needing to be loaded into the context window. Reference docs in references/ only load when Claude determines they’re needed.

What we can borrow: We already do the SKILL.md pattern. We don’t consistently use references/ for overflow content or scripts/ for deterministic work. We should adopt both.

2. Subagent Fan-Out Pattern

Feature-dev and code-review both use the same pattern: spawn multiple specialized subagents in parallel, each with a different focus, then synthesize results.

Feature-dev phases:

Code-review agents:

  1. CLAUDE.md compliance auditor
  2. Shallow bug scanner (changes only, no extra context)
  3. Git blame/history analyst
  4. Prior PR comment checker
  5. Code comment compliance checker

Then a second wave of Haiku agents scores each finding 0-100, and only issues >= 80 get reported.

What’s novel: The two-wave pattern (detect with Sonnet, then score with Haiku) is a smart cost/quality tradeoff. The confidence scoring rubric (0/25/50/75/100 with specific criteria at each level) is worth stealing wholesale.

What we can borrow: The confidence-scored filtering pattern. We do subagent fan-out already, but we don’t do the second-pass scoring to filter false positives. That’s a meaningful quality improvement for any review or analysis task.

3. Eval-Driven Skill Development (skill-creator)

This is the most sophisticated plugin. The loop:

  1. Capture intent through interview
  2. Write SKILL.md draft
  3. Create 2-3 test prompts
  4. Spawn parallel subagents: with-skill vs. baseline (no skill or old skill)
  5. While runs execute, draft quantitative assertions
  6. Grade results with a grader subagent
  7. Aggregate benchmarks (pass rate, time, tokens, with mean +/- stddev)
  8. Launch HTML eval viewer for human review
  9. Read feedback, improve skill, repeat

Key details:

What’s novel: The grader-critiques-the-evals pattern. Having the grader not just pass/fail assertions but also flag weak assertions and suggest missing ones creates a self-improving eval loop. Also: running each eval query 3x to get reliable trigger rates, and the train/test split for description optimization to avoid overfitting.

What we can borrow: The description optimization loop is directly applicable to our skills. Our skill descriptions are hand-written and probably undertrigger. The “pushy description” guidance is a quick win — making descriptions slightly aggressive to combat Claude’s tendency to not invoke skills.

4. Self-Referential Iteration (ralph-loop)

Uses a Stop hook to intercept Claude’s exit and feed the same prompt back in. The key insight: the prompt stays constant but the file system changes between iterations, so Claude sees its own prior work.

What’s novel: Using hooks.json to create a feedback loop without an external bash wrapper. Elegant.

What we already do: We have the /loop skill for recurring tasks. Ralph-loop is specifically for “run to completion” tasks with clear success criteria (tests passing, etc). Different use case.

Specific Skill Analysis

Playground Builder

Generates self-contained single-file HTML tools with:

Has 6 templates: design, data-explorer, concept-map, document-critique, diff-review, code-map. Each template defines layout, control types, rendering approach, and prompt output format.

The concept-map template is particularly interesting — it creates a canvas-based knowledge explorer where users drag nodes, draw relationship edges, mark their knowledge level (know/fuzzy/unknown), and it generates a targeted learning prompt from their markings.

What we can borrow: The “playground as prompt builder” pattern. Build a visual tool, user configures it, output is a natural-language prompt they copy back into Claude. This is a great pattern for complex configuration tasks.

Frontend Design

An anti-AI-slop manifesto. Key rules:

What we can borrow: The framing. If we ever build frontend skills, this is the right attitude. But more broadly, the “commit to a bold direction” principle applies to any creative output skill.

Code Simplifier

Runs as an Opus-model agent, automatically reviews recently modified code for:

What we can borrow: The “autonomous post-edit cleanup” pattern. We have /simplify registered but could use this as a model for what it should actually check.

Writing Style Patterns

Two distinct conventions depending on the component type:

ComponentVoiceExample
Skills (SKILL.md body)Imperative/infinitive”Parse the frontmatter using sed.”
Agents (system prompts)Second person”You are an expert code analyst…”
DescriptionsThird person”This skill should be used when the user asks to…”

The skill-creator’s writing guidance is the most interesting part:

What’s Genuinely Novel vs. What We Already Do

Novel (worth adopting)

  1. Confidence-scored two-pass review — detect with strong model, score with cheap model, filter below threshold
  2. Grader that critiques its own evals — self-improving evaluation loop
  3. Description optimization loop — systematic trigger testing with train/test split
  4. Progressive disclosure with scripts/ — deterministic scripts that execute without context window cost
  5. Pushy descriptions — deliberately aggressive skill descriptions to combat undertriggering
  6. Playground-as-prompt-builder — visual configuration that outputs natural language prompts

Already doing (validated by seeing it here)

  1. Subagent fan-out for parallel analysis
  2. SKILL.md as the skill format
  3. Phased development workflows (discovery -> design -> implement -> review)
  4. Loop-based iteration patterns
  5. Markdown-based agent definitions with frontmatter

Not relevant to us

  1. .skill packaging/distribution (we’re a single-org setup)
  2. HTML eval viewer (our eval loop is conversational)
  3. Claude.ai/Cowork compatibility layers
  4. Plugin marketplace structure

Action Items

  1. Audit our skill descriptions — apply the “pushy description” pattern. Add more trigger phrases. Test whether skills actually fire on realistic prompts.
  2. Add references/ and scripts/ dirs to skills that have grown large. Move overflow content out of SKILL.md.
  3. Adopt confidence scoring for any review/analysis skill. The 0-100 rubric with specific criteria at each level is directly reusable.
  4. Consider a description optimization pass on our highest-value skills. The systematic approach (generate test queries, iterate on description, measure trigger rate) is worth doing even manually.
  5. Bundle repeated scripts. If we notice Claude writing the same helper code across invocations of a skill, freeze that script into scripts/.