Plan: Create refine-prompt skill

Objective

Ship a refine-prompt skill that interviews users about their intent and constraints, then generates a well-structured, copy-pasteable AI prompt grounded in Anthropic’s official Claude Prompting Best Practices — applying the right combination of clarity, XML structure, role assignment, few-shot examples, chain-of-thought scaffolding, and output formatting for each prompt type — plus recommendations for installed skills, agents, and workflows that can achieve the goal directly.

Implement

Read and implement all steps in the plan at docs/plans/create-refine-prompt-skill.html — Ship a refine-prompt skill that interviews users and generates optimized AI prompts

Workflow

Run a workflow to implement the plan at docs/plans/create-refine-prompt-skill.html — Ship a refine-prompt skill that interviews users and generates optimized AI prompts

File create-prompt-refiner-skill.html

Path docs/plans/create-prompt-refiner-skill.html

Acceptance criteria 9 / 9 done

Context

Anthropic publishes a comprehensive Claude Prompting Best Practices guide covering clarity, XML structuring, few-shot examples, role assignment, thinking/CoT, output formatting, and agentic patterns. Most users never read it — they write vague instructions, skip context, omit output format specs, and don't leverage techniques like <example> tags, <thinking> scaffolds, or document-grounding patterns that the guide recommends.

The agentics marketplace already ships 12 plugins with dozens of skills and agents. A refine-prompt skill closes the loop: instead of the user guessing the right prompt or the right technique from the guide, the skill interviews them, applies the relevant best practices automatically (clarity → XML structure → role → examples → CoT → output format), and surfaces installed capabilities that match their intent — turning every user into a power user without requiring them to study the guide.

This skill lives inside the existing plan-agent plugin at kit/plugins/plan-agent/skills/refine-prompt/, since both plan-agent and refine-prompt are about structuring user intent into well-formed output. Adding it to plan-agent avoids creating a standalone plugin for a single skill and follows the domain-container naming convention. The references/ directory includes a distilled version of the Anthropic best practices organized by technique, so the skill can cite specific guidelines as it builds the prompt.

Steps

done Add the refine-prompt skill directory inside the existing plan-agent plugin

The plan-agent plugin already exists with its own plugin.json, skills, and marketplace entry. Adding a new skill directory under kit/plugins/plan-agent/skills/refine-prompt/ (with references/ for templates) is all that’s needed — no new plugin scaffold, no new plugin.json, no marketplace registration.

Verify

Run ls kit/plugins/plan-agent/skills/refine-prompt/ and confirm the directory exists with a references/ subdirectory. Confirm kit/plugins/plan-agent/.claude-plugin/plugin.json still has "name": "plan-agent" and was not modified.

done Write the core SKILL.md with the six-phase best-practices pipeline

The SKILL.md is the runtime artifact — its frontmatter controls activation and its body defines the entire workflow. The pipeline maps directly to Anthropic’s best practices guide: Classify (identify prompt type and select applicable techniques) → Interview (gather context per “Add context to improve performance”) → Structure (apply XML tags, role, examples per “Structure prompts with XML tags” and “Give Claude a role”) → Draft (assemble the prompt with CoT scaffolding per “Leverage thinking capabilities”) → Recommend (surface matching skills/agents) → Deliver (present in a copy-pasteable fenced block).

Verify

Read kit/plugins/plan-agent/skills/refine-prompt/SKILL.md and confirm: (1) YAML frontmatter includes name, description, allowed-tools (with ToolSearch listed for the Recommend phase), and disable-model-invocation: true; (2) body contains all six workflow phases with explicit references to the Anthropic best practices each phase implements; (3) AskUserQuestion is used in the Interview phase with batched questions.

done Build the prompt-type classifier with a best-practices layer matrix

The Anthropic guide documents seven core techniques (clarity, context/motivation, XML tags, role assignment, few-shot examples, thinking/CoT, output formatting). Different prompt types need different subsets. The classifier maps each type to its applicable layers so the Draft phase assembles only the relevant techniques. For example: a system prompt applies role + XML structure + guardrails + output formatting; a task prompt applies clarity + CoT + examples + output formatting; a creative prompt applies role + tone + context; an analytical/research prompt applies long-context patterns + quote grounding + CoT + self-check.

Verify

Confirm the Classify phase in SKILL.md lists at least four prompt types (system, task, creative, analytical) and defines a technique matrix mapping each type to applicable best-practice layers (clarity, XML structure, role, examples, CoT, output format, self-check). Each type must have a distinct combination — no two types should have identical layer sets.

done Build the clarifying-questions engine grounded in “Add context to improve performance”

The Anthropic guide’s “Add context to improve performance” principle states that providing motivation behind instructions helps Claude deliver more targeted responses. The interview phase implements this by extracting the user’s why, not just their what. Questions are derived from the classified type and its technique matrix: a system prompt interview asks about persona, tone, boundaries, and safety constraints (feeding the Role and XML layers); a task prompt asks about input/output formats, edge cases, and success criteria (feeding the Clarity, CoT, and Output Format layers). Progressive depth means: ask the critical 2–3 questions first, then offer to go deeper if the user wants a more refined prompt.

Verify

Confirm the Interview phase uses AskUserQuestion with type-specific questions (not generic ones) that map to the technique matrix from Step 3. Verify progressive depth: the first AskUserQuestion batch asks 2–3 essential questions, and a follow-up batch is offered only when the user opts in. Confirm each prompt type has its own question set, and that at least one question per type asks why (the motivation/context behind the request).

done Write the prompt-generation templates in references/ implementing specific best-practices techniques

The Anthropic guide’s techniques must be baked directly into each template as structural scaffolding — not left to the Draft phase to improvise. Each template implements the techniques from its row in the Step 3 matrix: the system-prompt template uses <role> tags (“Give Claude a role”), <instructions> and <constraints> XML wrappers (“Structure prompts with XML tags”), and output-format indicators (“Be direct about the desired output”). The task-prompt template adds <thinking> scaffolding (“Leverage thinking capabilities”), <example> tag slots (“Provide examples”), and step-by-step instruction blocks (“Be clear and direct”). The analytical template includes document-grounding patterns with <document> tags and quote-extraction (“Long-context prompting”) plus self-check instructions. The creative template uses tone-setting role assignments and positive-instruction framing. A separate references/best-practices-reference.md file provides a distilled summary of all seven techniques for the SKILL.md to cite.

Verify

Run ls kit/plugins/plan-agent/skills/refine-prompt/references/ and confirm at least five files: system-prompt-template.md, task-prompt-template.md, creative-prompt-template.md, analytical-prompt-template.md, and best-practices-reference.md. Read each template and confirm: (1) system template uses <role>, <instructions>, <constraints> XML tags; (2) task template includes <thinking> scaffolding and <example> slots; (3) analytical template includes <document> grounding and self-check blocks; (4) creative template uses role-based tone-setting. Confirm best-practices-reference.md covers all seven core techniques from the Anthropic guide.

done Implement the skill/agent/workflow recommendation engine grounded in “Tool use and agentic patterns”

The Anthropic guide’s “Tool use” and “Agentic coding” sections describe how Claude works best when given concrete tools with clear descriptions. The recommendation engine applies this principle in reverse: it queries the session’s live tool registry via ToolSearch with keyword queries derived from the user’s intent, and surfaces the most relevant tools. ToolSearch is session-scoped (only returns tools actually loaded), includes MCP tools, and requires no filesystem path assumptions. This turns a generated prompt into an actionable recommendation — “you could also run /code-review for this” — and when the recommended tool is itself the better path, prevents the user from needing a prompt at all.

Verify

Confirm the Recommend phase in SKILL.md extracts 2–3 keyword queries from the user's stated intent and calls ToolSearch (keyword search mode, max_results: 5) for each. Verify results are deduplicated and filtered to skills/commands/agents (excluding low-level tools like Read/Edit). Verify the output format: a ranked list of 1–3 recommendations with invocation syntax (e.g. /plugin:command) and a one-line explanation of why each matches. Confirm the skill explains when a recommended tool supersedes the prompt entirely (i.e. the user’s intent is best served by running the tool directly rather than crafting a prompt). Confirm ToolSearch is listed in allowed-tools.

done Bump plan-agent version in .claude-plugin/marketplace.json and update CHANGELOG.md

Adding a new skill is a minor version bump per the marketplace versioning convention. The plan-agent plugin is already registered — no new entry is needed, just a version increment and a changelog entry documenting the new refine-prompt skill.

Verify

Read .claude-plugin/marketplace.json and confirm the plan-agent entry’s version has been bumped (minor increment). Read kit/plugins/plan-agent/CHANGELOG.md and confirm a new entry documents the addition of the refine-prompt skill. Run cat .claude-plugin/marketplace.json | python3 -m json.tool to confirm valid JSON.

done Add a refine-prompt section to plan-agent’s existing README

The plan-agent README already documents existing skills. Adding a section for the new refine-prompt skill keeps all plugin documentation in one place. The section must explain that prompts are generated using Anthropic’s Claude Prompting Best Practices as the foundation, list the seven techniques, and show a concrete before/after example.

Verify

Read kit/plugins/plan-agent/README.md and confirm it contains a refine-prompt section with: overview (mentioning and linking to the Anthropic best practices guide), usage (at least one before/after example showing a vague request transformed into a structured prompt), and a feature matrix (table mapping prompt types to which of the seven best-practices techniques each type applies).

Acceptance Criteria

kit/plugins/plan-agent/skills/refine-prompt/ directory exists with SKILL.md and references/ subdirectory inside the existing plan-agent plugin
kit/plugins/plan-agent/skills/refine-prompt/SKILL.md has valid frontmatter (name, description, allowed-tools, disable-model-invocation: true) and body under 500 lines
The skill classifies prompts into at least four types (system, task, creative, analytical) with a technique matrix mapping each type to applicable best-practices layers from the Anthropic guide
The interview phase uses AskUserQuestion with type-specific questions derived from the technique matrix, asks the user’s why (per “Add context to improve performance”), and offers progressive depth
Generated prompts use XML tags (<instructions>, <context>, <example>) per “Structure prompts with XML tags”, role assignment per “Give Claude a role”, <thinking> scaffolding per “Leverage thinking capabilities”, and output format indicators per “Be direct about the desired output” — each applied only when the technique matrix selects it for the classified prompt type
The recommendation engine uses ToolSearch to query the session’s live tool registry and surfaces 1–3 matching tools with invocation syntax
.claude-plugin/marketplace.json has the plan-agent version bumped (minor increment) and kit/plugins/plan-agent/CHANGELOG.md documents the new skill
The final prompt output is presented in a fenced code block the user can copy-paste directly into any AI agent, and uses positive instructions (per “Be direct about the desired output”) rather than telling Claude what not to do
A references/best-practices-reference.md file exists containing a distilled summary of all seven core techniques from the Anthropic guide, organized by technique name with actionable implementation notes

Verification

Load the plugin with claude --plugin-dir ./kit/plugins/plan-agent and run four end-to-end tests, each verifying that the generated prompt applies the specific Anthropic best practices techniques selected by the technique matrix:

System prompt flow: Tell the skill you want a system prompt for a customer support chatbot. Confirm it asks about persona, tone, boundaries, and safety constraints (per “Add context”). Verify the output uses <role> tags (per “Give Claude a role”), <instructions> and <constraints> XML wrappers (per “Structure prompts with XML tags”), and output-format indicators (per “Be direct about the desired output”).
Task prompt flow: Ask for a prompt to refactor a Python module. Confirm it asks about input code, target patterns, and the why behind the refactor. Verify the output includes <thinking> scaffolding (per “Leverage thinking capabilities”), <example> tag slots (per “Provide examples”), and step-by-step instructions (per “Be clear and direct”).
Analytical prompt flow: Ask for a prompt to analyze a research paper. Confirm it asks about the document context and desired analysis depth. Verify the output includes <document> grounding tags (per “Long-context prompting”), quote-extraction instructions, and a self-check block.
Recommendation check: In all flows above, verify the skill surfaces at least one relevant installed skill (e.g. /code-review for the refactoring task). Confirm the recommendation includes the invocation command and a brief rationale explaining when the tool supersedes the prompt.

Completion Checklist

Required

All step TODOs marked as done
All acceptance criteria verified and checked off
Plan status updated to completed

Completion Report

Implemented as the craft-prompt skill (PR #248, commit 2672be2; Phase 7 “Save” added in PR #249), then renamed to refine-prompt — the plan’s original name — in v2.0.0 on 2026-06-10. The skill now lives at kit/plugins/plan-agent/skills/refine-prompt/, invoked as /plan-agent:refine-prompt. All functional requirements met. Verification evidence per check (recorded pre-rename; craft-prompt paths below now correspond to skills/refine-prompt/):

step-1-directory: kit/plugins/plan-agent/skills/craft-prompt/ exists with SKILL.md and a references/ subdirectory containing 5 files (system-prompt-template.md, task-prompt-template.md, creative-prompt-template.md, analytical-prompt-template.md, best-practices-reference.md) — naming delta only: plan says refine-prompt, implemented as craft-prompt per PR #248 mapping. kit/plugins/plan-agent/.claude-plugin/plugin.json:2 has "name": "plan-agent", parses as valid JSON, and grep confirms no "version" key anywhere in the file; version is correctly held only in .claude-plugin/marketplace.json:253 ("version": "1.11.0").
step-2-skill-md-pipeline: kit/plugins/plan-agent/skills/craft-prompt/SKILL.md satisfies all three requirements: (1) frontmatter has name (line 2), description (lines 3-6), disable-model-invocation: true (line 7), and allowed-tools listing ToolSearch (lines 9-10); (2) all six phases are present as headings — Classify (line 30), Interview (line 64), Structure (line 115), Draft (line 143), Recommend (line 180), Deliver (line 201) — with explicit Anthropic best-practice references: the Classify technique matrix is labeled "Anthropic best-practice layers" naming role assignment/XML structure/thinking-CoT/output format (lines 42-50), Interview cites "Add context to improve performance" verbatim (line 67), Structure applies the matrix's XML/role/examples/CoT layers (lines 117-139), and Draft cites "Be direct about the desired output" verbatim under "writing rules from Anthropic's best practices" (lines 170-173); (3) the Interview phase uses AskUserQuestion "with a batched set of 2-3 essential questions" per type (line 70) plus an optional second batch (lines 109-111). Naming delta only: the skill ships as craft-prompt rather than the plan's refine-prompt path, which passes per the naming-mapping rule; the extra Phase 7 — Save (line 228) is the explicitly-acceptable addition.
step-3-classifier-matrix: Phase 1 — Classify in kit/plugins/plan-agent/skills/craft-prompt/SKILL.md lists the four required types (lines 35-40) and defines a "technique matrix" table (lines 42-50) with four distinct layer sets: system = role assignment + XML structure (instructions/constraints) + output format + guardrails (line 47); task = clarity/directness + XML structure (context/example) + thinking/CoT + output format (line 48); creative = role assignment + tone/voice + context/motivation + output format + positive framing (line 49); analytical = long-context patterns (document/quote) + thinking/CoT + self-check + output format (line 50). No two sets are identical, and they match the plan's own exemplar mapping in Step 3 (docs/plans/create-prompt-refiner-skill.html:1124), covering the best-practice pool (clarity, XML structure, role, examples via the <example> tag, CoT, output format, self-check). Naming delta only: the skill ships as craft-prompt instead of the plan's refine-prompt path — substance fully present.
step-4-interview-engine: Phase 2 of kit/plugins/plan-agent/skills/craft-prompt/SKILL.md uses AskUserQuestion with "a batched set of 2-3 essential questions determined by the classified type" (line 70) and offers a follow-up batch only on user confirmation ("Only run a second AskUserQuestion batch if the user confirms", lines 109-111). All four types have their own question set — system (lines 73-79), task (81-88), creative (90-98), analytical (100-107) — each question annotated "(feeds Role/Clarity/CoT/Output Format/Long-context...)" mapping directly to the Phase 1 technique matrix (lines 45-50), and each set contains an explicit why question (lines 78, 87, 98, 106) per "Add context to improve performance" (cited at line 67). Naming delta only: the skill is craft-prompt rather than the plan's refine-prompt path — substance matches plan Step 4 (plan HTML lines 1141-1144); minor non-blocking note: the creative set lists 4 candidate questions while line 70 caps the first batch at 2-3.
step-5-templates: All five required files exist in kit/plugins/plan-agent/skills/craft-prompt/references/ (system-prompt-template.md, task-prompt-template.md, creative-prompt-template.md, analytical-prompt-template.md, best-practices-reference.md). Substance verified per template: system-prompt-template.md has <role>/<instructions>/<constraints> (lines 12/16/25); task-prompt-template.md has <example> input/output slots (lines 16-19) and <thinking> scaffolding (lines 21-26); analytical-prompt-template.md has <document><content> grounding (lines 12-15) and a "Self-check before responding:" block (lines 29-32); creative-prompt-template.md sets tone via <role>You are {{CREATIVE_ROLE}}. Your voice is {{VOICE_DESCRIPTION}}.</role> (lines 12-14) with style/tone requirements (lines 24-28); best-practices-reference.md covers all seven core techniques as sections 1-7 (Clarity and Directness line 7, Context and Motivation line 21, XML Tags line 35, Role Assignment line 52, Few-Shot Examples line 66, Thinking/CoT line 80, Output Formatting line 94). Naming delta only: templates say "Used by `craft-prompt`" (e.g., system-prompt-template.md line 3) instead of the plan's "refine-prompt" — naming difference, not a substance gap.
step-6-recommend-engine: Phase 5 — Recommend in /Users/shawnsandy/devbox/agentics/.claude/worktrees/trusting-grothendieck-b678fe/kit/plugins/plan-agent/skills/craft-prompt/SKILL.md satisfies every requirement: lines 186-191 require "ToolSearch with 2–3 keyword queries derived from the user's intent" run "for each keyword phrase" (keyword search mode); lines 192-193 require "Deduplicate results, filter to skills/commands/agents only (not filesystem tools)"; lines 194-195 require "Present the top 1–3 matches with: the invocation command, a one-sentence description, and a note on when it supersedes a custom prompt" (concrete invocation-syntax example "/code-review — Reviews code for bugs and quality..." at lines 219-220); lines 182-184 state matches "may supersede the need for a custom prompt entirely"; and ToolSearch is listed in allowed-tools at lines 9-10. The literal `max_results: 5` parameter from the plan is not written out, but the phase gives explicit result-count guidance ("top 1–3 matches", line 194), so per the check rule this is a noted delta, not a defect. Naming delta only: the skill ships as craft-prompt (kit/plugins/plan-agent/skills/craft-prompt/) instead of the plan's refine-prompt — substance is fully present, so this passes.
step-7-version-changelog: Commit 2672be2 bumped the plan-agent entry in .claude-plugin/marketplace.json from 1.3.2 to 1.4.0 (minor) in the same commit that added skills/craft-prompt/ (SKILL.md + 5 references), and kit/plugins/plan-agent/CHANGELOG.md:147 contains "## v1.4.0 — 2026-06-04 — Add craft-prompt skill" with an Added section (lines 151-162) documenting all six phases and five reference files. `python3 -m json.tool .claude-plugin/marketplace.json` printed VALID, and the current plan-agent version is 1.11.0 (.claude-plugin/marketplace.json:253), which is >= the 1.4.0 that shipped the skill. Naming delta only: the plan's "refine-prompt" shipped as "craft-prompt" — version/changelog substance is fully present under the shipped name.
step-8-readme: kit/plugins/plan-agent/README.md documents the skill under "#### craft-prompt — Manual invoke only" (line 210) with all three required elements: (1) line 212 overview mentions and links Anthropic's guide via [Anthropic's official Claude Prompting Best Practices](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices); (2) lines 220-249 give a "Before / after" usage example transforming the vague "write me a prompt to summarize stuff" into a structured task prompt with <context>/<thinking>/<example> and an output-format rule; (3) lines 260-267 provide a "Technique matrix" table mapping all four prompt types (system, task, creative, analytical) to their applied best-practices techniques. Naming delta only: the plan's "refine-prompt" is documented as "craft-prompt", which passes per the naming mapping.
acceptance-criteria-sweep: All nine criteria hold in substance under the naming delta (plan's refine-prompt shipped as craft-prompt at kit/plugins/plan-agent/skills/craft-prompt/ — name/path-only mismatch, passes per mapping). (1) `ls` shows skills/craft-prompt/ with SKILL.md and references/ holding all five files. (2) Frontmatter parses as valid YAML (Ruby YAML.safe_load) with name, description, allowed-tools, disable-model-invocation: true (SKILL.md:1-11); wc -l = 289 lines < 500. (3) SKILL.md:35-40 classifies four types (system/task/creative/analytical) and SKILL.md:45-50 is the technique matrix mapping each type to best-practices layers. (4) SKILL.md:70-71 "Use **AskUserQuestion** with a batched set of 2–3 essential questions determined by the classified type"; type-specific questions at 73-107 each annotated with the matrix technique they feed; a "_Why_" question per type (lines 79, 87, 98, 106) per "Add context to improve performance" (66-68); progressive depth opt-in at 109-111. (5) SKILL.md:115-139 maps <role>, <instructions>/<constraints>, <context>, <example>, <thinking>, <document>, self-check to matrix types with "Skip any layer whose type is not in the technique matrix" (139) and placeholder removal for unselected techniques (168); templates carry matching tags — system-prompt-template.md:12/16/25, task-prompt-template.md:12/16/21/30, creative-prompt-template.md:12/18/24, analytical-prompt-template.md:12/18/29/34. (6) SKILL.md:180-197 Phase 5 searches "the session's live tool registry" via ToolSearch (allowed-tools, line 10) and presents "the top 1–3 matches with: the invocation command" (194-195). (7) git show 2672be2 (PR #248, the commit adding the skill) bumps marketplace.json plan-agent "1.3.2" -> "1.4.0" (minor), and kit/plugins/plan-agent/CHANGELOG.md:147-161 ("v1.4.0 — Add craft-prompt skill") documents the skill and references. (8) SKILL.md:203-204 "Present the assembled prompt in a fenced code block the user can copy-paste directly" with ```text format example at 212-221; positive framing instructed at SKILL.md:172 ('Use positive framing ("Do X" not "Don't do Y")'). (9) references/best-practices-reference.md exists (119 lines) with technique-name sections 1-7 (Clarity, Context/Motivation, XML Tags, Role Assignment, Few-Shot Examples, Thinking/CoT, Output Formatting — exactly the seven the plan enumerates at HTML line 1124) plus a bonus eighth (Long-Context), each with actionable "Implementation:" bullets.

Next Steps

Add a /plan-agent:refine command for explicit invocation

Paste this prompt into Claude to execute this follow-up:

Create a command file at kit/plugins/plan-agent/commands/refine.md that explicitly invokes the refine-prompt skill. The command should accept $ARGUMENTS as the initial prompt text or intent description, pass it to the skill's Classify phase, and skip the initial "what do you need?" question when arguments are provided. Follow the command pattern in .claude/rules/plugin-patterns.md. Include YAML frontmatter with a description under 80 characters.

Add prompt-history persistence so users can iterate on past prompts

Paste this prompt into Claude to execute this follow-up:

Add a prompt-history feature to the plan-agent plugin at kit/plugins/plan-agent/. After the skill generates a prompt, save it to ~/.claude/plan-agent/prompt-history/ as a timestamped markdown file with YAML frontmatter (type, intent summary, date). Add a /plan-agent:history command that lists recent prompts and lets the user pick one to refine further. Use Read and Glob to scan the history directory. Limit history to the 50 most recent entries.

Run /skill-reviewer:reviewing-skills to audit the SKILL.md quality

Paste this prompt into Claude to execute this follow-up:

Run /skill-reviewer:reviewing-skills on kit/plugins/plan-agent/skills/refine-prompt/SKILL.md. Apply any fixes from the audit report that score below 4/5 in any dimension. Then run /skill-reviewer:auditing-allowed-tools on the same file to verify the allowed-tools list matches actual tool usage.

Wish List

Add a best-practices version tracker to detect guide updates

Paste this prompt into Claude to execute this follow-up:

Add a version-tracking mechanism to the plan-agent plugin at kit/plugins/plan-agent/. Store the current Anthropic best-practices guide URL and a content hash in references/best-practices-reference.md frontmatter (e.g. guide-url, guide-hash, last-checked). Add a /plan-agent:check-guide command that fetches https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices via WebFetch, compares the content hash, and reports whether the guide has changed since the reference was last updated. If changed, list which sections differ and suggest updates to the templates and technique matrix.

Prompt quality scoring grounded in the seven best-practices dimensions Wish List

Speculative / blue-sky idea — not on the critical path. Paste into Claude when ready to explore:

Add a prompt-quality scoring system to the refine-prompt skill grounded in Anthropic's seven core prompting best practices. After generating a prompt, score it on 7 dimensions matching the guide: (1) Clarity/directness, (2) Context/motivation, (3) XML structure, (4) Role assignment, (5) Examples, (6) Thinking/CoT, (7) Output formatting. Each dimension scores 0 (not applicable), 1 (missing but should be present per the technique matrix), or 2 (present and well-formed). If any applicable dimension scores 1, automatically suggest targeted improvements citing the specific guide section and offer to regenerate. Store scoring rubrics in references/scoring-rubric.md with links to the corresponding Anthropic guide sections.

Multi-model prompt adaptation (Claude, GPT, Gemini, local LLMs) Wish List

Speculative / blue-sky idea — not on the critical path. Paste into Claude when ready to explore:

Extend the refine-prompt skill to adapt prompts for different AI models. Add a "target model" question to the interview phase (default: Claude). Store model-specific best practices in references/model-guides/ (e.g. claude-guide.md, gpt-guide.md, gemini-guide.md). Each guide documents the model's strengths, preferred prompt patterns, token limits, and quirks (e.g. Claude prefers XML tags for structure, GPT prefers markdown headers). The Draft phase selects the appropriate guide and adapts the prompt accordingly. Include a "universal" option that generates a model-agnostic prompt when the user doesn't know which model they'll use.

Unresolved Questions

~~Should the skill auto-activate or require explicit invocation?~~ resolved

Resolved: command-only with disable-model-invocation: true. “Prompt” is too common a word in coding contexts for safe auto-activation, and the six-phase interview pipeline is too heavyweight for a false trigger. Follows the marketplace pattern where multi-step workflows (tdd-loop, tdd-fix, issue-agent) are manual-invoke only. Users invoke via /plan-agent:refine-prompt. Discoverability comes from tab completion, /help, the plan-agent README, and the Recommend phase of other runs.

~~How deep should the recommendation engine scan?~~ resolved

Resolved: use ToolSearch (the harness’s live tool registry) instead of filesystem scanning. ToolSearch is session-scoped (returns only loaded tools), includes MCP tools, and requires no path assumptions. No flag needed — ToolSearch is the sole discovery mechanism. The Recommend phase extracts 2–3 keyword queries from user intent, calls ToolSearch for each, deduplicates, filters to skills/commands/agents, and presents the top 3 matches.