Ship a mandatory Tests section in every implementation plan that specifies real-world application tests — actual unit, integration, E2E, and other test files written for the application or feature, run by the project's test runner, and committed to the codebase. Every plan must also include an objective-verification mock/smoke test: a real test file that runs against the application and confirms the plan's stated goal is actually accomplished. The section uses a two-tier system: Tier 1 (code-touching plans) requires all applicable test sub-sections; Tier 2 (non-code plans like docs or file-move chores) requires only the mandatory objective-verification test. The author selects the tier based on what the steps actually do, not the type: field.
Read and implement all steps in the plan at docs/plans/add-tests-section-to-plans.html — Add a Tests section to the plan template with unit, integration, E2E, and objective-verification tests
add-tests-section-to-plans.html
docs/plans/add-tests-section-to-plans.html
Context
The current implementation plan template (plan-mode.md, reference/SKELETON.md, and the HTML SKELETON.html) defines a robust verification chain: per-step verification confirms each step ran correctly, acceptance criteria confirm the plan meets the definition of done, and end-to-end verification confirms the whole plan executed correctly. However, none of these layers require real application tests — they are all prose-based assertions written inside the plan document itself.
This gap means a plan can be marked "completed" without any automated proof that the changes actually work in the running application. A developer could implement all steps, check all criteria boxes, and ship code that has zero test coverage. The missing layer is a Tests section that specifies real-world tests — actual test files (unit, integration, E2E) that are written for the application or feature, run by the project's test runner, and ship with the codebase. These are not plan-level prose checks; they are the same tests a developer would write and commit alongside feature code.
Additionally, every plan must include a mandatory objective-verification test — a real mock or smoke test that runs against the actual application and directly asserts the plan's stated objective is accomplished. Like all tests in this section, the objective-verification test is a real executable test file, not a prose assertion in the plan document. This closes the loop between "what we said we'd do" and "what the application actually does."
Tier decision (resolved): Investigation of 22 existing plans (17 feature, 4 chore, 1 fix, 1 refactor, 0 docs) showed that chore plans span a wide spectrum — from pure file moves (clean-up-docs) to code-mutating changes (enable-model-invocation). A conditionally-present rule (skip Tests for docs/chore) would either under-test code-touching chores or force a fuzzy "is this code-touching?" classification on every author. Instead, the Tests section is always present with a two-tier depth model: Tier 1 (code-touching plans) includes all applicable sub-sections (unit, integration, E2E) plus the mandatory objective-verification test; Tier 2 (non-code plans: docs, file-move chores) requires only the objective-verification test, with other sub-sections omitted entirely rather than left as empty stubs. The author selects the tier based on what the steps actually do, not the type: frontmatter field — a type: chore plan that changes import paths is Tier 1, while a type: chore plan that moves directories is Tier 2.
Files to Modify
- ~/.claude/rules/
plan-mode.mdmodified add tests to Required Structure
- ~/.claude/rules/reference/
SKELETON.mdmodified add Tests section template
- kit/plugins/plan-agent/skills/implementation-plan/
SKILL.mdmodified add Tests to Required Structure and HTML Outputreference/SKELETON.htmlmodified add Tests section HTML, CSS, and nav link
Diagram
Per-step verification
Tests section
Acceptance criteria
End-to-end verification
| Test Type | Scope | Tier | Include When | Required? |
|---|---|---|---|---|
| Unit | Real test file targeting a single function, method, or module in isolation | Tier 1 only | Plan adds or modifies business logic, utilities, helpers, data transformations | When applicable |
| Integration | Real test file exercising multiple modules, services, or layers working together | Tier 1 only | Plan touches API routes, database queries, middleware chains, service boundaries | When applicable |
| E2E | Real test file driving a full user flow through the running application | Tier 1 only | Plan affects user-facing flows, page navigation, form submissions, auth flows | When applicable |
| Objective-verification | Real mock/smoke test file that runs against the application and asserts the plan's stated objective is accomplished | Both tiers | Always — every plan must include one; always appears first in the Tests section as a highlighted hero card | Mandatory |
| Tier | When to use | Required sub-sections | Examples |
|---|---|---|---|
| Tier 1 (code-touching) | Any step creates, modifies, or deletes application source files, configs that affect runtime, or test files | Unit, Integration, E2E (when applicable) + Objective-verification (always) | feature, fix, refactor; chores that change imports, bump deps with code changes |
| Tier 2 (non-code) | Steps only move, rename, or delete files; write docs; update metadata; or perform other non-code operations | Objective-verification only (other sub-sections omitted, not left as empty stubs) | docs plans; chores that move directories, rename files, update non-runtime metadata |
Steps
tests to the Required Structure in plan-mode.md
Verify
Re-read ~/.claude/rules/plan-mode.md § Required Structure. Confirm a new tests bullet exists between steps and acceptance-criteria, and that it mentions the two-tier depth model (Tier 1 for code-touching plans, Tier 2 for non-code plans). At this point it only needs to name the section, its purpose, and the tier distinction — the detailed specification (sub-categories, mandatory vs. conditional rules) is Step 2's deliverable. No other sections should be altered.
plan-mode.md
Verify
Read the new tests bullet and confirm it specifies: (1) the two-tier model — Tier 1 (code-touching) includes unit/integration/E2E "when applicable" plus mandatory objective-verification; Tier 2 (non-code) includes only the mandatory objective-verification test with other sub-sections omitted entirely; (2) the author selects the tier based on what the steps do, not the type: field; (3) the objective-verification test is a mandatory real mock/smoke test that runs against the application and asserts the plan's stated objective; (4) the objective-verification test always appears first in the Tests section as a highlighted hero card, before any unit/integration/E2E sub-sections; (5) the Tests section is explicitly distinct from per-step verification (prose) and end-to-end verification (prose) — Tests are real test files that ship with the code.
reference/SKELETON.md
Verify
Open ~/.claude/rules/reference/SKELETON.md and confirm a ## Tests section exists between ## Steps and ## Acceptance Criteria. The objective-verification test placeholder must appear first (before unit/integration/E2E sub-headings), reflecting the hero-card-first layout. Confirm no existing sections were removed or reordered.
SKELETON.html
Verify
Open kit/plugins/plan-agent/skills/implementation-plan/reference/SKELETON.html <style> block and confirm new CSS rules exist for: .test-list, .test-card, .test-badge (with variants .test-badge-unit, .test-badge-integration, .test-badge-e2e, .test-badge-objective), and .objective-test-card. Confirm the objective test card has a visually distinct treatment (e.g. accent border, highlighted background).
SKELETON.html
Verify
Open kit/plugins/plan-agent/skills/implementation-plan/reference/SKELETON.html and confirm: (1) a <section class="section-card card-tests" id="tests"> element exists between the Steps and Acceptance Criteria sections; (2) the .objective-test-card appears first inside the section as a highlighted hero card, before the .test-list container with unit/integration/E2E placeholder .test-card items; (3) a nav sidebar <li> with href="#tests" and a beaker icon is present between Steps and Criteria links.
SKILL.md Required Structure and HTML Output Requirements
Verify
Read kit/plugins/plan-agent/skills/implementation-plan/SKILL.md Required Structure and HTML Output Requirements sections. Confirm: (1) tests is listed as a required section between steps and acceptance-criteria; (2) the description specifies the two-tier model — Tier 1 (code-touching) includes unit/integration/E2E as conditional plus objective-verification as mandatory; Tier 2 (non-code) includes only objective-verification with other sub-sections omitted; (3) the tier is selected by step content analysis, not the type: field; (4) the HTML Output Requirements mention the new CSS classes and the section's placement.
SKILL.md workflow
Verify
Read the Workflow section in kit/plugins/plan-agent/skills/implementation-plan/SKILL.md and confirm a test-generation instruction exists (either as a sub-step of an existing step or as a new step). It should describe: (1) scanning step content to classify the tier — Tier 1 if any step touches application source files, Tier 2 otherwise; (2) for Tier 1, selecting applicable test types (unit/integration/E2E) based on what the steps modify; (3) for Tier 2, omitting unit/integration/E2E sub-sections entirely; (4) always generating an objective-verification test entry regardless of tier; (5) populating the {test-items} and {objective-test} placeholders in the skeleton.
Acceptance Criteria
Verification
After implementing all steps, generate a new plan using /plan-agent:implementation-plan for any feature objective (e.g. "Add a search bar to the dashboard"). Open the resulting HTML plan and confirm:
- A Tests section appears between Steps and Acceptance Criteria in the rendered page.
- The section contains at least one test card for the applicable test types — each describing a real test file to be written for the application (the search-bar plan should include unit tests for the search logic, integration tests for the API, and E2E tests for the user flow).
- An objective-verification test hero card is always present, visually distinct, positioned first in the Tests section (before any unit/integration/E2E sub-sections), and describes a real mock/smoke test file that runs against the application and asserts the plan's objective statement is accomplished.
- The sidebar navigation includes a "Tests" link that scrolls to the correct section.
- Tier 1 verification: Generating a plan for a feature or fix (e.g. "Add a search bar") produces a Tier 1 Tests section with unit/integration/E2E sub-sections populated based on step analysis, plus the mandatory objective-verification test.
- Tier 2 verification: Generating a plan for a non-code change (e.g. a docs-only plan or a file-move chore) produces a Tier 2 Tests section with only the mandatory objective-verification test — unit/integration/E2E sub-sections are omitted entirely, not rendered as empty stubs.
- Tier boundary test: Generating a
type: choreplan whose steps modify application source files (e.g. "Bump dependency and update import paths") correctly selects Tier 1, not Tier 2 — confirming tier is driven by step content, not thetype:field.
Separately, create a markdown plan using the updated SKELETON.md and confirm the ## Tests section template renders with all four test-type placeholders.
Completion Checklist
Completion Report
No items to report — all requirements met.
Unresolved Questions
-
Should non-code plans (docs, chore) require the full Tests section?ResolvedDecision: Always-present Tests section with a two-tier depth model. Tier 1 (code-touching plans) includes all applicable sub-sections; Tier 2 (non-code plans) requires only the objective-verification test. The author selects the tier based on step content, not the
type:field. Rationale: chore plans span a wide spectrum (file moves to code mutations), making a conditionally-present rule unreliable; the objective-verification test is always mandatory so the section is never truly empty; and consistency reduces cognitive load for authors. -
Where should the objective-verification test appear when tests exist in multiple categories?ResolvedDecision: Always first, as a highlighted hero card. The objective-verification test is the one test you'd write if you could only write one — placing it at the top mirrors that priority. A fixed position eliminates ambiguity: "last" sends the wrong signal (capstone implies "do after everything else" when it should be written first), and "categorized" forces a scope classification ("is this E2E or integration?") that adds friction without value while making the mandatory test harder to find. The layout reads: hero card (the goal) → unit tests (details) → integration tests → E2E tests — mirroring how the plan's objective appears at the top of the document before context and steps.