Plan Review: Add Research Agent Team

Plan: docs/plans/add-research-agent-team.html
Reviewed: 2026-06-08 13:51 UTC
Reviewers: 7 (5 core + 2 UI-conditional)
Verdict: Sound with Revisions

Executive Summary

The plan is sound with revisions. All 7 reviewers agree the design follows the proven review-plan Agent Team pattern and inserts cleanly into the existing pipeline as Step 0c with proper graceful degradation. However, the team identified three high-priority issues that should be resolved before implementation:

The dual-artifact redundancy between agent definition files and research-prompts.md — flagged by 5 of 7 reviewers — needs a single source of truth
The complexity auto-detection heuristic is underspecified, leaving the most common invocation path (no flags) with an undefined branch
The test infrastructure is incomplete — test files are declared but no runner, framework, or directory creation steps exist

All three issues have been addressed in the inline edits applied to the source plan.

Role-by-Role Findings

Architecture Review

Fit: Follows the established Agent Team pattern with well-scoped component boundaries and proper graceful degradation.

medium — Dual-artifact redundancy between agent definitions and research-prompts.md; recommends agent definitions as single source of truth
medium — Context injection is implicit and unstructured; Research Brief handoff between 0c and Step 1 has no defined serialization boundary
medium — Complexity auto-detection heuristic underspecified; no programmatic detection logic defined
low — Unverified TeamCreate/TeamDelete tool names

Completeness Review

Completeness: Well-structured with 12 steps covering all named files, but gaps could cause implementation friction.

high — Auto-detection heuristic buried in Unresolved Question, not in Step 6
high — Test files declared but no creation steps exist in Steps section
high — Two Unresolved Questions directly affect Step 6 implementation
medium — No directory creation steps for agents/ and references/
medium — No step to read existing SKILL.md before editing
medium — No test runner specified for .ts test files

Testability Review

Test coverage: Adequate for Tier 1, but critical paths lack automated coverage.

high — No test for auto-detection heuristic
high — Objective-verification test is manual CLI invocation, not automated
high — No test runner or framework specified
medium — No test for conditional researcher spawning based on UI signals
medium — Integration test doesn’t specify env var manipulation

Risk Review

Risk level: Medium overall.

high — No partial-failure handling for researcher agents; synthesis blocks if one hangs
high — Unverified tool names in allowed-tools could cause silent permission failures
medium — Dual-artifact prompt drift risk
medium — No token budget ceiling for 4–6 parallel researchers
medium — Experimental flag dependency; silent removal could break feature

Conventions Review

Fit: Closely follows established project patterns.

medium — Redundant prompt storage conflicts with review-plan single-source pattern
medium — Test file placement in non-existent tests/plan-agent/ directory
medium — allowed-tools casing for TeamCreate/TeamDelete needs verification

UX Review

User fit: Well-structured plan, but research phase UX lacks progress feedback and auto-detection predictability.

high — No progress indication during research phase; users face silent wait
high — Auto-detection UX undefined; users cannot predict when research spawns
medium — Graceful degradation is too quiet when --research was explicitly requested
low — Research Brief hidden by default in collapsed <details>
low — No empty-state handling for partial researcher failure in the Brief

Accessibility Review

A11y compliance: Largely WCAG 2.1 AA compliant, but medium-severity gaps in the HTML template.

medium — status-badge lacks accessible label tying it to plan title
medium — <details>/<summary> marker removal may break VoiceOver
medium — Disabled checkboxes removed from tab order without aria-disabled
medium — compare-grid uses divs instead of semantic table
medium — Pulse-dot animation missing prefers-reduced-motion suppression

Agreements & Conflicts

Confirmed Concerns (multiple reviewers agree)

Dual-artifact redundancy — 5/7 reviewers (architecture, completeness, risk, conventions, accessibility). Consensus: use agent definitions as single source of truth.

Auto-detection heuristic underspecified — 4/7 reviewers (architecture, completeness, UX, risk). Consensus: define concrete rules, default to conservative opt-in bias.

Unverified tool names — 3/7 reviewers (architecture, risk, conventions). Consensus: verify against review-plan’s allowed-tools before implementation.

No partial-failure handling — 3/7 reviewers (risk, UX, architecture). Consensus: proceed with available findings after turn limit, note missing roles.

Test infrastructure incomplete — 2/7 reviewers (completeness, testability). Consensus: specify test runner and add test creation steps.

Conflicts

None. All reviewers were directionally aligned with no contradictory recommendations.

Highest-Risk Issues (priority order)

Dual-artifact prompt drift (5 reviewers) — resolve before implementation by making agent definitions the single canonical prompt source
Auto-detection heuristic undefined (4 reviewers) — embed concrete detection rules in Step 6 with conservative default
No partial-failure / timeout handling (3 reviewers) — add turn-limit rule and missing-role notation to synthesis
Unverified TeamCreate/TeamDelete tool names (3 reviewers) — cross-reference with actual Agent Teams API before writing allowed-tools
Test infrastructure gaps (2 reviewers) — specify test runner, add test creation steps, clarify manual vs automated expectations

Inline Edits Applied to Source Plan

Target	Action	Change
Step 1 `.step-why`	edit	Added panel recommendation: agent definitions as single source of truth for role prompts
Step 6 `.step-why`	edit	Added concrete auto-detection heuristic, partial-failure rule, progress reporting, stronger degradation warning
Step 5 `.step-why`	edit	Added last-wins conflict resolution documentation requirement
Step 7 `.step-why`	edit	Changed `<details>` to `<details open>` for Research Brief discoverability
Step 8 `.step-why`	edit	Added tool name verification requirement against Agent Teams API
Step 1 `.verify-body`	edit	Added directory pre-check for `references/`
Step 3 `.verify-body`	edit	Added directory pre-check for `agents/`
`#criteria-list`	append	Added AC11 (partial-failure handling), AC12 (progress reporting), AC13 (flag conflict docs)