Plans

Plan Review: Add Research Agent Team

Plan: docs/plans/add-research-agent-team.html
Reviewed: 2026-06-08 13:51 UTC
Reviewers: 7 (5 core + 2 UI-conditional)
Verdict: Sound with Revisions

Executive Summary

The plan is sound with revisions. All 7 reviewers agree the design follows the proven review-plan Agent Team pattern and inserts cleanly into the existing pipeline as Step 0c with proper graceful degradation. However, the team identified three high-priority issues that should be resolved before implementation:

  1. The dual-artifact redundancy between agent definition files and research-prompts.md — flagged by 5 of 7 reviewers — needs a single source of truth
  2. The complexity auto-detection heuristic is underspecified, leaving the most common invocation path (no flags) with an undefined branch
  3. The test infrastructure is incomplete — test files are declared but no runner, framework, or directory creation steps exist

All three issues have been addressed in the inline edits applied to the source plan.


Role-by-Role Findings

Architecture Review

Fit: Follows the established Agent Team pattern with well-scoped component boundaries and proper graceful degradation.

Completeness Review

Completeness: Well-structured with 12 steps covering all named files, but gaps could cause implementation friction.

Testability Review

Test coverage: Adequate for Tier 1, but critical paths lack automated coverage.

Risk Review

Risk level: Medium overall.

Conventions Review

Fit: Closely follows established project patterns.

UX Review

User fit: Well-structured plan, but research phase UX lacks progress feedback and auto-detection predictability.

Accessibility Review

A11y compliance: Largely WCAG 2.1 AA compliant, but medium-severity gaps in the HTML template.


Agreements & Conflicts

Confirmed Concerns (multiple reviewers agree)

Dual-artifact redundancy — 5/7 reviewers (architecture, completeness, risk, conventions, accessibility). Consensus: use agent definitions as single source of truth.
Auto-detection heuristic underspecified — 4/7 reviewers (architecture, completeness, UX, risk). Consensus: define concrete rules, default to conservative opt-in bias.
Unverified tool names — 3/7 reviewers (architecture, risk, conventions). Consensus: verify against review-plan’s allowed-tools before implementation.
No partial-failure handling — 3/7 reviewers (risk, UX, architecture). Consensus: proceed with available findings after turn limit, note missing roles.
Test infrastructure incomplete — 2/7 reviewers (completeness, testability). Consensus: specify test runner and add test creation steps.

Conflicts

None. All reviewers were directionally aligned with no contradictory recommendations.


Highest-Risk Issues (priority order)

  1. Dual-artifact prompt drift (5 reviewers) — resolve before implementation by making agent definitions the single canonical prompt source
  2. Auto-detection heuristic undefined (4 reviewers) — embed concrete detection rules in Step 6 with conservative default
  3. No partial-failure / timeout handling (3 reviewers) — add turn-limit rule and missing-role notation to synthesis
  4. Unverified TeamCreate/TeamDelete tool names (3 reviewers) — cross-reference with actual Agent Teams API before writing allowed-tools
  5. Test infrastructure gaps (2 reviewers) — specify test runner, add test creation steps, clarify manual vs automated expectations

Inline Edits Applied to Source Plan

TargetActionChange
Step 1 .step-whyeditAdded panel recommendation: agent definitions as single source of truth for role prompts
Step 6 .step-whyeditAdded concrete auto-detection heuristic, partial-failure rule, progress reporting, stronger degradation warning
Step 5 .step-whyeditAdded last-wins conflict resolution documentation requirement
Step 7 .step-whyeditChanged <details> to <details open> for Research Brief discoverability
Step 8 .step-whyeditAdded tool name verification requirement against Agent Teams API
Step 1 .verify-bodyeditAdded directory pre-check for references/
Step 3 .verify-bodyeditAdded directory pre-check for agents/
#criteria-listappendAdded AC11 (partial-failure handling), AC12 (progress reporting), AC13 (flag conflict docs)