Make /git-agent:ship-autonomous immune to claude-code#43809 — chain branch-agent, commit-agent, and pr-agent by Reading their SKILL.md files instead of gated Skill-tool calls, keeping disable-model-invocation: true (the guard against unsolicited git mutations) fully intact.
Read and implement all steps in the plan at docs/plans/fix-ship-autonomous-skill-chaining.html — Make ship-autonomous immune to issue #43809 via Read-based skill chaining
fix-ship-autonomous-skill-chaining.html
docs/plans/fix-ship-autonomous-skill-chaining.html
Context
claude-code#43809 reports that disable-model-invocation: true blocks every model-driven Skill-tool invocation — not just unsolicited auto-triggering. Any nested call fails with Error: Skill <name> cannot be used with Skill tool due to disable-model-invocation, even when the user explicitly requested the parent workflow. The issue was filed against Claude Code 2.1.92 and auto-closed as inactive — the behavior was never fixed.
The flag is deliberate in git-agent: commit-agent, pr-agent, branch-agent, and ship stage all changes and mutate git state, so they are command-only — explicit /git-agent:<name> invocation, never fuzzy intent matching. Keeping that guard is a hard requirement of this plan.
The exposure: ship-autonomous chains its pipeline by instructing Skill-tool invocations of three flagged skills at five sites — Step 2 branch-agent (line 89), Step 3 commit-agent (line 99), Step 4 pr-agent (line 109), and the autofix loop's commit references (lines 229 and 238). The user typed /git-agent:ship-autonomous, not the child commands, so every nested call is model-driven and refusable. The plain ship skill and the agent-commit background agent are immune only because they inline duplicate copies of the same workflows — a maintenance tax this plan avoids.
The fix: Read-based chaining. The flag gates only the Skill tool — SKILL.md files are plain markdown on disk, and ${CLAUDE_PLUGIN_ROOT} expands inside skill content at load time (verified empirically this session). Reading keeps a single source of truth. One behavioral consequence: under Skill-tool chaining each child ran under its own allowed-tools; under Read-chaining the parent's list governs — so ship-autonomous must absorb Bash(date *) for branch-agent's auto-naming. (Branch-agent's model: Haiku pin likewise no longer applies — inlined steps run on the session model.)
Files to Modify
kit/plugins/git-agent/skills/ship-autonomous/SKILL.mdmodified swap 5 Skill-tool calls to Read chaining; update allowed-toolskit/plugins/git-agent/CHANGELOG.mdmodified add v3.10.7 entry.claude-plugin/marketplace.jsonmodified bump git-agent 3.10.6 → 3.10.7tests/plugins/test-ship-autonomous-read-chaining.shnew chaining-contract smoke test
Diagram
- Invoke git-agent:commit-agent via Skill tool
- Gated: refuses disable-model-invocation skills
- "cannot be used with Skill tool" (#43809)
- Child skill's allowed-tools govern the sub-flow
- Read ${CLAUDE_PLUGIN_ROOT}/skills/<name>/SKILL.md
- Ungated: Read is a plain file read
- Single source of truth — no duplicated logic
- Parent allowed-tools govern (+ Bash(date *))
Steps
kit/plugins/git-agent/skills/ship-autonomous/SKILL.md to Read-based chaining — at the branch-agent (~line 89), commit-agent (~line 99), and pr-agent (~line 109) sites, replace "Invoke the existing git-agent:<name> skill" with: "Read ${CLAUDE_PLUGIN_ROOT}/skills/<name>/SKILL.md and execute its steps inline — do not use the Skill tool. Treat every STOP in the child's content — including its terminal "do not take further action" clause — as ending that sub-workflow only, then continue this pipeline; skip its Step 0 (plan-mode exit is already handled here). If ${CLAUDE_PLUGIN_ROOT} is unexpanded, locate the file with Glob **/git-agent/skills/<name>/SKILL.md (on multiple matches prefer the running plugin's own root; if still ambiguous, STOP and report). If the file cannot be Read at all, STOP the pipeline and report — never improvise the child workflow from memory." Also tighten the PR URL capture (lines ~117–118): take the URL from the gh pr create output, with gh pr view --json url as fallback, instead of "pr-agent's final output". Special-case re-runs: when pr-agent's inlined steps find an existing OPEN PR, capture that PR's URL and continue to Step 5 (CI watch) — its "STOP, do not create a duplicate" is not a pipeline halt.
disable-model-invocation: true skill (claude-code#43809); the Read tool is ungated and preserves a single source of truth instead of inlining duplicate copies of the child workflows.Verify
grep -ci "invoke the existing" kit/plugins/git-agent/skills/ship-autonomous/SKILL.md returns 0 — case-insensitive, because line 89's lowercase "invoke" would slip past a capital-I grep; a Read instruction referencing ${CLAUDE_PLUGIN_ROOT}/skills/<name>/SKILL.md exists for each of branch-agent, commit-agent, and pr-agent (three explicit checks, not an aggregate count); the failed-Read STOP rule is present; the PR URL capture references the gh pr create output rather than "pr-agent's final output"; the existing-OPEN-PR continue-to-CI-watch rule is present.${CLAUDE_PLUGIN_ROOT}/skills/commit-agent/SKILL.md again if its steps are no longer in context)".
Verify
grep -n "commit-agent" kit/plugins/git-agent/skills/ship-autonomous/SKILL.md shows only Read-phrased references — no git-agent:commit-agent mention on a line lacking Read or ${CLAUDE_PLUGIN_ROOT} (lines 229/238's "commit via git-agent:commit-agent" phrasing would slip past the Step 1 grep alone); the file-wide case-insensitive count of "invoke the existing" is 0.Skill from allowed-tools and add Bash(date *); below the intro add one note: child skills are command-only (disable-model-invocation: true) — chain them by Reading their SKILL.md, never via the Skill tool (claude-code#43809). The note also records three standing facts: the Bash(glab *) exclusion is intentional (GitHub-only pipeline); branch-agent's model: Haiku pin does not apply under Read-chaining; and any future addition to a child's allowed-tools must be mirrored here.
allowed-tools; under Read-chaining the parent's list governs, and branch-agent's auto-naming runs date. Dropping Skill enforces least privilege; the note stops future edits from regressing. Bash(glab *) is deliberately not added — ship-autonomous is GitHub-only (Step 1 hard-requires gh auth), so pr-agent's GitLab path is unreachable here.Verify
head -5 kit/plugins/git-agent/skills/ship-autonomous/SKILL.md shows allowed-tools containing Bash(date *) and no Skill token, with all previously listed tools otherwise intact; the guard note appears in the body and names the glab exclusion, the Haiku-pin loss, and the mirror obligation.tests/plugins/test-ship-autonomous-read-chaining.sh, modeled on tests/plugins/test-step8-review-option.sh (bash, set -euo pipefail, numbered PASS/FAIL checks, FAILURES counter, non-zero exit on failure). Assertions: (1) case-insensitive count of "invoke the existing" is 0; (2) no git-agent:(branch|commit|pr)-agent reference on a line lacking Read or ${CLAUDE_PLUGIN_ROOT}; (3) a literal ${CLAUDE_PLUGIN_ROOT}/skills/<name>/SKILL.md Read reference per child — three explicit checks, not an aggregate count; (4) allowed-tools has Bash(date *) and no Skill token; (5) each child still declares disable-model-invocation: true and is unchanged from main (git diff --quiet origin/main -- <child paths>); (6) the failed-Read STOP rule is present; (7) the parent's allowed-tools is a superset of each child's, excluding the intentional Bash(glab *). Note: tests/plugins/ is manual-run by repo convention — not wired into CI.
Verify
bash tests/plugins/test-ship-autonomous-read-chaining.sh prints PASS for all seven assertion groups and exits 0; spot-check the guard by deliberately breaking one contract line (e.g. rewording a Read reference) and confirming the test fails before reverting..claude-plugin/marketplace.json.
.claude/rules/marketplace.md; the new value must be higher than main's, and the settings hook auto-validates JSON syntax on save.Verify
jq -r '.plugins[] | select(.name=="git-agent").version' .claude-plugin/marketplace.json prints 3.10.7; the same query against git show origin/main:.claude-plugin/marketplace.json prints 3.10.6.## v3.10.7 — 2026-06-10 — Read-based skill chaining in ship-autonomous with a ### Fixed list covering the five swapped sites, the allowed-tools change, the guard note, the new smoke test, and a link to claude-code#43809 — to kit/plugins/git-agent/CHANGELOG.md. Include a one-line note that 3.10.6 shipped without a CHANGELOG entry — the v3.10.7 → v3.10.5 lineage gap predates this fix and is acknowledged, not introduced, here.
Verify
head -15 kit/plugins/git-agent/CHANGELOG.md shows the v3.10.7 entry above v3.10.5, dated 2026-06-10, containing the issue link and the one-line v3.10.6 gap acknowledgement.Tests
File: tests/plugins/test-ship-autonomous-read-chaining.sh
Type: smoke test — static contract assertions, run with bash like the rest of tests/plugins/
Asserts: case-insensitive zero count of "invoke the existing"; no gated git-agent:<child> reference outside a Read line; a literal ${CLAUDE_PLUGIN_ROOT}/skills/<name>/SKILL.md reference per child (three explicit checks); allowed-tools drops Skill, gains Bash(date *), and is a superset of each child's tools (minus the intentional Bash(glab *) exclusion); all three children unchanged from main and still disable-model-invocation: true; the failed-Read STOP rule is present.
Run: bash tests/plugins/test-ship-autonomous-read-chaining.sh
Unit, integration, and E2E suites are not applicable — the change edits declarative skill markdown and plugin metadata with no executable application code. The smoke test above is the executable contract; a live pipeline run is covered in Verification.
Acceptance Criteria
Verification
Run the smoke test and the metadata checks, then (optionally) exercise the live pipeline:
bash tests/plugins/test-ship-autonomous-read-chaining.sh— every check prints PASS, exit code 0.jq -r '.plugins[] | select(.name=="git-agent").version' .claude-plugin/marketplace.jsonprints3.10.7; the same query againstgit show origin/main:.claude-plugin/marketplace.jsonprints3.10.6.git diff --name-only origin/mainlists exactly: the ship-autonomous SKILL.md, the git-agent CHANGELOG,.claude-plugin/marketplace.json, the new test script, this plan file, and the auto-rebuiltdocs/plans/index.html.- Live smoke (required — the static test cannot catch runtime failure modes like a STOP-halt mid-pipeline or an unexpanded
${CLAUDE_PLUGIN_ROOT}): runclaude --plugin-dir ./kit/plugins/git-agenton a scratch branch with a trivial change, invoke/git-agent:ship-autonomous, and confirm Steps 2–4 Read the child SKILL.md files — no "cannot be used with Skill tool due to disable-model-invocation" error — and the pipeline reaches PR creation without halting at any inlined child STOP. Re-run once on a branch that already has an OPEN PR and confirm the pipeline continues to CI watch instead of halting.
Commit all changed files together — repo convention ships the plan file with the plugin change.
Completion Checklist
Completion Report
No items to report — all requirements met.
Team Review (2026-06-10 22:32:41 UTC) — 5 core reviewers, plan updated in place
Executive Summary
Sound with revisions — now applied. Five core reviewers (architecture, completeness, testability, risk, conventions; no UI signals, so UX/accessibility reviewers were not spawned) found no objection to the Read-based chaining direction; architecture confirmed it strictly improves on the repo's existing duplicate-the-workflow pattern. Two high-severity findings drove the revisions: the plan's own verification grep (capital-I "Invoke the existing") matched only 2 of the 5 sites it guards, and the inlined child skills' emphatic STOP directives (14/5/8 occurrences) risk halting the pipeline mid-run. Both are addressed in the updated plan, along with a newly surfaced re-entrancy bug (pr-agent's existing-OPEN-PR STOP) and a documented CHANGELOG lineage gap.
Role-by-Role Findings
- Architecture: Mechanism sound and consistent with the codebase; flagged the permanent allowed-tools flattening (parent must carry the union of child tool needs) and recommended a machine-checkable superset assertion in the smoke test, plus recording the
Bash(glab *)exclusion as intentional and disambiguating the Glob fallback in multi-checkout layouts. All adopted. - Completeness: Steps highly specific; line numbers and version arithmetic verified accurate against the repo. Found the high-severity case-sensitivity blind spot at line 89 (lowercase "invoke"), the unpinned Read anchor string between edit and test, the missing executable check for ac4, and the pre-existing missing v3.10.6 CHANGELOG entry. All adopted.
- Testability: Confirmed the line-89 and lines-229/238 blind spots live in the file; recommended case-insensitive zero-count, a no-gated-reference-outside-a-Read-line check, per-child positive assertions, and pinning the failed-Read STOP rule. Noted
tests/plugins/is not CI-wired (manual-run convention, now stated in Step 4). All adopted. - Risk (level: medium): Highest risk is the model obeying an inlined child's terminal STOP and ending the pipeline; second, pr-agent's existing-OPEN-PR STOP aborting re-runs; third, environment-dependent
${CLAUDE_PLUGIN_ROOT}expansion in skill content. Mitigations adopted: strengthened STOP-override wording, the continue-to-CI-watch rule (new ac9), and the live--plugin-dirsmoke promoted from optional to required. Rollback risk assessed low (declarative files only; marketplace merge driver covers version conflicts). - Conventions: Version arithmetic, CHANGELOG header format, test naming, and restricted
Bash(...)allowed-tools style all match repo practice. Flagged the v3.10.6 changelog gap (now acknowledged in Step 6) and, cosmetically, Step 1's multi-clause density (declined — see Conflicts).
Agreements & Conflicts
Confirmed concerns (multiple reviewers): the case-sensitive grep blind spot (completeness + testability, both high); the v3.10.6 CHANGELOG gap (completeness + risk + conventions); per-child positive Read assertions over an aggregate count (completeness + testability, with architecture's superset check as the stronger variant — both adopted).
Conflict resolved: conventions suggested splitting Step 1 into two cards for one-idea-per-item style; declined to preserve the user-aligned six-step structure — the added safeguards stay consolidated in Step 1, accepted as a readability tradeoff.
Highest-Risk Issues (priority order)
- Verification blind spot (high, completeness + testability) — capital-I grep missed 3 of 5 guarded sites. Fixed: case-insensitive counts, per-child literal Read assertions, and a no-gated-reference-outside-a-Read-line check across Step 1/2 verifies, ac1, and the smoke test.
- Inlined child STOP halts the pipeline (high, risk) — children contain 14/5/8 STOP tokens. Fixed: Step 1 override now reads "treat every STOP in the child's content — including its terminal clause — as ending that sub-workflow only"; live smoke (now required) is the runtime check.
- pr-agent existing-OPEN-PR abort (medium, risk) — re-runs would halt at "do not create a duplicate". Fixed: explicit capture-URL-and-continue-to-CI-watch rule in Step 1 plus new acceptance criterion ac9.
${CLAUDE_PLUGIN_ROOT}expansion is environment-dependent (medium, risk + architecture) — mitigated by the Glob fallback (now with multi-match disambiguation), the failed-Read STOP rule, and the required live smoke.- allowed-tools superset drift (medium, architecture) — fixed: smoke-test assertion 7 (parent superset of each child, minus intentional glab) and the guard note's mirror obligation.
- CHANGELOG v3.10.6 lineage gap (low, three reviewers) — acknowledged with a one-line note in the v3.10.7 entry rather than a fabricated backfill.
Inline Edits Applied
| Target | Action | Change |
|---|---|---|
| Step 1 action | edit | Strengthened STOP override (every STOP is sub-workflow-local), Glob multi-match disambiguation, existing-OPEN-PR continue-to-CI-watch rule |
| Step 1 verify | edit | Case-insensitive zero-count; per-child Read assertions; OPEN-PR rule check |
| Step 2 verify | edit | No gated commit-agent mention outside a Read line; case-insensitive count |
| Step 3 action + verify | edit | Guard note now records glab exclusion, Haiku-pin loss, and the mirror obligation |
| Step 4 action + verify | edit | Seven explicit test assertions incl. parent-superset check and children-unchanged git diff; manual-run convention noted |
| Step 6 action + verify | edit | One-line acknowledgement of the pre-existing v3.10.6 CHANGELOG gap |
| Objective test card | edit | Asserts rewritten to match the seven assertion groups |
| #criteria-list ac1, ac4 | edit | ac1 case-insensitive + outside-Read-line clause; ac4 gains executable git diff command |
| #criteria-list | append | New ac9: OPEN-PR re-run continues to CI watch (progress count 8 → 9) |
| #verification item 4 | edit | Live --plugin-dir smoke promoted from optional to required, plus an OPEN-PR re-run check |
| #context | append | Noted branch-agent's model: Haiku pin does not apply under Read-chaining |