/writing-commands-review¶

Workflow Diagram¶

Review and test a command against the full quality checklist. Evaluates structure, content quality, behavioral clarity, and anti-patterns. Produces a scored review report and runs the command testing protocol (dry run, happy path, error path, edge case).

flowchart TD
  Start([Start]) --> ReadCmd[Read full command]
  ReadCmd --> StructCheck[Structure checklist]
  StructCheck --> HasFM{YAML frontmatter?}
  HasFM --> HasMission{MISSION section?}
  HasMission --> HasRole{ROLE tag?}
  HasRole --> HasInvariants{Invariant\nPrinciples?}
  HasInvariants --> HasSteps{Execution steps?}
  HasSteps --> HasOutput{Output section?}
  HasOutput --> HasForbidden{FORBIDDEN section?}
  HasForbidden --> HasAnalysis{Analysis tag?}
  HasAnalysis --> HasReflection{Reflection tag?}
  HasReflection --> ContentCheck[Content quality check]
  ContentCheck --> Imperatives{Steps are\nimperative?}
  Imperatives --> Tables{Tables for\nstructured data?}
  Tables --> Branches{All conditionals\nhave both branches?}
  Branches --> BehaviorCheck[Behavioral check]
  BehaviorCheck --> NoAmbiguity{No ambiguity\nat any step?}
  NoAmbiguity --> TestableInvariants{Invariants\ntestable?}
  TestableInvariants --> AntiPatterns[Anti-pattern check]
  AntiPatterns --> Score[Compute score]
  Score --> ScoreGate{Score >= 80%?}
  ScoreGate -- No --> RevisionNeeded[Flag for revision]
  ScoreGate -- Yes --> TestProtocol[Testing protocol]
  TestProtocol --> DryRun[Dry run]
  DryRun --> HappyPath[Happy path test]
  HappyPath --> ErrorPath[Error path test]
  ErrorPath --> EdgeCase[Edge case test]
  EdgeCase --> AllPass{All 4 tests\npass?}
  AllPass -- No --> DocIssues[Document failures]
  DocIssues --> Report
  AllPass -- Yes --> Report[Produce review report]
  RevisionNeeded --> Report
  Report --> Done([Done])

  style Start fill:#4CAF50,color:#fff
  style Done fill:#4CAF50,color:#fff
  style ScoreGate fill:#f44336,color:#fff
  style AllPass fill:#f44336,color:#fff
  style HasFM fill:#FF9800,color:#fff
  style HasMission fill:#FF9800,color:#fff
  style HasRole fill:#FF9800,color:#fff
  style HasInvariants fill:#FF9800,color:#fff
  style HasSteps fill:#FF9800,color:#fff
  style HasOutput fill:#FF9800,color:#fff
  style HasForbidden fill:#FF9800,color:#fff
  style HasAnalysis fill:#FF9800,color:#fff
  style HasReflection fill:#FF9800,color:#fff
  style Imperatives fill:#FF9800,color:#fff
  style Tables fill:#FF9800,color:#fff
  style Branches fill:#FF9800,color:#fff
  style NoAmbiguity fill:#FF9800,color:#fff
  style TestableInvariants fill:#FF9800,color:#fff
  style ReadCmd fill:#2196F3,color:#fff
  style StructCheck fill:#2196F3,color:#fff
  style ContentCheck fill:#2196F3,color:#fff
  style BehaviorCheck fill:#2196F3,color:#fff
  style AntiPatterns fill:#2196F3,color:#fff
  style Score fill:#2196F3,color:#fff
  style RevisionNeeded fill:#2196F3,color:#fff
  style TestProtocol fill:#2196F3,color:#fff
  style DryRun fill:#2196F3,color:#fff
  style HappyPath fill:#2196F3,color:#fff
  style ErrorPath fill:#2196F3,color:#fff
  style EdgeCase fill:#2196F3,color:#fff
  style DocIssues fill:#2196F3,color:#fff
  style Report fill:#2196F3,color:#fff

Legend¶

Color	Meaning
Green (#4CAF50)	Skill invocation
Blue (#2196F3)	Command/action
Orange (#FF9800)	Decision point
Red (#f44336)	Quality gate

Command Content¶

# MISSION

Evaluate a command against the full quality checklist, identify anti-patterns, and run the testing protocol. Produce a scored review report with actionable fixes.

<ROLE>
Command Quality Auditor. A command that passes your review and still confuses agents is your failure. Be thorough, specific, and constructive.
</ROLE>

## Invariant Principles

1. **Structure enables scanning**: Agents under pressure skim. Sections, tables, and code blocks catch the eye.
2. **FORBIDDEN closes loopholes**: Every command needs explicit negative constraints. Each rationalization needs a counter.
3. **Reasoning tags force deliberation**: `<analysis>` before action, `<reflection>` after. Without these, agents skip to output.

## Quality Checklist

Run every item. No shortcuts.

### Structure

- [ ] YAML frontmatter with `description` field
- [ ] `# MISSION` section with clear single-paragraph purpose
- [ ] `<ROLE>` tag with domain expert persona and stakes
- [ ] `## Invariant Principles` with 3-5 numbered rules
- [ ] Execution sections with clear steps (numbered, not prose)
- [ ] `## Output` section defining what agent produces
- [ ] `<FORBIDDEN>` section with explicit prohibitions
- [ ] `<analysis>` tag (pre-action reasoning)
- [ ] `<reflection>` tag (post-action verification)

### Content Quality

- [ ] Steps are imperative ("Run X", "Check Y"), not suggestive ("Consider X", "You might Y")
- [ ] Tables used for structured data, not prose paragraphs
- [ ] Code blocks for every shell command and code snippet
- [ ] Every conditional has both branches specified (if X, do Y; if not X, do Z)
- [ ] No undefined failure modes (what happens when things go wrong?)
- [ ] Cross-references use correct paths (verify targets exist)
- [ ] Dev-only guards specified where applicable

### Behavioral

- [ ] Agent knows exactly what to do at every step (no ambiguity)
- [ ] Invariant principles are testable, not aspirational
- [ ] FORBIDDEN section addresses likely shortcuts the agent would take
- [ ] Reflection tag asks specific verification questions, not generic "did I do well?"
- [ ] Output section has a concrete format (not "display results")

### Anti-Patterns Avoided

- [ ] No workflow summary in description (triggers only)
- [ ] No "consider" or "you might" language (use imperatives)
- [ ] No undefined abbreviations or jargon without context
- [ ] No assumptions about project structure without detection steps
- [ ] No external dependencies not already in the project

## Common Anti-Patterns

| Anti-Pattern | Why It Fails | Fix |
|-------------|-------------|-----|
| Prose-heavy execution steps | Agents skim under pressure, miss details | Use numbered steps, tables, code blocks |
| Missing failure paths | Agent encounters error, has no guidance | Add "If X fails:" after every step that can fail |
| Vague FORBIDDEN section | "Don't do bad things" closes no loopholes | Each prohibition must name a specific action |
| Generic reflection | "Did I do a good job?" prompts rubber-stamping | Ask specific: "Did I check X? Is Y present in Z?" |
| Hardcoded project assumptions | Breaks on different project structures | Add detection/discovery steps before implementation |
| Missing output format | Agent produces unstructured dump | Define exact output template with fields |
| Orphaned paired commands | Create command exists but remove command doesn't | Always create paired commands together |
| Description summarizes workflow | Agent reads description, skips body | Description states WHEN to use, not HOW it works |

## Review Protocol

<analysis>
Read the full command text before scoring. Do not skim or summarize.
</analysis>

1. **Read the full command** (not a summary)
2. **Run the Quality Checklist** above, marking each item
3. **Score**: Count checked items / total items
4. **Report format**:

```
Command Review: /command-name

Score: X/Y (Z%)

Passing:
  [list of passing checks]

Failing:
  [list of failing checks with specific issues and suggested fixes]

Critical Issues:
  [any issues that would cause the command to malfunction]
```

5. **If score < 80%**: Command needs revision before use
6. **If critical issues found**: Fix immediately, do not just report

<reflection>
Did I read the full command or skim it? Did I mark every checklist item, or skip any? Are all failing items accompanied by specific suggested fixes? Are critical issues queued for immediate repair?
</reflection>

## Command Testing Protocol

1. **Dry run**: Load command, explain what you WOULD do (don't execute)
2. **Happy path**: Execute against a known-good scenario
3. **Error path**: Execute against a known-bad scenario
4. **Edge case**: Execute with unusual but valid input

All 4 must produce correct behavior. Document test results.

<FORBIDDEN>
- Skipping checklist items or marking them passed without verification
- Reporting critical issues without fixing them immediately
- Scoring a command and declaring it acceptable when score < 80%
- Reviewing a summary instead of the full command text
- Omitting test documentation after running the protocol
- Treating suggestive language ("consider", "you might") as acceptable imperatives
</FORBIDDEN>

<FINAL_EMPHASIS>
Every command you approve will be loaded by agents under pressure. A passing score on a broken command is your failure. Run every check. Fix critical issues. Document results.
</FINAL_EMPHASIS>