code-review¶
Quick code review covering correctness, style, and common issues across four modes: self-review before PRs, processing received feedback, reviewing others' code, and deep audit passes. Catches real issues with file-and-line references and honest severity classification. A core spellbook capability for routine review of changes before committing.
Auto-invocation: Your coding assistant will automatically invoke this skill when it detects a matching trigger.
Use when reviewing code. Triggers: 'review my code', 'check my work', 'look over this', 'review PR #X', 'PR comments to address', 'reviewer said', 'address feedback', 'self-review before PR', 'audit this code', 'branch code review', 'review this branch', 'review the changes', 'review what's on this branch', 'do a code review of the branch'. For heavyweight multi-phase analysis, use advanced-code-review instead. When the request could match more than one review skill, MUST use AskUserQuestion to disambiguate before invoking — never bypass the review skills for a raw Explore dispatch, even when the user's concerns seem narrow or specific.
Workflow Diagram¶
The existing diagram is complete and verified against the source SKILL.md and all three referenced command files (code-review-give, code-review-feedback, code-review-tarot). Below is the verified diagram content.
Diagram: code-review¶
Workflow diagrams for the code-review skill: mode routing, self/audit inline modes, give/feedback/tarot command workflows, and cross-reference index.
Overview¶
High-level mode routing, modifiers, and terminal outputs for all four modes.
flowchart TD
classDef dispatch fill:#4a9eff,color:#000
classDef gate fill:#ff6b6b,color:#000
classDef success fill:#51cf66,color:#000
INVOKE["code-review invoked"]
INVOKE --> MODCHECK{"Modifier flags?"}
MODCHECK -->|"--tarot"| TAROT["code-review-tarot\n(wraps active mode)"]:::dispatch
MODCHECK -->|"--pr <num>"| PRFETCH["pr_fetch / pr_diff\n(MCP tools; fallback:\ngh CLI → local diff → paste)"]:::dispatch
MODCHECK -->|"none"| MODECHECK{"Mode flag?"}
TAROT --> MODECHECK{"Mode flag?"}
PRFETCH --> MODECHECK
MODECHECK -->|"--self / -s\nor no flag"| SELF["Self Mode\n(inline)"]:::dispatch
MODECHECK -->|"--feedback / -f"| FEEDBACK["code-review-feedback\ncommand"]:::dispatch
MODECHECK -->|"--give <target>"| GIVE["code-review-give\ncommand"]:::dispatch
MODECHECK -->|"--audit [scope]"| AUDIT["Audit Mode\n(inline)"]:::dispatch
SELF --> SELFOUT(["PASS / WARN / FAIL\n(severity gate)"]):::success
FEEDBACK --> FBOUT(["All items addressed\n+ self-review clean"]):::success
GIVE --> GIVEOUT(["APPROVE /\nREQUEST_CHANGES /\nCOMMENT"]):::success
AUDIT --> AUDITOUT(["Executive Summary\n+ Risk Assessment\nLOW / MEDIUM / HIGH / CRITICAL"]):::success
subgraph LEGEND["Legend"]
LL1["Process"]
LL2{"Decision"}
LL3(["Terminal"]):::success
LL4["Dispatch node"]:::dispatch
LL5["Quality gate"]:::gate
end
Detail A — Self Mode and Audit Mode¶
Both inline modes: Self on the left, Audit on the right.
flowchart TD
classDef dispatch fill:#4a9eff,color:#000
classDef gate fill:#ff6b6b,color:#000
classDef success fill:#51cf66,color:#000
subgraph SELF["Self Mode (--self / -s / default)"]
S1["git diff\n$(git merge-base origin/main HEAD)..HEAD"]
S2["Pass 1: Logic"]
S3["Pass 2: Integration"]
S4["Pass 3: Security"]
S5["Pass 4: Style"]
S6["Generate findings\n(severity + file:line)"]
SGATE{"Highest severity?"}:::gate
SFAIL(["FAIL\n(Critical finding)"]):::success
SWARN(["WARN\n(Important finding)"]):::success
SPASS(["PASS\n(Minor only)"]):::success
S1 --> S2 --> S3 --> S4 --> S5 --> S6 --> SGATE
SGATE -->|"Critical"| SFAIL
SGATE -->|"Important"| SWARN
SGATE -->|"Minor only"| SPASS
end
subgraph AUDIT["Audit Mode (--audit [scope])"]
ASCOPE{"Scope\nresolution"}:::gate
ANONE["none → branch changes"]
AFILE["file.py → single file"]
ADIR["dir/ → directory"]
ASEC["security → security-only"]
AALL["all → full codebase"]
AP1["Pass 1: Correctness\n+ API Hallucination Detection\nchecklist"]
AP2["Pass 2: Security"]
AP3["Pass 3: Performance"]
AP4["Pass 4: Maintainability"]
AP5["Pass 5: Edge Cases"]
AOUT(["Executive Summary\n+ findings by category\n+ Risk Assessment\nLOW / MEDIUM / HIGH / CRITICAL"]):::success
ASCOPE -->|"none"| ANONE
ASCOPE -->|"file.py"| AFILE
ASCOPE -->|"dir/"| ADIR
ASCOPE -->|"security"| ASEC
ASCOPE -->|"all"| AALL
ANONE --> AP1
AFILE --> AP1
ADIR --> AP1
ASEC --> AP2
AALL --> AP1
AP1 --> AP2 --> AP3 --> AP4 --> AP5 --> AOUT
end
subgraph LEGEND["Legend"]
LL1["Process"]
LL2{"Decision"}
LL3(["Terminal"]):::success
LL4["Dispatch node"]:::dispatch
LL5["Quality gate"]:::gate
end
The API Hallucination Detection checklist (method existence, signature match, real config keys, resolvable imports, return-type contracts) runs inside the Correctness pass and is elevated to HIGH severity for AI-generated code.
Detail B — Give Mode (code-review-give)¶
Full step 0–3 workflow with all quality gates.
flowchart TD
classDef dispatch fill:#4a9eff,color:#000
classDef gate fill:#ff6b6b,color:#000
classDef success fill:#51cf66,color:#000
subgraph STEP0["Step 0: Load Project Conventions (BEFORE any review)"]
C1["Read CLAUDE.md /\n.claude/CLAUDE.md"]
C2["Read style config\n(pyproject.toml / .eslintrc /\nbiome.json / setup.cfg)"]
C3["Check docs/code-review-instructions.md\nor .github/code-review-instructions.md"]
C4["Sample 1-2 adjacent files\nNOT in PR changed file set"]
CNOTE["CRITICAL: NEVER read local versions\nof files in the PR changed file set\n(local version is old code)"]:::gate
C1 --> C2 --> C3 --> C4 --> CNOTE
end
subgraph STEP1["Step 1: Fetch and Inventory"]
SRC{"Source type?"}
SRCPR["PR# / URL\npr_fetch + pr_diff MCP\n+ gh pr diff"]:::dispatch
SRCBR["Branch name\ngit diff merge-base..HEAD"]:::dispatch
MANIFEST["Build coverage manifest\nfrom ALL changed files\n(quality gate — must complete\nBEFORE beginning review)"]:::gate
PRIOR["Fetch prior review comments\ngh api pulls/number/comments\ngh api pulls/number/reviews"]:::dispatch
CLASSIFY["Classify each prior item:\nADDRESSED or STILL_OPEN"]
SRC -->|"PR# or URL"| SRCPR
SRC -->|"branch name"| SRCBR
SRCPR --> MANIFEST
SRCBR --> MANIFEST
MANIFEST --> PRIOR --> CLASSIFY
end
subgraph STEP2["Step 2: Multi-Pass Review"]
MANDATORY["Mandatory dimensions — every changed file\n(skip any = coverage failure)"]:::gate
D1["1. Correctness\n(logic errors, off-by-ones,\nnull handling, return types)"]
D2["2. Security\n(injection, auth, secrets,\nSSRF, input length)"]
D3["3. Error Handling\n(missing catches, swallowed errors,\nnull safety, interrupt handling)"]
D4["4. Data Integrity\n(race conditions, non-atomic writes,\nstate mutations)"]
D5["5. API Contracts\n(breaking changes, validation,\nschema drift)"]
D6["6. Test Coverage\n(changes tested, edge cases,\nmeaningful assertions)"]
CONDCHECK{"Conditional\ndimensions\ntriggered?"}
PERF["Performance pass\n(N+1 queries, missing indexes,\nallocations)"]:::dispatch
CONC["Concurrency/Async pass\n(REQUIRED when triggered)\nevent loop blocking, thread safety,\nrace conditions, lock ordering"]:::dispatch
A11Y["Accessibility pass\n(ARIA, keyboard nav,\nscreen readers)"]:::dispatch
SECPASS["Security Pass — runs for every review\nInput validation, path traversal,\nhardcoded secrets, auth/authz,\ninjection, SSRF"]:::gate
MANDATORY --> D1 --> D2 --> D3 --> D4 --> D5 --> D6 --> CONDCHECK
CONDCHECK -->|"hot paths / DB ops"| PERF
CONDCHECK -->|"async / threading present"| CONC
CONDCHECK -->|"UI / HTML / templates"| A11Y
CONDCHECK -->|"none triggered"| SECPASS
PERF --> SECPASS
CONC --> SECPASS
A11Y --> SECPASS
end
subgraph STEP3["Step 3: Output and Post-Review Reflection Gate"]
FORMAT["Format output:\nSummary\n→ Coverage Manifest (N/N files, gaps)\n→ Prior Feedback Reconciliation\n→ Findings (CRITICAL/IMPORTANT/MINOR/QUESTION\nwith file:line + dimension)\n→ Recommendation"]
REFLECT{"Post-review\nreflection gate"}:::gate
RCHECK["Self-check:\nAll files in manifest evaluated?\nAll 6 mandatory dimensions checked?\nSecurity pass (all 6 checks)?\nIf async/threading: concurrency pass?\nAll prior items reconciled?\nSeverity ratings honest (impact-based)?"]
RFIX["Address gaps\nbefore output"]
TERMINAL(["APPROVE /\nREQUEST_CHANGES /\nCOMMENT"]):::success
FORMAT --> REFLECT
REFLECT -->|"gaps found"| RCHECK --> RFIX --> REFLECT
REFLECT -->|"all checks pass"| TERMINAL
end
STEP0 --> STEP1 --> STEP2 --> STEP3
subgraph LEGEND["Legend"]
LL1["Process"]
LL2{"Decision"}
LL3(["Terminal"]):::success
LL4["Dispatch node"]:::dispatch
LL5["Quality gate"]:::gate
end
Detail C — Feedback Mode (code-review-feedback)¶
Full categorize/decide/execute workflow.
flowchart TD
classDef dispatch fill:#4a9eff,color:#000
classDef gate fill:#ff6b6b,color:#000
classDef success fill:#51cf66,color:#000
GATHER["Gather ALL feedback holistically\nacross related PRs\nbefore responding to any item"]:::gate
CATEGORIZE["Categorize each item:\nbug / style / question / suggestion / nit"]
DECIDE{"Decision\nper item"}
ACCEPT["Accept:\nmake the change"]
PUSHBACK["Push back:\ndisagree with evidence"]
CLARIFY["Clarify:\nask specific question"]
DEFER["Defer:\nacknowledge + scope reason"]
RATIONALE["Document rationale\n(WHY for each decision\nbefore responding)"]:::gate
FACTCHECK["Fact-check technical claims\nbefore accepting or disputing"]:::gate
EXECUTE["Execute accepted fixes"]:::dispatch
SELFREV["Re-run self-review\n(--self mode)"]:::dispatch
SELFGATE{"Self-review\nclean?"}:::gate
FIXMORE["Address new findings"]
subgraph TEMPLATES["Response Templates"]
T1["Accept: Fixed in <SHA>"]
T2["Push back: tradeoff + evidence"]
T3["Clarify: specific question"]
T4["Defer: scope + reason"]
end
RESPOND["Respond using templates"]
DONE(["All items addressed\n+ self-review clean"]):::success
GATHER --> CATEGORIZE --> DECIDE
DECIDE -->|"correct, improves code"| ACCEPT
DECIDE -->|"incorrect or harmful"| PUSHBACK
DECIDE -->|"ambiguous"| CLARIFY
DECIDE -->|"valid but out of scope"| DEFER
ACCEPT --> RATIONALE
PUSHBACK --> RATIONALE
CLARIFY --> RATIONALE
DEFER --> RATIONALE
RATIONALE --> FACTCHECK --> EXECUTE --> SELFREV --> SELFGATE
SELFGATE -->|"findings"| FIXMORE --> SELFREV
SELFGATE -->|"clean"| RESPOND --> TEMPLATES --> DONE
subgraph LEGEND["Legend"]
LL1["Process"]
LL2{"Decision"}
LL3(["Terminal"]):::success
LL4["Dispatch node"]:::dispatch
LL5["Quality gate"]:::gate
end
Detail D — Tarot Overlay (code-review-tarot)¶
Persona mapping, roundtable format, and audit integration.
flowchart TD
classDef dispatch fill:#4a9eff,color:#000
classDef gate fill:#ff6b6b,color:#000
classDef success fill:#51cf66,color:#000
OPTIN["Opt-in: --tarot modifier\n(compatible with all modes)"]
subgraph PERSONAS["Persona Mapping"]
PH["Hermit\nRole: Security reviewer\nFocus: Input validation, injection\nStakes: Do NOT trust inputs"]
PP["Priestess\nRole: Architecture reviewer\nFocus: Design patterns, coupling\nStakes: Do NOT commit early"]
PF["Fool\nRole: Assumption challenger\nFocus: Hidden assumptions, edge cases\nStakes: Do NOT accept hidden complexity"]
PM["Magician\nRole: Synthesis / verdict\nFocus: Final assessment\nStakes: Clarity determines everything"]
end
subgraph ROUNDTABLE["Roundtable Format"]
RT1["Magician opens"]
RT2["Hermit speaks\n(security findings)"]
RT3["Priestess speaks\n(architecture findings)"]
RT4["Fool speaks\n(assumption challenges)"]
RT5["Magician synthesizes\nby evidence weight\nNOT majority vote"]:::gate
RT1 --> RT2 --> RT3 --> RT4 --> RT5
end
subgraph AUDITINT["Audit + Tarot Integration\n(parallel subagent prompts)"]
AI1["Security pass\n→ Hermit persona"]:::dispatch
AI2["Architecture pass\n→ Priestess persona"]:::dispatch
AI3["Assumption pass\n→ Fool persona"]:::dispatch
AI4["Synthesis\n→ Magician persona"]:::dispatch
AI1 --> AI4
AI2 --> AI4
AI3 --> AI4
end
RULES["Critical rules:\nPersona dialogue appears ONLY in dialogue sections\nNEVER in code suggestions or formal findings\nSynthesis by evidence weight (NOT majority vote)\nfile:line citations required even in persona dialogue"]:::gate
TOUT(["Tarot-annotated\nreview output"]):::success
OPTIN --> PERSONAS
OPTIN --> ROUNDTABLE
OPTIN --> AUDITINT
PERSONAS --> RULES
ROUNDTABLE --> RULES
AUDITINT --> RULES
RULES --> TOUT
subgraph LEGEND["Legend"]
LL1["Process"]
LL2{"Decision"}
LL3(["Terminal"]):::success
LL4["Dispatch node"]:::dispatch
LL5["Quality gate"]:::gate
end
Cross-Reference Table¶
| Overview node | Detail diagram | Section |
|---|---|---|
| Self Mode (inline) | Detail A | Self Mode subgraph |
| Audit Mode (inline) | Detail A | Audit Mode subgraph |
| code-review-give command | Detail B | Steps 0–3 |
| code-review-feedback command | Detail C | Full workflow |
| code-review-tarot command | Detail D | Personas + Roundtable + Audit integration |
| pr_fetch / pr_diff modifier | Detail B | Step 1: Fetch and Inventory |
| Severity gate | Detail A | Self Mode — highest severity decision node |
| Coverage manifest gate | Detail B | Step 1: Build coverage manifest |
| Post-review reflection gate | Detail B | Step 3: Reflection gate |
| Synthesis by evidence weight | Detail D | Roundtable — Magician synthesizes node |
Skill Content¶
# Code Review
<ROLE>
Code Review Specialist. Catch real issues. Respect developer time.
</ROLE>
<analysis>
Unified skill routes to specialized handlers via mode flags.
Self-review catches issues early. Feedback mode processes received comments. Give mode provides helpful reviews. Audit mode does deep security/quality passes.
</analysis>
## Invariant Principles
1. **Evidence Over Assertion** - Every finding needs file:line reference
2. **Severity Honesty** - Critical=security/data loss; Important=correctness; Minor=style
3. **Context Awareness** - Same code may warrant different severity in different contexts
4. **Respect Time** - False positives erode trust; prioritize signal
## Inputs
| Input | Required | Description |
|-------|----------|-------------|
| `args` | Yes | Mode flags and targets |
| `git diff` | Auto | Changed files |
| `PR data` | If --pr | PR metadata via GitHub |
## Outputs
| Output | Type | Description |
|--------|------|-------------|
| `findings` | List | Issues with severity, file:line |
| `status` | Enum | PASS/WARN/FAIL or APPROVE/REQUEST_CHANGES |
## Mode Router
| Flag | Mode | Command File |
|------|------|-------------|
| `--self`, `-s`, (default: no flag given) | Pre-PR self-review | (inline below) |
| `--feedback`, `-f` | Process received feedback | `code-review-feedback` |
| `--give <target>` | Review someone else's code | `code-review-give` |
| `--audit [scope]` | Multi-pass deep-dive | (inline below) |
**Modifiers:** `--tarot` (roundtable dialogue via `code-review-tarot`), `--pr <num>` (PR source)
---
## MCP Tool Integration
| Tool | Purpose |
|------|---------|
| `pr_fetch(num_or_url)` | Fetch PR metadata and diff |
| `pr_diff(raw_diff)` | Parse diff into FileDiff objects |
| `pr_match_patterns(files, root)` | Heuristic pre-filtering |
| `pr_files(pr_result)` | Extract file list |
MCP tools for read/analyze. `gh` CLI for write operations (posting reviews, replies). Fallback: MCP unavailable -> gh CLI -> local diff -> manual paste.
---
## Self Mode (`--self`)
<reflection>
Self-review finds what you missed. Assume bugs exist. Hunt them.
</reflection>
**Workflow:**
1. Get diff: `git diff $(git merge-base origin/main HEAD)..HEAD`
2. Multi-pass: Logic > Integration > Security > Style
3. Generate findings with severity, file:line, description
Example finding: `src/auth/login.py:42 [Critical] Token written to log — data exposure risk`
4. Gate: Critical=FAIL, Important=WARN, Minor only=PASS
---
## Audit Mode (`--audit [scope]`)
Scopes: (none)=branch changes, file.py, dir/, security, all
**Passes:** Correctness > Security > Performance > Maintainability > Edge Cases
**API Hallucination Detection (Correctness Pass):**
During the Correctness pass, check for API hallucination patterns:
- [ ] Method calls use APIs that exist in the imported library version (not invented methods)
- [ ] Function signatures match actual library definitions (parameter names, types, order)
- [ ] Configuration keys and environment variables are real (not plausible-sounding inventions)
- [ ] Import paths resolve to actual modules (not hallucinated package structures)
- [ ] Return types match actual API contracts (not assumed shapes)
When reviewing AI-generated code, these checks are elevated to HIGH severity. LLMs frequently generate syntactically valid but non-existent API calls that pass linting but fail at runtime.
Output: Executive Summary, findings by category (same severity thresholds as Self Mode), Risk Assessment (LOW/MEDIUM/HIGH/CRITICAL)
---
<FORBIDDEN>
- Skip self-review for "small" changes
- Ignore Critical findings
- Dismiss feedback without evidence
- Give vague feedback without file:line
- Approve to avoid conflict
- Rate severity by effort instead of impact
</FORBIDDEN>
## Self-Check
- [ ] Correct mode identified
- [ ] All findings have file:line
- [ ] Severity based on impact, not effort
- [ ] Output matches mode spec
<FINAL_EMPHASIS>
Every finding without file:line is noise. Every severity inflated by effort is a lie. Your credibility as a reviewer depends on signal quality — accurate severity, concrete evidence, zero false positives that waste developer time.
</FINAL_EMPHASIS>