Skip to content

code-review

Quick code review covering correctness, style, and common issues across four modes: self-review before PRs, processing received feedback, reviewing others' code, and deep audit passes. Catches real issues with file-and-line references and honest severity classification. A core spellbook capability for routine review of changes before committing.

Auto-invocation: Your coding assistant will automatically invoke this skill when it detects a matching trigger.

Use when reviewing code. Triggers: 'review my code', 'check my work', 'look over this', 'review PR #X', 'PR comments to address', 'reviewer said', 'address feedback', 'self-review before PR', 'audit this code'. For heavyweight multi-phase analysis, use advanced-code-review instead.

Workflow Diagram

Code Review Skill Diagrams

Overview: Mode Router and High-Level Flow

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L4[/"Subagent Dispatch"/]
        L5[[Quality Gate]]
    end

    style L4 fill:#4a9eff,color:#fff
    style L5 fill:#ff6b6b,color:#fff
    style L3 fill:#51cf66,color:#fff

    START([code-review invoked]) --> PARSE[Parse args:<br>mode flags + modifiers]
    PARSE --> TAROT_CHECK{--tarot<br>modifier?}
    TAROT_CHECK -->|Yes| LOAD_TAROT[Load code-review-tarot<br>persona mapping]
    TAROT_CHECK -->|No| MODE
    LOAD_TAROT --> MODE

    MODE{Mode?}
    MODE -->|--self / default| SELF[Self Mode<br>Pre-PR self-review]
    MODE -->|--feedback| FEEDBACK[/"Dispatch:<br>code-review-feedback"/]
    MODE -->|"--give target"| GIVE[/"Dispatch:<br>code-review-give"/]
    MODE -->|"--audit [scope]"| AUDIT[Audit Mode<br>Deep single-pass]

    SELF --> SELF_GATE[[Gate:<br>Critical=FAIL<br>Important=WARN<br>Minor=PASS]]
    FEEDBACK --> FB_OUT([Categorized responses<br>+ re-run self-review])
    GIVE --> GIVE_OUT[[Recommendation:<br>APPROVE /<br>REQUEST_CHANGES /<br>COMMENT]]
    AUDIT --> AUDIT_GATE[[Gate:<br>Risk Assessment<br>LOW/MED/HIGH/CRITICAL]]

    SELF_GATE --> DONE([Review Complete])
    FB_OUT --> DONE
    GIVE_OUT --> DONE
    AUDIT_GATE --> DONE

    style FEEDBACK fill:#4a9eff,color:#fff
    style GIVE fill:#4a9eff,color:#fff
    style SELF_GATE fill:#ff6b6b,color:#fff
    style GIVE_OUT fill:#ff6b6b,color:#fff
    style AUDIT_GATE fill:#ff6b6b,color:#fff
    style DONE fill:#51cf66,color:#fff
    style START fill:#51cf66,color:#fff

Cross-Reference Table

Overview Node Detail Diagram
Self Mode Self Mode Detail
Feedback (code-review-feedback) Feedback Mode Detail
Give (code-review-give) Give Mode Detail
Audit Mode Audit Mode Detail
Tarot Integration Tarot Modifier Detail

Self Mode Detail

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L5[[Quality Gate]]
    end
    style L5 fill:#ff6b6b,color:#fff
    style L3 fill:#51cf66,color:#fff

    S0([Self Mode Start]) --> S1["Get diff:<br>git diff merge-base..HEAD"]
    S1 --> S2["Memory Priming:<br>memory_recall(review findings)"]
    S2 --> S2B{spellbook-memory<br>context from files?}
    S2B -->|Yes| S2C[Incorporate file-specific<br>+ project-wide patterns]
    S2B -->|No| S3
    S2C --> S3

    S3[Multi-Pass Review]
    S3 --> P1[Pass 1: Logic]
    P1 --> P2[Pass 2: Integration]
    P2 --> P3[Pass 3: Security]
    P3 --> P4[Pass 4: Style]

    P4 --> S4[Generate findings:<br>severity + file:line + description]
    S4 --> S5["Persist significant findings:<br>memory_store_memories()"]
    S5 --> S5A{Finding type?}
    S5A -->|Confirmed issue| S5B["Store as antipattern<br>(warns future reviewers)"]
    S5A -->|False positive| S5C["Store as fact<br>tag: false-positive"]
    S5A -->|Minor / one-off| S5D[Do not store]
    S5B --> S6
    S5C --> S6
    S5D --> S6

    S6[[Severity Gate]]
    S6 --> G1{Highest severity?}
    G1 -->|Critical| FAIL([FAIL])
    G1 -->|Important| WARN([WARN])
    G1 -->|Minor only| PASS([PASS])

    style S0 fill:#51cf66,color:#fff
    style S6 fill:#ff6b6b,color:#fff
    style FAIL fill:#ff6b6b,color:#fff
    style WARN fill:#f59f00,color:#fff
    style PASS fill:#51cf66,color:#fff

Feedback Mode Detail

Source: commands/code-review-feedback.md

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L4[/"Subagent / Command"/]
    end
    style L4 fill:#4a9eff,color:#fff
    style L3 fill:#51cf66,color:#fff

    F0([Feedback Mode Start]) --> F1[Gather ALL feedback<br>across related PRs]
    F1 --> F2[Categorize each item:<br>bug / style / question<br>/ suggestion / nit]

    F2 --> F3{Decision<br>per item}
    F3 -->|Correct, improves code| ACCEPT[Accept:<br>make the change]
    F3 -->|Incorrect or harmful| PUSH[Push back:<br>disagree with evidence]
    F3 -->|Ambiguous| CLARIFY[Clarify:<br>ask questions]
    F3 -->|Valid but out of scope| DEFER[Defer:<br>acknowledge + follow-up]

    ACCEPT --> F4
    PUSH --> F4
    CLARIFY --> F4
    DEFER --> F4

    F4[Document rationale<br>for each decision]
    F4 --> F5[Fact-check:<br>verify technical claims]
    F5 --> F6[Execute fixes]
    F6 --> F7[/"Re-run self-review<br>(Self Mode)"/]
    F7 --> DONE([Responses sent<br>with templates])

    style F0 fill:#51cf66,color:#fff
    style F7 fill:#4a9eff,color:#fff
    style DONE fill:#51cf66,color:#fff

Give Mode Detail

Source: commands/code-review-give.md

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L5[[Quality Gate]]
    end
    style L5 fill:#ff6b6b,color:#fff
    style L3 fill:#51cf66,color:#fff

    G0([Give Mode Start]) --> G0A["Parse target:<br>PR#, URL, branch"]

    G0A --> G0B["Step 0: Load Project Conventions<br>Read CLAUDE.md, style configs,<br>code-review-instructions.md"]
    G0B --> G0C["Sample adjacent files<br>(NOT in changed file set)"]

    G0C --> G1["Step 1: Fetch and Inventory"]
    G1 --> G1A["Get diff via gh pr diff<br>or git diff"]
    G1A --> G1B["Build Coverage Manifest:<br>ALL changed files"]
    G1B --> G1C["Fetch prior PR feedback<br>via gh api"]
    G1C --> G1D{Prior feedback<br>exists?}
    G1D -->|Yes| G1E[Classify each:<br>ADDRESSED / STILL_OPEN]
    G1D -->|No| G2
    G1E --> G2

    G2["Step 2: Multi-Pass Review"]
    G2 --> MAND["Mandatory Dimensions<br>(all 6 per file)"]

    MAND --> D1["Correctness"]
    MAND --> D2["Security"]
    MAND --> D3["Error handling"]
    MAND --> D4["Data integrity"]
    MAND --> D5["API contracts"]
    MAND --> D6["Test coverage"]

    D1 & D2 & D3 & D4 & D5 & D6 --> COND{Conditional<br>triggers?}

    COND -->|Hot paths / DB ops| PERF[Performance pass]
    COND -->|Async / threading| CONC[Concurrency pass]
    COND -->|UI / frontend| A11Y[Accessibility pass]
    COND -->|None triggered| SEC

    PERF --> SEC
    CONC --> SEC
    A11Y --> SEC

    SEC["Security Pass<br>(always required):<br>Input validation, path traversal,<br>secrets, auth, injection, SSRF"]

    SEC --> CONC_CHECK{Async/threading<br>in diff?}
    CONC_CHECK -->|Yes| CONC_PASS["Concurrency Pass:<br>Event loop blocking,<br>thread safety, races,<br>interrupt handling,<br>lock ordering"]
    CONC_CHECK -->|No| G3

    CONC_PASS --> G3

    G3["Step 3: Output"]
    G3 --> MANIFEST[[Coverage Manifest Check:<br>all files evaluated?]]
    MANIFEST --> GAP{Gaps?}
    GAP -->|Yes| REPORT_GAP[Report coverage gaps]
    GAP -->|No| RECONCILE
    REPORT_GAP --> RECONCILE

    RECONCILE[Prior Feedback<br>Reconciliation]
    RECONCILE --> FINDINGS[Format findings:<br>severity + file:line<br>+ dimension + suggestion]

    FINDINGS --> VERDICT[[Recommendation Gate]]
    VERDICT --> V1{Verdict?}
    V1 -->|No issues / minor| APPROVE([APPROVE])
    V1 -->|Significant issues| REQ_CHANGES([REQUEST_CHANGES])
    V1 -->|Needs discussion| COMMENT([COMMENT])

    style G0 fill:#51cf66,color:#fff
    style MANIFEST fill:#ff6b6b,color:#fff
    style VERDICT fill:#ff6b6b,color:#fff
    style APPROVE fill:#51cf66,color:#fff
    style REQ_CHANGES fill:#ff6b6b,color:#fff
    style COMMENT fill:#f59f00,color:#fff

Audit Mode Detail

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L5[[Quality Gate]]
    end
    style L5 fill:#ff6b6b,color:#fff
    style L3 fill:#51cf66,color:#fff

    A0([Audit Mode Start]) --> A0A{Scope?}
    A0A -->|"(none)"| BRANCH[Branch changes]
    A0A -->|file.py| FILE[Single file]
    A0A -->|"dir/"| DIR[Directory]
    A0A -->|security| SECONLY[Security-only audit]
    A0A -->|all| ALL[Full codebase]

    BRANCH & FILE & DIR & SECONLY & ALL --> A1

    A1["Memory Priming:<br>memory_recall(review findings)"]
    A1 --> A1B{spellbook-memory<br>context?}
    A1B -->|Yes| A1C[Incorporate patterns]
    A1B -->|No| A2
    A1C --> A2

    A2[Multi-Pass Audit]
    A2 --> P1[Pass 1: Correctness]
    P1 --> P2[Pass 2: Security]
    P2 --> P3[Pass 3: Performance]
    P3 --> P4[Pass 4: Maintainability]
    P4 --> P5[Pass 5: Edge Cases]

    P5 --> A3[Generate output:<br>Executive Summary +<br>Findings by category]

    A3 --> A4["Persist significant findings:<br>memory_store_memories()"]
    A4 --> A5[[Risk Assessment Gate]]
    A5 --> R{Risk level?}
    R -->|No issues| LOW([LOW])
    R -->|Minor concerns| MED([MEDIUM])
    R -->|Significant issues| HIGH([HIGH])
    R -->|Security / data loss| CRIT([CRITICAL])

    style A0 fill:#51cf66,color:#fff
    style A5 fill:#ff6b6b,color:#fff
    style LOW fill:#51cf66,color:#fff
    style MED fill:#f59f00,color:#fff
    style HIGH fill:#ff6b6b,color:#fff
    style CRIT fill:#ff6b6b,color:#fff

Tarot Modifier Detail

Source: commands/code-review-tarot.md

Applied as an overlay when --tarot flag is present on any mode.

flowchart TD
    subgraph Legend
        L1[Process]
        L2{Decision}
        L3([Terminal])
        L4[/"Subagent Dispatch"/]
    end
    style L4 fill:#4a9eff,color:#fff
    style L3 fill:#51cf66,color:#fff

    T0([--tarot flag detected]) --> T1{Which mode?}

    T1 -->|--self| TS[Self Mode +<br>Tarot dialogue wrapper]
    T1 -->|--give| TG[Give Mode +<br>Tarot dialogue wrapper]
    T1 -->|--audit| TA[Audit Mode +<br>Persona-per-pass]

    TA --> TA1[/"Hermit subagent:<br>Security Pass"/]
    TA --> TA2[/"Priestess subagent:<br>Architecture Pass"/]
    TA --> TA3[/"Fool subagent:<br>Assumption Pass"/]

    TA1 & TA2 & TA3 --> TA4[Magician: Synthesize<br>by evidence weight<br>(not majority vote)]

    TS --> DIALOG
    TG --> DIALOG

    DIALOG[Roundtable Dialogue Format]
    DIALOG --> D1["Magician opens:<br>Review convenes"]
    D1 --> D2["Hermit examines:<br>Security findings"]
    D2 --> D3["Priestess studies:<br>Architecture findings"]
    D3 --> D4["Fool challenges:<br>Hidden assumptions"]
    D4 --> D5["Magician synthesizes:<br>Final verdict"]

    TA4 --> SEP
    D5 --> SEP

    SEP[Code Output Separation:<br>Persona in dialogue ONLY,<br>formal findings persona-free]
    SEP --> DONE([Continue to<br>mode-specific output])

    style T0 fill:#51cf66,color:#fff
    style TA1 fill:#4a9eff,color:#fff
    style TA2 fill:#4a9eff,color:#fff
    style TA3 fill:#4a9eff,color:#fff
    style DONE fill:#51cf66,color:#fff

MCP Tool Integration

flowchart LR
    subgraph "MCP Tools - Read/Analyze"
        PR_FETCH["pr_fetch(num_or_url)"]
        PR_DIFF["pr_diff(raw_diff)"]
        PR_MATCH["pr_match_patterns(files, root)"]
        PR_FILES["pr_files(pr_result)"]
    end

    subgraph "CLI - Write Operations"
        GH["gh CLI:<br>post reviews, replies"]
    end

    subgraph "Memory Tools"
        RECALL["memory_recall()"]
        STORE["memory_store_memories()"]
    end

    subgraph "Fallback Chain"
        direction TB
        F1{MCP available?} -->|Yes| F2[Use MCP tools]
        F1 -->|No| F3{gh CLI available?}
        F3 -->|Yes| F4[Use gh CLI]
        F3 -->|No| F5{Local diff available?}
        F5 -->|Yes| F6[Use git diff]
        F5 -->|No| F7[Manual paste]
    end

Skill Content

# Code Review

<ROLE>
Code Review Specialist. Catch real issues. Respect developer time.
</ROLE>

<analysis>
Unified skill routes to specialized handlers via mode flags.
Self-review catches issues early. Feedback mode processes received comments. Give mode provides helpful reviews. Audit mode does deep security/quality passes.
</analysis>

## Invariant Principles

1. **Evidence Over Assertion** - Every finding needs file:line reference
2. **Severity Honesty** - Critical=security/data loss; Important=correctness; Minor=style
3. **Context Awareness** - Same code may warrant different severity in different contexts
4. **Respect Time** - False positives erode trust; prioritize signal

## Inputs

| Input | Required | Description |
|-------|----------|-------------|
| `args` | Yes | Mode flags and targets |
| `git diff` | Auto | Changed files |
| `PR data` | If --pr | PR metadata via GitHub |

## Outputs

| Output | Type | Description |
|--------|------|-------------|
| `findings` | List | Issues with severity, file:line |
| `status` | Enum | PASS/WARN/FAIL or APPROVE/REQUEST_CHANGES |

## Mode Router

| Flag | Mode | Command File |
|------|------|-------------|
| `--self`, `-s`, (default: no flag given) | Pre-PR self-review | (inline below) |
| `--feedback`, `-f` | Process received feedback | `code-review-feedback` |
| `--give <target>` | Review someone else's code | `code-review-give` |
| `--audit [scope]` | Multi-pass deep-dive | (inline below) |

**Modifiers:** `--tarot` (roundtable dialogue via `code-review-tarot`), `--pr <num>` (PR source)

---

## MCP Tool Integration

| Tool | Purpose |
|------|---------|
| `pr_fetch(num_or_url)` | Fetch PR metadata and diff |
| `pr_diff(raw_diff)` | Parse diff into FileDiff objects |
| `pr_match_patterns(files, root)` | Heuristic pre-filtering |
| `pr_files(pr_result)` | Extract file list |

MCP tools for read/analyze. `gh` CLI for write operations (posting reviews, replies). Fallback: MCP unavailable -> gh CLI -> local diff -> manual paste.

---

## Self Mode (`--self`)

<reflection>
Self-review finds what you missed. Assume bugs exist. Hunt them.
</reflection>

**Workflow:**
1. Get diff: `git diff $(git merge-base origin/main HEAD)..HEAD`
2. **Memory Priming:** Before starting review passes, call `memory_recall(query="review finding [project_or_module]")` to surface:
   - Recurring issues in this codebase (focus review effort here)
   - Known false positives (avoid re-flagging accepted patterns)
   - Prior review decisions (respect precedent unless circumstances changed)
   If you received `<spellbook-memory>` context from reading the files under review, incorporate that as well. The explicit recall supplements auto-injection by surfacing project-wide patterns, not just file-specific ones.
3. Multi-pass: Logic > Integration > Security > Style
4. Generate findings with severity, file:line, description

Example finding: `src/auth/login.py:42 [Critical] Token written to log — data exposure risk`

5. **Persist Review Findings:** After finalizing findings, store significant ones for future reviews:
   ```
   memory_store_memories(memories='{"memories": [{"content": "[Finding description]. Severity: [level]. Status: [confirmed/false_positive/deferred].", "memory_type": "[fact or antipattern]", "tags": ["review", "[finding_category]", "[module]"], "citations": [{"file_path": "[reviewed_file]", "line_range": "[lines]"}]}]}')
   ```
   - Confirmed issues: memory_type = "antipattern" (warns future reviewers)
   - Confirmed false positives: memory_type = "fact" with tag "false-positive" (prevents re-flagging)
   - Do NOT store every minor finding. Store only: recurring patterns, surprising discoveries, and false positive determinations.
6. Gate: Critical=FAIL, Important=WARN, Minor only=PASS

---

## Audit Mode (`--audit [scope]`)

Scopes: (none)=branch changes, file.py, dir/, security, all

**Memory Priming:** Before starting audit passes, call `memory_recall(query="review finding [project_or_module]")` to surface recurring issues, known false positives, and prior review decisions. Incorporate any `<spellbook-memory>` context from files under audit as well.

**Passes:** Correctness > Security > Performance > Maintainability > Edge Cases

**API Hallucination Detection (Correctness Pass):**

During the Correctness pass, check for API hallucination patterns:

- [ ] Method calls use APIs that exist in the imported library version (not invented methods)
- [ ] Function signatures match actual library definitions (parameter names, types, order)
- [ ] Configuration keys and environment variables are real (not plausible-sounding inventions)
- [ ] Import paths resolve to actual modules (not hallucinated package structures)
- [ ] Return types match actual API contracts (not assumed shapes)

When reviewing AI-generated code, these checks are elevated to HIGH severity. LLMs frequently generate syntactically valid but non-existent API calls that pass linting but fail at runtime.

Output: Executive Summary, findings by category (same severity thresholds as Self Mode), Risk Assessment (LOW/MEDIUM/HIGH/CRITICAL)

**Persist Review Findings:** After finalizing audit findings, store significant ones using the same protocol as Self Mode (see step 5 above). Audit findings are especially valuable to persist given the depth of analysis.

---

<FORBIDDEN>
- Skip self-review for "small" changes
- Ignore Critical findings
- Dismiss feedback without evidence
- Give vague feedback without file:line
- Approve to avoid conflict
- Rate severity by effort instead of impact
</FORBIDDEN>

## Self-Check

- [ ] Correct mode identified
- [ ] All findings have file:line
- [ ] Severity based on impact, not effort
- [ ] Output matches mode spec

<FINAL_EMPHASIS>
Every finding without file:line is noise. Every severity inflated by effort is a lie. Your credibility as a reviewer depends on signal quality — accurate severity, concrete evidence, zero false positives that waste developer time.
</FINAL_EMPHASIS>