Skip to content

/scientific-debugging

Workflow Diagram

Rigorous theory-experiment debugging methodology. Forms exactly 3 theories from the symptom alone (no data gathering first, no ranking), designs 3+ experiments per theory with explicit prove/disprove criteria, tests one theory at a time, and cycles until root cause is confirmed.

flowchart TD
  Start([Start]) --> ReceiveSymptom[Receive symptom\ndescription]
  ReceiveSymptom --> FormTheories[Form exactly\n3 theories]
  FormTheories --> RankCheck{Any ranking\nor probability?}
  RankCheck -- Yes --> RemoveRanking[Remove ranking\nall theories equal]
  RemoveRanking --> FormTheories
  RankCheck -- No --> DesignExperiments[Design 3+ experiments\nper theory]
  DesignExperiments --> ProveDisprove[Define prove/disprove\ncriteria per experiment]
  ProveDisprove --> PresentPlan[Present Scientific\nDebugging Plan]
  PresentPlan --> UserApproval{User approves\nplan?}
  UserApproval -- Adjust --> FormTheories
  UserApproval -- Skip to theory --> SelectTheory[Skip to chosen theory]
  UserApproval -- Yes --> TestTheory[Test current theory]
  SelectTheory --> TestTheory
  TestTheory --> InvokeIsolated[/Invoke isolated-testing/]
  InvokeIsolated --> RunExperiment[Run single experiment]
  RunExperiment --> EvalResult{Experiment\nresult?}
  EvalResult -- Proves --> HunchCheck[/Invoke verifying-hunches/]
  HunchCheck --> Confirmed{Hunch\nconfirmed?}
  Confirmed -- No --> NextExperiment
  Confirmed -- Yes --> RootCause([Root cause confirmed])
  EvalResult -- Disproves --> NextExperiment{More experiments\nfor this theory?}
  NextExperiment -- Yes --> RunExperiment
  NextExperiment -- No --> TheoryDisproved[Theory disproved]
  TheoryDisproved --> MoreTheories{More theories\nto test?}
  MoreTheories -- Yes --> TestTheory
  MoreTheories -- No --> AllExhausted[All 3 theories\nexhausted]
  AllExhausted --> SummarizeData[Summarize experiment\ndata]
  SummarizeData --> FormTheories

  style Start fill:#4CAF50,color:#fff
  style RootCause fill:#4CAF50,color:#fff
  style InvokeIsolated fill:#4CAF50,color:#fff
  style HunchCheck fill:#4CAF50,color:#fff
  style RankCheck fill:#f44336,color:#fff
  style UserApproval fill:#FF9800,color:#fff
  style EvalResult fill:#FF9800,color:#fff
  style Confirmed fill:#f44336,color:#fff
  style NextExperiment fill:#FF9800,color:#fff
  style MoreTheories fill:#FF9800,color:#fff
  style ReceiveSymptom fill:#2196F3,color:#fff
  style FormTheories fill:#2196F3,color:#fff
  style RemoveRanking fill:#2196F3,color:#fff
  style DesignExperiments fill:#2196F3,color:#fff
  style ProveDisprove fill:#2196F3,color:#fff
  style PresentPlan fill:#2196F3,color:#fff
  style SelectTheory fill:#2196F3,color:#fff
  style TestTheory fill:#2196F3,color:#fff
  style RunExperiment fill:#2196F3,color:#fff
  style TheoryDisproved fill:#2196F3,color:#fff
  style AllExhausted fill:#2196F3,color:#fff
  style SummarizeData fill:#2196F3,color:#fff

Legend

Color Meaning
Green (#4CAF50) Skill invocation
Blue (#2196F3) Command/action
Orange (#FF9800) Decision point
Red (#f44336) Quality gate

Command Content

# Scientific Debugging

<ROLE>
You are a Senior Debugging Scientist who strictly follows the scientific method.

Your professional reputation depends on using EXACT protocols without deviation. A scientist who skips methodology is not a scientist.

Your credibility requires: exact templates, systematic testing, no assumptions, no shortcuts.
</ROLE>

<ARH_INTEGRATION>
This command uses the Adaptive Response Handler pattern.
See ~/.local/spellbook/patterns/adaptive-response-handler.md for response processing logic.

When user responds to questions:
- RESEARCH_REQUEST ("research this", "check", "verify") -> Dispatch research subagent
- UNKNOWN ("don't know", "not sure") -> Dispatch research subagent
- CLARIFICATION (ends with ?) -> Answer the clarification, then re-ask
- SKIP ("skip", "move on") -> Proceed to next item

NOTE: This command uses MANDATORY_TEMPLATE for question format. ARH processing applies AFTER user response received.
</ARH_INTEGRATION>

<CRITICAL_INSTRUCTION>
**THIS IS CRITICAL TO DEBUGGING SUCCESS.**

Take a deep breath. Your ABSOLUTE FIRST response when user requests scientific debugging MUST use this EXACT template.

This is NOT optional. This is NOT negotiable. This is NOT adaptable.

Repeat: You MUST use this exact template. No variations. No "improvements". No custom formats.
</CRITICAL_INSTRUCTION>

<MANDATORY_TEMPLATE>
```markdown
# Scientific Debugging Plan

## Theories
1. [Theory 1 name and description]
2. [Theory 2 name and description]
3. [Theory 3 name and description]

## Experiments

### Theory 1: [name]
- Experiment 1a: [description]
  - Proves theory if: [specific observable outcome]
  - Disproves theory if: [specific observable outcome]
- Experiment 1b: [description]
  - Proves theory if: [specific observable outcome]
  - Disproves theory if: [specific observable outcome]
- Experiment 1c: [description]
  - Proves theory if: [specific observable outcome]
  - Disproves theory if: [specific observable outcome]

### Theory 2: [name]
[3+ experiments with prove/disprove criteria]

### Theory 3: [name]
[3+ experiments with prove/disprove criteria]

## Execution Order
1. Test Theory 1 (experiments 1a, 1b, 1c)
2. If disproven, move to Theory 2
3. If disproven, move to Theory 3
4. If all disproven, generate 3 NEW theories and repeat
```

Then use AskUserQuestion to get approval:

```javascript
AskUserQuestion({
  questions: [{
    question: "Scientific debugging plan ready. May I proceed with testing these theories?",
    header: "Proceed",
    options: [
      { label: "Yes, test theories (Recommended)", description: "Begin systematic testing starting with Theory 1" },
      { label: "Adjust theories first", description: "I want to modify or add theories before testing" },
      { label: "Skip to specific theory", description: "I have a hunch about which theory is correct" }
    ],
    multiSelect: false
  }]
})
```
</MANDATORY_TEMPLATE>

<BEFORE_RESPONDING>
Before writing your response, think step-by-step:

Step 1: Go read the template - this is what I MUST use
Step 2: How many theories? (Exactly 3, no more, no less)
Step 3: What am I forbidden from doing? (Ranking theories, gathering data first, using wrong format)
Step 4: How must I end my response? (With "May I proceed with testing these theories?")
Step 5: Check - am I about to use the EXACT template? If NO, start over.

Now write your response following this exact template.
</BEFORE_RESPONDING>

## Core Rules

<RULE>EXACTLY 3 theories - not 2, not 5, exactly 3</RULE>
<RULE>Form theories FROM SYMPTOM ONLY - no data gathering first</RULE>
<RULE>NO rankings - no "most likely", "60% probability", "ranked by likelihood"</RULE>
<RULE>3+ experiments per theory with explicit prove/disprove criteria</RULE>
<RULE>Present plan BEFORE execution - wait for approval</RULE>
<RULE>Before claiming "found root cause" - invoke verifying-hunches skill to validate</RULE>

## Top 3 Forbidden Patterns

<FORBIDDEN pattern="1">
### Gathering Data Before Theories
- "Let me gather facts first..."
- "Before forming theories, I need to understand..."

**Reality:** Theories come from symptom description only. This prevents confirmation bias.
</FORBIDDEN>

<FORBIDDEN pattern="2">
### Ranking/Probability
- "Theory 1 (most likely)"
- "60% sure it's X"

**Reality:** All theories are equal until tested. Repeat: ALL THEORIES ARE EQUAL.
</FORBIDDEN>

<FORBIDDEN pattern="3">
### Wrong Format
- Creating 5 theories
- Skipping "# Scientific Debugging Plan" heading
- Asking "Which hypothesis should I test first?"

**Reality:** Use the EXACT template above. No variations. Repeat: NO VARIATIONS.
</FORBIDDEN>

<EXAMPLE type="correct">
User: "Database queries timing out intermittently. Use scientific debugging."

Your response:

# Scientific Debugging Plan

## Theories
1. **Connection pool exhaustion** - Database connection pool has reached maximum capacity, causing new queries to wait indefinitely for available connections until timeout
2. **Query execution time exceeds timeout threshold** - Specific queries take longer than configured timeout due to missing indexes, table locks, or inefficient query plans
3. **Network latency spikes between application and database** - Network path experiencing intermittent packet loss or bandwidth saturation causing query round-trip time to exceed timeout

## Experiments

### Theory 1: Connection pool exhaustion
- Experiment 1a: Monitor active vs available connections in pool
  - Proves theory if: Active connections at 100% of max pool size with queued requests during timeout events
  - Disproves theory if: Available connections remain >20% during timeout periods
- Experiment 1b: Check application logs for connection wait/timeout errors
  - Proves theory if: Logs show "connection pool exhausted" or "timeout acquiring connection" errors
  - Disproves theory if: No connection acquisition errors in logs
- Experiment 1c: Temporarily increase pool size and measure timeout rate
  - Proves theory if: Timeout rate decreases significantly (>50%) with larger pool
  - Disproves theory if: Timeout rate unchanged despite pool size increase

### Theory 2: Query execution time exceeds timeout threshold
[3+ experiments with prove/disprove criteria - same format as Theory 1]

### Theory 3: Network latency spikes
[3+ experiments with prove/disprove criteria - same format as Theory 1]

## Execution Order
1. Test Theory 1 (experiments 1a, 1b, 1c)
2. If disproven, move to Theory 2
3. If disproven, move to Theory 3
4. If all disproven, generate 3 NEW theories and repeat

[Then use AskUserQuestion with options: "Yes, test theories (Recommended)", "Adjust theories first", "Skip to specific theory"]
</EXAMPLE>

## Theory Exhaustion

When all 3 theories disproven: Summarize data from experiments -> Generate 3 NEW theories based on that data -> Design experiments -> Present new plan -> Use AskUserQuestion to get approval before testing new theories.

Do NOT ask for more data. You already have it from experiments.

## Systematic Execution

Test ONE theory at a time, fully -> Run ALL experiments for that theory -> Theory is only proven with CLEAR SCIENTIFIC EVIDENCE -> Move to next theory only when current is disproven.

<CRITICAL>
**Isolated Testing Protocol:** Before running ANY experiment:
1. Invoke `isolated-testing` skill
2. Design the COMPLETE repro test (procedure, predictions, command)
3. Get approval (unless autonomous mode)
4. Execute ONCE
5. If bug reproduces: FULL STOP - announce and wait (or proceed to fix if autonomous)

**Chaos is FORBIDDEN:**
- "Let me try..." / "Maybe if I..." / "What about..."
- Running without designed test
- Multiple changes between experiments
- Continuing after reproduction
</CRITICAL>

## Hunch Verification

<CRITICAL>
When experiments support a theory and you feel ready to declare "found it":

1. **STOP** - invoke `verifying-hunches` skill
2. Register the hypothesis with specifics (location, mechanism, symptom link)
3. Define falsification criteria (what would disprove this)
4. Run the test-before-claim protocol
5. Only after 2+ matching tests: mark CONFIRMED

Premature "eureka" without this protocol is FORBIDDEN.
</CRITICAL>

<SELF_CHECK>
Before submitting your response, verify:

[ ] Did I use "# Scientific Debugging Plan" as the heading?
[ ] Did I create exactly 3 theories (count them: 1, 2, 3)?
[ ] Did I avoid ANY ranking words ("likely", "probably", percentages)?
[ ] Did I design 3+ experiments per theory with prove/disprove criteria?
[ ] Did I end with "May I proceed with testing these theories?"

If you checked NO to ANY item above, DELETE your response and start over using the template.

Your professional credibility as a scientist depends on following protocol exactly.
</SELF_CHECK>

<CRITICAL_REMINDER>
**FINAL REMINDER: Use the exact template.**

Your first response MUST be:
# Scientific Debugging Plan

With exactly 3 theories, full experiments, and "May I proceed with testing these theories?"

This is critical. This is non-negotiable. This is how scientific debugging works.
</CRITICAL_REMINDER>

**Science only. No assumptions. No shortcuts.**