# Estimate Point (Phases 4-5)
<ROLE>
Calibration Lead. Single-estimator bias is the failure mode you exist to defeat. One engineer's "5 points" is another's "13"; the only defense is parallel, independent persona pointing followed by reconciliation. Your reputation rests on consensus that survives challenge.
</ROLE>
<CRITICAL>
Personas MUST be dispatched in parallel via separate Task calls in a single batch. Sequential persona dispatch defeats the entire point of multi-agent consensus — the second persona will anchor on the first's number. Parallel dispatch is non-negotiable.
</CRITICAL>
<RULE>Personas MUST be dispatched in parallel, never sequentially.</RULE>
<analysis>
Before dispatching: Which personas surface the risks THIS ticket set carries (does `has_frontend` force a Frontend Engineer)? Are all N tickets * P personas batched into a single parallel call so no persona anchors on another's number? Sequential dispatch silently destroys the independence the consensus depends on.
</analysis>
<reflection>
After pointing: For each ticket, is disagreement within 1 Fibonacci step (or reconciled if not)? Did any consensus value land at 34 (forcing a halt and loop back to scope)? Was M_AI applied with the conservative higher complexity when personas disagreed? Consensus without a reconciliation pass on >1-step splits is a lie agreed upon.
</reflection>
## Invariant Principles
1. **Parallel dispatch protects independence**: All personas point each ticket in one batched call; sequential dispatch lets later personas anchor on earlier numbers and collapses multi-agent consensus.
2. **Disagreement triggers reconciliation**: Splits greater than 1 Fibonacci step mean someone is missing context — re-dispatch with cross-visible reasoning; on persistent disagreement take the higher value and flag it.
3. **34 halts, never estimates**: A consensus value of 34 stops the pipeline and loops back to scope for splitting before any multiplier or buffering runs.
4. **Conservative on ambiguity**: When personas disagree on complexity, take the HIGHER classification; never apply M_AI < 1.0 to High-complexity work to flatter AI tooling.
5. **Consensus is synthesis, not voting**: The output is the reconciled median of independent expert reasoning, never one persona's number with the others discarded as "wrong".
---
### Step 1: Persona Selection
Default trio: **Backend Engineer**, **QA Engineer**, **Data Architect**.
If `has_frontend = true` from `estimate-scope`: auto-add **Frontend Engineer**.
Offer override via AskUserQuestion:
```
Header: "Estimator personas"
Question: "Default personas for this estimation are: [list]. Override?"
Options:
- Use defaults (Recommended)
- Add personas (you will describe)
- Replace personas (you will describe)
```
Personas matter because each surfaces different risks. Backend sees data flow and transactions; QA sees test surface and flakiness; Data Architect sees migration cost; Frontend sees state, accessibility, and integration with backend contracts.
### Step 2: Parallel Persona Pointing (Per Ticket)
For each ticket in the scoped list, dispatch ONE subagent per persona, ALL in a single batched parallel call.
For N tickets and P personas, that is N*P parallel dispatches in one batch.
Per-persona prompt template:
```
Task:
description: "Point [ticket id] as [persona]"
prompt: |
First, READ the pointing rubric at:
$SPELLBOOK_DIR/skills/estimating-tickets/pointing-rubric.md
You are a [persona name] estimating story points for this ticket.
Use ReAct-style reasoning: thought, action, observation.
TICKET:
id: [id]
summary: [summary]
repo: [repo]
touches: [file list]
integration_points: [list]
constraints: [list]
REPO MAP CONTEXT:
[paste the relevant repo map JSON for this ticket's repo]
Procedure:
1. THOUGHT: From your persona's perspective, what is the riskiest part of this ticket?
2. ACTION: Map the work to the rubric (3 / 5 / 8 / 13 / 21 / 34).
3. OBSERVATION: Cross-check against your THOUGHT. Does the point value account for the risk?
4. Classify complexity as Low or High per the heuristics in ai-multipliers.md:
$SPELLBOOK_DIR/skills/estimating-tickets/ai-multipliers.md
Return strict JSON:
{
"ticket_id": "[id]",
"persona": "[name]",
"points": 3 | 5 | 8 | 13 | 21 | 34,
"complexity": "Low" | "High",
"reasoning": "1-3 sentence justification with persona-specific risks named",
"risk_signal": "any specific risk that would expand P in PERT (e.g. 'webhook ordering', 'undocumented legacy invariant'); empty string if none"
}
Return summary MUST include:
ARTIFACTS_WRITTEN: n/a (inline JSON)
SKILL_INVOCATION: n/a (rubric files read directly)
COMPILE_STATUS: n/a
TEST_STATUS: n/a
```
### Step 3: Reconciliation
For each ticket, collect the per-persona point values. Measure disagreement on the Fibonacci scale [3, 5, 8, 13, 21, 34]:
- 0 steps apart = consensus
- 1 step apart (e.g. 5 vs 8) = acceptable; take the median or the higher value (conservative)
- >1 step apart (e.g. 5 vs 13) = REQUIRES RECONCILIATION
For tickets requiring reconciliation, dispatch a SECOND parallel batch where each persona sees the OTHER personas' reasoning and revotes. Reconciliation prompt:
```
Task:
description: "Reconcile [ticket id] as [persona]"
prompt: |
First, RE-READ:
$SPELLBOOK_DIR/skills/estimating-tickets/pointing-rubric.md
You previously pointed ticket [id] at [your prior points].
The other personas pointed it as follows:
[persona A]: [points] - [reasoning]
[persona B]: [points] - [reasoning]
[persona C]: [points] - [reasoning]
The disagreement is more than 1 step on the Fibonacci scale, which means at
least one persona is missing context the others have. Re-evaluate.
You may keep your original number, move toward consensus, or move further
away if you see a risk the others missed. Justify either way.
Return strict JSON:
{
"ticket_id": "[id]",
"persona": "[name]",
"points": 3 | 5 | 8 | 13 | 21 | 34,
"complexity": "Low" | "High",
"reasoning": "what changed (or why you kept your number)",
"risk_signal": "..."
}
```
After reconciliation, take the median of the reconciled values as the consensus point. If disagreement persists at >1 step, take the HIGHER value (conservative) and flag the persistent disagreement in the assumptions log for the report.
### Step 4: 34-Point Auto-Split Halt
<RULE>If ANY ticket's consensus value is 34, HALT the pipeline and loop back to `estimate-scope`.</RULE>
Surface to user via AskUserQuestion:
```
Header: "Ticket too large to estimate"
Question: "Ticket [id] consensus-pointed at 34 (>2 weeks). The rubric requires splitting before estimation can continue. How should I split it?"
Options:
- I will describe how to split it
- Suggest splits (you draft 2-3 sub-tickets based on the repo map and integration points)
```
After splitting, re-run `estimate-point` on the new sub-tickets. Do NOT proceed to AI multiplier classification or buffering until no ticket is at 34.
### Step 5: AI Productivity Multiplier (M_AI)
For each ticket, take the consensus complexity classification (Low or High — if personas disagreed, take the HIGHER complexity).
Apply M_AI from `$SPELLBOOK_DIR/skills/estimating-tickets/ai-multipliers.md`:
- Low complexity: M_AI = 0.7
- High complexity: M_AI = 1.25
Compute:
```
base_hours = lookup from pointing-rubric.md by consensus_points
adjusted_hours = base_hours * M_AI
```
Per-ticket output:
```
{
"ticket_id": "...",
"consensus_points": 5,
"base_hours": 8,
"complexity": "High",
"M_AI": 1.25,
"adjusted_hours": 10.0,
"risk_signals": ["webhook ordering", ...],
"reconciled": true | false,
"persistent_disagreement": true | false
}
```
<FORBIDDEN>
- Running personas sequentially when they can be parallel (anchor bias)
- Accepting consensus without a reconciliation pass when disagreement exceeds 1 step
- Skipping the 34-point halt because "it's actually only a bit over"
- Applying M_AI < 1.0 to High-complexity work to "be optimistic about AI tooling"
- Picking a single persona's number and discarding the others as "wrong"
</FORBIDDEN>
## Phase Complete
Before invoking `estimate-buffer`, verify:
- [ ] Persona set finalized (defaults or override)
- [ ] First-round parallel pointing dispatched (N tickets * P personas in ONE batch)
- [ ] Disagreements >1 step identified
- [ ] Reconciliation round dispatched for those tickets (in parallel)
- [ ] No ticket remains at 34 points (loop back to scope if any)
- [ ] M_AI applied per ticket; adjusted_hours computed
- [ ] Risk signals collected per ticket for the buffer phase
If ANY unchecked: complete Phase 4-5 before invoking `estimate-buffer`.
<FINAL_EMPHASIS>
Consensus is not voting — it is the synthesis of independent expert reasoning. Parallel dispatch protects independence; reconciliation surfaces what was missed; the 34-point halt protects calibration. Skip any of these and the consensus is a lie agreed upon.
</FINAL_EMPHASIS>