diff --git a/docs/superpowers/specs/2026-03-24-conversational-branching-design.md b/docs/superpowers/specs/2026-03-24-conversational-branching-design.md
index 898288dc..b0d35fef 100644
--- a/docs/superpowers/specs/2026-03-24-conversational-branching-design.md
+++ b/docs/superpowers/specs/2026-03-24-conversational-branching-design.md
@@ -47,7 +47,7 @@ The design is **additive and removable** — all branching state lives in new ta
 | RAG search | `rag_service.py` | Branch messages use same RAG pipeline |
 | Image upload + S3 storage | `file_uploads` table + `storage_service.py` | Extended with 5 new columns, no new table |
 | PSA push | `psa_documentation_service.py` | Resolution outputs and handoff notes push through existing service |
-| Escalation package | `_build_escalation_package_enhanced()` | HandoffManager reuses this for snapshot generation |
+| Escalation package structure | `_build_escalation_package_enhanced()` pattern | HandoffManager builds its own branch-aware snapshot (the existing function is not branch-aware and uses a different AI call path). The snapshot JSONB follows the same general structure for compatibility. |
 | Escalation queue | Existing `ai_sessions` query + frontend | Dual-write keeps old queue working |
 | Session lifecycle | `flowpilot_engine.py` resolve/escalate/pause | Extended, not replaced |
 
@@ -76,7 +76,7 @@ The design is **additive and removable** — all branching state lives in new ta
 | Column | Type | Default | Notes |
 |---|---|---|---|
 | `is_branching` | BOOLEAN | FALSE | Whether branching is active |
-| `active_branch_id` | UUID FK NULLABLE | NULL | Currently viewed branch |
+| `active_branch_id` | UUID NULLABLE (no FK — soft pointer to avoid circular FK) | NULL | Currently viewed branch. No FK constraint to `session_branches` because of circular reference (session → branch → session). Application-level integrity. |
 | `handoff_count` | INTEGER | 0 | Times handed off |
 | `total_active_seconds` | INTEGER | 0 | Cumulative active time |
 | `total_parked_seconds` | INTEGER | 0 | Cumulative parked time |
@@ -101,7 +101,7 @@ The design is **additive and removable** — all branching state lives in new ta
 
 All columns nullable. Existing rows unaffected. `ai_description` is always generated on upload (not just branching sessions) — useful for search, exports, PSA notes, Knowledge Flywheel.
 
-Also add `'fork'` to `ai_session_steps.step_type` check constraint.
+Also add `'fork'` to `ai_session_steps.step_type` check constraint. **Note:** modifying a PostgreSQL CHECK constraint requires `DROP CONSTRAINT` then `ADD CONSTRAINT` — this is NOT an additive operation. The Alembic migration must be written manually per CLAUDE.md lesson 77, using `op.drop_constraint('ck_ai_session_steps_step_type')` then `op.create_check_constraint(...)` with the new values list.
 
 ### New tables
 
@@ -118,7 +118,7 @@ Also add `'fork'` to `ai_session_steps.step_type` check constraint.
 | `status` | VARCHAR(20) | `active`, `dead_end`, `solved`, `untried`, `revived` |
 | `status_reason` | TEXT NULLABLE | AI-generated reason for status |
 | `status_changed_at` | TIMESTAMP NULLABLE | |
-| `status_changed_by` | UUID FK NULLABLE | |
+| `status_changed_by` | UUID FK → `users.id`, ondelete SET NULL, NULLABLE | |
 | `conversation_messages` | JSONB | LLM message history scoped to this branch |
 | `context_summary` | JSONB | `{tried: [], concluded: str, artifacts: []}` |
 | `evidence_from_branch_id` | UUID FK NULLABLE | If revived, evidence source |
@@ -184,7 +184,7 @@ Dual-write: on create, also populates `ai_sessions.escalation_package` and `esca
 | `created_at` | TIMESTAMP | |
 | `updated_at` | TIMESTAMP | |
 
-Constraints: `UNIQUE(session_id, output_type)`, check constraints on `output_type` and `status`.
+Constraints: check constraints on `output_type` and `status`. Use `INSERT ... ON CONFLICT (session_id, output_type) DO UPDATE` (upsert) in `generate_all()` so outputs can be regenerated if a session is re-opened after resolution. `UNIQUE(session_id, output_type)` enforces one-of-each but allows replacement.
 
 ### Entity Relationships
 
@@ -256,7 +256,7 @@ Branch lifecycle management. Pure data operations + one LLM call pattern (contex
 | Method | What it does | LLM call? |
 |---|---|---|
 | `create_root_branch(session_id)` | Creates root branch, sets `is_branching=True`, copies `session.conversation_messages` into root branch. Session-level field kept as pre-branching snapshot. | No |
-| `create_fork(session_id, parent_branch_id, trigger_step_id, fork_reason, options[])` | Creates `ForkPoint` + N `SessionBranch` rows. Sets `is_fork_point=True` on trigger step. Unexplored options get status `untried`. | No |
+| `create_fork(session_id, parent_branch_id, trigger_step_id, fork_reason, options[])` | Pre-generates all branch UUIDs in Python, then inserts `ForkPoint` (with branch_ids in options JSONB) + N `SessionBranch` rows in a single transaction. Sets `is_fork_point=True` on trigger step. Unexplored options get status `untried`. | No |
 | `switch_branch(session_id, target_branch_id)` | Updates `session.active_branch_id`. Returns branch with context. | No |
 | `mark_branch_status(branch_id, status, reason)` | Updates status. Generates `context_summary` via `_call_ai`. | Yes — summary |
 | `revive_branch(branch_id, evidence_from_branch_id, evidence_description)` | Sets status `revived`, records evidence source, prepends revival context to branch messages. | No |
@@ -269,17 +269,22 @@ Pure function — takes data, returns assembled prompt components. No DB access,
 
 **Single method:** `build(branch, sibling_summaries, session_context, attachments, token_budget)`
 
-Returns: `{system_prompt: str, history: list[dict], new_message: str, images: list[dict]}`
+Returns: `{system_base: str, rag_context: str, history: list[dict], new_message: str, images: list[dict]}`
 
-Preserves `_call_ai`'s cache breakpoint behavior by separating history from new message.
+**Important:** Return keys match `_call_ai`'s parameter names exactly. `system_base` is the stable system prompt (cached by Anthropic). `rag_context` contains the cross-branch summaries + attachment descriptions (NOT cached — changes per query). This preserves prompt caching: the base prompt is a cache hit across turns, while cross-branch context varies.
+
+Callers invoke: `_call_ai(**builder.build(...))`.
+
+**Data fetching responsibility:** The API endpoint (or a coordinator method `BranchManager.build_prompt_inputs(session_id, branch_id, db)`) pre-fetches all data from the DB — branch messages, sibling summaries via `build_cross_branch_context()`, session context, attachment descriptions — then passes the assembled data to `build()`. The builder itself does no DB queries.
 
 **Assembly order:**
-1. Session context (~2,000 tokens) — problem summary, domain, client info, PSA data
-2. Cross-branch summaries (~3,000 token cap) — prioritized: active > untried > revived > dead_end
-3. Revival context — if branch was revived, prepend evidence
-4. Attachment descriptions (~1,000 tokens) — `ai_description` from other branches' uploads
-5. Branch messages (remaining budget) — last 10-15 turns verbatim, older summarized
-6. Token budget enforcement — compress: old messages → dead-end summaries → file content → never drop system prompt, last 5 messages, branch status map
+1. `system_base` — `ASSISTANT_SYSTEM_PROMPT` + session context (~2,000 tokens). Problem summary, domain, client info, PSA data. This is stable across turns and gets cached.
+2. `rag_context` — cross-branch summaries (~3,000 token cap, prioritized: active > untried > revived > dead_end) + attachment descriptions from other branches (~1,000 tokens). This changes per query and is NOT cached.
+3. Revival context — if branch was revived, prepend evidence to `rag_context`.
+4. `history` — branch's `conversation_messages` minus the last user message. Last 10-15 turns verbatim, older summarized.
+5. `new_message` — the current user message (latest turn).
+6. `images` — image references from current branch uploads.
+7. Token budget enforcement — compress in order: old messages → dead-end summaries → file content → never drop system_base, last 5 messages, branch status map.
 
 ### `services/handoff_manager.py`
 
@@ -288,7 +293,7 @@ Unified park/escalate with dual-write backward compatibility.
 | Method | What it does | LLM call? |
 |---|---|---|
 | `create_handoff(session_id, intent, engineer_notes, user_id)` | Creates `SessionHandoff`. Calls `generate_snapshot()`. If escalate, calls `generate_ai_assessment()`. Dual-writes to `session.escalation_package` + `escalated_to_id`. | Escalate only |
-| `generate_snapshot(session_id)` | Serializes branch tree into snapshot JSONB. Reuses `_build_escalation_package_enhanced()` for steps-tried data. | No |
+| `generate_snapshot(session_id)` | Serializes branch tree into snapshot JSONB. Builds its own branch-aware steps-tried data (the existing `_build_escalation_package_enhanced()` is not branch-aware — it iterates all steps without branch attribution and uses a different AI call path). Follows the same general snapshot structure for compatibility with the existing escalation queue. | No |
 | `generate_ai_assessment(session_id)` | Full session + branch context → diagnostic assessment. | Yes |
 | `generate_briefing(handoff_id, claiming_user_id)` | Natural-language handoff summary for claiming engineer. | Yes |
 | `claim_session(handoff_id, claiming_user_id)` | Updates `claimed_by/at`, sets session `active`. Dual-writes `escalation_package`. | No |
@@ -311,12 +316,14 @@ Three LLM calls on resolve, each through `_call_ai`.
 
 Not a new service — extends the existing upload endpoint in `uploads.py`:
 
-1. Upload completes, response returned immediately.
+1. Upload completes, response returned immediately (non-blocking).
 2. Background task (via `asyncio.create_task`) calls `_call_ai` with image + prompt: "Describe this in one sentence for a troubleshooting context log."
 3. Result written to `file_uploads.ai_description`.
-4. For text files: extract content directly, call `_call_ai` for summary if >2,000 tokens.
+4. For text files: extract content directly (no LLM), call `_call_ai` for summary only if content >2,000 tokens.
 5. Always runs (not just branching sessions) — useful for search, exports, PSA notes.
 
+**Safeguards:** The background task must catch and log all exceptions without crashing the upload response. If `_call_ai` fails (rate limit, timeout), `ai_description` stays NULL — the upload is still usable, just without cross-branch context. Cost is ~$0.005 per image on Sonnet (~1,650 input tokens + 50 output tokens). No additional rate limiting needed beyond the existing upload rate limit (10/minute).
+
 ---
 
 ## API Endpoints
@@ -356,16 +363,35 @@ Note: endpoints nest under `/ai-sessions` (not `/api/v1/sessions` as the origina
 
 ---
 
+## Integration Surface — Existing Code Changes
+
+**Critical:** The following existing code paths must check `session.is_branching` to avoid data divergence:
+
+### `unified_chat_service.send_chat_message()`
+Currently appends messages to `session.conversation_messages`. When `is_branching=True`, this must instead:
+- Route the message to `session_branches[active_branch_id].conversation_messages`
+- Use `BranchAwarePromptBuilder` for context assembly instead of the flat message history
+- Still call `_call_ai` for the actual LLM interaction (same call path)
+
+### `flowpilot_engine` step creation
+Currently creates `AISessionStep` with no `branch_id`. When `is_branching=True`, must set `branch_id` to the active branch.
+
+### Existing `/ai-sessions/{id}/chat` endpoint
+Must detect `is_branching` and delegate to the branch message endpoint logic. Linear sessions continue through the existing path unchanged.
+
+**Pattern:** Each integration point is a simple `if session.is_branching:` guard that delegates to branching services. The existing code path is the `else` — completely untouched. If the feature is rolled back, remove the guards and the else paths remain.
+
+---
+
 ## Token Budget Strategy
 
 ### Budget Allocation
 
 | Context Layer | Budget | Strategy |
 |---|---|---|
-| System prompt + session context | ~2,000 tokens | Fixed |
-| Cross-branch summaries | ~3,000 tokens | Scales with branch count. Each summary ~200-500 tokens. Cap at 3,000. |
+| System prompt + session context | ~2,000 tokens | Fixed. Lives in `system_base` (cached). |
+| Cross-branch summaries + attachment descriptions | ~4,000 tokens combined | Scales with branch count. Each branch summary ~200-500 tokens + attachment descriptions ~100 tokens each. Lives in `rag_context` (not cached). Cap at 4,000 total. |
 | Current branch messages | Remaining budget | Last 10-15 turns verbatim. Older summarized. |
-| Attachment descriptions | ~1,000 tokens | Included in cross-branch summaries |
 
 ### Graceful Degradation (in order)
 
@@ -445,7 +471,8 @@ backend/app/
 ### Phase 1: Data Foundation (est. 2 days)
 - Create 4 new models + Pydantic schemas
 - Add columns to `ai_sessions`, `ai_session_steps`, `file_uploads`
-- Single Alembic migration (all additive)
+- Manual Alembic migration (per CLAUDE.md lesson 77). Mostly additive — new tables + nullable columns. One non-additive operation: `step_type` CHECK constraint must be dropped and recreated with `'fork'` added. Use `op.drop_constraint` / `op.create_check_constraint`.
+- `active_branch_id` on `ai_sessions` is a plain UUID column with no FK constraint (avoids circular FK with `session_branches`)
 - Unit tests for model creation and relationships
 
 ### Phase 2: Branch Engine (est. 2-3 days)
@@ -496,3 +523,5 @@ If this feature is pulled:
 7. Delete endpoint files: `session_branches.py`, `session_handoffs.py`, `session_resolutions.py`
 8. Delete frontend components/hooks/API clients listed above
 9. Existing escalation flow, upload pipeline, chat service, PSA integration — **all untouched**
+10. **Dual-write rollback note:** Sessions that were escalated via `HandoffManager` (while branching was active) will have `escalation_package` in the branching snapshot format (includes `branch_map`). The existing escalation queue UI should handle both the old flat format and the branching format gracefully — check for `branch_map` key presence. This is the one data shape difference that persists after rollback.
+11. Remove `if session.is_branching:` guards from `unified_chat_service`, `flowpilot_engine`, and the chat endpoint. The else paths are the original code — unchanged.