# CURRENT_TASK.md **Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call. **Status:** in-flight on `feat/escalation-metric-endpoint`. Backend is **feature-complete and test-stabilized**. **Frontend SSE subscription is shipped** (`EscalationQueue.tsx` now subscribes via fetch-based ReadableStream, prepends new arrivals with the locked 200ms slide-in, flashes tab title when backgrounded, respects `prefers-reduced-motion`, exponential-backoff reconnect). **Next:** magic-moment handoff-context screen, then push + draft PR. **Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied. **Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once feature-complete. ## Done on `feat/escalation-metric-endpoint` (8 commits, branched from `main` @ `c0ed6d9`) | Commit | What it ships | |---|---| | `d51e95c` | Plan + test-plan artifacts | | `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated | | `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin | | `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression | | `9f0bfd4` | `EscalationMetricCard` mounted above the queue list | | `a283d0d` | `.ai/` mid-flight refresh | | `87bd0b7` | **WIP** marker for the SSE backend slice (paused for Codex pass) | | `bc15952` | Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id | | `fff8338` | Doc-only: track escalation assessment latency follow-up | | `9bdd995` | Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out | | _pending_ | Frontend SSE subscription in `EscalationQueue.tsx` — fetch-based `ReadableStream` reader, `handoff_created` triggers refetch + prepend with locked 200ms slide-in, exponential-backoff reconnect, tab-title flash when backgrounded, `prefers-reduced-motion` honored, ARIA live-region | **Test status:** focused subset (`test_escalation_bus`, `test_handoff_manager`, `test_session_handoffs_api`, `test_flowpilot_analytics_escalations`) → `32 passed in 18.91s` with `-n auto`. Frontend `tsc -b` clean. Live-arrival smoke test against the running dev stack confirmed the SSE handshake delivers the `ready` frame on connect and a `handoff_created` frame with the expected payload after posting a handoff. Branch not pushed. ## Remaining work on this branch 1. **Magic-moment handoff-context screen** — 4-section view (problem summary / what's been tried / AI assessment / Start here CTA) that loads on Pick Up before dissolving into the regular FlowPilot session view. ~1.5-2 days. Must render gracefully when `ai_assessment` is `None` (assessment timed out — see `9bdd995`). 2. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo. 3. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording. ## Two-metric framing — read this before quoting numbers to anyone The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline − in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught. ## Kill-switch Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first. ## Previous task — closed out **Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`. **Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green runs cleared the threshold. Ops-only.