Capture the in-flight state of the Escalation Mode wedge build so the next session (or Codex resume) picks up cleanly without re-deriving context: - CURRENT_TASK now describes the wedge, what's done across the 5 commits on this branch, what remains (WebSocket push, magic-moment screen, analytics page, e2e), and the two-metric framing readers MUST internalize before quoting numbers - HANDOFF resume point is the WebSocket/SSE push (live-arrival half of the notification dual-path); includes suggested first slice + watch-outs (no user_id on ai_session_step, denormalized account_id, peer-escalation still gated to session owner) - Both files reference the design doc and the kill-switch criterion No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.0 KiB
CURRENT_TASK.md
Task: Build Escalation Mode — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
Status: in-flight on feat/escalation-mode (currently feat/escalation-metric-endpoint). Backend metric + role gate + email notification shipped. Frontend stat-card mounted. Next: WebSocket/SSE push (live-arrival half of the dual-path) and the magic-moment handoff-context screen.
Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md. Reviewed by /office-hours, /plan-eng-review, /plan-design-review, /codex review. Eng + Design CLEARED. Codex's two-metric correction + claim-role-gate + per-channel notification model all applied to the plan and the code.
Test plan artifact: docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md — primary input for /qa once the build is feature-complete.
Done so far on feat/escalation-metric-endpoint
| Commit | What it ships |
|---|---|
d51e95c |
Plan + test-plan artifacts checked in |
52f6d03 |
GET /analytics/flowpilot/escalations — in-product time-to-first-action; account-scoped, engineer-or-admin gated; 9 tests including multi-tenant isolation |
7a5b853 |
Role-gate POST /handoffs/{id}/claim to engineer-or-admin (was viewer-claimable); 2 tests |
07d0db9 |
HandoffManager.dispatch_escalation_notifications — emails engineer/admin teammates on intent=escalate; graceful-degradation regression test; 4 tests |
9f0bfd4 |
EscalationMetricCard mounted above the queue list; consumes the new endpoint; matches DESIGN-SYSTEM tokens |
20 backend tests green across handoff_manager + session_handoffs_api + flowpilot_analytics_escalations. Frontend tsc -b clean. Nothing pushed yet.
Remaining work on this branch
- WebSocket/SSE push for live escalation arrival in the queue — the second half of the notification dual-path. Senior already on the queue page sees a new card slide in within ~1s of the junior hitting Escalate. ~3-4 days of work split across multiple commits (connection manager, auth-scoped fan-out, frontend EventSource handling, reconnect, slide-in animation, tab-title flash).
- Magic-moment handoff-context screen — 4-section view (problem summary / what's been tried / AI assessment / Start here CTA) that loads on Pick Up before dissolving into the regular FlowPilot session view. ~1.5-2 days.
- Owner-facing analytics page at
/analytics/escalations— period selector, conversion-rate, trend chart. ~0.5d. - Playwright e2e for the magic-moment demo flow (junior escalates → senior receives → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
Two-metric framing — read this before quoting numbers to anyone
The in-product endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline − in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
Kill-switch
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative (deterministic-ops territory) for context, but don't pivot before the data lands.
Previous task — closed out
Task: Land PR #153 — fix the AssistantChatPage prefill currentChatRef bug. Status: complete (2026-04-26). Merged as 68fcdc6 on main. E2e regression test now in the suite.
Background CI item, not blocking: promoting CI / e2e (pull_request) to required on main. Two consecutive green PR runs (#150 and #153) cleared the threshold. Ops-only.