Files

Michael Chihlas f65b65790c docs(ai): handoff state after frontend SSE slice lands

Marks the SSE subscription as shipped, points the next-session resume
target at the magic-moment handoff-context screen, and logs the live
end-to-end verification.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-27 20:57:20 -04:00

4.9 KiB

Raw Blame History

CURRENT_TASK.md

Task: Build Escalation Mode — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.

Status: in-flight on feat/escalation-metric-endpoint. Backend is feature-complete and test-stabilized. Frontend SSE subscription is shipped (EscalationQueue.tsx now subscribes via fetch-based ReadableStream, prepends new arrivals with the locked 200ms slide-in, flashes tab title when backgrounded, respects prefers-reduced-motion, exponential-backoff reconnect). Next: magic-moment handoff-context screen, then push + draft PR.

Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md. Reviewed by /office-hours, /plan-eng-review, /plan-design-review, /codex review. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.

Test plan artifact: docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md — primary input for /qa once feature-complete.

Done on `feat/escalation-metric-endpoint` (8 commits, branched from `main` @ `c0ed6d9`)

Commit	What it ships
`d51e95c`	Plan + test-plan artifacts
`52f6d03`	`GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated
`7a5b853`	Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin
`07d0db9`	`HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression
`9f0bfd4`	`EscalationMetricCard` mounted above the queue list
`a283d0d`	`.ai/` mid-flight refresh
`87bd0b7`	WIP marker for the SSE backend slice (paused for Codex pass)
`bc15952`	Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id
`fff8338`	Doc-only: track escalation assessment latency follow-up
`9bdd995`	Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out
pending	Frontend SSE subscription in `EscalationQueue.tsx` — fetch-based `ReadableStream` reader, `handoff_created` triggers refetch + prepend with locked 200ms slide-in, exponential-backoff reconnect, tab-title flash when backgrounded, `prefers-reduced-motion` honored, ARIA live-region

Test status: focused subset (test_escalation_bus, test_handoff_manager, test_session_handoffs_api, test_flowpilot_analytics_escalations) → 32 passed in 18.91s with -n auto. Frontend tsc -b clean. Live-arrival smoke test against the running dev stack confirmed the SSE handshake delivers the ready frame on connect and a handoff_created frame with the expected payload after posting a handoff. Branch not pushed.

Remaining work on this branch

Magic-moment handoff-context screen — 4-section view (problem summary / what's been tried / AI assessment / Start here CTA) that loads on Pick Up before dissolving into the regular FlowPilot session view. ~1.5-2 days. Must render gracefully when ai_assessment is None (assessment timed out — see 9bdd995).
Owner-facing analytics page at /analytics/escalations — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
Playwright e2e for the magic-moment demo flow (junior escalates → senior receives → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.

Two-metric framing — read this before quoting numbers to anyone

The in-product endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline − in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.

Kill-switch

Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.

Previous task — closed out

Task: Land PR #153 — fix the AssistantChatPage prefill currentChatRef bug. Status: complete (2026-04-26). Merged as 68fcdc6 on main.

Background CI item, not blocking: promoting CI / e2e (pull_request) to required on main. Two consecutive green runs cleared the threshold. Ops-only.

4.9 KiB Raw Blame History Unescape Escape