Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas c194ba4a43 docs(ai): handoff state after magic-moment screen lands
Marks the magic-moment handoff-context screen as shipped, points the
next session at visual QA + push + draft PR, and captures the deferred
follow-ups (suggested-step chips, snapshot expansion, toolbar button
on revisits, owner analytics, Playwright e2e).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 21:08:07 -04:00

53 lines
6.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CURRENT_TASK.md
**Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
**Status:** in-flight on `feat/escalation-metric-endpoint`. Backend is **feature-complete and test-stabilized**. **Frontend live-arrival SSE subscription is shipped** (`EscalationQueue.tsx` subscribes via fetch-based ReadableStream, prepends new arrivals with the locked 200ms slide-in, flashes tab title when backgrounded, respects `prefers-reduced-motion`, exponential-backoff reconnect). **Magic-moment handoff-context screen is shipped** (`HandoffContextScreen.tsx` + integration in `FlowPilotSessionPage.tsx` — renders on Pick Up before claim, claims on "Start here", re-openable from toolbar, gracefully handles null AI assessment). **Next:** push + draft PR, then optional analytics page + Playwright e2e + chat-input suggested-step chips.
**Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.
**Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once feature-complete.
## Done on `feat/escalation-metric-endpoint` (8 commits, branched from `main` @ `c0ed6d9`)
| Commit | What it ships |
|---|---|
| `d51e95c` | Plan + test-plan artifacts |
| `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated |
| `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin |
| `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression |
| `9f0bfd4` | `EscalationMetricCard` mounted above the queue list |
| `a283d0d` | `.ai/` mid-flight refresh |
| `87bd0b7` | **WIP** marker for the SSE backend slice (paused for Codex pass) |
| `bc15952` | Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id |
| `fff8338` | Doc-only: track escalation assessment latency follow-up |
| `9bdd995` | Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out |
| `b8627f4` | Frontend SSE subscription in `EscalationQueue.tsx` — fetch-based `ReadableStream` reader; `handoff_created` triggers refetch + prepend with locked 200ms slide-in; exponential-backoff reconnect; tab-title flash when backgrounded; `prefers-reduced-motion` honored; ARIA live-region |
| `f65b657` | Handoff state docs after frontend SSE slice lands |
| `8e9d22e` | Magic-moment handoff-context screen on pickup — `HandoffContextScreen.tsx` (4 sections, graceful null AI assessment, focus management, prefers-reduced-motion); `FlowPilotSessionPage.tsx` integration (pre-claim handoff fetch, claim on Start here, toolbar re-open overlay) |
**Test status:** focused subset (`test_escalation_bus`, `test_handoff_manager`, `test_session_handoffs_api`, `test_flowpilot_analytics_escalations`) → `32 passed in 18.91s` with `-n auto`. Frontend `tsc -b` clean. End-to-end smoke test against the running dev stack confirmed: SSE handshake delivers `ready` frame on connect and `handoff_created` after a posted handoff; `listHandoffs` returns the unclaimed handoff for a senior pre-claim; `claimHandoff` flips session status from `escalated``active` and `escalated_to_id` is set so subsequent GET succeeds. Branch not pushed.
## Remaining work on this branch
1. **Push + draft PR** — branch is unpushed. Open against `main`.
2. **Suggested-step chips below the chat input** (Codex correction, design plan locks this) — surfaces `ai_assessment_data.suggested_steps[]` as clickable chips in `FlowPilotMessageBar` that prefill the input. Threading through `FlowPilotSession` → message bar.
3. **Snapshot expansion in `HandoffManager._generate_snapshot`** — include the recent diagnostic steps / conversation tail so the magic-moment screen's "What's been tried" section can render the actual timeline pre-claim instead of "full timeline available after pickup".
4. **Toolbar Context button on legacy-arrival sessions** — currently the button only appears when the senior arrived via the magic-moment flow this session. Lazy-fetching the handoff list on session-load (when status was-escalated) would make it work on revisits.
5. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
6. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives via SSE → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
## Two-metric framing — read this before quoting numbers to anyone
The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
## Kill-switch
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.
## Previous task — closed out
**Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`.
**Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green runs cleared the threshold. Ops-only.