Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas 0f00ee5e01 feat(escalations): close out plan-locked wedge polish
Four items from the design-plan audit, all flagged as locked-design or
Codex corrections, shipped together so the GTM demo path covers them
end-to-end before bug bash.

1. Live AI assessment refresh on the magic-moment screen. Backend already
   publishes handoff_assessment_ready when enrich_escalation_async commits;
   wire the frontend listener so the senior sees the assessment populate
   without a manual reopen. New event type + onAssessmentReady handler on
   streamEscalations; AssistantChatPage opens a scoped SSE subscription
   whenever it tracks a handoff missing its assessment, refetches on match,
   and replaces magicHandoff / overlayHandoff in place. Closes the loop on
   the async-assessment commit e8ba74e.

2. Suggested-step chips below the chat input. Locked design from the plan
   (Codex correction). Chip strip renders above the composer post-claim
   when ai_assessment_data.suggested_steps[] is non-empty. Click prefills
   the input and focuses; first send or explicit X hides for the session.

3. Unread 6px dot on EscalationQueue cards. localStorage-persisted seen
   set (rf-escalation-seen, capped 200). Dot top-right when not seen.
   Cleared on open (card click) or claim (Pick Up) — NOT on hover, per
   Codex correction. Pick Up stops propagation so it doesn't double-fire.

4. Race-condition toast on claim conflict. The /claim endpoint previously
   silently overwrote claimed_by — both seniors thought they owned the
   session. New HandoffAlreadyClaimedError carries the winner's id/name/
   timestamp; claim_session rejects different-user re-claims (same-user is
   idempotent for double-click safety); endpoint returns 409 with
   structured detail. AssistantChatPage.handleStartHere extracts and
   surfaces "Already claimed by {name} {time_ago}." via toast, drops
   ?pickup=true, dismisses magic-moment so the loser flows back to queue.

Tests: 2 new unit tests in test_handoff_manager.py (conflict raises,
same-user idempotent). Full handoff + escalation suite (34 tests) green.
Frontend tsc -b clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 01:59:28 -04:00

63 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CURRENT_TASK.md
**Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
**Status:** in-flight on `feat/escalation-metric-endpoint`. Branch is pushed; **draft PR #155** is open against `main` ([gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/155](https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/155)). Backend is **feature-complete and test-stabilized**. **Frontend live-arrival SSE subscription**, **magic-moment handoff-context screen**, and **bell-icon notification fix** all shipped. **`/escalate` and `/handoff` are now unified** through `HandoffManager` — every escalation creates a SessionHandoff, persists an AppNotification, fans out on the SSE bus, dispatches Slack/Teams via `notify()`, and emails per-user, regardless of which URL it entered through. **Next:** visual QA via `/qa`, then optional follow-ups (suggested-step chips, snapshot expansion, analytics page, Playwright e2e).
**Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.
**Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once feature-complete.
## Done on `feat/escalation-metric-endpoint` (8 commits, branched from `main` @ `c0ed6d9`)
| Commit | What it ships |
|---|---|
| `d51e95c` | Plan + test-plan artifacts |
| `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated |
| `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin |
| `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression |
| `9f0bfd4` | `EscalationMetricCard` mounted above the queue list |
| `a283d0d` | `.ai/` mid-flight refresh |
| `87bd0b7` | **WIP** marker for the SSE backend slice (paused for Codex pass) |
| `bc15952` | Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id |
| `fff8338` | Doc-only: track escalation assessment latency follow-up |
| `9bdd995` | Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out |
| `b8627f4` | Frontend SSE subscription in `EscalationQueue.tsx` — fetch-based `ReadableStream` reader; `handoff_created` triggers refetch + prepend with locked 200ms slide-in; exponential-backoff reconnect; tab-title flash when backgrounded; `prefers-reduced-motion` honored; ARIA live-region |
| `f65b657` | Handoff state docs after frontend SSE slice lands |
| `8e9d22e` | Magic-moment handoff-context screen on pickup — `HandoffContextScreen.tsx` (4 sections, graceful null AI assessment, focus management, prefers-reduced-motion); `FlowPilotSessionPage.tsx` integration (pre-claim handoff fetch, claim on Start here, toolbar re-open overlay) |
| `c194ba4` | Handoff state docs after magic-moment screen lands |
| `641853a` | Bell-icon notification opens the pickup flow — notification link template adds `?pickup=true`; GET `/ai-sessions/{id}` allows account-scoped read for `requesting_escalation` / `escalated` states |
| `2a2329a` | Handoff state docs after bell-icon fix; record draft PR #155 |
| `029680a` | Unify `/escalate` through `HandoffManager` — single canonical path for every escalation. `HandoffCreateRequest.target_user_id`, `create_handoff` does the legacy enriched-package work + sets `escalation_reason`, `finalize_escalation` runs documentation + PSA push + `notify()` pre-commit, `dispatch_escalation_notifications` keeps only fire-and-forget IO post-commit. `pickup_session` accepts either status for in-flight migration. `flowpilot_engine.escalate_session` no longer called from any endpoint |
**Test status:** full backend suite → `1103 passed in 259.63s` with `-n auto` after the unification. Frontend `tsc -b` clean. End-to-end smoke test against the running dev stack confirmed: SSE handshake delivers `ready` + `handoff_created` frames; `listHandoffs` returns the unclaimed handoff for a senior pre-claim; `claimHandoff` flips session status `escalated``active`; senior (non-owner, non-target) can `GET` an in-transit session detail; **a single legacy `/escalate` call now produces status='escalated', SessionDocumentation, SessionHandoff row, AppNotification with link `/pilot/{id}?pickup=true` for the team admin, and a PSA push attempt** — all from one funneled HandoffManager call. Branch pushed; draft PR #155 open.
## Remaining work on this branch
1. **Visual QA + bug bash** in a real browser — full pickup demo path with the four new pieces below; this is the next active step.
2. **Snapshot expansion in `HandoffManager._generate_snapshot`** — include the recent diagnostic steps / conversation tail so the magic-moment screen's "What's been tried" section can render the actual timeline pre-claim instead of "full timeline available after pickup".
3. **Toolbar Context button on legacy-arrival sessions** — currently the button only appears when the senior arrived via the magic-moment flow this session. Lazy-fetching the handoff list on session-load (when status was-escalated) would make it work on revisits.
4. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
5. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives via SSE → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
## Just shipped (4 plan-locked items, this session)
- **Live AI assessment refresh on the magic-moment screen.** New `HandoffAssessmentReadyEvent` type + `onAssessmentReady` handler on `streamEscalations`. `AssistantChatPage` opens a scoped SSE subscription whenever it has a tracked handoff with no AI assessment yet; on a matching event it refetches and replaces both `magicHandoff` and `overlayHandoff` in place. Closes the loop on the async-assessment commit `e8ba74e`.
- **Suggested-step chips below the chat input.** New `chipsHidden` state in `AssistantChatPage` defaulting to false; a chip strip renders above the composer when `magicHandoff?.ai_assessment_data?.suggested_steps[]` is non-empty and the magic-moment has dissolved. Click prefills input + focus; first send hides the strip; explicit X also hides. Per-session lifetime (Codex correction locked design).
- **Unread 6px dot on `EscalationQueue` cards.** localStorage-persisted seen set (`rf-escalation-seen`, capped 200). Dot renders top-right of any card not yet seen. Cleared on **open (card click) or claim (Pick Up)** — NOT on hover (Codex correction). Pick Up onClick now stops propagation so the wrapper's open handler isn't double-fired.
- **Race-condition toast on claim conflict.** New `HandoffAlreadyClaimedError` exception class in `handoff_manager.py`. `claim_session` now eager-loads `claimed_by_user`, rejects different-user re-claims (idempotent for same-user), and raises with the winner's id/name/timestamp. Endpoint translates to 409 with structured detail. `AssistantChatPage.handleStartHere` extracts the detail, formats `"Already claimed by {name} {time_ago}."` via `timeAgo()`, drops `?pickup=true`, and dismisses the magic-moment so the loser flows back to the queue. Backed by 2 new unit tests in `test_handoff_manager.py`.
## Two-metric framing — read this before quoting numbers to anyone
The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
## Kill-switch
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.
## Previous task — closed out
**Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`.
**Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green runs cleared the threshold. Ops-only.