Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas b7d7ff06d2
All checks were successful
Mirror to GitHub / mirror (push) Successful in 5s
CI / frontend (pull_request) Successful in 5m8s
CI / backend (pull_request) Successful in 9m46s
CI / e2e (pull_request) Successful in 10m16s
docs(ai): refresh handoff for compute swap
- HANDOFF: rewritten resume point. First action on resume is `git push`
  (commits 0f00ee5 and 665530f are local-only). Visual QA + bug bash is
  the active work; 4 plan-locked items + the structural task-lane fix
  all need real-browser verification.
- CURRENT_TASK: add 0f00ee5 and 665530f to the commit table; reframe
  "Just shipped" as a per-commit summary; flag the task-lane fix as
  needing visual confirmation.
- SESSION_LOG: chronological entry for this session with full detail
  (audit, four polish items, race-condition wiring, structural
  task-lane fix, test status, files touched).
- DECISIONS: new entry "Tag the task-lane state with an owner chatId"
  documenting the structural pattern, what was rejected, and the
  forward implication that future task-lane state slices follow the
  same owner-tagging pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 08:21:23 -04:00

70 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CURRENT_TASK.md
**Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
**Status:** in-flight on `feat/escalation-metric-endpoint`. Branch is pushed; **draft PR #155** is open against `main` ([gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/155](https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/155)). Backend is **feature-complete and test-stabilized**. **Frontend live-arrival SSE subscription**, **magic-moment handoff-context screen**, and **bell-icon notification fix** all shipped. **`/escalate` and `/handoff` are now unified** through `HandoffManager` — every escalation creates a SessionHandoff, persists an AppNotification, fans out on the SSE bus, dispatches Slack/Teams via `notify()`, and emails per-user, regardless of which URL it entered through. **Next:** visual QA via `/qa`, then optional follow-ups (suggested-step chips, snapshot expansion, analytics page, Playwright e2e).
**Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.
**Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once feature-complete.
## Done on `feat/escalation-metric-endpoint` (branched from `main` @ `c0ed6d9`)
| Commit | What it ships |
|---|---|
| `d51e95c` | Plan + test-plan artifacts |
| `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated |
| `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin |
| `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression |
| `9f0bfd4` | `EscalationMetricCard` mounted above the queue list |
| `a283d0d` | `.ai/` mid-flight refresh |
| `87bd0b7` | **WIP** marker for the SSE backend slice (paused for Codex pass) |
| `bc15952` | Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id |
| `fff8338` | Doc-only: track escalation assessment latency follow-up |
| `9bdd995` | Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out |
| `b8627f4` | Frontend SSE subscription in `EscalationQueue.tsx` — fetch-based `ReadableStream` reader; `handoff_created` triggers refetch + prepend with locked 200ms slide-in; exponential-backoff reconnect; tab-title flash when backgrounded; `prefers-reduced-motion` honored; ARIA live-region |
| `f65b657` | Handoff state docs after frontend SSE slice lands |
| `8e9d22e` | Magic-moment handoff-context screen on pickup — `HandoffContextScreen.tsx` (4 sections, graceful null AI assessment, focus management, prefers-reduced-motion); `FlowPilotSessionPage.tsx` integration (pre-claim handoff fetch, claim on Start here, toolbar re-open overlay) |
| `c194ba4` | Handoff state docs after magic-moment screen lands |
| `641853a` | Bell-icon notification opens the pickup flow — notification link template adds `?pickup=true`; GET `/ai-sessions/{id}` allows account-scoped read for `requesting_escalation` / `escalated` states |
| `2a2329a` | Handoff state docs after bell-icon fix; record draft PR #155 |
| `029680a` | Unify `/escalate` through `HandoffManager` — single canonical path for every escalation. `HandoffCreateRequest.target_user_id`, `create_handoff` does the legacy enriched-package work + sets `escalation_reason`, `finalize_escalation` runs documentation + PSA push + `notify()` pre-commit, `dispatch_escalation_notifications` keeps only fire-and-forget IO post-commit. `pickup_session` accepts either status for in-flight migration. `flowpilot_engine.escalate_session` no longer called from any endpoint |
| `8914391` | First task-lane race fix — initializer-time guards (`incomingPrefill || isPickup`) + eager `sessionStorage.removeItem` in `resetSessionDerivedState`. Insufficient (only covered mount-time entry paths) |
| `0f00ee5` | Four plan-locked wedge polish items in one commit — see "Just shipped" section below |
| `665530f` | **Structural fix for the task-lane stale-flash bug.** `taskLaneOwnerChatId` state tags the chatId the in-memory questions/actions belong to. Set at every populate site (sendPrefill, selectChat, handleSend, handleTaskSubmit, handleResumeNew, refreshFacts, handleApplyFix); cleared in `resetSessionDerivedState`. Persistence effect now writes `chatId: ownerChatId` (was `activeChatId` — that was the original write-side bug). Render gate `taskLaneIsForActiveChat = ownerChatId === activeChatId` ANDed into all three render conditions. Stale data is now structurally unable to display. See DECISIONS entry for full rationale |
**Test status:** full backend suite → `1103 passed in 259.63s` with `-n auto` after the unification. Frontend `tsc -b` clean. End-to-end smoke test against the running dev stack confirmed: SSE handshake delivers `ready` + `handoff_created` frames; `listHandoffs` returns the unclaimed handoff for a senior pre-claim; `claimHandoff` flips session status `escalated``active`; senior (non-owner, non-target) can `GET` an in-transit session detail; **a single legacy `/escalate` call now produces status='escalated', SessionDocumentation, SessionHandoff row, AppNotification with link `/pilot/{id}?pickup=true` for the team admin, and a PSA push attempt** — all from one funneled HandoffManager call. Branch pushed; draft PR #155 open.
## Remaining work on this branch
1. **Visual QA + bug bash** in a real browser — full pickup demo path with the four new pieces below; this is the next active step.
2. **Snapshot expansion in `HandoffManager._generate_snapshot`** — include the recent diagnostic steps / conversation tail so the magic-moment screen's "What's been tried" section can render the actual timeline pre-claim instead of "full timeline available after pickup".
3. **Toolbar Context button on legacy-arrival sessions** — currently the button only appears when the senior arrived via the magic-moment flow this session. Lazy-fetching the handoff list on session-load (when status was-escalated) would make it work on revisits.
4. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
5. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives via SSE → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
## Just shipped (this session — 2 commits)
**Commit `0f00ee5`** — four plan-locked wedge polish items:
- **Live AI assessment refresh on the magic-moment screen.** New `HandoffAssessmentReadyEvent` type + `onAssessmentReady` handler on `streamEscalations`. `AssistantChatPage` opens a scoped SSE subscription whenever it has a tracked handoff with no AI assessment yet; on a matching event it refetches and replaces both `magicHandoff` and `overlayHandoff` in place. Closes the loop on the async-assessment commit `e8ba74e`.
- **Suggested-step chips below the chat input.** New `chipsHidden` state in `AssistantChatPage` defaulting to false; a chip strip renders above the composer when `magicHandoff?.ai_assessment_data?.suggested_steps[]` is non-empty and the magic-moment has dissolved. Click prefills input + focus; first send hides the strip; explicit X also hides. Per-session lifetime (Codex correction locked design).
- **Unread 6px dot on `EscalationQueue` cards.** localStorage-persisted seen set (`rf-escalation-seen`, capped 200). Dot renders top-right of any card not yet seen. Cleared on **open (card click) or claim (Pick Up)** — NOT on hover (Codex correction). Pick Up onClick now stops propagation so the wrapper's open handler isn't double-fired.
- **Race-condition toast on claim conflict.** New `HandoffAlreadyClaimedError` exception class in `handoff_manager.py`. `claim_session` now eager-loads `claimed_by_user`, rejects different-user re-claims (idempotent for same-user), and raises with the winner's id/name/timestamp. Endpoint translates to 409 with structured detail. `AssistantChatPage.handleStartHere` extracts the detail, formats `"Already claimed by {name} {time_ago}."` via `timeAgo()`, drops `?pickup=true`, and dismisses the magic-moment so the loser flows back to the queue. Backed by 2 new unit tests in `test_handoff_manager.py`.
**Commit `665530f`** — structural fix for the recurring stale-task-lane bug. Owner-tagging pattern applied to `activeQuestions` / `activeActions` / `showTaskLane`. See [`DECISIONS.md`](DECISIONS.md) for the architecture write-up. **User-reported on next session: needs visual verification.**
## Two-metric framing — read this before quoting numbers to anyone
The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
## Kill-switch
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.
## Previous task — closed out
**Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`.
**Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green runs cleared the threshold. Ops-only.