Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas 02d5c6c08c docs(ai): refresh handoff state for next-session pickup under 200k context
Default Claude Code model is being switched from Opus 4.7 1M-context to
Opus 4.7 (200k). Tighten the per-session pickup docs so they're
self-sufficient under the smaller window:

- CURRENT_TASK now reflects the post-Codex state: 8 commits on the
  branch (5 feat + WIP SSE + 2 Codex test/latency fixes + 1 doc
  refresh), 32/32 backend tests with -n auto, frontend tsc -b clean.
  Remaining work re-scoped: the SSE backend half is feature-complete
  and tested, so what's left is the FRONTEND SSE subscription in
  EscalationQueue.tsx, then the magic-moment handoff-context screen,
  then push + draft PR.
- Session log gets a Claude Code entry covering today's planning →
  build → pause-for-Codex arc, the design decisions locked into the
  doc and code, the two TODOs added (peer-tech escalation, mobile
  responsive), and the model-switch context for the next session.
- HANDOFF.md needs no change — Codex's update in 9bdd995 already
  describes the resume point and watch-outs cleanly.

No code change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 20:13:40 -04:00

48 lines
4.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CURRENT_TASK.md
**Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
**Status:** in-flight on `feat/escalation-metric-endpoint`. Backend is **feature-complete and test-stabilized**. **Next:** frontend SSE subscription in `EscalationQueue.tsx`, then the magic-moment handoff-context screen, then push + draft PR.
**Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.
**Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once feature-complete.
## Done on `feat/escalation-metric-endpoint` (8 commits, branched from `main` @ `c0ed6d9`)
| Commit | What it ships |
|---|---|
| `d51e95c` | Plan + test-plan artifacts |
| `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated |
| `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin |
| `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression |
| `9f0bfd4` | `EscalationMetricCard` mounted above the queue list |
| `a283d0d` | `.ai/` mid-flight refresh |
| `87bd0b7` | **WIP** marker for the SSE backend slice (paused for Codex pass) |
| `bc15952` | Codex: stabilize SSE backend tests — `Depends(..., scope="function")` releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id |
| `fff8338` | Doc-only: track escalation assessment latency follow-up |
| `9bdd995` | Bound escalation assessment latency to `ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS` (default 5s); handoff still creates if assessment times out |
**Test status:** focused subset (`test_escalation_bus`, `test_handoff_manager`, `test_session_handoffs_api`, `test_flowpilot_analytics_escalations`) → `32 passed in 17.77s` with `-n auto`. Frontend `tsc -b` clean. Branch not pushed.
## Remaining work on this branch
1. **Frontend SSE subscription** in `EscalationQueue.tsx`. Use a fetch-based `ReadableStream` reader (matching [`frontend/src/api/aiSessions.ts`](../frontend/src/api/aiSessions.ts) `streamDocumentation` — native `EventSource` can't send auth headers). Prepend new cards with the locked 200ms slide-in. Reconnect with backoff. Tab-title flash when backgrounded. Respect `prefers-reduced-motion`.
2. **Magic-moment handoff-context screen** — 4-section view (problem summary / what's been tried / AI assessment / Start here CTA) that loads on Pick Up before dissolving into the regular FlowPilot session view. ~1.5-2 days. Must render gracefully when `ai_assessment` is `None` (assessment timed out — see `9bdd995`).
3. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
4. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
## Two-metric framing — read this before quoting numbers to anyone
The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
## Kill-switch
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.
## Previous task — closed out
**Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`.
**Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green runs cleared the threshold. Ops-only.