Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas b7d7ff06d2
All checks were successful
Mirror to GitHub / mirror (push) Successful in 5s
CI / frontend (pull_request) Successful in 5m8s
CI / backend (pull_request) Successful in 9m46s
CI / e2e (pull_request) Successful in 10m16s
docs(ai): refresh handoff for compute swap
- HANDOFF: rewritten resume point. First action on resume is `git push`
  (commits 0f00ee5 and 665530f are local-only). Visual QA + bug bash is
  the active work; 4 plan-locked items + the structural task-lane fix
  all need real-browser verification.
- CURRENT_TASK: add 0f00ee5 and 665530f to the commit table; reframe
  "Just shipped" as a per-commit summary; flag the task-lane fix as
  needing visual confirmation.
- SESSION_LOG: chronological entry for this session with full detail
  (audit, four polish items, race-condition wiring, structural
  task-lane fix, test status, files touched).
- DECISIONS: new entry "Tag the task-lane state with an owner chatId"
  documenting the structural pattern, what was rejected, and the
  forward implication that future task-lane state slices follow the
  same owner-tagging pattern.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 08:21:23 -04:00

10 KiB
Raw Blame History

CURRENT_TASK.md

Task: Build Escalation Mode — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.

Status: in-flight on feat/escalation-metric-endpoint. Branch is pushed; draft PR #155 is open against main (gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/155). Backend is feature-complete and test-stabilized. Frontend live-arrival SSE subscription, magic-moment handoff-context screen, and bell-icon notification fix all shipped. /escalate and /handoff are now unified through HandoffManager — every escalation creates a SessionHandoff, persists an AppNotification, fans out on the SSE bus, dispatches Slack/Teams via notify(), and emails per-user, regardless of which URL it entered through. Next: visual QA via /qa, then optional follow-ups (suggested-step chips, snapshot expansion, analytics page, Playwright e2e).

Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md. Reviewed by /office-hours, /plan-eng-review, /plan-design-review, /codex review. Eng + Design CLEARED. Codex's two-metric correction + claim role gate + per-channel notification model + SSE bus diagnostics all applied.

Test plan artifact: docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md — primary input for /qa once feature-complete.

Done on feat/escalation-metric-endpoint (branched from main @ c0ed6d9)

Commit What it ships
d51e95c Plan + test-plan artifacts
52f6d03 GET /analytics/flowpilot/escalations — in-product time-to-first-action; account-scoped, engineer-or-admin gated
7a5b853 Role-gate POST /handoffs/{id}/claim to engineer-or-admin
07d0db9 HandoffManager.dispatch_escalation_notifications — emails engineer/admin teammates on intent=escalate; graceful-degradation regression
9f0bfd4 EscalationMetricCard mounted above the queue list
a283d0d .ai/ mid-flight refresh
87bd0b7 WIP marker for the SSE backend slice (paused for Codex pass)
bc15952 Codex: stabilize SSE backend tests — Depends(..., scope="function") releases auth DB deps before the long-lived stream body; SSE handshake test calls the generator directly; AI-assessment stub fixture; bus normalizes string vs UUID account_id
fff8338 Doc-only: track escalation assessment latency follow-up
9bdd995 Bound escalation assessment latency to ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS (default 5s); handoff still creates if assessment times out
b8627f4 Frontend SSE subscription in EscalationQueue.tsx — fetch-based ReadableStream reader; handoff_created triggers refetch + prepend with locked 200ms slide-in; exponential-backoff reconnect; tab-title flash when backgrounded; prefers-reduced-motion honored; ARIA live-region
f65b657 Handoff state docs after frontend SSE slice lands
8e9d22e Magic-moment handoff-context screen on pickup — HandoffContextScreen.tsx (4 sections, graceful null AI assessment, focus management, prefers-reduced-motion); FlowPilotSessionPage.tsx integration (pre-claim handoff fetch, claim on Start here, toolbar re-open overlay)
c194ba4 Handoff state docs after magic-moment screen lands
641853a Bell-icon notification opens the pickup flow — notification link template adds ?pickup=true; GET /ai-sessions/{id} allows account-scoped read for requesting_escalation / escalated states
2a2329a Handoff state docs after bell-icon fix; record draft PR #155
029680a Unify /escalate through HandoffManager — single canonical path for every escalation. HandoffCreateRequest.target_user_id, create_handoff does the legacy enriched-package work + sets escalation_reason, finalize_escalation runs documentation + PSA push + notify() pre-commit, dispatch_escalation_notifications keeps only fire-and-forget IO post-commit. pickup_session accepts either status for in-flight migration. flowpilot_engine.escalate_session no longer called from any endpoint
8914391 First task-lane race fix — initializer-time guards (`incomingPrefill
0f00ee5 Four plan-locked wedge polish items in one commit — see "Just shipped" section below
665530f Structural fix for the task-lane stale-flash bug. taskLaneOwnerChatId state tags the chatId the in-memory questions/actions belong to. Set at every populate site (sendPrefill, selectChat, handleSend, handleTaskSubmit, handleResumeNew, refreshFacts, handleApplyFix); cleared in resetSessionDerivedState. Persistence effect now writes chatId: ownerChatId (was activeChatId — that was the original write-side bug). Render gate taskLaneIsForActiveChat = ownerChatId === activeChatId ANDed into all three render conditions. Stale data is now structurally unable to display. See DECISIONS entry for full rationale

Test status: full backend suite → 1103 passed in 259.63s with -n auto after the unification. Frontend tsc -b clean. End-to-end smoke test against the running dev stack confirmed: SSE handshake delivers ready + handoff_created frames; listHandoffs returns the unclaimed handoff for a senior pre-claim; claimHandoff flips session status escalatedactive; senior (non-owner, non-target) can GET an in-transit session detail; a single legacy /escalate call now produces status='escalated', SessionDocumentation, SessionHandoff row, AppNotification with link /pilot/{id}?pickup=true for the team admin, and a PSA push attempt — all from one funneled HandoffManager call. Branch pushed; draft PR #155 open.

Remaining work on this branch

  1. Visual QA + bug bash in a real browser — full pickup demo path with the four new pieces below; this is the next active step.
  2. Snapshot expansion in HandoffManager._generate_snapshot — include the recent diagnostic steps / conversation tail so the magic-moment screen's "What's been tried" section can render the actual timeline pre-claim instead of "full timeline available after pickup".
  3. Toolbar Context button on legacy-arrival sessions — currently the button only appears when the senior arrived via the magic-moment flow this session. Lazy-fetching the handoff list on session-load (when status was-escalated) would make it work on revisits.
  4. Owner-facing analytics page at /analytics/escalations — period selector, conversion-rate, trend chart. ~0.5d. Optional for v1 demo.
  5. Playwright e2e for the magic-moment demo flow (junior escalates → senior receives via SSE → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.

Just shipped (this session — 2 commits)

Commit 0f00ee5 — four plan-locked wedge polish items:

  • Live AI assessment refresh on the magic-moment screen. New HandoffAssessmentReadyEvent type + onAssessmentReady handler on streamEscalations. AssistantChatPage opens a scoped SSE subscription whenever it has a tracked handoff with no AI assessment yet; on a matching event it refetches and replaces both magicHandoff and overlayHandoff in place. Closes the loop on the async-assessment commit e8ba74e.
  • Suggested-step chips below the chat input. New chipsHidden state in AssistantChatPage defaulting to false; a chip strip renders above the composer when magicHandoff?.ai_assessment_data?.suggested_steps[] is non-empty and the magic-moment has dissolved. Click prefills input + focus; first send hides the strip; explicit X also hides. Per-session lifetime (Codex correction locked design).
  • Unread 6px dot on EscalationQueue cards. localStorage-persisted seen set (rf-escalation-seen, capped 200). Dot renders top-right of any card not yet seen. Cleared on open (card click) or claim (Pick Up) — NOT on hover (Codex correction). Pick Up onClick now stops propagation so the wrapper's open handler isn't double-fired.
  • Race-condition toast on claim conflict. New HandoffAlreadyClaimedError exception class in handoff_manager.py. claim_session now eager-loads claimed_by_user, rejects different-user re-claims (idempotent for same-user), and raises with the winner's id/name/timestamp. Endpoint translates to 409 with structured detail. AssistantChatPage.handleStartHere extracts the detail, formats "Already claimed by {name} {time_ago}." via timeAgo(), drops ?pickup=true, and dismisses the magic-moment so the loser flows back to the queue. Backed by 2 new unit tests in test_handoff_manager.py.

Commit 665530f — structural fix for the recurring stale-task-lane bug. Owner-tagging pattern applied to activeQuestions / activeActions / showTaskLane. See DECISIONS.md for the architecture write-up. User-reported on next session: needs visual verification.

Two-metric framing — read this before quoting numbers to anyone

The in-product endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.

Kill-switch

Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) for context, but data lands first.

Previous task — closed out

Task: Land PR #153 — fix the AssistantChatPage prefill currentChatRef bug. Status: complete (2026-04-26). Merged as 68fcdc6 on main.

Background CI item, not blocking: promoting CI / e2e (pull_request) to required on main. Two consecutive green runs cleared the threshold. Ops-only.