Files
resolutionflow/.ai/HANDOFF.md
Michael Chihlas a283d0d3fd docs(ai): refresh handoff state mid-flight on Escalation Mode build
Capture the in-flight state of the Escalation Mode wedge build so the next
session (or Codex resume) picks up cleanly without re-deriving context:

- CURRENT_TASK now describes the wedge, what's done across the 5 commits on
  this branch, what remains (WebSocket push, magic-moment screen, analytics
  page, e2e), and the two-metric framing readers MUST internalize before
  quoting numbers
- HANDOFF resume point is the WebSocket/SSE push (live-arrival half of the
  notification dual-path); includes suggested first slice + watch-outs
  (no user_id on ai_session_step, denormalized account_id, peer-escalation
  still gated to session owner)
- Both files reference the design doc and the kill-switch criterion

No code change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 16:38:14 -04:00

4.8 KiB
Raw Blame History

HANDOFF.md

Last updated: 2026-04-27 EDT

Active task: Escalation Mode wedge build. See CURRENT_TASK.md for the full status; this file holds the resume point only.

Branch: feat/escalation-metric-endpoint — five commits stacked on top of main (c0ed6d9). Nothing pushed yet.

9f0bfd4  feat(escalations): mount time-to-first-action stat-card on /escalations
07d0db9  feat(handoff): email engineer-or-admin teammates on escalation
7a5b853  feat(api): role-gate handoff claim to engineer-or-admin
52f6d03  feat(analytics): add escalation time-to-first-action metric endpoint
d51e95c  docs(plans): add escalation-mode wedge design + test plan

Resume point

Pick up the WebSocket/SSE push — the live-arrival half of the notification dual-path. Email is already wired (commit 07d0db9); push is the second channel that makes the demo's "30-second magic moment" undeniable when the receiving senior is online and on the queue page.

Suggested first slice: a thin server-side SSE endpoint scoped to current_user.account_id, fan out from HandoffManager.dispatch_escalation_notifications (alongside email), and hook the frontend EscalationQueue to subscribe and prepend new cards with the locked 200ms slide-in. Reconnect logic, tab-title flash, and prefers-reduced-motion respect are part of this slice per the locked UI spec in the design doc.

After the dual-path is feature-complete, the magic-moment handoff-context screen is next (4 sections, dissolves into the FlowPilot session view on first action).

Where things stand

  • CI on main still healthy. Branch protection: CI / frontend (pull_request) required, CI / backend (pull_request) required, CI / e2e (pull_request) not yet required (ops-only follow-up — two consecutive green runs cleared the threshold).
  • 20 backend tests green on this branch (handoff_manager, session_handoffs_api, flowpilot_analytics_escalations). Frontend tsc -b clean. Branch has not been pushed; no CI runs yet.
  • The plan doc at docs/plans/2026-04-27-escalation-mode-wedge-design.md is the source of truth for every UI / metric / scope decision. The embedded GSTACK REVIEW REPORT at the bottom shows Eng + Design CLEARED and Codex INFO with the disposition of all 12 of its findings.

Useful breadcrumbs

  • New endpoint: backend/app/api/endpoints/flowpilot_analytics.pyget_escalation_metrics at the bottom of the file.
  • Notification dispatch: backend/app/services/handoff_manager.pydispatch_escalation_notifications. Wired in backend/app/api/endpoints/session_handoffs.py after db.commit() so a rolled-back handoff never emails.
  • Frontend stat-card: frontend/src/components/flowpilot/EscalationMetricCard.tsx. Renders n_with_action / n_claimed, avg + median, and the metric_definition disclaimer.
  • Two-metric framing — required reading before quoting any number to a pilot. The in-product endpoint measures post-claim time-to-first-action; the savings claim is manual_baseline in_product. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc).
  • The notification_sent boolean is intentionally NOT being written. Per Codex's correction it should be replaced by per-channel delivery records; v1.x story. For now, application logs are the audit trail.
  • Two TODOs added during this session: peer-tech escalation (deferred to v2) and the (already moved-in-scope) claim role gate. See TODO.md.

Watch-outs

  • ai_session_step has NO user_id column — the metric query keys "first action by senior" off session_id + created_at > claimed_at, which is fine because session activity post-claim IS the senior's activity (the session is reactivated under escalated_to_id). If a future change adds user_id to ai_session_step, the metric query can become more precise.
  • account_id is denormalized on ai_session_step (Phase 4 RLS pattern). The metric query and any new SSE subscription scoping must use it directly, not join through ai_sessions.
  • POST /handoff still requires the session owner to be the escalator (AISession.user_id == current_user.id). Peer-tech escalation is captured as a v2 TODO. Don't widen this without a UX decision.

Kill-switch (week 8)

If 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) but data lands first.