Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas 7cee7228dc
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
CI / frontend (pull_request) Successful in 5m9s
CI / backend (pull_request) Successful in 9m51s
CI / e2e (pull_request) Successful in 9m22s
docs(ai): refresh handoff for PR #156 — pending-verification feature
Closes out Escalation Mode (PR #155 merged) and pivots active task to
the new applied_pending suggested-fix outcome on PR #156.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 17:37:08 -04:00

3.9 KiB
Raw Blame History

CURRENT_TASK.md

Task: Add a fourth, non-terminal outcome to the suggested-fix banner — Awaiting verification (applied_pending). Today the verifying banner forces a synchronous verdict (worked / didn't / partial), but a lot of real fixes are async — engineer ran the script but is waiting on the client to power-cycle, AD replication, an O365 license sync. Without a fourth state the banner sits stale or the engineer guesses wrong.

Status: Engineering complete; PR #156 open. Backend tests + tsc clean, Alembic migration applies. Pending browser QA.

Branch: feat/fix-pending-verification (off main after the Escalation Mode merge).

PR: #156

What ships

Layer Change
Schema New FixStatus="applied_pending" + new pending_reason Text column on session_suggested_fixes.
Migration c0f3a4b7e91d — adds pending_reason, extends status CHECK constraint.
API PATCH /suggested-fixes/{id}/outcome accepts applied_pending, requires notes, stamps applied_at only (NOT verified_at). Pending in/out transitions allowed (parked, like partial).
Generators resolution_note_generator and escalation_package_generator system prompts handle the new status: resolution note frames the fix as provisional; escalation package surfaces pending verification as the leading hypothesis with reference to what's being waited on.
Frontend New BannerMode='pending' + PendingBanner component (info-tone, mirrors PartialBanner); "Waiting to verify…" overflow option in VerifyingBanner; nudge "Still checking" now records pending with a reason instead of just silencing; AssistantChatPage banner-mode derivation maps applied_pending → 'pending'.
Tests 4 new integration tests in test_fix_outcome_endpoint.py: pending-requires-notes, pending stores reason + applied_at-not-verified_at, pending→success transition, pending_reason update on re-PATCH. 21/21 pass.

Out of scope (intentionally)

  • Cross-session "Follow-ups" dashboard rollup. The chat-anchored PendingBanner is the per-session reminder. Add the dashboard surface only if engineers report losing track across multiple pending sessions.
  • Optional follow-up timer ("remind me in 30m"). Nice but not the wedge.

Resume point — DO THIS NEXT

Browser QA: verify the new flow end-to-end in dev:

  1. Trigger a suggested fix, click Apply, then open the verifying banner overflow → "Waiting to verify…" → enter a reason → confirm PendingBanner renders with the reason.
  2. From PendingBanner, click "It worked" → confirm transition to terminal success and banner dismissal.
  3. From PendingBanner, click "Update reason" → confirm reason updates server-side.
  4. Trigger the nudge state (3+ post-apply messages) → click "Still checking" → enter a reason → confirm pending state takes over.

After QA passes, merge PR #156.

Just-shipped (2026-04-30)

PR #155 — Escalation Mode wedge merged into main as ac42f97. The wedge for ResolutionFlow's GTM (first paying-customer push). Senior tech sees structured handoff context in seconds via the magic-moment screen. Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md.

Two-metric framing (Escalation Mode — read before quoting numbers)

The in-product GET /analytics/flowpilot/escalations endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations. Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.

Kill-switch (Escalation Mode)

Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge.