chihlasm/resolutionflow

Fork 0

Files

Michael Chihlas 7cee7228dc

Mirror to GitHub / mirror (push) Successful in 3s

Details

CI / frontend (pull_request) Successful in 5m9s

Details

CI / backend (pull_request) Successful in 9m51s

Details

CI / e2e (pull_request) Successful in 9m22s

Details

docs(ai): refresh handoff for PR #156 — pending-verification feature

Closes out Escalation Mode (PR #155 merged) and pivots active task to
the new applied_pending suggested-fix outcome on PR #156.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-30 17:37:08 -04:00

3.9 KiB

Raw Blame History

CURRENT_TASK.md

Task: Add a fourth, non-terminal outcome to the suggested-fix banner — Awaiting verification (applied_pending). Today the verifying banner forces a synchronous verdict (worked / didn't / partial), but a lot of real fixes are async — engineer ran the script but is waiting on the client to power-cycle, AD replication, an O365 license sync. Without a fourth state the banner sits stale or the engineer guesses wrong.

Status: ✅ Engineering complete; PR #156 open. Backend tests + tsc clean, Alembic migration applies. Pending browser QA.

Branch: feat/fix-pending-verification (off main after the Escalation Mode merge).

PR: #156

What ships

Layer	Change
Schema	New `FixStatus="applied_pending"` + new `pending_reason` Text column on `session_suggested_fixes`.
Migration	`c0f3a4b7e91d` — adds `pending_reason`, extends status CHECK constraint.
API	`PATCH /suggested-fixes/{id}/outcome` accepts `applied_pending`, requires `notes`, stamps `applied_at` only (NOT `verified_at`). Pending in/out transitions allowed (parked, like partial).
Generators	`resolution_note_generator` and `escalation_package_generator` system prompts handle the new status: resolution note frames the fix as provisional; escalation package surfaces pending verification as the leading hypothesis with reference to what's being waited on.
Frontend	New `BannerMode='pending'` + `PendingBanner` component (info-tone, mirrors `PartialBanner`); "Waiting to verify…" overflow option in `VerifyingBanner`; nudge "Still checking" now records pending with a reason instead of just silencing; `AssistantChatPage` banner-mode derivation maps `applied_pending → 'pending'`.
Tests	4 new integration tests in `test_fix_outcome_endpoint.py`: pending-requires-notes, pending stores reason + applied_at-not-verified_at, pending→success transition, pending_reason update on re-PATCH. 21/21 pass.

Out of scope (intentionally)

Cross-session "Follow-ups" dashboard rollup. The chat-anchored PendingBanner is the per-session reminder. Add the dashboard surface only if engineers report losing track across multiple pending sessions.
Optional follow-up timer ("remind me in 30m"). Nice but not the wedge.

Resume point — DO THIS NEXT

Browser QA: verify the new flow end-to-end in dev:

Trigger a suggested fix, click Apply, then open the verifying banner overflow → "Waiting to verify…" → enter a reason → confirm PendingBanner renders with the reason.
From PendingBanner, click "It worked" → confirm transition to terminal success and banner dismissal.
From PendingBanner, click "Update reason" → confirm reason updates server-side.
Trigger the nudge state (3+ post-apply messages) → click "Still checking" → enter a reason → confirm pending state takes over.

After QA passes, merge PR #156.

Just-shipped (2026-04-30)

PR #155 — Escalation Mode wedge merged into main as ac42f97. The wedge for ResolutionFlow's GTM (first paying-customer push). Senior tech sees structured handoff context in seconds via the magic-moment screen. Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md.

Two-metric framing (Escalation Mode — read before quoting numbers)

The in-product GET /analytics/flowpilot/escalations endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline − in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations. Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.

Kill-switch (Escalation Mode)

Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge.

3.9 KiB Raw Blame History Unescape Escape

CURRENT_TASK.md

What ships

Out of scope (intentionally)

Resume point — DO THIS NEXT

Just-shipped (2026-04-30)

Two-metric framing (Escalation Mode — read before quoting numbers)

Kill-switch (Escalation Mode)

3.9 KiB

Raw Blame History