Files
resolutionflow/.ai/CURRENT_TASK.md
Michael Chihlas 5bee264d70 fix(suggested-fix-pending): apply PR #156 review fixes
- Page-level Resolve patches applied_pending → applied_success before
  opening the resolution flow, so resolved sessions don't carry a
  provisional pending fix.
- Page-level Escalate intercept now catches applied_pending in addition
  to verifying/partial; intercept copy generalized from "Verifying state"
  to "still needs an outcome."
- PendingBanner gains a Dismiss action, matching the PR body and the
  backend's allowed pending → dismissed transition.
- resolution_note_generator and escalation_package_generator system
  prompts no longer include real-looking pending examples (anti-parrot
  guardrail compliance).

Verified via Docker: prompt anti-parrot 2/2, suggested-fix outcome suite
21/21, frontend tsc -b clean, npm run build clean.

Co-Authored-By: Codex <noreply@openai.com>
2026-04-30 23:02:46 -04:00

51 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# CURRENT_TASK.md
**Task:** Add a fourth, non-terminal outcome to the suggested-fix banner — **Awaiting verification** (`applied_pending`). Today the verifying banner forces a synchronous verdict (worked / didn't / partial), but a lot of real fixes are async — engineer ran the script but is waiting on the client to power-cycle, AD replication, an O365 license sync. Without a fourth state the banner sits stale or the engineer guesses wrong.
**Status:****Engineering complete; PR #156 open.** Backend tests, prompt guardrail, frontend tsc/build clean; Alembic migration applies. Pending browser QA.
**Branch:** `feat/fix-pending-verification` (off `main` after the Escalation Mode merge).
**PR:** https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/156
## What ships
| Layer | Change |
|---|---|
| Schema | New `FixStatus="applied_pending"` + new `pending_reason` Text column on `session_suggested_fixes`. |
| Migration | `c0f3a4b7e91d` — adds `pending_reason`, extends status CHECK constraint. |
| API | `PATCH /suggested-fixes/{id}/outcome` accepts `applied_pending`, requires `notes`, stamps `applied_at` only (NOT `verified_at`). Pending in/out transitions allowed (parked, like partial). |
| Generators | `resolution_note_generator` and `escalation_package_generator` system prompts handle the new status without real-looking examples; resolution note frames the fix as provisional; escalation package surfaces pending verification as the leading hypothesis with reference to what's being waited on. |
| Frontend | New `BannerMode='pending'` + `PendingBanner` component (info-tone, mirrors `PartialBanner`) with worked / didn't / update reason / dismiss actions; "Waiting to verify…" overflow option in `VerifyingBanner`; nudge "Still checking" now records pending with a reason instead of just silencing; `AssistantChatPage` banner-mode derivation maps `applied_pending → 'pending'`; page-level Resolve/Escalate now handle pending honestly. |
| Tests | 4 new integration tests in `test_fix_outcome_endpoint.py`: pending-requires-notes, pending stores reason + applied_at-not-verified_at, pending→success transition, pending_reason update on re-PATCH. 21/21 pass. Prompt anti-parrot guardrail passes. Frontend `tsc -b` and Vite build pass via Docker. |
## Out of scope (intentionally)
- Cross-session "Follow-ups" dashboard rollup. The chat-anchored `PendingBanner` is the per-session reminder. Add the dashboard surface only if engineers report losing track across multiple pending sessions.
- Optional follow-up timer ("remind me in 30m"). Nice but not the wedge.
## Resume point — DO THIS NEXT
**Browser QA:** verify the new flow end-to-end in dev:
1. Trigger a suggested fix, click Apply, then open the verifying banner overflow → "Waiting to verify…" → enter a reason → confirm `PendingBanner` renders with the reason.
2. From `PendingBanner`, click "It worked" → confirm transition to terminal success and banner dismissal.
3. From `PendingBanner`, click "Update reason" → confirm reason updates server-side.
4. From `PendingBanner`, click "Dismiss" → confirm banner dismissal and no terminal timestamps.
5. Trigger the nudge state (3+ post-apply messages) → click "Still checking" → enter a reason → confirm pending state takes over.
6. From a pending fix, use page-level Resolve → confirm the fix is patched to `applied_success` before the resolution note.
7. From a pending fix, use page-level Escalate → confirm the outcome intercept appears before the conclude modal.
After QA passes, commit/push the local review fixes and merge PR #156.
## Just-shipped (2026-04-30)
**PR #155 — Escalation Mode wedge** merged into main as `ac42f97`. The wedge for ResolutionFlow's GTM (first paying-customer push). Senior tech sees structured handoff context in seconds via the magic-moment screen. Plan: [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md).
## Two-metric framing (Escalation Mode — read before quoting numbers)
The in-product `GET /analytics/flowpilot/escalations` endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations. Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
## Kill-switch (Escalation Mode)
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge.