Files

Mirror to GitHub / mirror (push) Successful in 5s

Details

chore(ai): post-#156 handoff — feature shipped, QA report attached

Updates the .ai/ handoff trio after PR #156 merge:
- CURRENT_TASK.md: clear active task; record #156 in Recently shipped
  alongside #155 with one-line summary and QA-report pointer.
- HANDOFF.md: rewrite resume point as "pick next from TODO/roadmap";
  document carry-forward env quirks (CONTAINER=1 for Chromium,
  docker-01 hosts entry, multi-head alembic state).
- SESSION_LOG.md: append session entry for QA + merge.

Also includes the .gstack/qa-reports/ artifacts (report + 8 screenshots).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-30 23:45:10 -04:00

2.6 KiB

Raw Blame History

CURRENT_TASK.md

Active task: None — pick next from .ai/TODO.md or 03-DEVELOPMENT-ROADMAP.md.

Recently shipped

2026-05-01 — PR #156 Suggested-fix applied_pending non-terminal outcome. Merged into main as 3ba4532. Adds:
- Schema/API: FixStatus="applied_pending", pending_reason Text column, migration c0f3a4b7e91d. PATCH /suggested-fixes/{id}/outcome accepts pending, requires notes, stamps applied_at only.
- UI: PendingBanner (info-tone, worked / didn't / update reason / dismiss). "Waiting to verify…" overflow option in VerifyingBanner. Nudge "Still checking" records pending with a reason. Page-level Resolve auto-patches pending → success before resolution flow; page-level Escalate intercepts pending the same way verifying/partial does.
- Generators: resolution_note_generator and escalation_package_generator system prompts handle the new status without real-looking examples.
- Tests: 4 new in test_fix_outcome_endpoint.py (21/21 suite green); prompt anti-parrot guardrail green; tsc + Vite build clean.
- QA report: .gstack/qa-reports/qa-report-pending-verification-2026-04-30.md (5/7 scripted checks PASS with concrete evidence; 2 entry-path checks deferred — same handlers verified via tested transitions).
2026-04-30 — PR #155 Escalation Mode wedge merged as ac42f97. Senior-tech magic-moment screen. Plan: docs/plans/2026-04-27-escalation-mode-wedge-design.md.

Two-metric framing (Escalation Mode — read before quoting numbers)

The in-product GET /analytics/flowpilot/escalations endpoint measures post-claim time-to-first-action. The "minutes recovered" sales claim is manual_baseline − in_product_metric. Manual baseline comes from the founder's stopwatch on the next 5 escalations. Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.

Kill-switch (Escalation Mode)

Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge.

Notes for next session

Drive checks 1 (VerifyingBanner overflow → "Waiting to verify…") and 5 (nudge "Still checking" with 3+ post-apply messages) in real pilot usage to close the QA gap left by /qa (the tested handlers cover the same mutations, but the entry-path UI rendering wasn't exercised end-to-end).
Consider monitoring how often pending fixes get parked vs resolved — if engineers report losing track across sessions, revisit the cross-session "Follow-ups" dashboard rollup that was scoped out.

2.6 KiB Raw Blame History Unescape Escape