feat(pilot): [FIX_OUTCOME] system prompt instructions

Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's parser consumes. Placeholder-only per the anti-parrot pattern — no literal UUIDs, outcomes, or reasons that could leak into unrelated sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:12:00 -04:00
parent c0112f8bee
commit 2cde6673b0
3 changed files with 46 additions and 6 deletions
--- a/backend/app/services/assistant_chat_service.py
+++ b/backend/app/services/assistant_chat_service.py
@@ -198,6 +198,44 @@ for the drafted script
 The marker is stripped from display — the engineer sees the suggested fix as \
 an interactive card with confidence badge, not raw JSON.
 ## Reporting fix outcome with [FIX_OUTCOME]
 When the engineer clearly indicates in chat that a previously proposed fix
 worked, didn't work, or was partially applied, emit a [FIX_OUTCOME] marker
 on its own lines. This surfaces a "confirm outcome?" banner in the UI — it
 does NOT mark the fix resolved on its own; the engineer confirms via the UI.
 **When to emit [FIX_OUTCOME]:**
 - The engineer states the user's problem is resolved after applying the fix
  (affirmative resolution language → outcome="success")
 - The engineer states the issue persists after applying the fix
  (→ outcome="failure")
 - The engineer describes applying only part of the fix
  (→ outcome="partial")
 **When NOT to emit [FIX_OUTCOME]:**
 - The engineer is still verifying (user rebooting, testing, etc.)
 - The outcome is ambiguous or inferred rather than stated
 - No [SUGGEST_FIX] has been emitted this session
 **[FIX_OUTCOME] marker format (one block per response, on its own lines).**
 Schema below — DO NOT copy these placeholders into your real response, fill \
 each field with content specific to the actual ticket:
 [FIX_OUTCOME]
 {"fix_id": "<uuid-of-the-active-suggested-fix>",
 "outcome": "<success|failure|partial>",
 "reason": "<one-line-quote-or-paraphrase-of-what-the-engineer-said>"}
 [/FIX_OUTCOME]
 - `fix_id`: the UUID of the active suggested fix (provided in session context)
 - `outcome`: one of `"success"`, `"failure"`, or `"partial"`
 - `reason`: one-line paraphrase of what the engineer said — derived from \
 their CURRENT message, not invented
 The marker is stripped from display — the engineer sees a "confirm outcome?" \
 banner in the UI, not raw JSON.
 ## Using the Team's Flow Library
 Your team has built troubleshooting flows in ResolutionFlow. When relevant flows \
 appear in the context below, reference them by name so the engineer can launch them \
@@ -269,6 +307,8 @@ the originating item's `id` into `source_ref` verbatim.
 [SUGGEST_FIX] is OPTIONAL — emit one at most per response, only when you have a \
 concrete proposed resolution at ~50%+ confidence. A new [SUGGEST_FIX] supersedes \
 any prior suggested fix.
 [FIX_OUTCOME] is OPTIONAL — emit one at most per response, only when the engineer \
 has clearly stated the outcome in their current message.
 ANTI-PARROT RULE: The schemas above use placeholders in `<angle brackets>` to show \
 the SHAPE of valid output. Your real questions, actions, facts, and suggested fixes \
--- a/backend/tests/test_prompt_anti_parrot.py
+++ b/backend/tests/test_prompt_anti_parrot.py
@@ -87,7 +87,7 @@ _FORBIDDEN_LITERAL_TOKENS: tuple[str, ...] = (
 #   so prose blocks (like the closing-tag-distance regex match across
 #   markdown headings) are excluded
 _MARKER_BLOCK_RE = re.compile(
-    r"(?:^|\n)\[(QUESTIONS|ACTIONS|SUGGEST_FIX|PROMOTE|FORK|TREE_UPDATE|STEPS_UPDATE|INTAKE_FORM|METADATA|DELTA)\]"
+    r"(?:^|\n)\[(QUESTIONS|ACTIONS|SUGGEST_FIX|FIX_OUTCOME|PROMOTE|FORK|TREE_UPDATE|STEPS_UPDATE|INTAKE_FORM|METADATA|DELTA)\]"
    r"\s*\n"                          # forced newline before content
    r"(\s*[\[{][\s\S]*?)"             # content must start with [ or {
    r"\s*\n\[/\1\]"
--- a/docs/FlowAssist_Migration/phase-8-fix-outcome-banner.md
+++ b/docs/FlowAssist_Migration/phase-8-fix-outcome-banner.md
@@ -5,7 +5,7 @@
 **Goal:** Replace the task-lane Suggested Fix card with a chat-composer-anchored **Proposal Banner** that owns the full lifecycle of a proposed fix (Proposed → Verifying → Success / Failed / Partial / Dismissed), with explicit outcome tracking, AI chat-inferred outcomes via a new `[FIX_OUTCOME]` marker, and implicit signals wired to the Resolve / Escalate actions.
 **Architecture:**
- Backend extends `session_suggested_fixes` with outcome columns (`status`, `applied_at`, `verified_at`, `partial_notes`, `failure_reason`) and adds a PATCH `/outcome` endpoint. `unified_chat_service` learns a new `[FIX_OUTCOME]` marker that writes outcome proposals (not terminal — engineer confirms). FLOWPILOT_SYSTEM_PROMPT gains marker instructions in placeholder form (anti-parrot compliant).
+- Backend extends `session_suggested_fixes` with outcome columns (`status`, `applied_at`, `verified_at`, `partial_notes`, `failure_reason`) and adds a PATCH `/outcome` endpoint. `unified_chat_service` learns a new `[FIX_OUTCOME]` marker that writes outcome proposals (not terminal — engineer confirms). `ASSISTANT_SYSTEM_PROMPT` gains marker instructions in placeholder form (anti-parrot compliant).
 - Frontend replaces `SuggestedFix.tsx` (task-lane card) with a new `ProposalBanner.tsx` docked above the chat composer. The banner is a state-driven component (`proposed | verifying | partial | ai_confirming | dismissed`) with a sibling `EscalateInterceptDialog` popover. AssistantChatPage orchestrates state transitions and wires the implicit signals (Resolve auto-success, Escalate intercept, post-apply-message nudge).
 **Tech Stack:** Python 3.11 + FastAPI + SQLAlchemy 2.0 async + Alembic + Pydantic v2 (backend); React 19 + Vite + TypeScript + Tailwind v4 (frontend). pytest for backend tests; frontend verified via `tsc -b` + browser smoke-test (no unit-test harness for new components — the codebase has no component test pattern per the handoff note on Phase 7 visual verification).
@@ -26,7 +26,7 @@
 - `backend/app/schemas/session_suggested_fix.py` — add `SessionSuggestedFixOutcomeRequest`, extend response with outcome fields
 - `backend/app/api/endpoints/session_suggested_fixes.py` — add PATCH `/outcome` endpoint
 - `backend/app/services/unified_chat_service.py` — add `[FIX_OUTCOME]` parser + persist step
- `backend/app/services/flowpilot_engine.py` — add `[FIX_OUTCOME]` instructions to `FLOWPILOT_SYSTEM_PROMPT`
+- `backend/app/services/assistant_chat_service.py` — add `[FIX_OUTCOME]` instructions to `ASSISTANT_SYSTEM_PROMPT`
 - `backend/tests/test_prompt_anti_parrot.py` — extend known-leaked-token list if needed
 ### Frontend — new
@@ -659,13 +659,13 @@ git commit -m "feat(pilot): [FIX_OUTCOME] marker parser + ai_outcome_proposal co
 ## Task 5: System prompt update + anti-parrot guardrail
 **Files:**
- Modify: `backend/app/services/flowpilot_engine.py` (or wherever `FLOWPILOT_SYSTEM_PROMPT` lives — confirm at start)
+- Modify: `backend/app/services/assistant_chat_service.py` — `ASSISTANT_SYSTEM_PROMPT` is the prompt used by `unified_chat_service._call_ai`; `flowpilot_engine.py` has a separate prompt for structured JSON flows that does not use chat markers
 - Modify: `backend/tests/test_prompt_anti_parrot.py`
 - [ ] **Step 1: Locate the system prompt**
 ```bash
-grep -rln "FLOWPILOT_SYSTEM_PROMPT\s*=" /config/workspace/resolutionflow/backend/app/
+grep -rln "ASSISTANT_SYSTEM_PROMPT\s*=" /config/workspace/resolutionflow/backend/app/
 ```
 Open whichever file owns the constant.
@@ -719,7 +719,7 @@ Expected: pass. If it fails with a literal-value complaint, re-read the prompt a
 - [ ] **Step 5: Commit**
 ```bash
-git add backend/app/services/flowpilot_engine.py backend/tests/test_prompt_anti_parrot.py
+git add backend/app/services/assistant_chat_service.py backend/tests/test_prompt_anti_parrot.py
 git commit -m "feat(pilot): [FIX_OUTCOME] system prompt instructions"
 ```