feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs

Adds the load-bearing structural feature of the FlowPilot migration: a "What we know" panel that holds confirmed facts for a session, fed by AI [PROMOTE] markers and engineer-added notes. Facts feed the resolution note preview (Phase 3) and survive across turns via stable UUIDs assigned to pending_task_lane items. Backend: - FactSynthesisService: create/update/soft-delete facts with atomic state_version bumps; LLM-backed synthesize_from_question/check on the fact_synthesis (Haiku) action tier per Section 6.6. - /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or via synthesis). PATCH returns 403 for question/diagnostic_check facts (edit the source item instead, Section 7.3). - unified_chat_service: [PROMOTE] marker parser (JSON-block per Section 8.1 spec drift note), stable-UUID assignment for pending_task_lane questions/actions preserved by exact text/label match across turns. - ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to emit, hallucination guardrails, source_ref handling. - 17 tests covering parser, stable IDs, service validation, CRUD, editability rule, both promote modes, 422 null-synthesis path, state_version invariant. Frontend: - src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton} — green-gradient section above Questions, dashed-circle check, inline edit/delete gated by the server's editable flag. - TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept per the doc's "rename is opportunistic" guidance). - AssistantChatPage fetches facts on selectChat and refetches after each chat send (so [PROMOTE]-synthesized facts appear immediately); auto- opens the lane when facts exist. Verification: end-to-end smoke against the local docker stack confirms all five endpoints (list/create/patch/delete/promote) plus the 403 editability rule. pytest suite verifies the same with mocked LLM. Live [PROMOTE] flow remains untested until used in the UI — the marker shape is covered by parser tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:13:44 -04:00
parent 19cfd71995
commit 625dba7548
15 changed files with 1922 additions and 21 deletions
--- a/backend/app/services/assistant_chat_service.py
+++ b/backend/app/services/assistant_chat_service.py
@@ -62,9 +62,12 @@ Every response you write MUST follow this exact structure:
 1. **1-3 sentences of analysis** (what the symptoms tell you)
 2. **[QUESTIONS] marker** with 1-3 questions for the engineer (if you need info)
 3. **[ACTIONS] marker** with 1-4 diagnostic commands to run (if applicable)
+4. **[PROMOTE] marker(s)** when the engineer's most recent message confirmed a fact \
+worth recording (optional; see "Promoting facts" below)

 You MUST include at least one marker ([QUESTIONS] or [ACTIONS]) in every response. \
-A response with only prose and no markers is INVALID and will break the UI.
+A response with only prose and no markers is INVALID and will break the UI. \
+[PROMOTE] is optional and IN ADDITION to the required markers, never a replacement.

 ### Complete example of a correct first response:

@@ -112,6 +115,50 @@ information is no longer needed to resolve the issue. Default to keeping them.
 **Both markers are stripped from display** — the engineer sees them as interactive UI cards, \
 not raw JSON. Put analysis BEFORE markers. Markers go at the END of your response.

+## Promoting facts to "What we know"
+
+The engineer has a "What we know" panel that holds confirmed facts about this \
+session. Each confirmed fact stays visible to the engineer for the rest of the \
+session and feeds the resolution note posted to the customer ticket. Surface \
+facts there using a `[PROMOTE]` marker.
+
+**When to emit [PROMOTE]:**
+- The engineer just answered a [QUESTIONS] item with a substantive answer that \
+rules something in or out
+- The engineer just shared diagnostic-check output that confirmed a finding
+- You synthesized a new conclusion from two or more prior facts
+
+**When NOT to emit [PROMOTE]:**
+- The engineer's answer was "unknown", "I don't know", or a clarifying question \
+back to you
+- The diagnostic output was empty, errored, or inconclusive
+- You're re-stating something already in What we know
+- The "fact" is your own hypothesis, not something the engineer confirmed
+
+**[PROMOTE] marker format:**
+Each fact is its own block. You may emit multiple blocks per response.
+
+[PROMOTE]
+{"source_type": "question", "source_ref": "<task_lane_item_id>", "text": "<one short past-tense sentence stating what is now confirmed>", "summary": "<3-7 word provenance label, e.g. 'rules out tenant/license'>"}
+[/PROMOTE]
+
+- `source_type` is one of: `"question"` (fact derived from a question's answer), \
+`"diagnostic_check"` (fact derived from a check's output), or `"ai_synthesis"` \
+(you combined prior facts).
+- `source_ref` is the `id` field of the originating task-lane item — the \
+[QUESTIONS] and [ACTIONS] payloads you receive in conversation context include \
+an `id` for each item. Copy that UUID verbatim. For `ai_synthesis`, OMIT \
+`source_ref` (or set it to null).
+- `text` is a short past-tense sentence ("OWA login confirmed working for \
+jsmith"). Use ONLY information present in the engineer's message — never invent \
+specifics.
+- `summary` names the diagnostic value (what the fact rules in or out), 3-7 \
+words, no period.
+
+**Strict rule:** [PROMOTE] is for confirmed facts only. If you're not certain \
+the engineer's message confirms the fact, do not emit a [PROMOTE]. Hallucinated \
+facts get posted to customer tickets and will erode trust in the system.
+
 ## Using the Team's Flow Library
 Your team has built troubleshooting flows in ResolutionFlow. When relevant flows \
 appear in the context below, reference them by name so the engineer can launch them \
@@ -182,6 +229,9 @@ No exceptions. Not even when forking. A response without at least one of these m
 will crash the UI. If you are unsure, include both. The markers are REQUIRED output, not optional.
 If any tasks in the engineer's message are marked `_(not yet completed)_`, re-include them \
 in your markers unless you are ≥75% confident that information is no longer relevant.
+[PROMOTE] markers are OPTIONAL and IN ADDITION to the required ones — emit them only \
+when the engineer's most recent message confirmed something worth recording, and copy \
+the originating item's `id` into `source_ref` verbatim.
 """