Adds the load-bearing structural feature of the FlowPilot migration: a
"What we know" panel that holds confirmed facts for a session, fed by AI
[PROMOTE] markers and engineer-added notes. Facts feed the resolution
note preview (Phase 3) and survive across turns via stable UUIDs assigned
to pending_task_lane items.
Backend:
- FactSynthesisService: create/update/soft-delete facts with atomic
state_version bumps; LLM-backed synthesize_from_question/check on the
fact_synthesis (Haiku) action tier per Section 6.6.
- /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or
via synthesis). PATCH returns 403 for question/diagnostic_check facts
(edit the source item instead, Section 7.3).
- unified_chat_service: [PROMOTE] marker parser (JSON-block per Section
8.1 spec drift note), stable-UUID assignment for pending_task_lane
questions/actions preserved by exact text/label match across turns.
- ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to
emit, hallucination guardrails, source_ref handling.
- 17 tests covering parser, stable IDs, service validation, CRUD,
editability rule, both promote modes, 422 null-synthesis path,
state_version invariant.
Frontend:
- src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton}
— green-gradient section above Questions, dashed-circle check, inline
edit/delete gated by the server's editable flag.
- TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept
per the doc's "rename is opportunistic" guidance).
- AssistantChatPage fetches facts on selectChat and refetches after each
chat send (so [PROMOTE]-synthesized facts appear immediately); auto-
opens the lane when facts exist.
Verification: end-to-end smoke against the local docker stack confirms
all five endpoints (list/create/patch/delete/promote) plus the 403
editability rule. pytest suite verifies the same with mocked LLM. Live
[PROMOTE] flow remains untested until used in the UI — the marker shape
is covered by parser tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
286 lines
11 KiB
Python
286 lines
11 KiB
Python
"""FactSynthesisService — converts engineer answers and check output into facts.
|
|
|
|
Two paths feed this service:
|
|
|
|
1. **AI marker path (the common case).** When the model emits a `[PROMOTE]`
|
|
marker in the chat stream, `unified_chat_service` parses the marker (which
|
|
already contains the engineer-readable `text` and short provenance `summary`)
|
|
and calls `create_fact` directly. No LLM call is needed — the model already
|
|
wrote the fact.
|
|
|
|
2. **Engineer-driven synthesize path.** The "+ Promote to What we know" affordance
|
|
in the UI sends a raw answer or check output and asks the server to draft
|
|
`text` + `summary` for review. `synthesize_from_question` /
|
|
`synthesize_from_check` make a small Haiku call for that draft. The engineer
|
|
confirms (or edits) before persistence, so the LLM output is never
|
|
silently posted to a customer ticket.
|
|
|
|
Either way, persistence funnels through `create_fact`, which ALSO bumps
|
|
`ai_sessions.state_version` so the resolution-note preview cache invalidates
|
|
(see FLOWPILOT-MIGRATION.md Section 5.5).
|
|
|
|
Model tier is `fact_synthesis` in `settings.ACTION_MODEL_MAP` (Haiku per
|
|
Section 6.6). MCP is intentionally disabled for synthesis — these are
|
|
pure transformations of input, not research calls.
|
|
"""
|
|
from __future__ import annotations
|
|
|
|
import json
|
|
import logging
|
|
import re
|
|
from typing import Any
|
|
from uuid import UUID
|
|
|
|
from sqlalchemy import select, update
|
|
from sqlalchemy.ext.asyncio import AsyncSession
|
|
|
|
from app.core.ai_provider import get_ai_provider
|
|
from app.core.config import settings
|
|
from app.models.ai_session import AISession
|
|
from app.models.session_fact import SessionFact
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
# Conservative synthesis prompt. Hallucinated specifics are a trust-killer
|
|
# because facts feed the resolution note posted to customer tickets — the
|
|
# prompt makes "no fact" an explicit, valid output.
|
|
_SYNTHESIS_SYSTEM_PROMPT = """\
|
|
You convert one engineer answer or one diagnostic-check output into a single \
|
|
candidate fact for a troubleshooting session's "What we know" log.
|
|
|
|
Return strict JSON with this shape:
|
|
{
|
|
"text": "<one short sentence stating what is now known, in past tense>",
|
|
"summary": "<3-7 word provenance label, e.g. 'rules out tenant/license'>"
|
|
}
|
|
|
|
If the answer/output does NOT contain a substantive fact (e.g. the engineer \
|
|
typed 'unknown', the command failed, the output is empty), return:
|
|
{
|
|
"text": null,
|
|
"summary": null
|
|
}
|
|
|
|
Strict rules:
|
|
- Use ONLY information present in the input. Never add details that were not stated.
|
|
- Do not speculate, infer causes, or extrapolate. State only what the input proves.
|
|
- The text is a fact a colleague could verify by looking at the original answer/output.
|
|
- The summary names the diagnostic value (what this fact rules in or out), not the topic.
|
|
- Output ONLY the JSON object, no prose, no markdown fences.
|
|
"""
|
|
|
|
|
|
class FactSynthesisService:
|
|
"""Persists session facts and (optionally) drafts them via an LLM call.
|
|
|
|
Methods that touch the database take an `AsyncSession` and assume the
|
|
caller commits. `create_fact` flushes so the returned row has a primary key.
|
|
"""
|
|
|
|
def __init__(self, db: AsyncSession) -> None:
|
|
self.db = db
|
|
|
|
# ── Persistence ────────────────────────────────────────────────────────
|
|
|
|
async def create_fact(
|
|
self,
|
|
*,
|
|
session_id: UUID,
|
|
account_id: UUID,
|
|
user_id: UUID,
|
|
source_type: str,
|
|
text: str,
|
|
summary: str | None = None,
|
|
source_ref: UUID | None = None,
|
|
) -> SessionFact:
|
|
"""Persist a fact and bump the session's preview-cache version.
|
|
|
|
`source_ref` MUST be None for `user_note` and `ai_synthesis` sources;
|
|
for `question` and `diagnostic_check` it should point at the stable
|
|
UUID of the originating task-lane item. The DB has no FK constraint
|
|
on `source_ref` (the target lives inside JSONB) — integrity is enforced
|
|
here.
|
|
"""
|
|
if source_type not in ("question", "diagnostic_check", "user_note", "ai_synthesis"):
|
|
raise ValueError(f"Invalid source_type: {source_type}")
|
|
|
|
if source_type in ("user_note", "ai_synthesis") and source_ref is not None:
|
|
# `source_ref` is a back-pointer to a question/check; user notes
|
|
# and AI-synthesized facts have no source item to point at.
|
|
raise ValueError(
|
|
f"source_ref must be None for source_type={source_type}"
|
|
)
|
|
|
|
text = (text or "").strip()
|
|
if not text:
|
|
raise ValueError("Fact text cannot be empty")
|
|
|
|
fact = SessionFact(
|
|
session_id=session_id,
|
|
account_id=account_id,
|
|
text=text,
|
|
source_type=source_type,
|
|
source_ref=source_ref,
|
|
source_summary=(summary or "").strip() or None,
|
|
created_by=user_id,
|
|
)
|
|
self.db.add(fact)
|
|
|
|
# Invalidate any preview cached against the prior state_version.
|
|
# Single UPDATE so the bump is atomic relative to the fact insert
|
|
# within this transaction; concurrent writers serialize on the row.
|
|
await self.db.execute(
|
|
update(AISession)
|
|
.where(AISession.id == session_id)
|
|
.values(state_version=AISession.state_version + 1)
|
|
)
|
|
await self.db.flush()
|
|
return fact
|
|
|
|
async def soft_delete_fact(self, fact: SessionFact) -> None:
|
|
"""Mark a fact deleted and bump state_version."""
|
|
from datetime import datetime, timezone
|
|
|
|
fact.deleted_at = datetime.now(timezone.utc)
|
|
await self.db.execute(
|
|
update(AISession)
|
|
.where(AISession.id == fact.session_id)
|
|
.values(state_version=AISession.state_version + 1)
|
|
)
|
|
await self.db.flush()
|
|
|
|
async def update_fact(
|
|
self,
|
|
fact: SessionFact,
|
|
*,
|
|
text: str | None = None,
|
|
summary: str | None = None,
|
|
) -> SessionFact:
|
|
"""Update an editable fact and bump state_version.
|
|
|
|
Caller is responsible for the editability check — only `user_note`
|
|
and `ai_synthesis` facts may be edited at the card level. The
|
|
endpoint enforces this and returns 403 for the read-only types.
|
|
"""
|
|
if text is not None:
|
|
stripped = text.strip()
|
|
if not stripped:
|
|
raise ValueError("Fact text cannot be empty")
|
|
fact.text = stripped
|
|
if summary is not None:
|
|
fact.source_summary = summary.strip() or None
|
|
|
|
await self.db.execute(
|
|
update(AISession)
|
|
.where(AISession.id == fact.session_id)
|
|
.values(state_version=AISession.state_version + 1)
|
|
)
|
|
await self.db.flush()
|
|
return fact
|
|
|
|
# ── LLM-backed drafting ────────────────────────────────────────────────
|
|
|
|
async def synthesize_from_question(
|
|
self, *, question_text: str, raw_answer: str
|
|
) -> dict[str, str | None]:
|
|
"""Draft `{text, summary}` from a question + engineer's free-text answer.
|
|
|
|
Returns `{"text": None, "summary": None}` when the answer doesn't
|
|
contain a substantive fact — caller should not persist in that case.
|
|
"""
|
|
return await self._synthesize(
|
|
user_input=(
|
|
f"Question asked: {question_text.strip()}\n"
|
|
f"Engineer's answer: {raw_answer.strip()}"
|
|
),
|
|
)
|
|
|
|
async def synthesize_from_check(
|
|
self, *, check_label: str, check_output: str
|
|
) -> dict[str, str | None]:
|
|
"""Draft `{text, summary}` from a diagnostic check label + its output."""
|
|
return await self._synthesize(
|
|
user_input=(
|
|
f"Diagnostic check: {check_label.strip()}\n"
|
|
f"Output:\n{check_output.strip()}"
|
|
),
|
|
)
|
|
|
|
async def _synthesize(self, *, user_input: str) -> dict[str, str | None]:
|
|
"""Single Haiku call with the conservative synthesis prompt."""
|
|
model = settings.get_model_for_action("fact_synthesis")
|
|
provider = get_ai_provider(model=model)
|
|
|
|
# Cache the system prompt — it's identical across every synthesis call.
|
|
system_blocks: list[dict[str, Any]] = [
|
|
{
|
|
"type": "text",
|
|
"text": _SYNTHESIS_SYSTEM_PROMPT,
|
|
"cache_control": {"type": "ephemeral"},
|
|
# cacheable: identical across all fact-synthesis calls
|
|
},
|
|
]
|
|
|
|
try:
|
|
text, _in, _out = await provider.generate_json(
|
|
system_prompt=system_blocks,
|
|
messages=[{"role": "user", "content": user_input}],
|
|
max_tokens=200,
|
|
)
|
|
except Exception:
|
|
logger.exception("Fact synthesis LLM call failed")
|
|
return {"text": None, "summary": None}
|
|
|
|
return self._parse_synthesis_response(text)
|
|
|
|
@staticmethod
|
|
def _parse_synthesis_response(raw: str) -> dict[str, str | None]:
|
|
"""Tolerant parse: strip fences, accept null fields, ignore extras."""
|
|
cleaned = raw.strip()
|
|
if cleaned.startswith("```"):
|
|
cleaned = re.sub(r"^```(?:json)?\s*", "", cleaned)
|
|
cleaned = re.sub(r"\s*```$", "", cleaned)
|
|
|
|
try:
|
|
data = json.loads(cleaned)
|
|
except (json.JSONDecodeError, ValueError):
|
|
logger.warning("Fact synthesis returned non-JSON: %r", raw[:200])
|
|
return {"text": None, "summary": None}
|
|
|
|
if not isinstance(data, dict):
|
|
return {"text": None, "summary": None}
|
|
|
|
text = data.get("text")
|
|
summary = data.get("summary")
|
|
if text is not None and not isinstance(text, str):
|
|
text = None
|
|
if summary is not None and not isinstance(summary, str):
|
|
summary = None
|
|
|
|
# Treat empty strings the same as null — "no substantive fact".
|
|
if isinstance(text, str) and not text.strip():
|
|
text = None
|
|
if isinstance(summary, str) and not summary.strip():
|
|
summary = None
|
|
|
|
return {"text": text, "summary": summary}
|
|
|
|
|
|
async def list_facts_for_session(
|
|
db: AsyncSession, session_id: UUID
|
|
) -> list[SessionFact]:
|
|
"""List non-deleted facts for a session, oldest first.
|
|
|
|
RLS filters by tenant; the explicit account_id check is unnecessary here.
|
|
"""
|
|
result = await db.execute(
|
|
select(SessionFact)
|
|
.where(
|
|
SessionFact.session_id == session_id,
|
|
SessionFact.deleted_at.is_(None),
|
|
)
|
|
.order_by(SessionFact.created_at.asc())
|
|
)
|
|
return list(result.scalars().all())
|