feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache

Adds the AI-proposed resolution path and the inline preview of the
markdown that will be posted to the customer ticket on Resolve. The
preview is keyed on (session_id, ai_sessions.state_version) so back-to-
back fetches against unchanged state hit an in-process cache instead
of paying for a Sonnet call.

Backend:
- preview_cache: in-process LRU keyed on (kind, session_id, state_version).
  No TTL — state_version is the source of truth. Soft-cap 5000 entries.
- unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON
  payload, confidence clamped 0-100), supersession persistence (sets
  superseded_at on prior active row), atomic state_version bump.
- ResolutionNoteGeneratorService: pulls session, facts, active fix, and
  redacted script_generations into a structured input bundle for Sonnet;
  produces the four-section markdown (Problem / What we confirmed /
  Root cause / Resolution). Sensitive script parameters redacted via
  ScriptTemplateEngine.redact_sensitive driven by the template's
  parameters_schema.
- /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active
  fix or 404.
- /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records
  one_off / draft_template / build_template / dismissed; dismiss
  supersedes; bumps state_version. 409 on dismissing an already-
  superseded fix.
- /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns
  cached markdown; from_cache flag in payload signals cache hit.
- scripts.py POST /generate now bumps state_version on the linked
  ai_session_id when present (third source of preview-cache invalidation
  per Section 5.5).
- ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit,
  format, supersession semantics).
- 12 tests covering the parser (well-formed, last-wins, malformed,
  confidence clamping), supersession + state_version invariant, all
  decision branches, preview cache hit-on-no-change + miss-after-write.

Frontend:
- src/components/pilot/sections/SuggestedFix.tsx — amber-accented card
  with confidence badge; dismiss action wired to the decision endpoint.
- src/components/pilot/ResolutionNotePreview.tsx — popover with refresh,
  loading state, cached/fresh indicator, ticket-ref display.
- src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes
  404 to null so callers don't have to special-case.
- TaskLane gains suggestedFixSlot + bottomSlot props (rendered after
  Diagnostic Checks; bottomSlot anchors the Resolve action).
- AssistantChatPage: refreshSessionDerived helper batches fact + fix
  refresh; fact mutations and chat sends both schedule a 500ms-debounced
  preview refresh per the Section 5.5 spec.

Verified end-to-end against the dev stack with a real Sonnet call:
- /active 404 → fact create → preview generates four-section markdown
  grounded only in provided facts → second preview call hits cache
  (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-21 21:45:52 -04:00
parent 625dba7548
commit 66e592096c
16 changed files with 1617 additions and 22 deletions

View File

@@ -11,6 +11,9 @@ infrastructure and system prompt from assistant_chat_service.
Items in pending_task_lane carry stable UUIDs (assigned here) so PROMOTE
source_refs survive across turns even when the model re-emits the same
question/action.
- `[SUGGEST_FIX]` (Phase 3) — proposes a resolution path for the session.
Each new emission supersedes the previous active row (sets superseded_at)
so there's exactly one active fix at a time.
"""
import json
import logging
@@ -22,7 +25,13 @@ from uuid import UUID
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from datetime import datetime, timezone
from sqlalchemy import update
from app.models.ai_session import AISession
from app.models.script_template import ScriptTemplate
from app.models.session_suggested_fix import SessionSuggestedFix
from app.services.assistant_chat_service import (
ASSISTANT_SYSTEM_PROMPT,
_call_ai,
@@ -287,6 +296,125 @@ def _assign_stable_task_lane_ids(
return out_questions, out_actions
def _parse_suggest_fix_marker(
ai_content: str,
) -> tuple[str, dict[str, Any] | None]:
"""Extract a single [SUGGEST_FIX]...[/SUGGEST_FIX] JSON block from AI response.
The block contains:
{"title": "...", "description": "...", "confidence": 0..100,
"script_template_slug": "..." | null,
"ai_drafted_script": "..." | null,
"ai_drafted_parameters": {...} | null}
Per FLOWPILOT-MIGRATION.md Section 8.2. Only the LAST block in the response
is honored — if the model emits multiple, only its final view of the fix
matters; earlier ones in the same turn are stale even before persistence.
Returns (cleaned_content, fix_dict_or_None). Marker stripped from display.
"""
blocks = list(re.finditer(r"\[SUGGEST_FIX\]\s*([\s\S]*?)\s*\[/SUGGEST_FIX\]", ai_content))
if not blocks:
return ai_content, None
# Take the last block — most-recent intent wins within a single turn.
last = blocks[-1]
raw = last.group(1).strip()
if raw.startswith("```"):
raw = re.sub(r"^```(?:json)?\s*", "", raw)
raw = re.sub(r"\s*```$", "", raw)
try:
data = json.loads(raw)
except (json.JSONDecodeError, ValueError) as e:
logger.warning("Failed to parse [SUGGEST_FIX] block: %s", e)
return re.sub(r"\[SUGGEST_FIX\]\s*[\s\S]*?\s*\[/SUGGEST_FIX\]", "", ai_content).strip(), None
if not isinstance(data, dict):
return re.sub(r"\[SUGGEST_FIX\]\s*[\s\S]*?\s*\[/SUGGEST_FIX\]", "", ai_content).strip(), None
title = (data.get("title") or "").strip()
description = (data.get("description") or "").strip()
confidence = data.get("confidence")
if not title or not description or not isinstance(confidence, (int, float)):
logger.warning("[SUGGEST_FIX] missing required fields, dropping")
return re.sub(r"\[SUGGEST_FIX\]\s*[\s\S]*?\s*\[/SUGGEST_FIX\]", "", ai_content).strip(), None
confidence_int = max(0, min(100, int(round(float(confidence)))))
parsed = {
"title": title[:200],
"description": description,
"confidence_pct": confidence_int,
"script_template_slug": (data.get("script_template_slug") or None),
"ai_drafted_script": (data.get("ai_drafted_script") or None),
"ai_drafted_parameters": data.get("ai_drafted_parameters") if isinstance(data.get("ai_drafted_parameters"), dict) else None,
}
cleaned = re.sub(r"\[SUGGEST_FIX\]\s*[\s\S]*?\s*\[/SUGGEST_FIX\]", "", ai_content).strip()
return cleaned, parsed
async def _persist_suggested_fix(
*,
db: AsyncSession,
session: AISession,
fix: dict[str, Any],
) -> None:
"""Supersede the prior active fix and insert the new one. Bumps state_version.
A session has at most one active suggested fix (`superseded_at IS NULL`).
Emitting [SUGGEST_FIX] is the only way to introduce a new one; the
engineer's user_decision is recorded via the decision endpoint.
"""
now = datetime.now(timezone.utc)
# Mark any prior active rows for this session as superseded.
await db.execute(
update(SessionSuggestedFix)
.where(
SessionSuggestedFix.session_id == session.id,
SessionSuggestedFix.superseded_at.is_(None),
)
.values(superseded_at=now)
)
# Resolve script_template_slug → script_template_id if provided.
script_template_id = None
slug = fix.get("script_template_slug")
if slug:
result = await db.execute(
select(ScriptTemplate).where(ScriptTemplate.slug == slug)
)
tpl = result.scalar_one_or_none()
if tpl is not None:
script_template_id = tpl.id
else:
logger.warning(
"SUGGEST_FIX referenced unknown script_template_slug=%r"
"treating as no template match", slug,
)
new_fix = SessionSuggestedFix(
session_id=session.id,
account_id=session.account_id,
title=fix["title"],
description=fix["description"],
confidence_pct=fix["confidence_pct"],
script_template_id=script_template_id,
ai_drafted_script=fix.get("ai_drafted_script"),
ai_drafted_parameters=fix.get("ai_drafted_parameters"),
)
db.add(new_fix)
# Bump preview-cache version atomically with the supersession+insert.
await db.execute(
update(AISession)
.where(AISession.id == session.id)
.values(state_version=AISession.state_version + 1)
)
await db.flush()
async def _persist_promote_items(
*,
db: AsyncSession,
@@ -431,11 +559,13 @@ async def send_chat_message(
if session.status == "paused":
session.status = "active"
# Check for fork, actions, questions, and promote markers in branch response too
# Check for fork, actions, questions, promote, and suggest_fix markers
# in branch response too
branch_display, branch_fork_data = _parse_fork_marker(ai_content)
branch_display, branch_actions_data = _parse_actions_marker(branch_display)
branch_display, branch_questions_data = _parse_questions_marker(branch_display)
branch_display, branch_promote_items = _parse_promote_marker(branch_display)
branch_display, branch_suggest_fix = _parse_suggest_fix_marker(branch_display)
if branch_display != ai_content:
# Store stripped content in branch history
msgs[-1] = {"role": "assistant", "content": branch_display}
@@ -493,6 +623,12 @@ async def send_chat_message(
db=db, session=session, user_id=user_id, items=branch_promote_items,
)
# Persist a [SUGGEST_FIX] if the branch turn included one.
if branch_suggest_fix:
await _persist_suggested_fix(
db=db, session=session, fix=branch_suggest_fix,
)
suggested_flows = extract_suggested_flows(
await rag_search(query=message, account_id=account_id, db=db, limit=8)
)
@@ -542,10 +678,14 @@ async def send_chat_message(
# Check for promote markers — facts the AI is surfacing to What we know.
display_content, promote_items = _parse_promote_marker(display_content)
# Check for a [SUGGEST_FIX] marker — supersedes the prior active fix.
display_content, suggest_fix_data = _parse_suggest_fix_marker(display_content)
logger.info(
"Marker parsing results — actions: %s, questions: %s, fork: %s, promote: %d, raw_length: %d, display_length: %d",
"Marker parsing results — actions: %s, questions: %s, fork: %s, "
"promote: %d, suggest_fix: %s, raw_length: %d, display_length: %d",
bool(actions_data), bool(questions_data), bool(fork_data),
len(promote_items or []),
len(promote_items or []), bool(suggest_fix_data),
len(ai_content), len(display_content),
)
@@ -630,6 +770,10 @@ async def send_chat_message(
db=db, session=session, user_id=user_id, items=promote_items,
)
# Persist a [SUGGEST_FIX] if this turn included one — supersedes prior fix.
if suggest_fix_data:
await _persist_suggested_fix(db=db, session=session, fix=suggest_fix_data)
suggested_flows = extract_suggested_flows(rag_results)
return display_content, suggested_flows, session, fork_metadata, actions_data, questions_data