10 Commits

Author SHA1 Message Date
f3c3ee5b57 feat(pilot): unify AI troubleshooting surface at /pilot, redirect /assistant (Phase 1)
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Collapses the pre-existing dual-surface setup (AssistantChatPage at /assistant,
FlowPilotSessionPage at /pilot) into a single chat-primary surface per
architectural claim #1 of FLOWPILOT-MIGRATION.md.

Router changes (frontend/src/router.tsx):
- /pilot and /pilot/:sessionId now render AssistantChatPage.
- /assistant redirects permanently to /pilot via <Navigate replace>.
- /assistant/:sessionId redirects to /pilot/:sessionId preserving the ID
  via an AssistantSessionRedirect helper that reads the param.
- FlowPilotSessionPage is no longer imported or mounted. Per the
  beta-history-disposable decision, the file stays on disk for reference
  but is unreachable; delete once nothing else in the tree imports it.

Dispatcher de-branching — previously these sites routed by session_type
(chat -> /assistant, otherwise -> /pilot). All now unconditionally go to
/pilot/:id since session_type is no longer used for frontend routing:
- components/dashboard/ActiveFlowPilotSessions.tsx
- components/dashboard/RecentFlowPilotSessions.tsx
- components/flowpilot/AISessionListItem.tsx
  (keeps isChat for icon selection, but linkTo is unconditional)

User-facing label + navigation updates:
- components/layout/CommandPalette.tsx: "AI Assistant" palette entry
  becomes "FlowPilot" pointing to /pilot; the sparkles quick-action also
  routes to /pilot.
- components/dashboard/StartSessionInput.tsx: both navigate() call sites
  now go to /pilot instead of /assistant.
- lib/routePrefetch.ts: prefetch entry for AssistantChatPage keyed to
  /pilot (the real surface) rather than /assistant (now redirect-only).

Preserved intentionally (not user-facing routes):
- Backend /assistant/retention API path and the assistantChatApi module
  name — those are internal API and module identifiers, not SPA routes.
- src/components/assistant/* and src/types/assistant-chat — TypeScript
  module paths, not routes.
- Sidebar.tsx — no top-level AI entry existed to rename; /pilot is
  already in the History group's matchPaths. Whether FlowPilot deserves
  its own rail entry is a future UX decision, not Phase 1 scope.
- FlowPilotAnalyticsPage at /analytics/flowpilot — analytics for the
  unified product, not guided-only, per the agreed Q16 interpretation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:48:00 +00:00
b49772f1a1 feat(models): Phase 1 SQLAlchemy models — SessionFact, SessionSuggestedFix, DraftTemplate, AccountSettings
Backs the schema added in 210d310 with SQLAlchemy 2.0 models.

- SessionFact: "What we know" facts with polymorphic source_ref pointing
  at task-lane item UUIDs inside ai_sessions.pending_task_lane (not a FK
  per Section 4.2).
- SessionSuggestedFix: AI-proposed resolutions with supersession tracking
  and the full user_decision state machine.
- DraftTemplate: post-resolve templatization queue with promotion to
  script_templates.
- AccountSettings: per-account JSONB preferences grab-bag with async
  classmethod helpers — get_setting(db, account_id, key, default) reads
  without creating, set_setting(db, account_id, key, value) upserts via
  Postgres ON CONFLICT + jsonb `||` merge so existing keys are preserved.
  Lazy row creation matches the Phase 1 design.

Column additions on existing models to mirror the migration:
- AISession: resolution_note_* / escalation_package_* / state_version
  (the preview-cache-invalidation counter consumed by Phase 3).
- ScriptTemplate: source_session_id / source_user_id / source_ticket_ref
  (provenance for templates promoted from DraftTemplate).

All four new models registered in app.models.__init__ and __all__.
TYPE_CHECKING-guarded relationship imports throughout, matching the
repo's existing model style.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:35:00 +00:00
210d310fb2 feat(db): Phase 1 schema — session_facts, suggested_fixes, draft_templates, account_settings
Adds the backing store for the FlowPilot unified session surface, per
the FLOWPILOT-MIGRATION.md Phase 1 deliverable. Descends from production
head 074 (add_network_diagrams_table).

New tables (all tenant-scoped, all RLS-enabled + forced):
- session_facts — "What we know" facts. source_ref is a polymorphic
  pointer to a task-lane item inside ai_sessions.pending_task_lane
  (no DB-level FK; integrity enforced at service layer per Section 4.2
  of the design doc). Soft-delete via deleted_at; active-facts partial
  index excludes deleted rows.
- session_suggested_fixes — AI-proposed resolutions. One active per
  session at a time (supersession tracked via superseded_at; partial
  index on (session_id) WHERE superseded_at IS NULL powers the
  "find active fix" query).
- draft_templates — scripts pending post-resolve templatization.
  Partial index on (account_id) WHERE status='pending' supports the
  "N scripts ready to review" Script Library badge.
- account_settings — new per-account table with JSONB preferences
  grab-bag. Rows created lazily on first write; get_setting returns
  default when no row exists.

Column additions on ai_sessions:
- resolution_note_markdown / posted_at / external_id
- escalation_package_markdown / posted_at / external_id
- state_version (INTEGER NOT NULL DEFAULT 0) — incremented atomically
  by any write that invalidates the resolution note preview cache
  per Section 5.5. Phase 3 consumes this.

Column additions on script_templates:
- source_session_id, source_user_id, source_ticket_ref — powers the
  "generated from CW #X · resolved by Y · used N times" provenance
  chip in the Script Library.

RLS pattern matches the repo convention (074 / network_diagrams is the
nearest template): ENABLE + FORCE, USING + WITH CHECK on
`account_id = app.current_account_id`. Downgrade is reversible —
drops in the inverse order of creation so FK dependencies unwind.

No runtime verification from code-server; migration apply + downgrade
will be verified on the new dev environment per the standing deferral.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:14:26 +00:00
92fadfb90a docs(flowpilot-migration): integrate Codex plan review + Phase 0 audit findings
Significant rewrite of FLOWPILOT-MIGRATION.md after post-Codex plan review
and the Phase 0 in-flight audit. Archives the pre-rewrite version as
FLOWPILOT-MIGRATION-v1.md and keeps the Codex review under
CODEX-FlowAssist-Migration-PLAN.md for traceability.

Substantive changes that affect implementation:

- Section 0.1 adds a spec-drift note listing corrections integrated into
  this revision (API namespace, task-lane item UUIDs, account_settings
  creation, missing /tickets/ai-parse endpoint).
- Section 2 adds "Task lane item ID" terminology — stable UUID assigned
  to items inside ai_sessions.pending_task_lane so session_facts.source_ref
  has something reliable to point to.
- Section 4.1 adds ai_sessions.state_version (INTEGER NOT NULL DEFAULT 0)
  and escalation_package_external_id. state_version drives preview cache
  invalidation; incremented atomically on writes to facts / suggested
  fixes / script_generations.
- Section 4.6 creates account_settings as a new table with JSONB
  preferences column, lazy row creation, and a promotion rule for when a
  setting should graduate to a typed column.
- Section 5 namespaces all session-scoped routes under
  /api/v1/ai-sessions/{id}/... to match the existing codebase pattern.
- Section 5.5 documents the preview caching strategy (state_version
  keyed, 500ms client debounce, Redis planned).
- Section 6.6 adds per-service MCP capability flags alongside the model
  tier flags.
- Section 7.1 makes the /assistant -> /pilot redirect include the
  session-deep-link path and preserve the session ID.
- Section 8.2 adds supersession semantics for [SUGGEST_FIX] markers.
- Section 9 Phase 1 now explicitly includes account_settings and
  state_version; Phase 3 uses state_version-keyed caching; Phase 5
  mentions MCP inheritance via chat_call_cached wrapper.
- Section 11 adds a dedicated test plan (migrations, backend, frontend,
  manual QA).
- Section 14 captures the eight planning decisions made during the
  Phase 0 conversation so they are traceable.

No code changes in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:05:04 +00:00
3f0a132058 refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4)
Renames the chat caller to a name that signals its actual purpose, and
factors the reusable cached-system-block + cached-history + cache-usage-log
primitives out to app.core.ai_provider so they can be shared with the
provider-generic path without pulling MCP/beta/images into the abstract
interface.

Helpers added to ai_provider.py:
- `build_anthropic_chat_messages(history, new_message, images, format_reminder)`
  — owns: copy history, apply cache_control to last history message,
  append format reminder to new message, render images as multimodal blocks.
  Anthropic-shaped by design; do not call from Gemini paths.

chat_call_cached keeps exactly the concerns that are unique to the one
MCP/beta/multimodal chat caller:
- Anthropic beta endpoint invocation
- Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN)
- Retry-without-MCP fallback
- Format-reminder content string (declared as module constant)
- Phase 0.5 telemetry (mcp.turn, mcp.fallback)

Documents in the module docstring AND at the function site that this is
the ONE MCP/beta chat caller and should not become the general provider
path. MCP/beta/images are features of exactly one optional Anthropic beta
endpoint; routing them through AnthropicProvider would leak a provider-
specific concern into the abstract interface that also serves Gemini.

Behavior change: chat_call_cached now reuses the singleton AnthropicProvider
HTTP client via `_get_anthropic_client(...)` instead of instantiating a new
`anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern
and avoids burning connections per-turn. No user-visible difference.

No runtime verification from code-server. TODO(phase0-verify) in
ai_provider.py tracks the cache-hit verification owed on the new dev env.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:03:09 +00:00
da93ae55c3 feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3)
Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).

Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
  (~2.5k tokens with few-shot example); retries inside the same function
  re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
  (each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
  substitution — same-language sessions share cache entries)

Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.

script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.

No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:29:45 +00:00
56fd440b16 docs(flowpilot-migration): flag Phase 0.2 as pending-endpoint; target not yet built
The /tickets/ai-parse endpoint named in Phase 0.2 does not exist in the
codebase (verified: zero matches for ai-parse/ai_parse across endpoints,
services, models, and all branches/commit messages). integrations.py:557
is get_ticket_statuses — a CW passthrough with no AI call.

Adding a block-quoted note under the 0.2 deliverable that flags the
drift, records the cached-system-block pattern to apply when the endpoint
is built, and instructs the next editor to remove the note once applied.
No implementation change this commit — guidance only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:24:33 +00:00
b3be66652e feat(ai): structured-system-block caching in AnthropicProvider (Phase 0.1)
Widens AIProvider.generate_json / generate_text / generate_text_stream
signatures to accept `system_prompt: str | list[SystemBlock]`:

- `str` (the existing call shape): passes through uncached, unchanged
  behavior. Every existing caller stays on the uncached path — no silent
  behavior change.
- `list[SystemBlock]`: enables Anthropic prompt caching via structured
  system blocks. Caller-authored `cache_control` is honored verbatim
  (policy α); if no block carries it, the provider applies
  `cache_control: {"type": "ephemeral"}` to the first block only.

Gemini ignores cache_control and concatenates list entries into one
system string — the widened signature is strictly additive on that path.

Adds `anthropic.cache` structured-log telemetry: on every Anthropic
response (streaming included, via `stream.get_final_message()`), logs
`cache_read_input_tokens` and `cache_creation_input_tokens`. Telemetry
failure in streaming is swallowed so the user-facing stream never breaks.

Verification deferred: cannot run from code-server (no Python, no DB,
no dev env). TODO(phase0-verify) left inline in the module docstring.
First verification task on the new dev environment is to hit any
FlowPilot endpoint twice within 5 minutes and confirm the second call
shows cache_read_input_tokens > 0 in the `anthropic.cache` log event.
If verification fails, that's a debug task on the new env — not a
blocker for continuing Phase 0.2/0.3/0.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:17:12 +00:00
0fbc1e0a57 feat(telemetry): add MCP per-turn structured-log telemetry (Phase 0.5)
Emits structured `mcp.turn` log events on every Anthropic-path chat turn,
capturing whether MCP was wired in (mcp_available), whether the model
actually invoked an MCP tool (mcp_invoked), which tool names fired,
and whether the silent retry-without-MCP fallback was triggered.
Adds a separate `mcp.fallback` event with error type/message for
fallback occurrences.

Establishes baseline data for deciding whether MCP investment is earning
its keep before Phase 2+ expands the product footprint. Scope: the one
MCP-using code path (`_call_anthropic_cached`) — not a general
instrumentation layer.

No new dependencies, no schema changes, no behavior change. Standard
library `logging` is the sink; PostHog is not wired on the backend.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:57:13 +00:00
46291f30b9 docs: add FlowPilot migration design doc and mockups
Brings the locked FlowPilot migration design onto the branch that will
implement it. Includes the annotated target UI mockups (primary session
view + three Script Generator integration states) and the superseded
FLOWPILOT-AND-RESOLUTIONASSIST.md for historical reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:22:39 +00:00
31 changed files with 6111 additions and 124 deletions

View File

@@ -0,0 +1,404 @@
"""FlowPilot migration Phase 1 — schema for the unified session surface.
Revision ID: f07010f17b01
Revises: 074
Create Date: 2026-04-17
Creates the backing store for the FlowPilot unified session surface:
- `session_facts` — "What we know" facts, keyed to a session, with a polymorphic
`source_ref` pointing at a task-lane item inside `ai_sessions.pending_task_lane`
(no DB-level FK; integrity enforced at the service layer per the design doc).
- `session_suggested_fixes` — AI-proposed resolution paths. Only one active
(`superseded_at IS NULL`) per session at a time.
- `draft_templates` — scripts pending post-resolve templatization
(Option 2 in the three-option dialog).
- `account_settings` — new per-account key/value settings table with a JSONB
`preferences` grab-bag. Rows are created lazily on first write.
- Column additions to `ai_sessions` — resolution/escalation markdown + external IDs,
plus `state_version` (incremented by any write that invalidates the resolution
note preview cache).
- Column additions to `script_templates` — provenance fields for templates
promoted from draft_templates.
All four new tenant-scoped tables have RLS enabled + forced with a
`tenant_isolation` policy matching the repo pattern (USING + WITH CHECK on
`account_id = app.current_account_id`). Downgrade is reversible: drops in the
inverse order of creation.
Chained from `074` (add_network_diagrams_table) per the single-head state of
production; the other local heads on feat/flowpilot-migration are branch
artifacts not present in production.
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import UUID, JSONB
revision = "f07010f17b01"
down_revision = "074"
branch_labels = None
depends_on = None
_CURRENT_ACCOUNT = (
"COALESCE("
"NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000'"
")::uuid"
)
def upgrade() -> None:
# ── ai_sessions: resolution / escalation columns + state_version ───────
op.add_column(
"ai_sessions",
sa.Column("resolution_note_markdown", sa.Text(), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("resolution_note_posted_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("resolution_note_external_id", sa.String(128), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_markdown", sa.Text(), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_posted_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_external_id", sa.String(128), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column(
"state_version",
sa.Integer(),
nullable=False,
server_default=sa.text("0"),
),
)
# ── script_templates: provenance for post-resolve promotion ────────────
op.add_column(
"script_templates",
sa.Column(
"source_session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id"),
nullable=True,
),
)
op.add_column(
"script_templates",
sa.Column(
"source_user_id",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=True,
),
)
op.add_column(
"script_templates",
sa.Column("source_ticket_ref", sa.String(64), nullable=True),
)
# ── session_facts ──────────────────────────────────────────────────────
op.create_table(
"session_facts",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column("text", sa.Text(), nullable=False),
sa.Column("source_type", sa.String(32), nullable=False),
# `source_ref` is a polymorphic pointer to a task-lane item inside
# ai_sessions.pending_task_lane JSON, NOT a FK to any table.
# Integrity enforced at the service layer per Section 4.2 of the
# migration design doc.
sa.Column("source_ref", UUID(as_uuid=True), nullable=True),
sa.Column("source_summary", sa.Text(), nullable=True),
sa.Column(
"created_by",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=False,
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column("deleted_at", sa.DateTime(timezone=True), nullable=True),
sa.CheckConstraint(
"source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')",
name="ck_session_facts_source_type",
),
)
# Active-facts-per-session; partial index excludes soft-deleted rows.
op.create_index(
"idx_session_facts_session",
"session_facts",
["session_id"],
postgresql_where=sa.text("deleted_at IS NULL"),
)
op.create_index(
"idx_session_facts_account",
"session_facts",
["account_id"],
)
op.execute("ALTER TABLE session_facts ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE session_facts FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON session_facts
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── session_suggested_fixes ────────────────────────────────────────────
op.create_table(
"session_suggested_fixes",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column("title", sa.String(200), nullable=False),
sa.Column("description", sa.Text(), nullable=False),
sa.Column("confidence_pct", sa.Integer(), nullable=False),
sa.Column(
"script_template_id",
UUID(as_uuid=True),
sa.ForeignKey("script_templates.id"),
nullable=True,
),
sa.Column("ai_drafted_script", sa.Text(), nullable=True),
sa.Column("ai_drafted_parameters", JSONB(), nullable=True),
sa.Column("user_decision", sa.String(32), nullable=True),
sa.Column("superseded_at", sa.DateTime(timezone=True), nullable=True),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.CheckConstraint(
"confidence_pct BETWEEN 0 AND 100",
name="ck_session_suggested_fixes_confidence_pct",
),
sa.CheckConstraint(
"user_decision IS NULL OR user_decision IN ("
"'one_off', 'draft_template', 'build_template', 'dismissed')",
name="ck_session_suggested_fixes_user_decision",
),
)
# Only-one-active-per-session is enforced by service-layer supersession;
# this partial index serves the "find active fix" query.
op.create_index(
"idx_session_suggested_fixes_session_active",
"session_suggested_fixes",
["session_id"],
postgresql_where=sa.text("superseded_at IS NULL"),
)
op.execute("ALTER TABLE session_suggested_fixes ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE session_suggested_fixes FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON session_suggested_fixes
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── draft_templates ────────────────────────────────────────────────────
op.create_table(
"draft_templates",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column(
"source_session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id"),
nullable=False,
),
sa.Column(
"source_user_id",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=False,
),
sa.Column("script_body", sa.Text(), nullable=False),
sa.Column("proposed_parameters", JSONB(), nullable=False),
sa.Column("proposed_name", sa.String(200), nullable=True),
sa.Column(
"proposed_category_id",
UUID(as_uuid=True),
sa.ForeignKey("script_categories.id"),
nullable=True,
),
sa.Column(
"status",
sa.String(32),
nullable=False,
server_default=sa.text("'pending'"),
),
sa.Column("resolved_at", sa.DateTime(timezone=True), nullable=True),
sa.Column(
"promoted_template_id",
UUID(as_uuid=True),
sa.ForeignKey("script_templates.id"),
nullable=True,
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.CheckConstraint(
"status IN ('pending', 'accepted', 'rejected')",
name="ck_draft_templates_status",
),
)
# Supports the Script Library "N scripts ready to review" badge.
op.create_index(
"idx_draft_templates_account_pending",
"draft_templates",
["account_id"],
postgresql_where=sa.text("status = 'pending'"),
)
op.execute("ALTER TABLE draft_templates ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE draft_templates FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON draft_templates
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── account_settings ───────────────────────────────────────────────────
# One row per account, created lazily on first write. The `preferences`
# JSONB is a grab-bag for simple settings (e.g. templatize_prompt_enabled).
# Settings graduate to typed columns via future migrations when they meet
# the promotion criteria in Section 4.6 of the design doc (hot path /
# validation / joins).
op.create_table(
"account_settings",
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id", ondelete="CASCADE"),
primary_key=True,
),
sa.Column(
"preferences",
JSONB(),
nullable=False,
server_default=sa.text("'{}'::jsonb"),
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
)
op.execute("ALTER TABLE account_settings ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE account_settings FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON account_settings
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
def downgrade() -> None:
# Drop in reverse order so FK dependencies unwind cleanly.
op.execute("DROP POLICY IF EXISTS tenant_isolation ON account_settings")
op.execute("ALTER TABLE account_settings DISABLE ROW LEVEL SECURITY")
op.drop_table("account_settings")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON draft_templates")
op.execute("ALTER TABLE draft_templates DISABLE ROW LEVEL SECURITY")
op.drop_index("idx_draft_templates_account_pending", table_name="draft_templates")
op.drop_table("draft_templates")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON session_suggested_fixes")
op.execute("ALTER TABLE session_suggested_fixes DISABLE ROW LEVEL SECURITY")
op.drop_index(
"idx_session_suggested_fixes_session_active",
table_name="session_suggested_fixes",
)
op.drop_table("session_suggested_fixes")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON session_facts")
op.execute("ALTER TABLE session_facts DISABLE ROW LEVEL SECURITY")
op.drop_index("idx_session_facts_account", table_name="session_facts")
op.drop_index("idx_session_facts_session", table_name="session_facts")
op.drop_table("session_facts")
op.drop_column("script_templates", "source_ticket_ref")
op.drop_column("script_templates", "source_user_id")
op.drop_column("script_templates", "source_session_id")
op.drop_column("ai_sessions", "state_version")
op.drop_column("ai_sessions", "escalation_package_external_id")
op.drop_column("ai_sessions", "escalation_package_posted_at")
op.drop_column("ai_sessions", "escalation_package_markdown")
op.drop_column("ai_sessions", "resolution_note_external_id")
op.drop_column("ai_sessions", "resolution_note_posted_at")
op.drop_column("ai_sessions", "resolution_note_markdown")

View File

@@ -199,7 +199,10 @@ async def generate_fixes(
try:
text, in_tok, out_tok = await provider.generate_json(
system_prompt=FIX_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": FIX_SYSTEM_PROMPT},
# cacheable: stable constant across all fix attempts
],
messages=messages,
max_tokens=2048,
)
@@ -232,7 +235,11 @@ async def generate_fixes(
try:
text2, in_tok2, out_tok2 = await provider.generate_json(
system_prompt=FIX_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": FIX_SYSTEM_PROMPT},
# cacheable: stable constant; retry reads the cached
# system block from the first attempt above
],
messages=messages,
max_tokens=2048,
)

View File

@@ -3,16 +3,169 @@ AI Provider abstraction layer.
Supports Gemini (google-genai) and Anthropic (anthropic) as interchangeable
backends for JSON generation used by the AI Flow Builder.
## Prompt caching (Anthropic only)
Callers may pass `system_prompt` as either:
- `str` — backward-compatible, uncached.
- `list[SystemBlock]` — Anthropic structured system blocks. Each block is a
dict of shape `{"type": "text", "text": str, "cache_control": {...}?}`.
Caching policy (policy α, per Phase 0.1 design):
- If any block in the list carries an explicit `cache_control` key, that
caller-authored configuration is honored verbatim.
- If no block carries `cache_control`, the provider applies
`cache_control: {"type": "ephemeral"}` to the first block only. First block
is the common "large static prefix" case (e.g. system prompt, reference data).
Gemini ignores cache_control and concatenates list blocks into one system
string — callers should not rely on Gemini for cache-hit behavior.
TODO(phase0-verify): When a dev environment is available, verify cache-hit
behavior by hitting any FlowPilot endpoint twice within the 5-minute
ephemeral TTL. First call should emit `anthropic.cache` with
`cache_creation_input_tokens > 0`; second call with `cache_read_input_tokens > 0`.
If the second call returns zero reads, inspect the prefix for silent
invalidators (timestamps, unsorted JSON keys, varying tool list ordering).
"""
import logging
from abc import ABC, abstractmethod
from collections.abc import AsyncIterator
from typing import Any
from app.core.config import settings
logger = logging.getLogger(__name__)
# Anthropic structured system block. See module docstring for caching policy.
SystemBlock = dict[str, Any]
def _normalize_system_for_anthropic(
system_prompt: str | list[SystemBlock],
) -> str | list[SystemBlock]:
"""Return the value to pass as the `system=` parameter to the Anthropic API.
- Plain strings pass through untouched (uncached path).
- Lists are returned as structured system blocks. If no block in the list
carries an explicit `cache_control`, `cache_control: {"type": "ephemeral"}`
is applied to the FIRST block only (policy α).
- Caller-authored `cache_control` is never overwritten.
"""
if isinstance(system_prompt, str):
return system_prompt
if not system_prompt:
# Empty list is not a meaningful system prompt — pass empty string so
# Anthropic treats this as "no system prompt" rather than erroring.
return ""
blocks = [dict(b) for b in system_prompt]
already_cached = any("cache_control" in b for b in blocks)
if not already_cached:
blocks[0]["cache_control"] = {"type": "ephemeral"}
return blocks
def _flatten_system_for_gemini(
system_prompt: str | list[SystemBlock],
) -> str:
"""Gemini has no structured system blocks; concatenate list entries."""
if isinstance(system_prompt, str):
return system_prompt
return "\n\n".join(b.get("text", "") for b in system_prompt)
def build_anthropic_chat_messages(
history: list[dict[str, Any]],
new_message: str,
images: list[dict[str, Any]] | None = None,
format_reminder: str | None = None,
) -> list[dict[str, Any]]:
"""Construct the Anthropic `messages` payload for a cached multi-turn chat.
Responsibilities:
- Copy the valid history messages in order.
- Apply `cache_control: ephemeral` to the LAST history message so the entire
conversation prefix is cached across turns. The new user message stays
uncached (it changes each turn).
- Append `format_reminder` to the new user message if provided. The reminder
is invisible to storage (caller's concern) but helps enforce structured
output compliance at generation time.
- If `images` are provided, render the new user message as a multimodal
content block list (images first, then text). Otherwise, render it as
a plain string.
This helper is Anthropic-specific: the cache-breakpoint pattern, ephemeral
cache_control, and multimodal block shape are all Anthropic conventions.
Do not call it from Gemini code paths.
"""
messages: list[dict[str, Any]] = []
for msg in history:
messages.append({"role": msg["role"], "content": msg["content"]})
# Cache breakpoint on the last existing history message so the entire
# conversation prefix is cached across turns. Safe only when there IS a
# history message; otherwise the new message is the only message.
if messages:
last = messages[-1]
messages[-1] = {
"role": last["role"],
"content": [
{
"type": "text",
"text": last["content"],
"cache_control": {"type": "ephemeral"},
}
],
}
effective_text = new_message + (format_reminder or "")
if images:
content_blocks: list[dict[str, Any]] = []
for img in images:
content_blocks.append(
{
"type": "image",
"source": {
"type": "base64",
"media_type": img["media_type"],
"data": img["data"],
},
}
)
content_blocks.append({"type": "text", "text": effective_text})
messages.append({"role": "user", "content": content_blocks})
else:
messages.append({"role": "user", "content": effective_text})
return messages
def _log_anthropic_cache_usage(usage: Any, model: str) -> None:
"""Emit a structured log line capturing cache_read / cache_creation tokens."""
cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0
input_tokens = getattr(usage, "input_tokens", 0) or 0
output_tokens = getattr(usage, "output_tokens", 0) or 0
if cache_read or cache_creation:
logger.info(
"anthropic.cache",
extra={
"event": "anthropic.cache",
"model": model,
"cache_read_input_tokens": cache_read,
"cache_creation_input_tokens": cache_creation,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
},
)
class AIProvider(ABC):
"""Abstract base class for AI providers."""
@@ -20,14 +173,16 @@ class AIProvider(ABC):
@abstractmethod
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
"""Generate a JSON response from the AI model.
Args:
system_prompt: System-level instruction for the model.
system_prompt: System-level instruction. Plain `str` is uncached
(Anthropic) or used as-is (Gemini). `list[SystemBlock]` enables
Anthropic prompt caching per module-docstring policy.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
@@ -39,37 +194,25 @@ class AIProvider(ABC):
@abstractmethod
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
"""Generate a text response from the AI model (no JSON constraint).
Args:
system_prompt: System-level instruction for the model.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
Returns:
Tuple of (response_text, input_tokens, output_tokens).
See `generate_json` for argument semantics.
"""
...
async def generate_text_stream(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> "AsyncIterator[str]":
"""Stream a text response token by token.
Args:
system_prompt: System-level instruction for the model.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
Yields:
Text chunks as they are generated.
See `generate_json` for argument semantics.
"""
raise NotImplementedError("Streaming not supported for this provider")
# Make this an async generator to satisfy type checker
@@ -85,14 +228,15 @@ class GeminiProvider(AIProvider):
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
from google import genai
from google.genai import types as genai_types
client = genai.Client(api_key=self._api_key)
system_text = _flatten_system_for_gemini(system_prompt)
# Convert messages to Gemini Content format
contents: list[genai_types.Content] = []
@@ -106,7 +250,7 @@ class GeminiProvider(AIProvider):
)
config = genai_types.GenerateContentConfig(
system_instruction=system_prompt,
system_instruction=system_text,
max_output_tokens=max_tokens,
response_mime_type="application/json",
)
@@ -137,14 +281,15 @@ class GeminiProvider(AIProvider):
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
from google import genai
from google.genai import types as genai_types
client = genai.Client(api_key=self._api_key)
system_text = _flatten_system_for_gemini(system_prompt)
contents: list[genai_types.Content] = []
for msg in messages:
@@ -157,7 +302,7 @@ class GeminiProvider(AIProvider):
)
config = genai_types.GenerateContentConfig(
system_instruction=system_prompt,
system_instruction=system_text,
max_output_tokens=max_tokens,
# No response_mime_type — allow free-form text
)
@@ -214,16 +359,17 @@ class AnthropicProvider(AIProvider):
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
client = _get_anthropic_client(self._api_key, self._timeout)
normalized_system = _normalize_system_for_anthropic(system_prompt)
response = await client.messages.create(
model=self._model,
max_tokens=max_tokens,
system=system_prompt,
system=normalized_system,
messages=messages,
)
@@ -231,12 +377,14 @@ class AnthropicProvider(AIProvider):
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
_log_anthropic_cache_usage(response.usage, self._model)
return text, input_tokens, output_tokens
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
# Anthropic doesn't differentiate between JSON and text mode
@@ -244,20 +392,28 @@ class AnthropicProvider(AIProvider):
async def generate_text_stream(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> AsyncIterator[str]:
client = _get_anthropic_client(self._api_key, self._timeout)
normalized_system = _normalize_system_for_anthropic(system_prompt)
async with client.messages.stream(
model=self._model,
max_tokens=max_tokens,
system=system_prompt,
system=normalized_system,
messages=messages,
) as stream:
async for text in stream.text_stream:
yield text
# Per Anthropic SDK, get_final_message() resolves the stream's
# final usage object (including cache_read/cache_creation tokens).
try:
final = await stream.get_final_message()
_log_anthropic_cache_usage(final.usage, self._model)
except Exception as exc: # best-effort telemetry, never fail the stream
logger.debug("anthropic.cache streaming usage unavailable: %s", exc)
def get_ai_provider(model: str | None = None) -> AIProvider:

View File

@@ -146,7 +146,10 @@ async def scaffold_branches(
user_message += f"Environment: {', '.join(tags)}\n"
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=SCAFFOLD_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": SCAFFOLD_SYSTEM_PROMPT},
# cacheable: stable constant across all scaffold calls
],
messages=[{"role": "user", "content": user_message}],
max_tokens=2048,
)
@@ -207,7 +210,13 @@ async def generate_branch_detail(
for attempt in range(3):
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=BRANCH_DETAIL_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": BRANCH_DETAIL_SYSTEM_PROMPT},
# cacheable: stable constant. Retries in this loop re-read the
# cached system block rather than paying full input cost each
# attempt — the ~2.5k-token prompt with few-shot example is
# the dominant cost here.
],
messages=messages,
max_tokens=8192,
)

View File

@@ -425,7 +425,12 @@ async def convert_document(
try:
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=system_prompt,
system_prompt=[
{"type": "text", "text": system_prompt},
# cacheable: one of two stable constants (TROUBLESHOOTING_SYSTEM_PROMPT
# or PROCEDURAL_SYSTEM_PROMPT) selected by target_type. Each
# variant caches independently by text content.
],
messages=[{"role": "user", "content": user_message}],
max_tokens=16384,
)

View File

@@ -58,6 +58,10 @@ from .template_tree import TemplateTree
from .platform_step import PlatformStep
from .device_type import DeviceType
from .network_diagram import NetworkDiagram
from .session_fact import SessionFact
from .session_suggested_fix import SessionSuggestedFix
from .draft_template import DraftTemplate
from .account_settings import AccountSettings
__all__ = [
"User",
@@ -130,4 +134,8 @@ __all__ = [
"PlatformStep",
"DeviceType",
"NetworkDiagram",
"SessionFact",
"SessionSuggestedFix",
"DraftTemplate",
"AccountSettings",
]

View File

@@ -0,0 +1,99 @@
"""Per-account settings with a JSONB preferences grab-bag.
Rows are created lazily on first write. Reads of a missing row return the
caller-supplied default — no upfront row creation per account.
Settings live in `preferences` until they meet the promotion criteria in
Section 4.6 of FLOWPILOT-MIGRATION.md (hot path / validation / joins), at
which point a future migration adds a typed column and the helpers prefer it.
"""
from __future__ import annotations
import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import DateTime, ForeignKey, text
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB, insert as pg_insert
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.sql import select
from app.core.database import Base
if TYPE_CHECKING:
from app.models.account import Account
class AccountSettings(Base):
"""One row per account. Created lazily on first `set_setting` call."""
__tablename__ = "account_settings"
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
primary_key=True,
)
preferences: Mapped[dict[str, Any]] = mapped_column(
JSONB, nullable=False, default=dict, server_default=text("'{}'::jsonb")
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
@classmethod
async def get_setting(
cls,
db: AsyncSession,
account_id: uuid.UUID,
key: str,
default: Any = None,
) -> Any:
"""Return preferences[key] for the account, or `default` if no row/no key.
Never creates a row — this is the pure-read path.
"""
result = await db.execute(
select(cls.preferences).where(cls.account_id == account_id)
)
prefs = result.scalar_one_or_none()
if prefs is None:
return default
return prefs.get(key, default)
@classmethod
async def set_setting(
cls,
db: AsyncSession,
account_id: uuid.UUID,
key: str,
value: Any,
) -> None:
"""Upsert preferences[key] = value for the account.
Creates the row on first write; on subsequent writes, merges the key
into the existing preferences JSON without clobbering other keys.
Uses PostgreSQL's `||` jsonb merge operator via ON CONFLICT DO UPDATE.
"""
stmt = pg_insert(cls).values(
account_id=account_id,
preferences={key: value},
)
stmt = stmt.on_conflict_do_update(
index_elements=[cls.account_id],
set_={
# Merge the new {key: value} into the existing preferences.
# The `||` operator on jsonb overwrites matching keys and keeps
# all other keys intact.
"preferences": cls.preferences.op("||")(stmt.excluded.preferences),
"updated_at": text("now()"),
},
)
await db.execute(stmt)

View File

@@ -214,6 +214,38 @@ class AISession(Base):
comment="Current task lane state: {questions: [...], actions: [...]}",
)
# ── Resolution / Escalation artifacts (Phase 1 — FlowPilot migration) ──
# Markdown of the posted note + PSA external ID for round-trip traceability.
resolution_note_markdown: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
comment="Final Resolve note markdown, as posted to the PSA",
)
resolution_note_posted_at: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
)
resolution_note_external_id: Mapped[Optional[str]] = mapped_column(
String(128), nullable=True,
comment="PSA (e.g. CW) ticket-note ID returned at post time",
)
escalation_package_markdown: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
comment="Final Escalate handoff package markdown, as posted to the PSA",
)
escalation_package_posted_at: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
)
escalation_package_external_id: Mapped[Optional[str]] = mapped_column(
String(128), nullable=True,
comment="PSA ticket-note ID for the escalation package",
)
# Incremented atomically by any write that invalidates the resolution
# note preview cache (facts, suggested fixes, script generations).
# See FLOWPILOT-MIGRATION.md Section 5.5.
state_version: Mapped[int] = mapped_column(
Integer, nullable=False, default=0, server_default=sa.text("0"),
comment="Monotonic preview-cache version; bumped on state-changing writes",
)
# ── Branching ──
is_branching: Mapped[bool] = mapped_column(
default=False,

View File

@@ -0,0 +1,91 @@
"""Draft template model — scripts generated during a session, pending templatization.
Created when an engineer picks "Run now, templatize after resolve" in the
three-option dialog. Post-resolve, the TemplatizePrompt component reads pending
drafts and lets the engineer accept (promotes to `script_templates`) or reject.
"""
import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import (
Text, DateTime, ForeignKey, String, CheckConstraint,
)
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB
from app.core.database import Base
if TYPE_CHECKING:
from app.models.account import Account
from app.models.ai_session import AISession
from app.models.user import User
from app.models.script_template import ScriptCategory, ScriptTemplate
class DraftTemplate(Base):
"""A session-generated script pending conversion to a reusable template."""
__tablename__ = "draft_templates"
__table_args__ = (
CheckConstraint(
"status IN ('pending', 'accepted', 'rejected')",
name="ck_draft_templates_status",
),
)
id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id"),
nullable=False,
)
source_session_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("ai_sessions.id"),
nullable=False,
)
source_user_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("users.id"),
nullable=False,
)
script_body: Mapped[str] = mapped_column(Text, nullable=False)
proposed_parameters: Mapped[dict[str, Any]] = mapped_column(
JSONB, nullable=False
)
proposed_name: Mapped[str | None] = mapped_column(String(200), nullable=True)
proposed_category_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True),
ForeignKey("script_categories.id"),
nullable=True,
)
status: Mapped[str] = mapped_column(
String(32), nullable=False, default="pending"
)
resolved_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True
)
# Set when status transitions to 'accepted' and the draft is promoted
# to a real script_templates row.
promoted_template_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True),
ForeignKey("script_templates.id"),
nullable=True,
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
source_session: Mapped["AISession"] = relationship(
"AISession", foreign_keys=[source_session_id]
)
source_user: Mapped["User"] = relationship("User", foreign_keys=[source_user_id])
proposed_category: Mapped["ScriptCategory | None"] = relationship(
"ScriptCategory", foreign_keys=[proposed_category_id]
)
promoted_template: Mapped["ScriptTemplate | None"] = relationship(
"ScriptTemplate", foreign_keys=[promoted_template_id]
)

View File

@@ -78,6 +78,20 @@ class ScriptTemplate(Base):
is_gallery_featured: Mapped[bool] = mapped_column(Boolean, nullable=False, default=False, server_default=text("false"), index=True)
gallery_sort_order: Mapped[int] = mapped_column(Integer, nullable=False, default=0, server_default=text("0"))
usage_count: Mapped[int] = mapped_column(Integer, nullable=False, default=0, server_default=text("0"))
# ── Provenance (Phase 1 — FlowPilot migration) ──
# Populated when a template is promoted from a post-resolve draft_templates row.
# Powers the Script Library provenance chip:
# "generated from CW #X · resolved by Y · used N times"
source_session_id: Mapped[Optional[uuid.UUID]] = mapped_column(
UUID(as_uuid=True), ForeignKey("ai_sessions.id"), nullable=True,
)
source_user_id: Mapped[Optional[uuid.UUID]] = mapped_column(
UUID(as_uuid=True), ForeignKey("users.id"), nullable=True,
)
source_ticket_ref: Mapped[Optional[str]] = mapped_column(
String(64), nullable=True,
comment="Human-readable PSA ticket ref for display, e.g. 'CW #48307'",
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)

View File

@@ -0,0 +1,79 @@
"""Session fact model — the "What we know" backing store for a FlowPilot session.
A fact is an atomic, engineer-readable statement of what has been confirmed
during troubleshooting. Facts accumulate across the session and drive the
resolution note preview.
`source_ref` is a polymorphic pointer to a task-lane item inside
`ai_sessions.pending_task_lane` JSON — it is NOT a FK. Integrity is enforced
at the service layer per the FLOWPILOT-MIGRATION design doc Section 4.2.
Phase 2 assigns stable UUIDs to those task-lane items so `source_ref` has
something reliable to point to.
"""
import uuid
from datetime import datetime, timezone
from typing import TYPE_CHECKING
from sqlalchemy import Text, DateTime, ForeignKey, String, CheckConstraint
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID
from app.core.database import Base
if TYPE_CHECKING:
from app.models.ai_session import AISession
from app.models.account import Account
from app.models.user import User
class SessionFact(Base):
"""A single fact in the What-we-know section of a session's task lane."""
__tablename__ = "session_facts"
__table_args__ = (
CheckConstraint(
"source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')",
name="ck_session_facts_source_type",
),
)
id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
)
session_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id"),
nullable=False,
)
text: Mapped[str] = mapped_column(Text, nullable=False)
source_type: Mapped[str] = mapped_column(String(32), nullable=False)
# Pointer to a task-lane item UUID inside ai_sessions.pending_task_lane.
# NOT a FK. Null for `user_note` and `ai_synthesis` sources.
source_ref: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True), nullable=True
)
source_summary: Mapped[str | None] = mapped_column(Text, nullable=True)
created_by: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("users.id"),
nullable=False,
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)
deleted_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True
)
session: Mapped["AISession"] = relationship("AISession", foreign_keys=[session_id])
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
creator: Mapped["User"] = relationship("User", foreign_keys=[created_by])

View File

@@ -0,0 +1,80 @@
"""Session suggested-fix model — AI-proposed resolution path for a session.
A session can have multiple suggested fixes over its lifetime as the AI's
understanding evolves. Only one is active at a time (superseded_at IS NULL);
emitting a new [SUGGEST_FIX] marker supersedes the prior active one.
"""
import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import (
Text, DateTime, ForeignKey, String, Integer, CheckConstraint,
)
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB
from app.core.database import Base
if TYPE_CHECKING:
from app.models.ai_session import AISession
from app.models.account import Account
from app.models.script_template import ScriptTemplate
class SessionSuggestedFix(Base):
"""One AI-proposed fix for a FlowPilot session."""
__tablename__ = "session_suggested_fixes"
__table_args__ = (
CheckConstraint(
"confidence_pct BETWEEN 0 AND 100",
name="ck_session_suggested_fixes_confidence_pct",
),
CheckConstraint(
"user_decision IS NULL OR user_decision IN ("
"'one_off', 'draft_template', 'build_template', 'dismissed')",
name="ck_session_suggested_fixes_user_decision",
),
)
id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
)
session_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id"),
nullable=False,
)
title: Mapped[str] = mapped_column(String(200), nullable=False)
description: Mapped[str] = mapped_column(Text, nullable=False)
confidence_pct: Mapped[int] = mapped_column(Integer, nullable=False)
script_template_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True),
ForeignKey("script_templates.id"),
nullable=True,
)
# Populated only when there's no matching template and the AI has
# drafted a session-specific script.
ai_drafted_script: Mapped[str | None] = mapped_column(Text, nullable=True)
ai_drafted_parameters: Mapped[dict[str, Any] | None] = mapped_column(
JSONB, nullable=True
)
user_decision: Mapped[str | None] = mapped_column(String(32), nullable=True)
# Set when a newer suggested fix supersedes this one.
superseded_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
session: Mapped["AISession"] = relationship("AISession", foreign_keys=[session_id])
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
script_template: Mapped["ScriptTemplate | None"] = relationship(
"ScriptTemplate", foreign_keys=[script_template_id]
)

View File

@@ -10,10 +10,32 @@ Uses Anthropic prompt caching to reduce cost on multi-turn conversations:
Optionally connects to Microsoft Learn via Anthropic's MCP connector
for real-time documentation lookups (controlled by ENABLE_MCP_MICROSOFT_LEARN).
## Architectural note — this module is the one MCP/beta chat caller
`chat_call_cached` below is the ONLY caller in the codebase that uses
Anthropic's `client.beta.messages.create` endpoint, MCP servers, multimodal
user messages, and the retry-without-MCP fallback. It is deliberately NOT
routed through `AnthropicProvider` — MCP/beta/images are features of exactly
one optional Anthropic beta endpoint and do not belong in a provider-agnostic
abstraction that also serves Gemini.
If a new caller needs the same (MCP, beta, images, history caching) bundle,
call `chat_call_cached` directly rather than pushing those concerns into
`AnthropicProvider`. Cached-system-block plumbing is shared with the provider
via `_normalize_system_for_anthropic` / `build_anthropic_chat_messages` /
`_log_anthropic_cache_usage` in `app.core.ai_provider` — cache primitives are
reusable, but the MCP/beta orchestration stays here.
"""
import logging
from typing import Any
from app.core.ai_provider import (
_get_anthropic_client,
_log_anthropic_cache_usage,
_normalize_system_for_anthropic,
build_anthropic_chat_messages,
)
from app.core.config import settings
logger = logging.getLogger(__name__)
@@ -184,7 +206,7 @@ async def _call_ai(
to include alongside the new_message as vision content.
"""
if settings.AI_PROVIDER == "anthropic" and settings.ANTHROPIC_API_KEY:
return await _call_anthropic_cached(
return await chat_call_cached(
system_base, rag_context, history, new_message, max_tokens,
images=images,
)
@@ -202,7 +224,18 @@ async def _call_ai(
)
async def _call_anthropic_cached(
# Appended to every chat turn's user message immediately before generation.
# Invisible to storage (unified_chat_service strips markers before persisting),
# but critical for structured output compliance — the model emits invalid
# responses often enough without it that removing this reminder regresses UX.
_CHAT_FORMAT_REMINDER = (
"\n\n[SYSTEM: Remember — your response MUST end with [QUESTIONS] "
"and/or [ACTIONS] markers containing valid JSON arrays. "
"Responses without markers break the UI.]"
)
async def chat_call_cached(
system_base: str,
rag_context: str,
history: list[dict[str, Any]],
@@ -210,79 +243,56 @@ async def _call_anthropic_cached(
max_tokens: int,
images: list[dict[str, Any]] | None = None,
) -> tuple[str, int, int]:
"""Call Anthropic with prompt caching on system prompt and history.
"""Call Anthropic's chat surface with caching, MCP, images, and retry-without-MCP.
Uses structured system blocks so the static base prompt is cached
independently from the per-query RAG context. Optionally connects
to Microsoft Learn via MCP for real-time documentation lookups.
This is the ONE MCP/beta/multimodal chat caller. It is deliberately NOT
routed through `AnthropicProvider`. See module docstring for rationale.
Responsibilities unique to this function (not in the provider):
- Anthropic beta endpoint (`client.beta.messages.create`)
- Microsoft Learn MCP connector wiring (optional via ENABLE_MCP_MICROSOFT_LEARN)
- Retry-without-MCP fallback when the MCP server misbehaves
- Multimodal image blocks in the user message
- Format-reminder append for structured-output compliance
- Telemetry (`mcp.turn`, `mcp.fallback`) for Phase 0.5 MCP usage signal
Cache plumbing is shared with the provider via helpers in `ai_provider`:
`_normalize_system_for_anthropic` (policy α — ephemeral on first block if
none specified), `build_anthropic_chat_messages` (history cache breakpoint +
multimodal user message + format reminder), `_log_anthropic_cache_usage`.
"""
import anthropic
client = anthropic.AsyncAnthropic(
api_key=settings.ANTHROPIC_API_KEY,
client = _get_anthropic_client(
settings.ANTHROPIC_API_KEY,
timeout=settings.AI_REQUEST_TIMEOUT_SECONDS,
)
# System prompt as structured blocks:
# Block 1: static base prompt (cached)
# Block 2: RAG context (changes per query, not cached)
# System prompt as structured blocks. The static base is cacheable; the
# RAG context changes per query and must NOT be cached — so we mark the
# base explicitly and leave the RAG block unmarked. `_normalize_system`
# honors caller-authored cache_control verbatim (policy α).
system_blocks: list[dict[str, Any]] = [
{
"type": "text",
"text": system_base,
"cache_control": {"type": "ephemeral"},
# cacheable: static system prompt, stable across all turns of all sessions
},
]
if rag_context:
system_blocks.append({"type": "text", "text": rag_context})
system_blocks.append(
{"type": "text", "text": rag_context}
# uncached: RAG retrieval varies per query
)
normalized_system = _normalize_system_for_anthropic(system_blocks)
# Build messages with cache breakpoint on conversation history
messages: list[dict[str, Any]] = []
for msg in history:
messages.append({"role": msg["role"], "content": msg["content"]})
# Place cache breakpoint on the last history message so the entire
# conversation prefix is cached across turns
if messages:
last = messages[-1]
messages[-1] = {
"role": last["role"],
"content": [
{
"type": "text",
"text": last["content"],
"cache_control": {"type": "ephemeral"},
}
],
}
# Add the new user message (uncached — it's new each turn)
# Append a format reminder to the user message so the model sees it
# immediately before generating. This is invisible to the user (stripped
# before storage) but critical for structured output compliance.
format_reminder = (
"\n\n[SYSTEM: Remember — your response MUST end with [QUESTIONS] "
"and/or [ACTIONS] markers containing valid JSON arrays. "
"Responses without markers break the UI.]"
messages = build_anthropic_chat_messages(
history=history,
new_message=new_message,
images=images,
format_reminder=_CHAT_FORMAT_REMINDER,
)
reminded_message = new_message + format_reminder
# If images are attached, build multimodal content blocks
if images:
content_blocks: list[dict[str, Any]] = []
for img in images:
content_blocks.append({
"type": "image",
"source": {
"type": "base64",
"media_type": img["media_type"],
"data": img["data"],
},
})
content_blocks.append({"type": "text", "text": reminded_message})
messages.append({"role": "user", "content": content_blocks})
else:
messages.append({"role": "user", "content": reminded_message})
# MCP server config (optional — controlled by settings)
mcp_servers = anthropic.NOT_GIVEN
@@ -304,12 +314,13 @@ async def _call_anthropic_cached(
]
_mcp_active = mcp_servers is not anthropic.NOT_GIVEN
_mcp_fallback_triggered = False
try:
response = await client.beta.messages.create(
model=settings.AI_MODEL_ANTHROPIC,
max_tokens=max_tokens,
system=system_blocks,
system=normalized_system,
messages=messages,
mcp_servers=mcp_servers,
tools=tools,
@@ -326,14 +337,24 @@ async def _call_anthropic_cached(
or isinstance(e, (anthropic.BadRequestError, anthropic.APIStatusError))
)
if _is_mcp_error:
_mcp_fallback_triggered = True
logger.warning(
"MCP server error (%s), retrying without MCP: %s",
type(e).__name__, e,
)
# Phase 0.5 telemetry: per-turn fallback event.
logger.info(
"mcp.fallback",
extra={
"event": "mcp.fallback",
"mcp_error_type": type(e).__name__,
"mcp_error_message": str(e)[:500],
},
)
response = await client.messages.create(
model=settings.AI_MODEL_ANTHROPIC,
max_tokens=max_tokens,
system=system_blocks,
system=normalized_system,
messages=messages,
)
else:
@@ -355,18 +376,27 @@ async def _call_anthropic_cached(
input_tokens = usage.input_tokens
output_tokens = usage.output_tokens
# Log MCP tool usage
# Phase 0.5 telemetry: per-turn MCP event. Emitted for every turn that
# reached this code path (i.e., AI_PROVIDER=anthropic chat). `mcp_available`
# reflects whether MCP was actually wired into the request (scope (ii) from
# the Phase 0.5 design — Anthropic code path AND flag on). `mcp_invoked`
# reflects whether the model chose to call an MCP tool on this turn.
logger.info(
"mcp.turn",
extra={
"event": "mcp.turn",
"mcp_available": _mcp_active,
"mcp_invoked": bool(mcp_tools_used),
"mcp_tools": mcp_tools_used,
"mcp_fallback_triggered": _mcp_fallback_triggered,
},
)
# Human-readable log retained for grep-based inspection.
if mcp_tools_used:
logger.info("MCP tools used: %s", ", ".join(mcp_tools_used))
# Log cache performance
cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0
if cache_read or cache_creation:
logger.info(
"Anthropic cache: read=%d creation=%d input=%d output=%d",
cache_read, cache_creation, input_tokens, output_tokens,
)
_log_anthropic_cache_usage(usage, settings.AI_MODEL_ANTHROPIC)
return text, input_tokens, output_tokens

View File

@@ -220,7 +220,15 @@ async def send_message(
model = settings.get_model_for_action("script_build")
provider = get_ai_provider(model=model)
ai_text, input_tokens, output_tokens = await provider.generate_text(
system_prompt=system_prompt,
system_prompt=[
{"type": "text", "text": system_prompt},
# cacheable: SYSTEM_PROMPT_TEMPLATE with a per-session language
# substitution. Two sessions on the same language share a cache
# entry; different languages cache independently. Conversation
# history (ai_messages) is NOT cached at this layer — if that
# becomes a cost driver, route script_builder through the chat
# wrapper (0.4) which handles history caching.
],
messages=ai_messages,
max_tokens=8192,
)

View File

@@ -0,0 +1,163 @@
# FlowPilot & ResolutionAssist
> ResolutionFlow offers two AI-driven troubleshooting modes that share the same session backend but present very different interaction styles. Both work standalone and become richer when paired with a PSA connection.
---
## At a glance
| | **FlowPilot** | **ResolutionAssist** |
|---|---|---|
| **Style** | Guided, structured | Conversational, freeform |
| **Entry** | `/pilot` | `/assistant` |
| **Interaction** | Questions → Actions → Resolution, one step at a time | Natural chat with inline questions/actions |
| **Best for** | Reproducible workflows, low-context engineers, handoffs | Exploratory problems, quick lookups, rubber-ducking |
| **Lifecycle** | Active → Paused → Resolved / Escalated / Abandoned | Active → Resolved / Abandoned (lightweight) |
| **Confidence tracking** | Yes — drives tier transitions | No — always responsive to user direction |
| **Navigation guard** | Yes — prevents accidental loss | No — free to leave and return |
Both modes share the `ai_sessions` table (discriminated by `session_type`), the same multimodal AI backend (image uploads, markdown, cached prompts), and the same `[QUESTIONS]` / `[ACTIONS]` / `[FORK]` marker vocabulary that renders inline TaskLane elements.
---
## FlowPilot — guided troubleshooting
FlowPilot is a wizard-style AI engineer that walks you through a problem one diagnostic step at a time. It runs on confidence tiers:
- **Discovery** (confidence < 0.4) — asking broad, open-ended questions to characterize the problem
- **Exploring** (0.40.8) — proposing targeted actions and narrowing hypotheses
- **Guided** (≥ 0.8) — recommending a specific fix with steps to verify
### The FlowPilot session flow
1. **Intake.** You start from `/pilot` or from the dashboard "New Session" button. The intake screen accepts free-text description, PSA ticket context, screenshots, or log pastes.
2. **Preference check.** Before suggesting any fix, the AI asks whether you want a **GUI** or **script** approach. This is enforced in the system prompt so you never get steps you can't execute.
3. **Step-by-step progression.** Each AI response is either a question (with clickable options), an action (with "Done" / "Didn't work" buttons), a `[FORK]` (two distinct paths to try), or a final resolution suggestion. You respond, the AI updates its confidence, and the next step is generated.
4. **Action bar.** The session header always shows **Pause & Leave**, **Resolve**, **Escalate**, **Share Update**, and **Close**. Pausing freezes the session; resuming restores the full context.
5. **Resolve / Escalate.** *Resolve* marks the ticket fixed and generates a clean summary of what worked. *Escalate* packages the problem summary and steps tried into an **escalation package** that the next engineer (or the PSA ticket) inherits.
### Why FlowPilot exists
- **New engineers** get senior-engineer-level diagnostic rigor without needing the experience to know what to ask next.
- **Documented resolutions** — every step is captured, so the generated note on the ticket is substantive (not just "fixed it").
- **Handoff-friendly** — escalation packages mean the next person doesn't start from zero.
---
## ResolutionAssist — conversational AI
ResolutionAssist is a chat with an expert IT systems engineer. It's less structured than FlowPilot but still surfaces interactive elements when the AI wants structured input.
### The ResolutionAssist flow
1. **Open a chat.** From `/assistant` or the dashboard. Sessions show up in the left sidebar just like any messaging app.
2. **Send a message.** Freeform prose. Attach up to 3 images per message (screenshots, error dialogs, network diagrams). Paste logs, code, or PowerShell output.
3. **AI responds.** The response is prose, but any `[QUESTIONS]` or `[ACTIONS]` blocks render as a **TaskLane** — a side panel with clickable options and action buttons. You can answer via chat or click the TaskLane elements.
4. **Branching (`[FORK]`).** If the AI proposes two paths ("check cable or restart switch?"), the fork renders as a choice. Picking one continues the conversation down that path.
5. **Resume later.** Unlike FlowPilot, there's no navigation guard. Leave mid-conversation; every message is stored.
### Why ResolutionAssist exists
- **Unstructured problems** — "I have no idea where to start, here's a screenshot" works great.
- **Reference lookups** — "what's the right PowerShell command to check Exchange health" is faster in chat than through an intake form.
- **Senior engineers** — when you already know what you're doing and just want a second opinion or a syntax check.
---
## Without a PSA connection
Both modes work standalone. Without ConnectWise connected:
- Sessions live entirely in ResolutionFlow. They're listed in your session history, searchable, and shareable via public share links (`/shared/sessions/:token`).
- Summaries generated on Resolve are saved to the session record but **not** written anywhere else. You can copy/paste into whatever ticketing or documentation system you use.
- Escalating a FlowPilot session routes the escalation package to another ResolutionFlow engineer on your team — not to an external PSA ticket.
- No ticket context is injected into the AI prompt, so the AI starts cold with only what you provide in the intake or first message.
**Standalone use cases:**
- Evaluating ResolutionFlow before committing to a PSA integration
- Troubleshooting internal IT issues that aren't client-facing
- Teams using a PSA ResolutionFlow doesn't integrate with yet
- Knowledge-base research ("what are my options for X") that don't map to a ticket
---
## With a PSA connection (ConnectWise)
When ConnectWise is connected, both modes become ticket-aware and write back to the PSA as a first-class client.
### FlowPilot + PSA
**Starting from a ticket:**
- Click a ticket row (from `/tickets` or the dashboard queue) and pick "Start FlowPilot." The ticket's problem description, recent notes, configurations, company details, and related tickets are auto-injected into the AI's context. No manual retyping.
- The session shows the linked ticket badge in the header.
**During the session:**
- **Share Update** — posts an interim note to the CW ticket with the current AI summary, so stakeholders can see progress without interrupting you.
- **Status changes** — the detail panel and session header let you move the ticket through statuses (New → In Progress → Waiting on Customer → Resolved) directly from ResolutionFlow. Status writes are verified against CW so you're never told "success" when CW silently rejected the change.
- **Resource assignment** — add yourself or a teammate as a co-assignee without touching the owner. If the ticket has no owner yet, assigning sets owner; if there's already an owner, you're added as an additional resource via a CW schedule entry.
**On Resolve:**
- Final summary is posted as a ticket note.
- Ticket status can auto-update to Resolved (per your team's settings).
**On Escalate:**
- The escalation package (problem summary + steps tried) is posted as a note.
- The ticket can be routed via CW's normal escalation rules.
- The next engineer picking up the ticket can auto-start a new session with the full escalation context pre-filled.
**Spin-off tickets (new):**
- During any session, if you discover a separate issue, the AI can propose `create_spin_off_ticket`. Accepting opens the New Ticket modal pre-filled with the current ticket's company and board, so a second ticket is one click away without leaving your session.
### ResolutionAssist + PSA
**Starting from a ticket:**
- Same ticket-context injection as FlowPilot. When opened with a linked ticket, the AI sees company, configs, notes, and related tickets.
- A "New Ticket" button appears in the header — lets you spawn a separate ticket mid-conversation (same flow as FlowPilot's spin-off).
**During the chat:**
- Ask the AI about the ticket directly: *"Summarize what's been tried," "What configs does this company have?"* — the AI already has that context loaded.
- `[ACTIONS]` can include `create_spin_off_ticket` when the AI detects a separate issue surfaced in the conversation.
**Writing back:**
- ResolutionAssist is a lighter-weight mode, so it doesn't auto-post on resolve. You can manually copy the conversation summary to a ticket note if useful.
- Status updates and resource assignment are done via the `/tickets` page rather than the chat UI.
---
## Choosing between them
| I want to… | Use |
|---|---|
| Walk through a known issue type with step-by-step rigor | **FlowPilot** |
| Document every action taken for audit or handoff | **FlowPilot** |
| Escalate with a full context package | **FlowPilot** |
| Ask a question, get an answer, move on | **ResolutionAssist** |
| Paste a screenshot and say "what's wrong here?" | **ResolutionAssist** |
| Stay on the ticket for 2 minutes, not 20 | **ResolutionAssist** |
| Troubleshoot without breaking flow to switch pages | Either, with the linked ticket panel open alongside |
The two modes aren't competitive. A common workflow is to start in ResolutionAssist to scope the problem, then kick off a FlowPilot session when you realize the issue is going to take real diagnosis. Both show up in the unified session history.
---
## Tickets page — the PSA hub
`/tickets` is the CW ticket manager built into ResolutionFlow. With a PSA connection:
- Search and filter tickets by assignment (me / unassigned / specific member via searchable picker), board, status, priority, company, open/closed.
- Slide-out detail panel shows notes, configurations, related tickets, and assignees — all fetched in parallel for fast hydration.
- From the detail panel: change status, add/remove assignees, post notes, or "Start FlowPilot" / "Open in ResolutionAssist" with full context.
- New Ticket modal offers both AI-parse ("Create a high-priority ticket for Acme — Outlook not syncing for jsmith") and a traditional form.
Without a PSA connection, `/tickets` is hidden from the sidebar entirely — there's nothing to show.
---
## Summary
- **FlowPilot** = guided, structured, lifecycle-heavy, ideal for resolvable issues and handoffs.
- **ResolutionAssist** = freeform chat, ideal for scoping and quick answers.
- **Without PSA** = both work, sessions live in ResolutionFlow, summaries are yours to export.
- **With PSA** = both become ticket-aware, write back to CW (notes, status, resources), and can spawn spin-off tickets mid-session.
The AI is the same under the hood. The difference is how much structure you want around the conversation — and how deeply the result needs to integrate with your ticketing system.

View File

@@ -0,0 +1,178 @@
# FlowPilot Migration Plan: Phase 0 Through Phase 7
## Summary
- Stay code-change-free until execution is explicitly requested.
- Implement in commit-sized phases, with Phase 0 as a prerequisite for AI-heavy Phases 2+.
- Use this repos existing `/api/v1/ai-sessions` API namespace instead of the docs generic `/sessions` path.
- Move the existing chat-first `AssistantChatPage` to `/pilot`; `/assistant` becomes a permanent redirect.
- Keep `ai_sessions.session_type` for compatibility, but the user-facing product becomes one FlowPilot surface.
## Phase 0: Prompt Caching Infrastructure
- Consolidate Anthropic prompt caching into `backend/app/core/ai_provider.py`, then route all Anthropic calls through that provider.
- Preserve the existing cached behavior from `assistant_chat_service`, but remove the private duplicate cached implementation once provider parity exists.
- Add cache-control blocks for static system prompt sections and stable tool/context prefixes; keep volatile user messages outside the cached prefix.
- Update one-shot AI generators and `/tickets/ai-parse` to separate stable context from changing request content.
- Instrument every Anthropic response with `cache_read_input_tokens` and `cache_creation_input_tokens`.
- Acceptance: two identical FlowPilot/provider-backed calls within 5 minutes show creation tokens on the first call and read tokens on the second.
## Phase 1: Schema + Route Rename
- Add Alembic migration after current repo head with:
- `session_facts`
- `session_suggested_fixes`
- `draft_templates`
- `account_settings`
- new artifact columns on `ai_sessions`
- provenance columns on `script_templates`
- Create `account_settings` as one lazy row per account:
- `account_id` primary key, FK to `accounts(id)` with cascade delete
- `preferences JSONB NOT NULL DEFAULT '{}'`
- timestamps
- `get_setting(key, default)` helper on the SQLAlchemy model
- `templatize_prompt_enabled` default read as `true` when the row/key is absent
- Apply RLS to all new tenant-scoped tables using the repos `app.current_account_id` policy pattern.
- Route `/pilot` and `/pilot/:sessionId` to the existing chat UI; redirect `/assistant` and `/assistant/:sessionId` permanently.
- Update sidebar, command palette, dashboard cards, session list links, and visible labels from “AI Assistant”/ResolutionAssist to “FlowPilot” where they describe the troubleshooting surface.
- Acceptance: `/pilot` renders the chat UI, `/assistant` redirects, RLS grep/check passes, and no Phase 2 UI is introduced yet.
## Phase 2: What We Know
- Add `FactSynthesisService` for conservative conversion of answers/check outputs into facts.
- Add fact APIs under `/api/v1/ai-sessions/{id}/facts`:
- list, create manual note, update editable fact, soft-delete, promote source item.
- Extend `unified_chat_service` marker parsing with `[PROMOTE]`; do not create a separate marker pipeline.
- Because current questions/checks live in `ai_sessions.pending_task_lane` JSON, Phase 2 must assign stable UUIDs to task-lane questions/actions/checks when they are first persisted. `session_facts.source_ref` points to those stable JSON item IDs; it remains polymorphic and unconstrained at the DB level.
- Add frontend task lane components under the new FlowPilot component namespace:
- `TaskLane`
- `WhatWeKnow`
- `WhatWeKnowItem`
- `AddNoteButton`
- moved/refactored Questions and Diagnostic Checks sections
- Place What We Know above Questions. Facts from questions/checks are read-only at the fact card level; manual and AI-synthesis facts are editable.
- Acceptance: answering a question or completing a check can promote a fact within 2 seconds; manual notes persist; page reload preserves facts; cross-account access is blocked.
## Phase 3: Suggested Fix + Resolve Preview
- Add suggested-fix APIs under `/api/v1/ai-sessions/{id}/suggested-fixes`:
- get active suggested fix
- record decision for one-off/draft-template/build-template/dismissed
- Extend `unified_chat_service` marker parsing with `[SUGGEST_FIX]`; supersede the prior active fix when a new one is persisted.
- Add `ResolutionNoteGeneratorService` that builds the fixed markdown shape:
- Problem
- What we confirmed
- Root cause
- Resolution
- Add preview endpoint at `/api/v1/ai-sessions/{id}/resolution-note/preview`.
- Generate the preview from `ai_sessions`, `session_facts`, active suggested fix, and linked script generations; redact sensitive script parameters.
- Cache preview output by session-state version or content hash; invalidate on fact/suggested-fix/script-generation writes.
- Add `SuggestedFix`, `ResolveButton`, and `ResolutionNotePreview` popover. Debounce preview refresh to 500ms.
- Acceptance: a session with facts and a suggested fix shows a four-section preview; editing a fact refreshes preview; human review confirms no unsupported claims.
## Phase 4: Resolve + Escalate Writebacks
- Add `EscalationPackageGeneratorService` with handoff-oriented markdown:
- Problem
- What weve confirmed
- What weve tried
- Current hypothesis
- Suggested next steps
- Add preview/post endpoints under `/api/v1/ai-sessions/{id}`:
- `/resolution-note/preview`
- `/resolution-note/post`
- `/escalation-package/preview`
- `/escalation-package/post`
- Extend PSA writeback service using the existing PSA provider registry and `post_note` seam.
- Implement “confirm and fire”: engineer edits preview, clicks Confirm & post, then server posts to PSA and stores result metadata.
- Ticket status transitions must verify by re-fetching status; failed verification is surfaced as an error, not silent success.
- Resolving without a linked PSA ticket stores markdown and marks the session resolved without external posting.
- Acceptance: ConnectWise test ticket receives the note/package, status verification works, and unlinked sessions resolve locally.
## Phase 5: Inline Script Generator Integration
- Add inline Script Generator components:
- `TemplateMatchPanel`
- `NoTemplateDialog`
- `ParameterizationPreview`
- For template matches, clicking the suggested fix opens the existing Script Generator flow with parameters prefilled from facts, ticket context, account/PSA config, and AI-suggested values.
- For no-template matches, show the three-option dialog:
- Run as one-off
- Run now, templatize after resolve
- Build as template now
- Persist the selected path on `session_suggested_fixes.user_decision`.
- Add `TemplateExtractionService` for converting concrete scripts into proposed parameter schemas and templated bodies.
- Link every script generation back to `ai_sessions` via existing `script_generations.ai_session_id`.
- `Cmd+K → script` opens the inline generator from the FlowPilot session; no Resolve keyboard shortcut is added.
- Acceptance: matched templates prefill parameters; no-match flow shows three options; all options produce the correct session/template side effects.
## Phase 6: Post-Resolve Templatize Prompt
- Add `TemplatizePrompt` after successful Resolve only when:
- the account setting allows prompts
- the session has pending `draft_templates`
- the user chose “Run now, templatize after resolve”
- Accept flow creates a real `script_templates` row with:
- `source_session_id`
- `source_user_id`
- `source_ticket_ref`
- accepted parameter schema/body edits
- Skip flow marks the draft rejected.
- “Dont ask me again for this team” writes `{"templatize_prompt_enabled": false}` to `account_settings.preferences`.
- Script Library shows a pending-drafts badge/count for the account.
- Acceptance: accept creates a visible template with provenance; skip creates no template; disabled prompt is respected on the next resolve.
## Phase 7: Polish
- Match the authoritative mockup HTML for spacing, colors, typography, and component structure; use PNGs for visual target confirmation.
- Add loading states for fact synthesis, preview generation, template extraction, PSA post/verify, and script generation.
- Add empty states for:
- no facts
- no questions
- no checks
- no suggested fix
- no pending draft templates
- Add keyboard shortcuts except Resolve:
- `Cmd+K` command palette
- `Cmd+Enter` send composer
- `Cmd+G` script generator
- At widths below 1200px, collapse the task lane into a bottom drawer.
- Use existing design tokens where present; add missing tokens only if needed to match the mockups.
- Acceptance: major screens visually compare within the docs tolerance, no horizontal scroll at 1280px, mobile task lane works, and shortcuts do not conflict with browser reload.
## Public Interfaces
- New backend routes use `/api/v1/ai-sessions/{id}/...`, not `/api/v1/sessions/{id}/...`.
- Existing chat creation/message APIs remain compatible.
- `session_type` remains queryable and stored, but frontend routing no longer sends chat sessions to `/assistant`.
- New persistent entities:
- `session_facts`
- `session_suggested_fixes`
- `draft_templates`
- `account_settings`
- New persisted artifact columns on `ai_sessions` store resolution/escalation markdown and PSA post metadata.
## Test Plan
- Migration tests:
- fresh DB upgrade succeeds
- downgrade succeeds if the repo expects reversible migrations
- new tables have RLS enabled/forced
- tenant policy includes `app.current_account_id`
- Backend tests:
- fact CRUD and promotion authorization
- suggested-fix supersession and decision persistence
- preview generation cache invalidation
- Resolve/Escalate local-only behavior without PSA
- PSA status verification failure path
- draft-template accept/reject behavior
- Frontend tests:
- route redirects
- task lane rendering and persistence
- inline editing and preview refresh
- script generator option flows
- templatize prompt settings behavior
- responsive drawer behavior
- Manual QA:
- run through one ConnectWise linked Resolve
- run through one Escalate
- run one template-match script path
- run one no-template draft-template path through post-resolve save
## Assumptions
- Phase 0 is included and must be complete before Phase 2 begins.
- No Resolve keyboard shortcut in this migration.
- Templatize prompt defaults to enabled.
- Resolution notes use engineer review plus Confirm & post, not supervisor staging.
- Existing component folders may be renamed opportunistically, but behavior and route migration matter more than directory-name purity.
- No backfill of What We Know for old sessions.
- Team Wiki compilation, SharePoint integration, marketplace sharing, and confidence-tier UI are out of scope.

View File

@@ -0,0 +1,809 @@
# FlowPilot Migration — Design & Implementation Doc
> **Target:** Transform `/assistant` (ResolutionAssist) into the new unified `/pilot` (FlowPilot) surface.
> **Audience:** Claude Code (implementation) reviewed by Michael (owner).
> **Status:** Design locked. Ready for phased implementation.
> **Last updated:** April 17, 2026
---
## 0. Prerequisite reading for Claude Code
Before writing any code, read these in order:
1. This document end-to-end.
2. `mockups/01-session-primary.png` — the target state for the main session UI.
3. `mockups/02-script-template-match.png`, `03-script-three-options.png`, `04-script-templatize-prompt.png` — Script Generator integration states.
4. The source HTML files `mockups/01-session-primary.html` and `mockups/02-04-script-integration.html` — authoritative for spacing, colors, and component structure. When CSS or layout questions arise during implementation, these files are the tiebreaker.
Do not proceed to implementation until you have confirmed you understand the following three architectural claims. If any of them are unclear, stop and ask.
1. **There is one AI troubleshooting surface, not two.** The existing split between FlowPilot (guided) and ResolutionAssist (chat) is collapsed into a single chat-primary product called FlowPilot at `/pilot`. The `ai_sessions.session_type` discriminator column is retained for data, but the product shows one unified UI.
2. **The task lane is the load-bearing structural feature.** It is not a sidebar of metadata. It actively tracks diagnostic state: *What we know*, *Questions*, *Diagnostic checks*, *Suggested fix*. Engineers interact with it; facts flow between sections.
3. **Resolve and Escalate are deterministic artifact generators, not free-text prompts.** When an engineer clicks Resolve, a structured summary is generated from task lane state (not from the chat transcript alone) and posted to CW. The summary structure is fixed: *Problem / What we confirmed / Root cause / Resolution*.
---
## 1. Why this change
### The current state
- `/assistant` is a chat-primary AI session with a `[QUESTIONS]` and `[DIAGNOSTIC_CHECKS]` task lane.
- `/pilot` was specced as a separate guided, confidence-tiered wizard with a different UI and lifecycle.
- The `FLOWPILOT-AND-RESOLUTIONASSIST.md` design document treated them as two products sharing a backend.
### The problems with the current state
- Two sidebar entries, two session histories, two mental models for engineers to learn.
- The PSA integration scope doubles (writebacks for lifecycle events must be built twice, or built for Pilot and bolted onto Assist).
- The Team Wiki moat depends on structured session artifacts with explicit resolutions — a chat-only mode produces weaker artifacts.
- The cockpit positioning (the core ResolutionFlow brand promise) does not map to a blank chat window.
- Branching into two modes forces a decision onto the engineer ("which mode for this ticket?") that has no right answer.
### The resolution
The existing `/assistant` UI already does most of what `/pilot` was supposed to do — structured questions, diagnostic checks, lifecycle actions in the header. It is closer to the right product than the doc anticipated. Rather than building Pilot as a second surface, we extend Assist with the missing structural features (*What we know*, auto-generated summaries, escalation packages) and rename it FlowPilot.
### The strategic move
FlowPilot becomes the single canonical troubleshooting surface. Every PSA writeback, every Wiki compilation path, every Script Generator invocation points here. One session shape, one lifecycle, one integration surface.
---
## 2. Terminology used in this document
| Term | Meaning |
|---|---|
| **Session** | A single `ai_sessions` row representing one troubleshooting conversation. |
| **Task lane** | The right-side panel containing What we know, Questions, Diagnostic checks, Suggested fix. |
| **Fact** | An item in the What we know section. Has `text`, `source_type` (`question` / `diagnostic_check` / `user_note`), and `source_ref` (FK to the originating question/check, or null for user notes). |
| **Suggested fix** | The AI's current best-guess resolution path. Has a confidence score and, optionally, a reference to a Script Library template. |
| **Promotion** | The act of a question answer or diagnostic check result being converted into a fact in What we know. Triggered by AI, confirmed/editable by engineer. |
| **Resolution note** | The structured document generated when the engineer clicks Resolve. Posted to CW as a ticket note. |
| **Escalation package** | The structured handoff document generated when the engineer clicks Escalate. Posted to CW and attached to the session for the next engineer. |
---
## 3. Target UI — annotated
### 3.1 Primary session view
![Primary session view](mockups/01-session-primary.png)
The session UI is a four-column layout:
1. **Icon rail** (64px wide) — primary app navigation. FlowPilot / Tickets / Trees / Scripts / Wiki. Avatar at bottom.
2. **Session list** (260px wide) — all sessions grouped by state (Active / Recent). Each row shows title, state dot, PSA ticket number, and client name.
3. **Conversation column** (fluid) — the chat thread, composer, and incident header.
4. **Task lane** (380px wide) — *What we know*, *Questions*, *Diagnostic checks*, *Suggested fix*, and the Resolve action at the bottom.
Key visual and behavioral elements numbered against the mockup:
**Incident header (top of conversation column)**
- PSA chip showing `CW #48291` in cyan, monospaced
- Client / contact / priority meta line
- Incident title in Bricolage Grotesque 19px
- Four lifecycle buttons right-aligned: **Pause** (ghost), **Share update** (neutral), **Escalate** (amber), **Resolve** (green)
**Conversation column**
- Standard chat thread with pilot and user avatars
- Pilot uses cyan gradient avatar; user uses purple gradient
- AI messages in `bg-2` bubbles with subtle border; user messages in cyan-tinted bubbles
- Composer at bottom with inline action chips (Attach / Paste logs / Ticket context) and a send button
**Task lane sections, in order:**
1. **What we know** (NEW)
- Header: `WHAT WE KNOW · 4` (section title + count)
- Each fact is a card: `bg-2` background, dashed circular green check, fact text, and a provenance line (`from question · rules out tenant/license`)
- "+ Add a note" button at the bottom for manual facts from the engineer
- Background has a subtle green-to-transparent gradient to visually distinguish from the rest of the lane
2. **Questions**
- Header: `QUESTIONS · 2 unanswered`
- Each unanswered question: title, AI hint text, Answer / Skip buttons
- Answered questions dim to 55% opacity with a dashed border and show the resolution inline (`Answered · isolated to jsmith (promoted to What we know)`)
3. **Diagnostic checks**
- Header: `DIAGNOSTIC CHECKS · 1 / 3 run`
- "Run remaining 2 checks" button at top when applicable
- Each check: icon + command name (monospaced), description
- Completed checks dim and show "Complete · findings promoted to What we know" in green
4. **Suggested fix**
- Header: `SUGGESTED FIX · 94% confidence`
- Amber-accented card with fix title and description
- Clicking opens the Script Generator flow (Section 5)
**Resolve action bar (bottom of task lane)**
- Small hint text ("Summary preview is open →")
- Full-width "Resolve & post to CW" button in green
**Resolution note preview (floating, anchored to Resolve button)**
- A persistent popover, NOT a modal
- Shows the draft resolution note with Problem / What we confirmed / Root cause / Resolution sections
- Displays the target ticket (`CW #48291`) and status change (`Resolved`)
- Edit button opens an inline editor; Confirm & post fires the PSA writeback
### 3.2 Script Generator integration — template match
![Template match flow](mockups/02-script-template-match.png)
When the suggested fix references an existing Script Library template, clicking the fix opens the Script Generator panel in place of (or sliding over) the task lane. Key behavior:
- A **Verified template** badge appears above the parameter form
- Parameters pre-filled from session context get a cyan `from session` tag and a cyan-tinted input background
- Each pre-filled parameter has a hint line explaining the source: *"Pulled from CW company config for Acme Corp"*
- The engineer can adjust any pre-filled value before generating
- `⌘K` → "script" invokes the generator mid-conversation from anywhere in the session
### 3.3 Script Generator integration — no template match (three-option dialog)
![No template match](mockups/03-script-three-options.png)
When no template matches the suggested fix, FlowPilot drafts a session-specific script and presents three paths:
1. **Run as one-off** (neutral outline CTA)
- Script generated and captured in session documentation, discarded after
- Tradeoffs: fastest, but team won't benefit next time
2. **Run now, templatize after resolve** (RECOMMENDED, cyan primary CTA)
- Script generated for this ticket; draft template queued
- Post-resolve prompt offers to templatize (Section 5.3)
- Tradeoffs: zero cognitive overhead now, only templatize what works, ~30s review later
3. **Build as template now** (purple outline CTA)
- Full parameterization upfront
- Tradeoffs: immediate team benefit, but adds time mid-ticket
The drafted script renders as a code preview above the option cards with the AI's proposed parameters highlighted in amber.
### 3.4 Script Generator integration — post-resolve templatization prompt
![Templatize prompt](mockups/04-script-templatize-prompt.png)
If the engineer picked Option 2 in the three-option dialog and Resolve succeeds, this prompt appears after the resolution note is posted to CW:
- Success banner confirms the resolution posted
- Templatize card shows the script with AI-proposed parameters substituted in as `{{ gateway_host }}`, etc.
- Right pane lists extracted parameters with remove buttons (engineer can adjust)
- Provenance note: *"generated from CW #48307 · resolved by M. Davis"*
- Three actions: Skip / Edit parameters / Save as team template
- "Don't ask me again for this team" opt-out in footer
---
## 4. Data model changes
### 4.1 New columns on `ai_sessions`
```sql
ALTER TABLE ai_sessions
ADD COLUMN resolution_note_markdown TEXT NULL,
ADD COLUMN resolution_note_posted_at TIMESTAMPTZ NULL,
ADD COLUMN resolution_note_external_id VARCHAR(128) NULL, -- CW note ID after posting
ADD COLUMN escalation_package_markdown TEXT NULL,
ADD COLUMN escalation_package_posted_at TIMESTAMPTZ NULL;
```
No migration of `session_type` — the column stays. New sessions all default to the unified FlowPilot type.
### 4.2 New `session_facts` table (the What we know backing store)
```sql
CREATE TABLE session_facts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
account_id UUID NOT NULL REFERENCES accounts(id), -- for RLS, per multi-tenant architecture
text TEXT NOT NULL,
source_type VARCHAR(32) NOT NULL CHECK (source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')),
source_ref UUID NULL, -- FK to session_questions.id or session_diagnostic_checks.id, null for user_note
source_summary TEXT NULL, -- free-text provenance label, e.g. "rules out tenant/license"
created_by UUID NOT NULL REFERENCES users(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ NULL
);
CREATE INDEX idx_session_facts_session ON session_facts(session_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_session_facts_account ON session_facts(account_id);
```
**Important:** `source_ref` is a polymorphic FK and should NOT have a database-level FK constraint. Enforce integrity at the service layer.
### 4.3 New `session_suggested_fixes` table
```sql
CREATE TABLE session_suggested_fixes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
account_id UUID NOT NULL REFERENCES accounts(id),
title VARCHAR(200) NOT NULL,
description TEXT NOT NULL,
confidence_pct INTEGER NOT NULL CHECK (confidence_pct BETWEEN 0 AND 100),
script_template_id UUID NULL REFERENCES script_templates(id), -- null if no template match
ai_drafted_script TEXT NULL, -- populated if no template match
ai_drafted_parameters JSONB NULL, -- AI's proposed parameterization
user_decision VARCHAR(32) NULL CHECK (user_decision IN ('one_off', 'draft_template', 'build_template', 'dismissed')),
superseded_at TIMESTAMPTZ NULL, -- set when a new suggestion replaces this one
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_session_suggested_fixes_session ON session_suggested_fixes(session_id) WHERE superseded_at IS NULL;
```
A session can have multiple suggested fixes over time as the AI's understanding evolves. Only one is active (superseded_at IS NULL) at a time.
### 4.4 New `draft_templates` table
Backing store for Option 2 in the three-option dialog — scripts generated during sessions that are pending templatization.
```sql
CREATE TABLE draft_templates (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
account_id UUID NOT NULL REFERENCES accounts(id),
source_session_id UUID NOT NULL REFERENCES ai_sessions(id),
source_user_id UUID NOT NULL REFERENCES users(id),
script_body TEXT NOT NULL,
proposed_parameters JSONB NOT NULL, -- {"parameters": [{"key": "...", "label": "...", "type": "..."}]}
proposed_name VARCHAR(200) NULL,
proposed_category_id UUID NULL REFERENCES script_categories(id),
status VARCHAR(32) NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'accepted', 'rejected')),
resolved_at TIMESTAMPTZ NULL, -- when the user acted on the draft
promoted_template_id UUID NULL REFERENCES script_templates(id), -- if accepted, the created template
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
Accepted draft templates produce a new `script_templates` row and record the source session for provenance display.
### 4.5 Extension to `script_templates`
```sql
ALTER TABLE script_templates
ADD COLUMN source_session_id UUID NULL REFERENCES ai_sessions(id),
ADD COLUMN source_user_id UUID NULL REFERENCES users(id),
ADD COLUMN source_ticket_ref VARCHAR(64) NULL; -- e.g. "CW #48307" for display
```
These fields power the provenance chip in the Script Library: *"generated from CW #48307 · resolved by M. Davis · used 7 times"*.
### 4.6 Per-account settings
```sql
ALTER TABLE account_settings
ADD COLUMN templatize_prompt_enabled BOOLEAN NOT NULL DEFAULT true;
```
Controls whether the post-resolve templatize prompt appears. Toggleable from the prompt's footer ("Don't ask me again for this team") and from admin settings.
---
## 5. API endpoints
All endpoints follow ResolutionFlow conventions: `/api/v1/` prefix, JWT auth, tenant-scoped via RLS.
### 5.1 Session facts
```
GET /api/v1/sessions/{id}/facts List facts for a session (ordered by created_at ASC)
POST /api/v1/sessions/{id}/facts Create a manual fact (user_note source_type)
PATCH /api/v1/sessions/{id}/facts/{fact_id} Edit fact text or summary (only for user_note or AI-synthesized facts)
DELETE /api/v1/sessions/{id}/facts/{fact_id} Soft-delete
POST /api/v1/sessions/{id}/facts/promote Promote a question answer or check result to a fact
Body: { source_type, source_ref, proposed_text, proposed_summary }
Returns the created fact. Used by the AI synthesis flow and
by the engineer's explicit "promote to What we know" action.
```
### 5.2 Suggested fixes
```
GET /api/v1/sessions/{id}/suggested-fixes/active Returns the current active fix (superseded_at IS NULL) or 404
POST /api/v1/sessions/{id}/suggested-fixes/{fix_id}/decision
Body: { decision: "one_off" | "draft_template" | "build_template" | "dismissed" }
Records the user's path choice. Server-side side effects:
- one_off: generates script via ScriptTemplateEngine, returns rendered script
- draft_template: same as one_off, plus creates draft_templates row
- build_template: redirects to full template creation flow
- dismissed: marks fix as superseded
```
### 5.3 Draft templates (post-resolve flow)
```
GET /api/v1/draft-templates List pending drafts for the current user's account
(used by the Script Library "X scripts ready to review" notification)
GET /api/v1/draft-templates/{id} Get a single draft including its proposed parameterization
POST /api/v1/draft-templates/{id}/accept Body: { name, category_id, parameters_schema, edits }
Creates a new script_templates row with source_session_id set,
sets draft status to 'accepted', returns the new template
POST /api/v1/draft-templates/{id}/reject Sets status to 'rejected'
```
### 5.4 Resolution notes and escalation packages
```
POST /api/v1/sessions/{id}/resolution-note/preview Generates the draft resolution note from current session state
WITHOUT posting. Returns { markdown, target_ticket_ref }.
Called when the task lane renders and refreshed whenever
facts/suggested fix change.
POST /api/v1/sessions/{id}/resolution-note/post Body: { markdown } (engineer-edited version)
Posts to the linked PSA ticket, updates ticket status if configured,
marks session resolved.
POST /api/v1/sessions/{id}/escalation-package/preview Same pattern for escalation
POST /api/v1/sessions/{id}/escalation-package/post Posts and transitions session to escalated state
```
---
## 6. Services to implement
### 6.1 `FactSynthesisService` (new)
**Location:** `services/fact_synthesis_service.py`
**Purpose:** Converts question answers and diagnostic check results into candidate facts. Called by the AI pipeline when the LLM emits a `[PROMOTE]` marker, and by explicit engineer action.
**Key methods:**
- `synthesize_from_question(question_id: UUID, raw_answer: str) -> dict` — returns `{proposed_text, proposed_summary}` via LLM call. The summary is the short provenance label ("rules out tenant/license").
- `synthesize_from_check(check_id: UUID, check_output: str) -> dict` — same pattern for diagnostic check output.
- `create_fact(session_id, source_type, source_ref, text, summary, user_id) -> SessionFact` — persists the fact.
**Prompt engineering note:** The synthesis prompt should be conservative — short, factual statements. Hallucinated specifics are a trust-killer. The prompt must explicitly instruct: *"Use only information present in the answer/output. If the answer does not contain a substantive fact, return null."*
### 6.2 `ResolutionNoteGeneratorService` (new)
**Location:** `services/resolution_note_generator.py`
**Purpose:** Produces the structured resolution note markdown from session state.
**Input:** session_id
**Output:** `{markdown: str, target_ticket_ref: str | None}`
**Template structure:**
```markdown
## Problem
{ai-synthesized one-paragraph problem statement, pulling from session description + incident header}
## What we confirmed
{bulleted list of session_facts, grouped by source_type}
## Root cause
{ai-synthesized from the active suggested fix + facts}
## Resolution
{description of the fix applied, parameters used if a script ran, outcome}
```
The service pulls from four data sources: `ai_sessions`, `session_facts`, `session_suggested_fixes` (active), and `script_generations` (if scripts ran during the session). Passwords in script_generations.parameters_used must be redacted (already a Script Generator pattern per the existing plan).
**Critical:** This service is called on every fact/suggestion change to keep the preview live. Cache aggressively — LLM calls for every keystroke will blow the budget. Invalidate the cache on any write to session_facts or session_suggested_fixes.
### 6.3 `EscalationPackageGeneratorService` (new)
**Location:** `services/escalation_package_generator.py`
Same structure as ResolutionNoteGenerator but with a handoff-oriented template:
```markdown
## Problem
...
## What we've confirmed
...
## What we've tried
{list of diagnostic_checks run with their outcomes, scripts generated}
## Current hypothesis
{active suggested fix description}
## Suggested next steps
{ai-synthesized from the gap between facts and a complete resolution}
```
### 6.4 `TemplateExtractionService` (new)
**Location:** `services/template_extraction_service.py`
**Purpose:** Given a concrete rendered script and session context, propose a parameterization.
**Input:** `{script_body: str, session_context: dict, ticket_context: dict}`
**Output:** `{parameters: [{key, label, type, inferred_from}], templated_body: str}`
**Implementation approach:**
- LLM call with a structured prompt: "Given this script that resolved a ticket, identify values that would change for a different invocation. Propose a parameter schema following the Script Generator conventions (text / password / select / boolean / multi_text / number / textarea)."
- Post-process to ensure the proposed template renders back to the original script when given the extracted parameter values.
- Conservative default: prefer fewer parameters. If a value looks environment-agnostic (e.g. a command name), don't parameterize it.
This service is the engine behind Option 2 and Option 3 of the three-option dialog, and behind the post-resolve templatize prompt.
### 6.5 Extend `PSAWritebackService` (existing)
Add methods:
- `post_resolution_note(session_id, markdown) -> {external_id, posted_at}`
- `post_escalation_package(session_id, markdown) -> {external_id, posted_at}`
- `transition_ticket_status(ticket_ref, new_status) -> {success, verified_status}`
The `transition_ticket_status` method must verify the status change took effect (per the existing ConnectWise integration principle: "never told 'success' when CW silently rejected the change").
### 6.6 Model selection per service
Each AI-calling service must use a configurable model string from application settings, not a hardcoded model. Use these defaults:
```python
FACT_SYNTHESIS_MODEL = "claude-haiku-4-5-20251001" # short transformation, latency-sensitive
RESOLUTION_NOTE_MODEL = "claude-sonnet-4-6" # customer-facing artifact, quality matters
ESCALATION_PACKAGE_MODEL = "claude-sonnet-4-6" # same
TEMPLATE_EXTRACTION_MODEL = "claude-sonnet-4-6" # creates persistent library artifact
MAIN_CONVERSATION_MODEL = "claude-sonnet-4-6" # primary FlowPilot chat
```
Do not hardcode model strings at call sites. Every new service must read from settings with a service-specific key.
**Instrumentation requirement:** log a `disputed_fact_rate` metric for fact synthesis — the percentage of AI-synthesized facts that engineers subsequently edit or delete. If this exceeds 10% over a 500-session window, escalate `FACT_SYNTHESIS_MODEL` to `claude-sonnet-4-6`. If under 5%, Haiku is performing correctly.
Do not use Opus 4.7 for any of these services at current scale.
---
## 7. Frontend components
### 7.1 Routes to change
| Current route | New route | Action |
|---|---|---|
| `/assistant` | `/pilot` | **Rename** the route. The existing page moves. `/assistant` permanently redirects to `/pilot` with no sunset date. |
| `/pilot` (if it exists as a separate guided flow) | REMOVED | Collapse into the unified surface. |
| `/pilot/session/:id` | `/pilot/session/:id` | No change (this is where the unified session UI lives) |
Sidebar nav entry renames from "ResolutionAssist" to "FlowPilot" with the cockpit icon.
### 7.2 New React components
Under `src/components/pilot/`:
```
TaskLane.tsx -- The right-side panel, owns all four sections
sections/
WhatWeKnow.tsx -- New component for the facts list
WhatWeKnowItem.tsx -- Single fact card with provenance line
AddNoteButton.tsx -- "+ Add a note" inline composer
Questions.tsx -- Existing questions rendering (moved if already present)
DiagnosticChecks.tsx -- Existing checks rendering (moved if already present)
SuggestedFix.tsx -- New or refactored component for the suggested fix card
ResolveButton.tsx -- The Resolve CTA at the bottom of the task lane
ResolutionNotePreview.tsx -- Floating popover anchored to Resolve button
EscalatePackagePreview.tsx -- Same pattern for Escalate
ScriptGenInline/ -- Script Generator embedded in session context
TemplateMatchPanel.tsx -- Scene 1 mockup: template pre-filled
NoTemplateDialog.tsx -- Scene 2 mockup: three-option dialog
TemplatizePrompt.tsx -- Scene 3 mockup: post-resolve prompt
ParameterizationPreview.tsx -- Shared component: script with highlighted params
```
### 7.3 Component behavior contracts
**`WhatWeKnowItem`**
- Props: `{fact: SessionFact, onEdit, onDelete}`
- Renders the fact text, a green checkmark, and the provenance line with source-type color coding
- Clicking the fact text opens inline edit (only for `user_note` and `ai_synthesis` sources — question/check facts are read-only, edit the source instead)
**`TaskLane`**
- Subscribes to a session state hook that polls for fact / question / check / suggested-fix updates
- On any state change, calls `POST /api/v1/sessions/{id}/resolution-note/preview` to refresh the ResolutionNotePreview
- Debounce preview refresh to 500ms to avoid LLM spam
**`NoTemplateDialog`** (three-option dialog)
- Props: `{suggestedFix, onDecision}`
- Renders the three cards with the middle (draft_template) marked as recommended
- `onDecision` posts to `/api/v1/sessions/{id}/suggested-fixes/{fix_id}/decision` and either opens the Script Generator (one_off / draft_template) or navigates to full template creation (build_template)
**`TemplatizePrompt`**
- Rendered after successful Resolve when a draft template exists for the session
- Fetches proposed parameters from the draft template record
- Save button posts to `/api/v1/draft-templates/{id}/accept`
---
## 8. AI prompt changes
The existing FlowPilot / ResolutionAssist system prompt needs updates to emit the new markers.
### 8.1 New marker: `[PROMOTE]`
Used to surface facts to What we know. Syntax:
```
[PROMOTE]
source_type: question
source_ref: {question_id}
text: OWA login and send/receive confirmed working for jsmith
summary: rules out tenant/license
[/PROMOTE]
```
The AI should emit `[PROMOTE]` blocks in the same message that answers or processes a question/check, so the fact appears in What we know simultaneously with the chat acknowledgment.
### 8.2 New marker: `[SUGGEST_FIX]`
```
[SUGGEST_FIX]
title: Clear cached credentials + rebuild Outlook profile
description: Stale cached credential in Credential Manager is holding the pre-reset token...
confidence: 94
script_template_slug: clear-outlook-credentials # or omitted if no template match
ai_drafted_script: | # only if no template match
# Generated by FlowPilot...
...
[/SUGGEST_FIX]
```
### 8.3 Removed markers
The old `[FORK]` marker from the ResolutionAssist prompt is removed. Forks were a Guided-mode concept; in the unified model, they're replaced by Questions with mutually exclusive answer options.
---
## 9. Implementation phases
Each phase ends with a git commit and verification step. Do not advance to the next phase until verification passes.
### Phase 0 — Prompt caching infrastructure (prerequisite)
A codebase audit revealed that prompt caching is only implemented in `assistant_chat_service.py` (the file being deprecated). Every other Anthropic API call site — including all of FlowPilot's 7 call sites through `AnthropicProvider` — is uncached. Phase 0 must land before Phase 2 starts because new services built in Phase 2 will inherit caching from `AnthropicProvider` automatically once it's fixed.
**Deliverables:**
- **0.1** Promote `AnthropicProvider.generate_json()` and `generate_text_stream()` in `ai_provider.py` to the cached pattern currently implemented in `assistant_chat_service.py:_call_anthropic_cached()`. Convert the `system` string parameter to a structured system block list with `cache_control: {"type": "ephemeral"}` on the static portion. Add a second breakpoint on the last history message. For the streaming variant, capture the final usage object via `get_final_message()`. Log `cache_read_input_tokens` and `cache_creation_input_tokens` on every response.
- **0.2** Update `integrations.py:557` (`/tickets/ai-parse`) to move the members list and team-stable boards data into a cached system block.
> **Phase 0.2 — pending target endpoint.** The `/tickets/ai-parse` endpoint described in the original migration doc does not exist in the codebase as of this commit. When this endpoint is built, apply the cached-system-block pattern:
>
> ```python
> system_blocks = [
> {"type": "text", "text": members_json, "cache_control": {"type": "ephemeral"}},
> # cacheable: team-stable
> {"type": "text", "text": boards_json, "cache_control": {"type": "ephemeral"}},
> # cacheable: team-stable
> {"type": "text", "text": engineer_description},
> # uncached: per-request
> ]
> ```
>
> Remove this note when the endpoint is implemented and the pattern applied.
- **0.3** Add `cache_control` to one-shot generators: `ai_tree_generator`, `kb_conversion`, `ai_fix`, `script_builder`. Same pattern as 0.1.
- **0.4** Extract the caching logic from `assistant_chat_service.py:_call_anthropic_cached()` into `AnthropicProvider` and delete `_call_anthropic_cached`. `assistant_chat_service` should call the provider like every other service. This prevents two canonical implementations of the same pattern.
**Verification:**
- Hit any FlowPilot endpoint twice within 5 minutes. First call shows `cache_creation_input_tokens > 0`, second call shows `cache_read_input_tokens > 0`.
- If the second call returns zero cache reads, inspect the prefix for silent invalidators (timestamps, unsorted JSON keys, varying tool list ordering). Fix before proceeding.
```
git commit -m "feat(ai): promote AnthropicProvider to cached pattern, consolidate caching implementation"
```
**Dependencies:**
- Phase 1 (route rename and schema) can run in parallel with Phase 0.
- Phase 2 (What we know) must not start until Phase 0 is complete and verified.
### Phase 1 — Data model and route rename (backend + routing only)
**Deliverables:**
- Alembic migration creating `session_facts`, `session_suggested_fixes`, `draft_templates` tables and the column additions to `ai_sessions`, `script_templates`, `account_settings`
- All tables include `account_id` and have RLS policies following the multi-tenant architecture (per existing project standard)
- `/assistant``/pilot` route rename with permanent redirect (stays in place indefinitely; no sunset date)
- Sidebar nav entry rename
- No UI changes yet beyond the nav label
**Verification:**
- Run migration on a fresh dev database
- Confirm RLS policies active via the existing CI grep check for `tenant_filter()`
- Navigate to `/assistant` — should 301 to `/pilot`
- Navigate to `/pilot` — should render the existing ResolutionAssist UI with the sidebar entry now reading "FlowPilot"
```
git commit -m "feat(pilot): rename /assistant to /pilot, add session_facts + suggested_fixes + draft_templates schema"
```
### Phase 2 — What we know (task lane + service + API)
**Deliverables:**
- `FactSynthesisService` and its LLM prompt
- Fact CRUD API endpoints
- `WhatWeKnow`, `WhatWeKnowItem`, `AddNoteButton` components
- Task lane layout adjustment: What we know section renders above Questions
- Counter in task lane header updates to `X / Y answered` format
- AI prompt updated to emit `[PROMOTE]` markers; backend parses them and creates facts
**Verification:**
- Open a session, answer a question; within 2 seconds a fact should appear in What we know with correct provenance
- Click "+ Add a note", type a manual fact, confirm it appears with `source_type: user_note`
- Run a diagnostic check, confirm the check result promotes to a fact
- Facts persist across page reloads
- RLS: a user from a different account cannot read or write facts for this session
```
git commit -m "feat(pilot): add What we know section with fact synthesis"
```
### Phase 3 — Suggested fix + resolution note preview
**Deliverables:**
- `session_suggested_fixes` API endpoints and data flow
- `SuggestedFix` component in the task lane
- AI prompt updated to emit `[SUGGEST_FIX]` markers
- `ResolutionNoteGeneratorService` and preview endpoint
- `ResolutionNotePreview` floating popover anchored to Resolve button
- Preview refreshes on fact / suggested-fix changes (debounced)
**Verification:**
- Session with ≥3 facts and an active suggested fix shows a populated Resolve preview
- Editing a fact updates the preview within 1 second
- Preview markdown renders correctly with all four sections (Problem / What we confirmed / Root cause / Resolution)
- Preview contains no hallucinated information not present in session state (human review of 5 real-ish sessions)
```
git commit -m "feat(pilot): add suggested fix tracking and Resolve note preview"
```
### Phase 4 — Resolve and Escalate PSA writebacks
**Deliverables:**
- `transition_ticket_status` method with CW verification
- `post_resolution_note` endpoint and CW integration
- Resolve button fires: post note → transition status → mark session resolved → show templatize prompt (if applicable)
- `EscalationPackageGeneratorService` and parallel flow for Escalate
- Escalate button fires: post package → transition status → mark session escalated → route via CW rules
**Verification:**
- Complete a session end-to-end with a ConnectWise test instance
- Click Resolve, edit the preview, confirm post — verify the note appears in CW and status changes to Resolved
- Click Escalate on a different session — verify the package is posted and the ticket routes correctly
- Attempt to Resolve without a linked PSA ticket — should mark the session resolved without erroring, note stored in `resolution_note_markdown` only
```
git commit -m "feat(pilot): wire Resolve and Escalate to ConnectWise writeback"
```
### Phase 5 — Script Generator inline integration
**Deliverables:**
- `ScriptGenInline/TemplateMatchPanel` — when suggested fix has `script_template_id`, clicking the fix opens this panel with pre-filled parameters from session context
- Parameter pre-fill logic: pulls from session facts, ticket context (company configs), and AI-suggested values in the `[SUGGEST_FIX]` marker
- `ScriptGenInline/NoTemplateDialog` — three-option dialog when no template match
- User decision persisted on `session_suggested_fixes.user_decision`
- `TemplateExtractionService` for generating parameterization proposals
- Script generation flow produces a `script_generations` record linked to the session (existing Script Generator behavior)
**Verification:**
- Session with a template-matched suggested fix: clicking opens generator with ≥2 pre-filled parameters
- Session with a custom script suggested fix: dialog appears with three options, script preview shows parameters highlighted
- All three paths end correctly: one-off generates and closes, draft creates `draft_templates` row and generates, build_template opens full template creation
- `⌘K` → "script" anywhere in a session opens the generator directly
```
git commit -m "feat(pilot): integrate Script Generator inline with suggested fixes"
```
### Phase 6 — Post-resolve templatize prompt
**Deliverables:**
- `TemplatizePrompt` component
- Logic: after Resolve success, check for pending `draft_templates` rows for this session; if any, show the prompt
- Accept flow creates a new `script_templates` row with `source_session_id`, `source_user_id`, `source_ticket_ref` set
- "Don't ask me again" writes to `account_settings.templatize_prompt_enabled`
- Script Library sidebar shows a small badge when `draft_templates` with `status='pending'` exist for the current user
**Verification:**
- Resolve a session where the engineer picked Option 2 — templatize prompt appears with AI-proposed parameters
- Accept the prompt — new template appears in the Script Library with the provenance chip ("generated from CW #...")
- Skip the prompt — draft marked rejected, Script Library shows no new template
- Toggle "don't ask me again" — next session Resolve skips the prompt even with a pending draft
```
git commit -m "feat(pilot): add post-resolve templatize prompt for draft templates"
```
### Phase 7 — Polish
**Deliverables:**
- Visual polish against the mockup files (spacing, colors, animations)
- Loading states for LLM calls (fact synthesis, preview generation, template extraction)
- Empty states (new session with no facts yet, no active suggested fix, no draft templates pending)
- Keyboard shortcuts: `⌘K` (command menu), `⌘↵` (send composer), `⌘G` (generator), `⌘R` (resolve with confirm)
- Responsive behavior: task lane collapses on <1200px viewports into a bottom drawer
**Verification:**
- Compare each major screen side-by-side with the mockup PNG files — colors, spacing, typography within 5px / exact color match
- All flows work on a 1280px viewport without horizontal scroll
- Keyboard shortcuts documented in-app via `?` overlay
```
git commit -m "feat(pilot): visual polish and keyboard shortcuts"
```
---
## 10. Design system reference
All components must use the existing ResolutionFlow design system. Pulling the key tokens from the mockup CSS for quick reference — these should already exist in your tokens file; if they don't, add them:
```css
/* Backgrounds */
--bg-0: #070b12; /* page background */
--bg-1: #0d131c; /* sidebar / chrome */
--bg-2: #121a25; /* card / bubble background */
--bg-3: #1a2332; /* raised element */
/* Borders */
--border: rgba(148, 163, 184, 0.12);
--border-strong: rgba(148, 163, 184, 0.22);
/* Text */
--text-primary: #e2e8f0;
--text-secondary: #94a3b8;
--text-tertiary: #64748b;
/* Brand cyan (FlowPilot accent) */
--cyan-400: #22d3ee;
--cyan-500: #06b6d4;
--cyan-600: #0891b2;
--cyan-bg: rgba(34, 211, 238, 0.10);
--cyan-border: rgba(34, 211, 238, 0.30);
/* Semantic */
--success: #34d399; /* Resolve, facts */
--warning: #fbbf24; /* Escalate, proposed parameters */
--danger: #f87171;
--purple: #a78bfa; /* Script Generator / templates */
```
**Typography:**
- Body: IBM Plex Sans, 14px/1.5
- Headings: Bricolage Grotesque, 500 weight, -0.01em letter-spacing
- Code: JetBrains Mono
**Icons:** Phosphor Icons (Duotone) per the memory-recorded design decision to migrate off Lucide.
---
## 11. Non-goals for this migration
Do not build these as part of this work. They belong to later phases of the roadmap.
- **Confidence tiers (Discovery / Exploring / Guided).** We explicitly removed these. The task lane itself is the progress signal.
- **Mode toggle between Guided and Quick ask.** There is one mode.
- **"Convert to guided" promotion flow.** No longer applicable.
- **Team Wiki compilation from resolved sessions.** Tracked separately; depends on this migration but is not part of it.
- **SharePoint integration.** Sequenced after ConnectWise per roadmap.
- **Template marketplace / sharing across accounts.** Tracked under Client Context System roadmap item.
---
## 12. Risks and mitigations
| Risk | Mitigation |
|---|---|
| LLM fact synthesis hallucinates specifics not in the answer | Conservative prompt; engineer can edit/delete any AI-synthesized fact; provenance line shows the source so the engineer can verify |
| Resolution note preview LLM cost at scale | Cache aggressively, invalidate only on session state write; debounce UI updates to 500ms; consider lower-tier model for preview generation (final post-to-CW version can use the better model) |
| ConnectWise silently rejects status change | `transition_ticket_status` must re-fetch and verify; fail loudly if the change didn't stick |
| Template extraction proposes bad parameterization | Engineer reviews before saving; draft templates never silently become real templates; provenance chip lets team admins audit |
| Users lose muscle memory from `/assistant``/pilot` rename | Permanent redirect (no sunset date); inline toast on first `/pilot` visit explaining the rename |
| Existing sessions have no `session_facts` entries, so What we know is empty | Acceptable — Phase 2 deliberately does not backfill; facts only accumulate for new or ongoing sessions after deploy. Document in release notes. |
---
## 13. Questions for Michael before implementation starts
These are the decisions Claude Code cannot make unilaterally. Answer these inline in the doc or in chat before kicking off Phase 1.
1. **Keyboard shortcut for Resolve** — I've proposed `⌘R` (with a confirm). Browsers intercept `⌘R` for page reload. Alternative: `⌘⇧R` or no shortcut. Preference?
2. **Default `templatize_prompt_enabled` value** — I defaulted to `true`. If your beta testers find it annoying we'll learn fast, but it's a tradeoff between "every engineer sees the prompt" and "feature gets discovered only by those who know about it".
3. **Resolution note posts immediately, or stage for review?** — Current design: engineer edits preview inline, clicks Confirm & post. Alternative: stage in CW as draft note for a supervisor to approve before posting. Affects MSPs with strict compliance.
---
## End of document

View File

@@ -0,0 +1,960 @@
# FlowPilot Migration — Design & Implementation Doc
> **Target:** Transform `/assistant` (ResolutionAssist) into the new unified `/pilot` (FlowPilot) surface.
> **Audience:** Claude Code (implementation) and Codex (review) reviewed by Michael (owner).
> **Status:** Phase 0 in progress. Phases 17 awaiting Phase 0 completion for the AI-dependent work; Phase 1 can run in parallel.
> **Last updated:** April 17, 2026 (post-Codex plan review, reflects Phase 0 audit findings and in-flight implementation decisions)
---
## 0. Prerequisite reading for Claude Code
Before writing any code for Phase 1 or later, read these in order:
1. This document end-to-end.
2. `mockups/01-session-primary.png` — the target state for the main session UI.
3. `mockups/02-script-template-match.png`, `03-script-three-options.png`, `04-script-templatize-prompt.png` — Script Generator integration states.
4. The source HTML files `mockups/01-session-primary.html` and `mockups/02-04-script-integration.html` — authoritative for spacing, colors, and component structure. When CSS or layout questions arise during implementation, these files are the tiebreaker.
Do not proceed to implementation until you have confirmed you understand the following three architectural claims. If any of them are unclear, stop and ask.
1. **There is one AI troubleshooting surface, not two.** The existing split between FlowPilot (guided) and ResolutionAssist (chat) is collapsed into a single chat-primary product called FlowPilot at `/pilot`. The `ai_sessions.session_type` discriminator column is retained for data compatibility, but the product shows one unified UI and no new code branches on `session_type` for UI routing.
2. **The task lane is the load-bearing structural feature.** It is not a sidebar of metadata. It actively tracks diagnostic state: *What we know*, *Questions*, *Diagnostic checks*, *Suggested fix*. Engineers interact with it; facts flow between sections.
3. **Resolve and Escalate are deterministic artifact generators, not free-text prompts.** When an engineer clicks Resolve, a structured summary is generated from task lane state (not from the chat transcript alone) and posted to CW. The summary structure is fixed: *Problem / What we confirmed / Root cause / Resolution*.
### 0.1 Spec drift note for reviewers
This document was originally written against a set of assumptions about the codebase that turned out to be partially incorrect. Two audits (Claude Code's Phase 0 audit and the Codex plan review) surfaced drift. Key corrections already integrated:
- **API namespace is `/api/v1/ai-sessions/{id}/...`**, not the doc's original `/api/v1/sessions/{id}/...`. All route references below reflect this.
- **`pending_task_lane` items do not have stable IDs today.** Phase 2 must assign stable UUIDs when questions/checks are first persisted. `session_facts.source_ref` points to those JSON item IDs.
- **`account_settings` table did not exist.** Phase 1 creates it with a JSONB `preferences` column; settings live in `preferences` until they need their own column.
- **`/tickets/ai-parse` endpoint does not exist.** Phase 0.2 became a doc-only note; no code change.
Any further drift found during implementation should be flagged by the implementer and reconciled in this doc before writing code that assumes the drifted spec.
---
## 1. Why this change
### The current state
- `/assistant` is a chat-primary AI session with a `[QUESTIONS]` and `[DIAGNOSTIC_CHECKS]` task lane.
- `/pilot` was specced as a separate guided, confidence-tiered wizard with a different UI and lifecycle.
- The `FLOWPILOT-AND-RESOLUTIONASSIST.md` design document treated them as two products sharing a backend.
### The problems with the current state
- Two sidebar entries, two session histories, two mental models for engineers to learn.
- The PSA integration scope doubles (writebacks for lifecycle events must be built twice, or built for Pilot and bolted onto Assist).
- The Team Wiki moat depends on structured session artifacts with explicit resolutions — a chat-only mode produces weaker artifacts.
- The cockpit positioning (the core ResolutionFlow brand promise) does not map to a blank chat window.
- Branching into two modes forces a decision onto the engineer ("which mode for this ticket?") that has no right answer.
### The resolution
The existing `/assistant` UI already does most of what `/pilot` was supposed to do — structured questions, diagnostic checks, lifecycle actions in the header. It is closer to the right product than the original doc anticipated. Rather than building Pilot as a second surface, we extend Assist with the missing structural features (*What we know*, auto-generated summaries, escalation packages) and rename it FlowPilot.
### The strategic move
FlowPilot becomes the single canonical troubleshooting surface. Every PSA writeback, every Wiki compilation path, every Script Generator invocation points here. One session shape, one lifecycle, one integration surface.
---
## 2. Terminology used in this document
| Term | Meaning |
|---|---|
| **Session** | A single `ai_sessions` row representing one troubleshooting conversation. |
| **Task lane** | The right-side panel containing What we know, Questions, Diagnostic checks, Suggested fix. |
| **Task lane item ID** | A stable UUID assigned to each question / action / check inside `ai_sessions.pending_task_lane` when first persisted. `session_facts.source_ref` points to these. |
| **Fact** | An item in the What we know section. Has `text`, `source_type` (`question` / `diagnostic_check` / `user_note` / `ai_synthesis`), and `source_ref` (task lane item ID, or null for `user_note` and `ai_synthesis`). |
| **Suggested fix** | The AI's current best-guess resolution path. Has a confidence score and, optionally, a reference to a Script Library template. |
| **Promotion** | The act of a question answer or diagnostic check result being converted into a fact in What we know. Triggered by AI (via `[PROMOTE]` marker), confirmed/editable by engineer. |
| **Resolution note** | The structured document generated when the engineer clicks Resolve. Posted to CW as a ticket note. |
| **Escalation package** | The structured handoff document generated when the engineer clicks Escalate. Posted to CW and attached to the session for the next engineer. |
| **Draft template** | A script generated during a session where the engineer chose "Run now, templatize after resolve." Lives in `draft_templates` until accepted or rejected. |
---
## 3. Target UI — annotated
### 3.1 Primary session view
![Primary session view](mockups/01-session-primary.png)
The session UI is a four-column layout:
1. **Icon rail** (64px wide) — primary app navigation. FlowPilot / Tickets / Trees / Scripts / Wiki. Avatar at bottom.
2. **Session list** (260px wide) — all sessions grouped by state (Active / Recent). Each row shows title, state dot, PSA ticket number, and client name.
3. **Conversation column** (fluid) — the chat thread, composer, and incident header.
4. **Task lane** (380px wide) — *What we know*, *Questions*, *Diagnostic checks*, *Suggested fix*, and the Resolve action at the bottom.
Key visual and behavioral elements numbered against the mockup:
**Incident header (top of conversation column)**
- PSA chip showing `CW #48291` in cyan, monospaced
- Client / contact / priority meta line
- Incident title in Bricolage Grotesque 19px
- Four lifecycle buttons right-aligned: **Pause** (ghost), **Share update** (neutral), **Escalate** (amber), **Resolve** (green)
**Conversation column**
- Standard chat thread with pilot and user avatars
- Pilot uses cyan gradient avatar; user uses purple gradient
- AI messages in `bg-2` bubbles with subtle border; user messages in cyan-tinted bubbles
- Composer at bottom with inline action chips (Attach / Paste logs / Ticket context) and a send button
**Task lane sections, in order:**
1. **What we know** (NEW)
- Header: `WHAT WE KNOW · 4` (section title + count)
- Each fact is a card: `bg-2` background, dashed circular green check, fact text, and a provenance line (`from question · rules out tenant/license`)
- "+ Add a note" button at the bottom for manual facts from the engineer
- Background has a subtle green-to-transparent gradient to visually distinguish from the rest of the lane
- Fact editability: **facts sourced from questions or diagnostic checks are read-only at the fact card level** (edit the source question/check instead); **manual notes and AI-synthesis facts are editable**
2. **Questions**
- Header: `QUESTIONS · 2 unanswered`
- Each unanswered question: title, AI hint text, Answer / Skip buttons
- Answered questions dim to 55% opacity with a dashed border and show the resolution inline (`Answered · isolated to jsmith (promoted to What we know)`)
3. **Diagnostic checks**
- Header: `DIAGNOSTIC CHECKS · 1 / 3 run`
- "Run remaining 2 checks" button at top when applicable
- Each check: icon + command name (monospaced), description
- Completed checks dim and show "Complete · findings promoted to What we know" in green
4. **Suggested fix**
- Header: `SUGGESTED FIX · 94% confidence`
- Amber-accented card with fix title and description
- Clicking opens the Script Generator flow (Section 5)
**Resolve action bar (bottom of task lane)**
- Small hint text ("Summary preview is open →")
- Full-width "Resolve & post to CW" button in green
**Resolution note preview (floating, anchored to Resolve button)**
- A persistent popover, NOT a modal
- Shows the draft resolution note with Problem / What we confirmed / Root cause / Resolution sections
- Displays the target ticket (`CW #48291`) and status change (`Resolved`)
- Edit button opens an inline editor; Confirm & post fires the PSA writeback
### 3.2 Script Generator integration — template match
![Template match flow](mockups/02-script-template-match.png)
When the suggested fix references an existing Script Library template, clicking the fix opens the Script Generator panel in place of (or sliding over) the task lane. Key behavior:
- A **Verified template** badge appears above the parameter form
- Parameters pre-filled from session context get a cyan `from session` tag and a cyan-tinted input background
- Each pre-filled parameter has a hint line explaining the source: *"Pulled from CW company config for Acme Corp"*
- The engineer can adjust any pre-filled value before generating
- `⌘K` → "script" invokes the generator mid-conversation from anywhere in the session
### 3.3 Script Generator integration — no template match (three-option dialog)
![No template match](mockups/03-script-three-options.png)
When no template matches the suggested fix, FlowPilot drafts a session-specific script and presents three paths:
1. **Run as one-off** (neutral outline CTA)
- Script generated and captured in session documentation, discarded after
- Tradeoffs: fastest, but team won't benefit next time
2. **Run now, templatize after resolve** (RECOMMENDED, cyan primary CTA)
- Script generated for this ticket; draft template queued
- Post-resolve prompt offers to templatize (Section 3.4)
- Tradeoffs: zero cognitive overhead now, only templatize what works, ~30s review later
3. **Build as template now** (purple outline CTA)
- Full parameterization upfront
- Tradeoffs: immediate team benefit, but adds time mid-ticket
The drafted script renders as a code preview above the option cards with the AI's proposed parameters highlighted in amber.
### 3.4 Script Generator integration — post-resolve templatization prompt
![Templatize prompt](mockups/04-script-templatize-prompt.png)
If the engineer picked Option 2 in the three-option dialog and Resolve succeeds, this prompt appears after the resolution note is posted to CW:
- Success banner confirms the resolution posted
- Templatize card shows the script with AI-proposed parameters substituted in as `{{ gateway_host }}`, etc.
- Right pane lists extracted parameters with remove buttons (engineer can adjust)
- Provenance note: *"generated from CW #48307 · resolved by M. Davis"*
- Three actions: Skip / Edit parameters / Save as team template
- "Don't ask me again for this team" opt-out in footer
---
## 4. Data model changes
### 4.1 New columns on `ai_sessions`
```sql
ALTER TABLE ai_sessions
ADD COLUMN resolution_note_markdown TEXT NULL,
ADD COLUMN resolution_note_posted_at TIMESTAMPTZ NULL,
ADD COLUMN resolution_note_external_id VARCHAR(128) NULL, -- CW note ID after posting
ADD COLUMN escalation_package_markdown TEXT NULL,
ADD COLUMN escalation_package_posted_at TIMESTAMPTZ NULL,
ADD COLUMN escalation_package_external_id VARCHAR(128) NULL,
ADD COLUMN state_version INTEGER NOT NULL DEFAULT 0; -- incremented on any write to facts/suggested_fixes/script_generations; drives preview cache invalidation
```
No migration of `session_type` — the column stays for data compatibility. New sessions default to the unified FlowPilot type. Phase 1 does not branch frontend routing on `session_type`.
### 4.2 New `session_facts` table (the What we know backing store)
```sql
CREATE TABLE session_facts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
account_id UUID NOT NULL REFERENCES accounts(id), -- for RLS, per multi-tenant architecture
text TEXT NOT NULL,
source_type VARCHAR(32) NOT NULL CHECK (source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')),
source_ref UUID NULL, -- task lane item ID (from pending_task_lane JSON), null for user_note and ai_synthesis
source_summary TEXT NULL, -- free-text provenance label, e.g. "rules out tenant/license"
created_by UUID NOT NULL REFERENCES users(id),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ NULL
);
CREATE INDEX idx_session_facts_session ON session_facts(session_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_session_facts_account ON session_facts(account_id);
```
**Important:** `source_ref` is a pointer to a JSON item inside `ai_sessions.pending_task_lane` (not a FK to any table). It has no database-level FK constraint. Enforce integrity at the service layer. Phase 2 includes the work of assigning stable UUIDs to task lane items so `source_ref` has something reliable to point to.
### 4.3 New `session_suggested_fixes` table
```sql
CREATE TABLE session_suggested_fixes (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
account_id UUID NOT NULL REFERENCES accounts(id),
title VARCHAR(200) NOT NULL,
description TEXT NOT NULL,
confidence_pct INTEGER NOT NULL CHECK (confidence_pct BETWEEN 0 AND 100),
script_template_id UUID NULL REFERENCES script_templates(id), -- null if no template match
ai_drafted_script TEXT NULL, -- populated if no template match
ai_drafted_parameters JSONB NULL, -- AI's proposed parameterization
user_decision VARCHAR(32) NULL CHECK (user_decision IN ('one_off', 'draft_template', 'build_template', 'dismissed')),
superseded_at TIMESTAMPTZ NULL, -- set when a new suggestion replaces this one
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_session_suggested_fixes_session_active ON session_suggested_fixes(session_id) WHERE superseded_at IS NULL;
```
A session can have multiple suggested fixes over time as the AI's understanding evolves. Only one is active (`superseded_at IS NULL`) at a time.
### 4.4 New `draft_templates` table
Backing store for Option 2 in the three-option dialog — scripts generated during sessions that are pending templatization.
```sql
CREATE TABLE draft_templates (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
account_id UUID NOT NULL REFERENCES accounts(id),
source_session_id UUID NOT NULL REFERENCES ai_sessions(id),
source_user_id UUID NOT NULL REFERENCES users(id),
script_body TEXT NOT NULL,
proposed_parameters JSONB NOT NULL, -- {"parameters": [{"key": "...", "label": "...", "type": "..."}]}
proposed_name VARCHAR(200) NULL,
proposed_category_id UUID NULL REFERENCES script_categories(id),
status VARCHAR(32) NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'accepted', 'rejected')),
resolved_at TIMESTAMPTZ NULL, -- when the user acted on the draft
promoted_template_id UUID NULL REFERENCES script_templates(id), -- if accepted, the created template
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_draft_templates_account_pending ON draft_templates(account_id) WHERE status = 'pending';
```
Accepted draft templates produce a new `script_templates` row and record the source session for provenance display.
### 4.5 Extension to `script_templates`
```sql
ALTER TABLE script_templates
ADD COLUMN source_session_id UUID NULL REFERENCES ai_sessions(id),
ADD COLUMN source_user_id UUID NULL REFERENCES users(id),
ADD COLUMN source_ticket_ref VARCHAR(64) NULL; -- e.g. "CW #48307" for display
```
These fields power the provenance chip in the Script Library: *"generated from CW #48307 · resolved by M. Davis · used 7 times"*.
### 4.6 New `account_settings` table
The codebase has no existing `account_settings` table. Create it with a JSONB grab-bag column for simple settings, plus room for typed columns as settings graduate to needing their own structure.
```sql
CREATE TABLE account_settings (
account_id UUID PRIMARY KEY REFERENCES accounts(id) ON DELETE CASCADE,
preferences JSONB NOT NULL DEFAULT '{}',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
```
**Row lifecycle:** rows are created lazily on first write. A `get_setting(account_id, key, default)` helper returns the default when no row exists — no upfront row creation for every account.
**Promotion rule:** settings live in `preferences` (keyed JSON) until they meet one of these thresholds:
- Accessed in a hot path (frequent reads, latency-sensitive)
- Has validation rules that warrant a CHECK constraint
- Participates in joins or aggregations
When a setting graduates, add a typed column in a future migration and update `get_setting` to prefer the typed column over the JSON key.
**Initial contents:** `templatize_prompt_enabled` lives in `preferences` as `{"templatize_prompt_enabled": true}` (effective default when absent). No column needed.
---
## 5. API endpoints
All endpoints follow ResolutionFlow conventions: `/api/v1/` prefix, JWT auth, tenant-scoped via RLS. **All session-related routes use the `/api/v1/ai-sessions/{id}/...` namespace** to match the existing codebase pattern (not the generic `/sessions/` originally specced).
### 5.1 Session facts
```
GET /api/v1/ai-sessions/{id}/facts List facts for a session (ordered by created_at ASC)
POST /api/v1/ai-sessions/{id}/facts Create a manual fact (user_note source_type)
PATCH /api/v1/ai-sessions/{id}/facts/{fact_id} Edit fact text or summary
Authorization: only user_note and ai_synthesis facts are editable;
question and diagnostic_check facts return 403 (edit the source instead)
DELETE /api/v1/ai-sessions/{id}/facts/{fact_id} Soft-delete
POST /api/v1/ai-sessions/{id}/facts/promote Promote a question answer or check result to a fact
Body: { source_type, source_ref, proposed_text, proposed_summary }
Returns the created fact. Used by the AI synthesis flow and by
the engineer's explicit "promote to What we know" action.
```
### 5.2 Suggested fixes
```
GET /api/v1/ai-sessions/{id}/suggested-fixes/active Returns the current active fix (superseded_at IS NULL) or 404
POST /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision
Body: { decision: "one_off" | "draft_template" | "build_template" | "dismissed" }
Records the user's path choice. Server-side side effects:
- one_off: generates script via ScriptTemplateEngine, returns rendered script
- draft_template: same as one_off, plus creates draft_templates row
- build_template: returns redirect payload to full template creation flow
- dismissed: marks fix as superseded
```
### 5.3 Draft templates (post-resolve flow)
```
GET /api/v1/draft-templates List pending drafts for the current user's account
(used by the Script Library "X scripts ready to review" notification)
GET /api/v1/draft-templates/{id} Get a single draft including its proposed parameterization
POST /api/v1/draft-templates/{id}/accept Body: { name, category_id, parameters_schema, edits }
Creates a new script_templates row with source_session_id set,
sets draft status to 'accepted', returns the new template
POST /api/v1/draft-templates/{id}/reject Sets status to 'rejected'
```
### 5.4 Resolution notes and escalation packages
```
POST /api/v1/ai-sessions/{id}/resolution-note/preview Generates the draft resolution note from current session state
WITHOUT posting. Returns { markdown, target_ticket_ref }.
Called when the task lane renders and refreshed whenever
facts/suggested fix change. Cached by state_version.
POST /api/v1/ai-sessions/{id}/resolution-note/post Body: { markdown } (engineer-edited version)
Posts to the linked PSA ticket, updates ticket status if configured,
marks session resolved.
POST /api/v1/ai-sessions/{id}/escalation-package/preview Same pattern for escalation
POST /api/v1/ai-sessions/{id}/escalation-package/post Posts and transitions session to escalated state
```
### 5.5 Preview caching strategy
The resolution note preview and escalation package preview are LLM-generated and refresh on every fact / suggested-fix / script-generation change. To avoid LLM-per-keystroke cost:
- **Cache key:** `(session_id, state_version)` where `state_version` is the `ai_sessions.state_version` integer column
- **Invalidation:** any write to `session_facts`, `session_suggested_fixes`, or `script_generations` for a session atomically increments `ai_sessions.state_version` (a single SQL UPDATE wrapped into the same transaction)
- **Cache backend:** Redis (planned for Session Sharing work; can be in-memory LRU for Phase 3, swapped to Redis when Redis is available)
- **Client debounce:** 500ms on the UI side to batch rapid edits before hitting the preview endpoint
The choice of `state_version` over content hash is deliberate: cheaper to compute (single-integer comparison), easier to debug (logs show explicit version bumps), and makes invalidation failures visible (stale preview would keep showing an old version number).
---
## 6. Services to implement
### 6.1 `FactSynthesisService` (new)
**Location:** `backend/app/services/fact_synthesis_service.py`
**Purpose:** Converts question answers and diagnostic check results into candidate facts. Called by `unified_chat_service`'s marker parser when the LLM emits a `[PROMOTE]` marker, and by explicit engineer action.
**Key methods:**
- `synthesize_from_question(question_ref: UUID, raw_answer: str) -> dict` — returns `{proposed_text, proposed_summary}` via LLM call. The summary is the short provenance label ("rules out tenant/license").
- `synthesize_from_check(check_ref: UUID, check_output: str) -> dict` — same pattern for diagnostic check output.
- `create_fact(session_id, source_type, source_ref, text, summary, user_id) -> SessionFact` — persists the fact, increments `ai_sessions.state_version`.
**Prompt engineering note:** The synthesis prompt must be conservative. Hallucinated specifics are a trust-killer and would be particularly damaging because facts feed into the resolution note that gets posted to customer tickets. The prompt must explicitly instruct: *"Use only information present in the answer/output. If the answer does not contain a substantive fact, return null."*
### 6.2 `ResolutionNoteGeneratorService` (new)
**Location:** `backend/app/services/resolution_note_generator.py`
**Purpose:** Produces the structured resolution note markdown from session state.
**Input:** `session_id`
**Output:** `{markdown: str, target_ticket_ref: str | None}`
**Template structure:**
```markdown
## Problem
{ai-synthesized one-paragraph problem statement, pulling from session description + incident header}
## What we confirmed
{bulleted list of session_facts, grouped by source_type}
## Root cause
{ai-synthesized from the active suggested fix + facts}
## Resolution
{description of the fix applied, parameters used if a script ran, outcome}
```
The service pulls from four data sources: `ai_sessions`, `session_facts`, `session_suggested_fixes` (active), and `script_generations` (if scripts ran during the session). Passwords in `script_generations.parameters_used` must be redacted (already an existing Script Generator pattern).
**Caching:** keyed by `(session_id, ai_sessions.state_version)` per Section 5.5. Debounced client-side at 500ms.
### 6.3 `EscalationPackageGeneratorService` (new)
**Location:** `backend/app/services/escalation_package_generator.py`
Same structure as `ResolutionNoteGenerator` but with a handoff-oriented template:
```markdown
## Problem
...
## What we've confirmed
...
## What we've tried
{list of diagnostic_checks run with their outcomes, scripts generated}
## Current hypothesis
{active suggested fix description}
## Suggested next steps
{ai-synthesized from the gap between facts and a complete resolution}
```
Same caching and invalidation model.
### 6.4 `TemplateExtractionService` (new)
**Location:** `backend/app/services/template_extraction_service.py`
**Purpose:** Given a concrete rendered script and session context, propose a parameterization.
**Input:** `{script_body: str, session_context: dict, ticket_context: dict}`
**Output:** `{parameters: [{key, label, type, inferred_from}], templated_body: str}`
**Implementation approach:**
- LLM call with a structured prompt: "Given this script that resolved a ticket, identify values that would change for a different invocation. Propose a parameter schema following the Script Generator conventions (text / password / select / boolean / multi_text / number / textarea)."
- Post-process to ensure the proposed template renders back to the original script when given the extracted parameter values.
- Conservative default: prefer fewer parameters. If a value looks environment-agnostic (e.g. a command name), don't parameterize it.
This service is the engine behind Option 2 and Option 3 of the three-option dialog, and behind the post-resolve templatize prompt.
### 6.5 Extend `PSAWritebackService` (existing)
Add methods using the existing PSA provider registry and `post_note` seam:
- `post_resolution_note(session_id, markdown) -> {external_id, posted_at}`
- `post_escalation_package(session_id, markdown) -> {external_id, posted_at}`
- `transition_ticket_status(ticket_ref, new_status) -> {success, verified_status}`
The `transition_ticket_status` method must **verify by re-fetching** the status after the transition attempt. Failed verification is surfaced as an error, not silent success (per the existing ConnectWise integration principle).
### 6.6 Model and capability selection per service
Each AI-calling service must use configurable model and MCP strings from application settings, not hardcoded values. Use these defaults:
```python
# Model tier per service
FACT_SYNTHESIS_MODEL = "claude-haiku-4-5-20251001" # short transformation, latency-sensitive
RESOLUTION_NOTE_MODEL = "claude-sonnet-4-6" # customer-facing artifact, quality matters
ESCALATION_PACKAGE_MODEL = "claude-sonnet-4-6" # same
TEMPLATE_EXTRACTION_MODEL = "claude-sonnet-4-6" # creates persistent library artifact
MAIN_CONVERSATION_MODEL = "claude-sonnet-4-6" # primary FlowPilot chat
# MCP availability per service (true = this service can use MCP tools when available)
FACT_SYNTHESIS_MCP_ENABLED = False # fast transformation, no external lookup needed
RESOLUTION_NOTE_MCP_ENABLED = False # summarizing existing state, not researching
ESCALATION_PACKAGE_MCP_ENABLED = False # same
TEMPLATE_EXTRACTION_MCP_ENABLED = False # purely transforms an existing script
MAIN_CONVERSATION_MCP_ENABLED = True # interactive troubleshooting, grounding matters
SCRIPT_GENERATOR_MCP_ENABLED = True # Microsoft Learn for documentation grounding
```
Do not hardcode model or MCP strings at call sites. Every new service reads from settings with a service-specific key.
**Instrumentation:** log a `disputed_fact_rate` metric for fact synthesis — the percentage of AI-synthesized facts that engineers subsequently edit or delete. If this exceeds 10% over a 500-session window, escalate `FACT_SYNTHESIS_MODEL` to `claude-sonnet-4-6`. If under 5%, Haiku is performing correctly.
**Do not use Opus 4.7 for any of these services at current scale.**
---
## 7. Frontend components
### 7.1 Routes to change
| Current route | New route | Action |
|---|---|---|
| `/assistant` | `/pilot` | Move existing `AssistantChatPage` to `/pilot`. |
| `/assistant/:sessionId` | `/pilot/:sessionId` | Session-deep-links must redirect with the session ID preserved. |
| `/assistant` (bare) | → `/pilot` | **Permanent** 301 redirect. No sunset date. |
| `/assistant/:sessionId` (deep) | → `/pilot/:sessionId` | **Permanent** 301 redirect. |
Sidebar nav entry renames from "ResolutionAssist" to "FlowPilot" with the cockpit icon. Command palette entries, dashboard cards, and session list links that previously pointed to `/assistant` all update to `/pilot`.
### 7.2 New React components
Under `src/components/pilot/`:
```
TaskLane.tsx -- The right-side panel, owns all four sections
sections/
WhatWeKnow.tsx -- New component for the facts list
WhatWeKnowItem.tsx -- Single fact card with provenance line
AddNoteButton.tsx -- "+ Add a note" inline composer
Questions.tsx -- Existing questions rendering (moved/refactored from current location)
DiagnosticChecks.tsx -- Existing checks rendering (moved/refactored from current location)
SuggestedFix.tsx -- New or refactored component for the suggested fix card
ResolveButton.tsx -- The Resolve CTA at the bottom of the task lane
ResolutionNotePreview.tsx -- Floating popover anchored to Resolve button
EscalatePackagePreview.tsx -- Same pattern for Escalate
ScriptGenInline/ -- Script Generator embedded in session context
TemplateMatchPanel.tsx -- Scene 1 mockup: template pre-filled
NoTemplateDialog.tsx -- Scene 2 mockup: three-option dialog
TemplatizePrompt.tsx -- Scene 3 mockup: post-resolve prompt
ParameterizationPreview.tsx -- Shared component: script with highlighted params
```
Existing component folders (e.g., `src/components/assistant/`) may be renamed opportunistically, but behavior and route migration matter more than directory-name purity.
### 7.3 Component behavior contracts
**`WhatWeKnowItem`**
- Props: `{fact: SessionFact, onEdit, onDelete}`
- Renders the fact text, a green checkmark, and the provenance line with source-type color coding
- Edit affordance: only shown when `fact.source_type` is `user_note` or `ai_synthesis`. Question/check facts are read-only at the card level (edit the source question/check instead).
- Delete affordance: shown for all facts (soft-delete via DELETE endpoint)
**`TaskLane`**
- Subscribes to a session state hook that polls for fact / question / check / suggested-fix updates
- On any state change (state_version increment), calls `POST /api/v1/ai-sessions/{id}/resolution-note/preview` to refresh the `ResolutionNotePreview`
- Debounce preview refresh to 500ms to avoid LLM spam
**`NoTemplateDialog`** (three-option dialog)
- Props: `{suggestedFix, onDecision}`
- Renders the three cards with the middle (`draft_template`) marked as recommended
- `onDecision` posts to `/api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision` and either opens the Script Generator (one_off / draft_template) or navigates to full template creation (build_template)
**`TemplatizePrompt`**
- Rendered after successful Resolve when a draft template exists for the session AND `account_settings.preferences.templatize_prompt_enabled` is not `false`
- Fetches proposed parameters from the draft template record
- Save button posts to `/api/v1/draft-templates/{id}/accept`
---
## 8. AI prompt changes
The existing FlowPilot / ResolutionAssist system prompt needs updates to emit the new markers. Parser lives in `unified_chat_service` alongside the existing `[QUESTIONS]` / `[DIAGNOSTIC_CHECKS]` parsing — do not create a separate marker pipeline.
### 8.1 New marker: `[PROMOTE]`
Used to surface facts to What we know. Syntax:
```
[PROMOTE]
source_type: question
source_ref: {task_lane_item_uuid}
text: OWA login and send/receive confirmed working for jsmith
summary: rules out tenant/license
[/PROMOTE]
```
The AI should emit `[PROMOTE]` blocks in the same message that answers or processes a question/check, so the fact appears in What we know simultaneously with the chat acknowledgment. `source_ref` points to the stable UUID of the task lane item being promoted (assigned in Phase 2).
### 8.2 New marker: `[SUGGEST_FIX]`
```
[SUGGEST_FIX]
title: Clear cached credentials + rebuild Outlook profile
description: Stale cached credential in Credential Manager is holding the pre-reset token...
confidence: 94
script_template_slug: clear-outlook-credentials # or omitted if no template match
ai_drafted_script: | # only if no template match
# Generated by FlowPilot...
...
[/SUGGEST_FIX]
```
Emitting a new `[SUGGEST_FIX]` supersedes any existing active fix for the session (sets `superseded_at` on the old row).
### 8.3 Removed markers
The old `[FORK]` marker from the ResolutionAssist prompt is removed. Forks were a Guided-mode concept; in the unified model, they're replaced by Questions with mutually exclusive answer options.
---
## 9. Implementation phases
Each phase ends with a git commit and verification step. Do not advance to the next phase until verification passes (or, for Phase 0, the verification step is explicitly deferred to the new dev environment with a tracking TODO).
### Phase 0 — Prompt caching infrastructure (prerequisite)
A codebase audit revealed that prompt caching was only implemented in `assistant_chat_service.py` (the file being deprecated). Every other Anthropic API call site — including all of FlowPilot's 7 call sites through `AnthropicProvider` — was uncached. Phase 0 must land before Phase 2 starts because new services built in Phase 2 will inherit caching from `AnthropicProvider` automatically once it's fixed.
**Deliverables:**
- **0.1 — Cached system-block support in `AnthropicProvider`.** Convert `AnthropicProvider.generate_json()` and `generate_text_stream()` signatures to accept `system_prompt: str | list[SystemBlock]`. Plain string = uncached (backward compatible). List = cached using **policy α**: if the caller marks `cache_control` on any block, honor those markers; if no block has `cache_control`, cache the first block only by default. For streaming, capture the final usage object via `get_final_message()`. Log `cache_read_input_tokens` and `cache_creation_input_tokens` on every response.
- **0.2 — Pending target endpoint.** The `/tickets/ai-parse` endpoint described in the original migration doc does not exist in the codebase. No code change in Phase 0. When this endpoint is built, apply the cached-system-block pattern:
```python
system_blocks = [
{"type": "text", "text": members_json, "cache_control": {"type": "ephemeral"}},
# cacheable: team-stable
{"type": "text", "text": boards_json, "cache_control": {"type": "ephemeral"}},
# cacheable: team-stable
{"type": "text", "text": engineer_description},
# uncached: per-request
]
```
Remove this note when the endpoint is implemented and the pattern applied.
- **0.3 — Opt-in caching for one-shot generators.** Add `cache_control` to the static system prompt in `ai_tree_generator_service`, `kb_conversion_service`, `ai_fix_service`, and `script_builder_service`. Pattern: single-block list with policy α auto-caching the first (and only) block. Per-block inline comment explaining cacheability. For `script_builder` (multi-turn): cache only the system prompt; conversation history stays uncached in this phase. Retries in `ai_tree_generator.generate_branch_detail` inherit the cache automatically — no special handling.
- **0.4 — Consolidate the MCP-capable chat path.** Rename `_call_anthropic_cached` to `chat_call_cached()` in `assistant_chat_service` (or move to a shared module; implementer's choice based on cleanest structure). Refactor it to delegate cached-system-block plumbing to `AnthropicProvider`. MCP + image + beta-endpoint logic stays inside the chat wrapper — do NOT push MCP into `AnthropicProvider`, which is a provider-agnostic abstraction (Gemini has no MCP). Document that this wrapper is the one MCP-using caller, the exception not the rule. Track MCP unification as a separate future ticket.
- **0.5 — MCP telemetry.** Add counters for: (a) turns where MCP was available, (b) turns where the model actually invoked an MCP tool, (c) turns where the silent-retry-without-MCP fallback was triggered, (d) which MCP tool names got called. Log to whatever telemetry path exists (PostHog if wired up, otherwise structured logs). This gives us real data by the time Phase 2+ decisions about MCP investment are made. **Do 0.5 first or alongside 0.1 — don't save it for last.**
**Per-call-site comment pattern for multi-block lists:**
When a call site passes more than one block to `system_prompt`, add a one-line comment next to EACH block — including uncached ones — explaining why it is or isn't cached. The absence of a marker deserves documentation as much as the presence of one, because it tells the next dev you made a conscious choice.
**Verification:**
- Hit any FlowPilot endpoint twice within 5 minutes. First call shows `cache_creation_input_tokens > 0`, second call shows `cache_read_input_tokens > 0`.
- If the second call returns zero cache reads, inspect the prefix for silent invalidators (timestamps, unsorted JSON keys, varying tool list ordering). Fix before proceeding.
**Verification is deferred to the new dev environment.** Phase 0 code commits without live verification because no running environment exists at authoring time. A `TODO(phase-0-verification)` inline comment in the caching module names the verification steps. Execute verification when the new env is up; if it fails, that is a debug task then, not a blocker now.
```
git commit -m "feat(ai): promote AnthropicProvider to cached pattern, consolidate caching implementation"
```
**Dependencies:**
- Phase 1 (route rename and schema) can run in parallel with Phase 0.
- Phase 2 (What we know) must not start until Phase 0 is complete and verification has passed (or been explicitly deferred with a tracked issue).
### Phase 1 — Data model and route rename (can run in parallel with Phase 0)
**Deliverables:**
- Alembic migration after current repo head creating: `session_facts`, `session_suggested_fixes`, `draft_templates`, `account_settings`; column additions to `ai_sessions` (including `state_version`), `script_templates`.
- All new tenant-scoped tables have `account_id` and RLS policies using the repo's `app.current_account_id` policy pattern.
- SQLAlchemy models for each new table. `AccountSettings` model includes `get_setting(key, default)` and `set_setting(key, value)` helpers; lazy row creation on first write.
- Route move: `AssistantChatPage` component mounted at `/pilot` and `/pilot/:sessionId`.
- Permanent 301 redirect: `/assistant` → `/pilot`, `/assistant/:sessionId` → `/pilot/:sessionId` (preserving session ID).
- Sidebar nav entry renames from "ResolutionAssist" / "AI Assistant" to "FlowPilot". Command palette entries, dashboard cards, and session list links update to `/pilot`.
- No Phase 2 UI changes yet (no task lane restructuring, no What we know section).
**Verification:**
- Run migration on a fresh dev database — succeeds.
- Downgrade migration succeeds (reversibility).
- RLS grep/check passes for new tables.
- `/assistant` redirects to `/pilot` (301).
- `/assistant/:sessionId` redirects to `/pilot/:sessionId` with ID preserved.
- `/pilot` renders the existing chat UI with the sidebar now reading "FlowPilot".
- No Phase 2 UI introduced.
```
git commit -m "feat(pilot): rename /assistant to /pilot, add session_facts/suggested_fixes/draft_templates/account_settings schema"
```
### Phase 2 — What we know (task lane + service + API)
**Deliverables:**
- Stable-UUID assignment for `pending_task_lane` items. When questions/checks are persisted (or when a legacy session is loaded), each item receives a UUID written back into the JSON. This is a prerequisite for `session_facts.source_ref` to point anywhere reliable. Handle in-flight sessions gracefully — sessions open during deploy may have unstable IDs until their next save.
- `FactSynthesisService` per Section 6.1, with its LLM prompt.
- Fact CRUD API endpoints per Section 5.1.
- `WhatWeKnow`, `WhatWeKnowItem`, `AddNoteButton`, `TaskLane` components under `src/components/pilot/`.
- Task lane layout adjustment: What we know section renders above Questions.
- Counter in task lane header updates to `X / Y answered` format.
- AI system prompt updated to emit `[PROMOTE]` markers; `unified_chat_service` marker parser extended to handle them.
- Fact editability enforcement: API returns 403 on PATCH of `question` or `diagnostic_check`-sourced facts. UI hides the edit affordance for those facts.
**Verification:**
- Open a session, answer a question; within 2 seconds a fact appears in What we know with correct provenance.
- Click "+ Add a note", type a manual fact, confirm it persists with `source_type: user_note`.
- Run a diagnostic check, confirm the check result promotes to a fact.
- Facts persist across page reloads.
- RLS: a user from a different account cannot read or write facts for this session.
- Attempt to PATCH a question-sourced fact → 403.
- PATCH a user_note fact → succeeds.
```
git commit -m "feat(pilot): add What we know section with fact synthesis and stable task-lane item IDs"
```
### Phase 3 — Suggested fix + resolution note preview
**Deliverables:**
- `session_suggested_fixes` API endpoints per Section 5.2 and data flow.
- `SuggestedFix` component in the task lane.
- AI system prompt updated to emit `[SUGGEST_FIX]` markers; parser handles supersession.
- `ResolutionNoteGeneratorService` per Section 6.2 and preview endpoint per Section 5.4.
- `ResolutionNotePreview` floating popover anchored to Resolve button.
- Preview refreshes on fact / suggested-fix / script-generation changes via `state_version` increment. Client-side 500ms debounce.
- Preview cache keyed by `(session_id, state_version)` per Section 5.5.
**Verification:**
- Session with ≥3 facts and an active suggested fix shows a populated Resolve preview.
- Editing a fact updates the preview within 1 second.
- Preview markdown renders correctly with all four sections (Problem / What we confirmed / Root cause / Resolution).
- Preview contains no hallucinated information not present in session state (human review of 5 real-ish sessions).
- Incrementing `state_version` invalidates the preview cache; reading the same version returns the cached markdown.
```
git commit -m "feat(pilot): add suggested fix tracking and Resolve note preview with state_version caching"
```
### Phase 4 — Resolve and Escalate PSA writebacks
**Deliverables:**
- `transition_ticket_status` method on `PSAWritebackService` with CW re-fetch verification.
- `post_resolution_note` endpoint and CW integration via existing PSA provider registry + `post_note` seam.
- Resolve button flow: engineer edits preview → Confirm & post → server posts to PSA → stores `{external_id, posted_at}` → transitions status → verifies status → marks session resolved → shows templatize prompt if applicable.
- `EscalationPackageGeneratorService` and parallel flow for Escalate, including CW routing rules.
- Local-only path: resolving or escalating a session with no linked PSA ticket stores markdown locally and marks the session state without external posting.
**Verification:**
- Complete a session end-to-end with a ConnectWise test instance.
- Click Resolve, edit the preview, confirm post — verify the note appears in CW and status changes to Resolved (verified by re-fetch).
- Click Escalate on a different session — verify the package is posted and the ticket routes correctly.
- Simulate CW silently rejecting a status change — verify the app surfaces an error, not silent success.
- Attempt to Resolve without a linked PSA ticket — session marks resolved locally without erroring; markdown stored in `resolution_note_markdown`.
```
git commit -m "feat(pilot): wire Resolve and Escalate to ConnectWise writeback with status verification"
```
### Phase 5 — Script Generator inline integration
**Deliverables:**
- `ScriptGenInline/TemplateMatchPanel` — when suggested fix has `script_template_id`, clicking the fix opens this panel with parameters pre-filled from session facts, ticket context (company configs), and AI-suggested values in the `[SUGGEST_FIX]` marker.
- `ScriptGenInline/NoTemplateDialog` — three-option dialog when no template match.
- User decision persisted on `session_suggested_fixes.user_decision`.
- `TemplateExtractionService` for generating parameterization proposals (Section 6.4).
- Script generation flow produces a `script_generations` record linked to the session via existing `script_generations.ai_session_id`; increments `state_version`.
- `⌘K → "script"` opens the inline generator from the FlowPilot session. **No Resolve keyboard shortcut** is added (browsers intercept `⌘R`; decided against alternatives).
- Script Generator inherits MCP access for Microsoft Learn lookups via the `chat_call_cached` wrapper (Phase 0.4), not via `AnthropicProvider` directly.
**Verification:**
- Session with a template-matched suggested fix: clicking opens generator with ≥2 pre-filled parameters.
- Session with a custom script suggested fix: dialog appears with three options, script preview shows parameters highlighted.
- All three paths end correctly: one-off generates and closes, draft_template creates `draft_templates` row and generates, build_template opens full template creation.
- `⌘K → "script"` anywhere in a session opens the generator directly.
- Edge case: if the suggested fix's `script_template_id` points at a template that has been deleted, show the no-template three-option dialog with the AI-drafted script (do not error).
```
git commit -m "feat(pilot): integrate Script Generator inline with suggested fixes"
```
### Phase 6 — Post-resolve templatize prompt
**Deliverables:**
- `TemplatizePrompt` component.
- Show logic: after successful Resolve, show only when ALL of:
1. `account_settings.preferences.templatize_prompt_enabled` is not `false` (default `true` when absent)
2. Session has pending `draft_templates` rows
3. The user chose `draft_template` on the original three-option dialog
- Accept flow creates a new `script_templates` row with `source_session_id`, `source_user_id`, `source_ticket_ref` set. Updates draft to `status='accepted'`, `promoted_template_id` set.
- Reject flow updates draft to `status='rejected'`.
- "Don't ask me again for this team" writes `{"templatize_prompt_enabled": false}` to `account_settings.preferences`.
- Script Library sidebar shows a pending-drafts badge/count for the account.
**Verification:**
- Resolve a session where the engineer picked Option 2 → templatize prompt appears with AI-proposed parameters.
- Accept the prompt → new template appears in the Script Library with the provenance chip.
- Skip the prompt → draft marked rejected, Script Library shows no new template.
- Toggle "don't ask me again" → next session Resolve skips the prompt even with a pending draft.
```
git commit -m "feat(pilot): add post-resolve templatize prompt for draft templates"
```
### Phase 7 — Polish
**Deliverables:**
- Visual polish against the mockup HTML source files (spacing, colors, typography, component structure). Use PNGs for visual target confirmation.
- Loading states for: fact synthesis, preview generation, template extraction, PSA post/verify, script generation.
- Empty states: no facts yet, no questions, no checks, no active suggested fix, no pending draft templates.
- Keyboard shortcuts (no Resolve shortcut): `⌘K` (command palette), `⌘↵` (send composer), `⌘G` (script generator).
- Responsive: at widths below 1200px, task lane collapses into a bottom drawer.
- Use existing design tokens where present; add missing tokens only if needed to match the mockups.
**Verification:**
- Major screens visually compare within tolerance against the mockup PNG files.
- No horizontal scroll at 1280px viewport.
- Keyboard shortcuts documented in-app via `?` overlay.
- Shortcuts do not conflict with browser reload.
```
git commit -m "feat(pilot): visual polish, empty/loading states, keyboard shortcuts"
```
---
## 10. Design system reference
All components must use the existing ResolutionFlow design system. Tokens from the mockup CSS for quick reference — these should already exist in your tokens file; if they don't, add them:
```css
/* Backgrounds */
--bg-0: #070b12; /* page background */
--bg-1: #0d131c; /* sidebar / chrome */
--bg-2: #121a25; /* card / bubble background */
--bg-3: #1a2332; /* raised element */
/* Borders */
--border: rgba(148, 163, 184, 0.12);
--border-strong: rgba(148, 163, 184, 0.22);
/* Text */
--text-primary: #e2e8f0;
--text-secondary: #94a3b8;
--text-tertiary: #64748b;
/* Brand cyan (FlowPilot accent) */
--cyan-400: #22d3ee;
--cyan-500: #06b6d4;
--cyan-600: #0891b2;
--cyan-bg: rgba(34, 211, 238, 0.10);
--cyan-border: rgba(34, 211, 238, 0.30);
/* Semantic */
--success: #34d399; /* Resolve, facts */
--warning: #fbbf24; /* Escalate, proposed parameters */
--danger: #f87171;
--purple: #a78bfa; /* Script Generator / templates */
```
**Typography:**
- Body: IBM Plex Sans, 14px/1.5
- Headings: Bricolage Grotesque, 500 weight, -0.01em letter-spacing
- Code: JetBrains Mono
**Icons:** Phosphor Icons (Duotone) per the recorded design decision to migrate off Lucide.
---
## 11. Test plan
### Migration tests
- Fresh DB upgrade succeeds.
- Downgrade succeeds (reversibility).
- New tables have RLS enabled/forced.
- Tenant policy includes `app.current_account_id`.
### Backend tests
- Fact CRUD authorization (edit allowed on `user_note` / `ai_synthesis`, 403 on `question` / `diagnostic_check`).
- Fact promotion: `POST /facts/promote` creates fact and increments `state_version`.
- Suggested-fix supersession: emitting a new `[SUGGEST_FIX]` sets `superseded_at` on the prior active one.
- Decision persistence on `session_suggested_fixes.user_decision`.
- Resolution note preview cache invalidation on `state_version` increment.
- Resolve/Escalate local-only behavior without a linked PSA ticket.
- PSA status verification failure path (simulated rejection surfaces error).
- Draft-template accept/reject behavior.
- `AccountSettings.get_setting` returns default when row absent.
### Frontend tests
- Route redirects (`/assistant` → `/pilot`, deep-link ID preservation).
- Task lane rendering and persistence across reloads.
- Inline fact editing refreshes the Resolve preview.
- Script Generator option flows (template match, three-option dialog, post-resolve prompt).
- Templatize prompt respects `templatize_prompt_enabled` setting.
- Responsive drawer behavior at <1200px.
### Manual QA
- Run one ConnectWise-linked Resolve end-to-end.
- Run one Escalate end-to-end.
- Run one template-match script generation path.
- Run one no-template draft-template path through post-resolve save.
---
## 12. Non-goals for this migration
Do not build these as part of this work. They belong to later phases of the roadmap.
- **Confidence tiers (Discovery / Exploring / Guided).** Explicitly removed. The task lane itself is the progress signal.
- **Mode toggle between Guided and Quick ask.** There is one mode.
- **"Convert to guided" promotion flow.** No longer applicable.
- **Team Wiki compilation from resolved sessions.** Tracked separately; depends on this migration but is not part of it.
- **SharePoint integration.** Sequenced after ConnectWise per roadmap.
- **Template marketplace / sharing across accounts.** Tracked under Client Context System roadmap item.
- **Backfill of What we know for pre-Phase-2 sessions.** Sessions resolved before Phase 2 ships will not retroactively gain facts. Document in release notes.
- **MCP unification into `AnthropicProvider`.** Deferred pending telemetry-driven evaluation. Track as a separate ticket.
- **Supervisor staging of resolution notes.** Engineer review + Confirm & post is the committed flow (not compliance-grade draft approval).
---
## 13. Risks and mitigations
| Risk | Mitigation |
|---|---|
| LLM fact synthesis hallucinates specifics not in the answer | Conservative prompt; engineer can edit/delete any AI-synthesized fact; provenance line shows the source so the engineer can verify. Haiku default + `disputed_fact_rate` telemetry triggers escalation to Sonnet if quality drops. |
| Resolution note preview LLM cost at scale | `state_version`-keyed cache prevents re-generation on unchanged state; 500ms client debounce batches rapid edits. |
| ConnectWise silently rejects status change | `transition_ticket_status` re-fetches and verifies; fails loudly if the change didn't stick. |
| Template extraction proposes bad parameterization | Engineer reviews before saving; draft templates never silently become real templates; provenance chip lets team admins audit. |
| Users lose muscle memory from `/assistant` → `/pilot` rename | Permanent 301 redirect (no sunset date); deep-link session IDs preserved through the redirect. |
| Existing sessions have no facts at Phase 2 deploy | Acceptable per non-goals. Facts accumulate for new or ongoing sessions after deploy. Document in release notes. |
| In-flight sessions during Phase 2 deploy lack stable task-lane item IDs | Sessions open at deploy time may have unstable IDs until the next save cycle re-persists with UUIDs. Facts tied to those sessions may reference IDs that don't resolve. Engineer can manually re-promote if needed. |
| Phase 0 cache verification deferred to new env | Tracked via inline TODO in the caching module. If verification fails when executed, debug as a normal bug — do not retroactively block dependent phases. |
| MCP usage data unknown, may under- or over-invest | Phase 0.5 telemetry answers this within 30 days of new env being live. Schedule "MCP review" checkpoint at that mark. |
---
## 14. Decisions made during migration planning
These questions were raised during the planning conversation and have been resolved. Captured here so the decisions are traceable.
1. **Keyboard shortcut for Resolve** — **Decided: no shortcut.** `⌘R` conflicts with browser reload; alternatives add complexity without clear value. Resolve has a button, a preview, and a confirm step. No shortcut needed.
2. **Default for `templatize_prompt_enabled`** — **Decided: true.** Feature discovery outweighs annoyance at pre-revenue stage. Opt-out is one click and persistent. Tune surfacing logic rather than the default if feedback indicates over-prompting.
3. **Resolution note posting** — **Decided: engineer edits inline, clicks Confirm & post.** Supervisor staging is out of scope for this migration. Revisit if an MSP with strict compliance requirements surfaces the need.
4. **Fact synthesis model tier** — **Decided: Haiku 4.5 behind a `FACT_SYNTHESIS_MODEL` config flag.** All other AI services default to Sonnet 4.6. Opus 4.7 not used at current scale. Per-service MCP capability configured via matching flags (Section 6.6).
5. **MCP architecture in Phase 0** — **Decided: leave MCP in the chat wrapper.** Option C from the Phase 0 audit. Do not push MCP into the provider-agnostic `AnthropicProvider`. Add telemetry in Phase 0.5 to gather data for a future unification decision.
6. **Cache breakpoint policy** — **Decided: policy α.** Caller-marked `cache_control` is honored; if no blocks are marked, the first block is cached by default.
7. **API namespace** — **Decided: `/api/v1/ai-sessions/{id}/...`**, matching the existing codebase.
8. **`account_settings` structure** — **Decided: new table with JSONB `preferences` column, lazy row creation.** Simple settings live in `preferences`; settings graduate to typed columns when they meet the promotion criteria (hot path / validation / joins).
---
## End of document

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 505 KiB

File diff suppressed because it is too large Load Diff

Binary file not shown.

After

Width:  |  Height:  |  Size: 282 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 341 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 381 KiB

View File

@@ -48,7 +48,7 @@ export function ActiveFlowPilotSessions({ hideHeader = false }: { hideHeader?: b
{sessions.map((session) => (
<button
key={session.id}
onClick={() => navigate(session.session_type === 'chat' ? `/assistant/${session.id}` : `/pilot/${session.id}`)}
onClick={() => navigate(`/pilot/${session.id}`)}
className="card-interactive p-4 text-left"
>
<div className="flex items-start justify-between gap-2 mb-2">

View File

@@ -52,7 +52,7 @@ export function RecentFlowPilotSessions({ hideHeader = false }: { hideHeader?: b
return (
<button
key={session.id}
onClick={() => navigate(session.session_type === 'chat' ? `/assistant/${session.id}` : `/pilot/${session.id}`)}
onClick={() => navigate(`/pilot/${session.id}`)}
className="flex w-full items-center gap-3 px-5 py-3 text-left hover:bg-[rgba(255,255,255,0.02)] transition-colors"
style={{
borderBottom: i < sessions.length - 1 ? '1px solid var(--color-border-default)' : undefined,

View File

@@ -52,7 +52,7 @@ export function StartSessionInput() {
if (completedUploadIds.length > 0) {
state.uploadIds = completedUploadIds
}
navigate('/assistant', { state })
navigate('/pilot', { state })
}
const handleKeyDown = (e: React.KeyboardEvent) => {
@@ -63,7 +63,7 @@ export function StartSessionInput() {
}
const handleSuggestionClick = (suggestion: string) => {
navigate('/assistant', { state: { prefill: suggestion } })
navigate('/pilot', { state: { prefill: suggestion } })
}
// ── File handling ──────────────────────────────

View File

@@ -18,9 +18,12 @@ const STATUS_CONFIG = {
export function AISessionListItem({ session }: AISessionListItemProps) {
const config = STATUS_CONFIG[session.status as keyof typeof STATUS_CONFIG] ?? STATUS_CONFIG.active
const StatusIcon = config.icon
// Both chat and guided sessions now land on the unified /pilot surface.
// session_type is preserved on the DB row for data compatibility but is
// no longer used for frontend route selection (Phase 1 FlowPilot migration).
const isChat = session.session_type === 'chat'
const TypeIcon = isChat ? MessageCircle : Route
const linkTo = isChat ? `/assistant/${session.id}` : `/pilot/${session.id}`
const linkTo = `/pilot/${session.id}`
const displayTitle = isChat
? (session.title || session.problem_summary || 'Untitled chat')
: (session.problem_summary || 'Untitled session')

View File

@@ -42,7 +42,7 @@ const PAGES: PaletteItem[] = [
{ id: 'page-dashboard', group: 'pages', title: 'Dashboard', path: '/', icon: 'page' },
{ id: 'page-flows', group: 'pages', title: 'All Flows', subtitle: 'Browse your flow library', path: '/trees', icon: 'page' },
{ id: 'page-sessions', group: 'pages', title: 'Sessions', subtitle: 'View session history', path: '/sessions', icon: 'page' },
{ id: 'page-assistant', group: 'pages', title: 'AI Assistant', subtitle: 'FlowPilot chat', path: '/assistant', icon: 'page' },
{ id: 'page-flowpilot', group: 'pages', title: 'FlowPilot', subtitle: 'AI troubleshooting', path: '/pilot', icon: 'page' },
{ id: 'page-scripts', group: 'pages', title: 'Script Generator', subtitle: 'Generate PowerShell scripts', path: '/scripts', icon: 'page' },
{ id: 'page-analytics', group: 'pages', title: 'Analytics', subtitle: 'Team usage & metrics', path: '/analytics', icon: 'page' },
{ id: 'page-settings', group: 'pages', title: 'Settings', subtitle: 'Account & preferences', path: '/account', icon: 'page' },
@@ -177,7 +177,7 @@ export function CommandPalette({ open, onClose }: CommandPaletteProps) {
group: 'flowpilot',
title: 'Troubleshoot with FlowPilot',
subtitle: trimmed,
path: '/assistant',
path: '/pilot',
icon: 'sparkles',
}

View File

@@ -9,7 +9,7 @@ const PREFETCH_MAP: Record<string, () => Promise<unknown>> = {
'/shares': () => import('@/pages/MySharesPage'),
'/analytics': () => import('@/pages/TeamAnalyticsPage'),
'/analytics/me': () => import('@/pages/MyAnalyticsPage'),
'/assistant': () => import('@/pages/AssistantChatPage'),
'/pilot': () => import('@/pages/AssistantChatPage'),
'/step-library': () => import('@/pages/StepLibraryPage'),
'/guides': () => import('@/pages/GuidesHubPage'),
'/feedback': () => import('@/pages/FeedbackPage'),

View File

@@ -1,4 +1,4 @@
import { createBrowserRouter } from 'react-router-dom'
import { createBrowserRouter, Navigate, useParams } from 'react-router-dom'
import * as Sentry from '@sentry/react'
import { Suspense } from 'react'
import { AppLayout, ProtectedRoute } from '@/components/layout'
@@ -49,7 +49,10 @@ const ScriptLibraryPage = lazyWithRetry(() => import('@/pages/ScriptLibraryPage'
const ScriptManagePage = lazyWithRetry(() => import('@/pages/ScriptManagePage'))
const AssistantChatPage = lazyWithRetry(() => import('@/pages/AssistantChatPage'))
const FlowAssistPage = lazyWithRetry(() => import('@/pages/FlowAssistPage'))
const FlowPilotSessionPage = lazyWithRetry(() => import('@/pages/FlowPilotSessionPage'))
// FlowPilotSessionPage (the old guided-mode surface) was removed from active
// routing in Phase 1 of the FlowPilot migration. The file is retained on disk
// for reference but not mounted — the unified chat-primary surface now serves
// /pilot. Delete the file when nothing in the tree references it anymore.
const EscalationQueuePage = lazyWithRetry(() => import('@/pages/EscalationQueuePage'))
const ReviewQueuePage = lazyWithRetry(() => import('@/pages/ReviewQueuePage'))
const FlowPilotAnalyticsPage = lazyWithRetry(() => import('@/pages/FlowPilotAnalyticsPage'))
@@ -98,6 +101,16 @@ function page(Component: React.LazyExoticComponent<React.ComponentType>) {
)
}
/**
* Permanent 301-style redirect from /assistant/:sessionId to /pilot/:sessionId.
* Used by the Phase 1 route-rename; paired with a bare-path redirect to /pilot.
* SPA redirects replace history so the legacy URL does not linger in back-nav.
*/
function AssistantSessionRedirect() {
const { sessionId } = useParams<{ sessionId: string }>()
return <Navigate to={sessionId ? `/pilot/${sessionId}` : '/pilot'} replace />
}
export const router = sentryCreateBrowserRouter([
{
path: '/landing',
@@ -202,11 +215,14 @@ export const router = sentryCreateBrowserRouter([
{ path: 'network-diagrams/new', element: page(DiagramEditorPage) },
{ path: 'network-diagrams/:id', element: page(DiagramEditorPage) },
{ path: 'kb-accelerator', element: page(KBAcceleratorPage) },
{ path: 'assistant', element: page(AssistantChatPage) },
{ path: 'assistant/:sessionId', element: page(AssistantChatPage) },
// Phase 1 — FlowPilot migration. The unified chat-primary surface lives at
// /pilot; /assistant permanently redirects. FlowPilotSessionPage (old
// guided surface) is no longer mounted.
{ path: 'pilot', element: page(AssistantChatPage) },
{ path: 'pilot/:sessionId', element: page(AssistantChatPage) },
{ path: 'assistant', element: <Navigate to="/pilot" replace /> },
{ path: 'assistant/:sessionId', element: <AssistantSessionRedirect /> },
{ path: 'flow-assist', element: page(FlowAssistPage) },
{ path: 'pilot', element: page(FlowPilotSessionPage) },
{ path: 'pilot/:sessionId', element: page(FlowPilotSessionPage) },
{ path: 'escalations', element: page(EscalationQueuePage) },
{ path: 'queue', element: page(SessionQueuePage) },
{ path: 'review-queue', element: page(ReviewQueuePage) },