Files
resolutionflow/docs/FlowAssist_Migration/FLOWPILOT-MIGRATION.md
Michael Chihlas d386d11af2
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
docs(pilot): correct Phase 9 migration description
Handoff + migration spec incorrectly claimed Phase 9 added a new
parent_pilot_session_id FK. The implementation reuses the existing
ai_session_id column; the migration only adds the origin discriminator
+ partial unique index. Also: ScriptBuilderTab wraps ScriptBuilderChat
and ScriptBodyEditor (Monaco), not "ScriptBuilderChat in ephemeral
mode" — there is no ephemeral mode on the presentational component.

Applies applied_at call-site specifics: handleScriptDecision stamps
on one_off/draft_template, TemplateMatchPanel stamps on onMarkRun,
Script Builder tab Submit does not stamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:17:08 -04:00

67 KiB
Raw Permalink Blame History

FlowPilot Migration — Design & Implementation Doc

Target: Transform /assistant (ResolutionAssist) into the new unified /pilot (FlowPilot) surface. Audience: Claude Code (implementation) and Codex (review) reviewed by Michael (owner). Status: Phases 09 implemented. Phase 9 shipped the tabbed Script Builder integration (chat-region tab strip, ScriptBuilderTab controller with AI + Monaco editor modes, InlineNoTemplateDialog chat-region relocation, PATCH /script endpoint, origin discriminator migration reusing the existing ai_session_id FK, applied_at semantics correction, and EscalateInterceptDialog fourth "partial" choice). tsc -b and npm run build both clean. Last updated: April 24, 2026 (Phase 9 — Tabbed Script Builder — committed; handoff and migration spec updated)


0. Prerequisite reading for Claude Code

Before writing any code for Phase 1 or later, read these in order:

  1. This document end-to-end.
  2. mockups/01-session-primary.png — the target state for the main session UI.
  3. mockups/02-script-template-match.png, 03-script-three-options.png, 04-script-templatize-prompt.png — Script Generator integration states.
  4. The source HTML files mockups/01-session-primary.html and mockups/02-04-script-integration.html — authoritative for spacing, colors, and component structure. When CSS or layout questions arise during implementation, these files are the tiebreaker.

Do not proceed to implementation until you have confirmed you understand the following three architectural claims. If any of them are unclear, stop and ask.

  1. There is one AI troubleshooting surface, not two. The existing split between FlowPilot (guided) and ResolutionAssist (chat) is collapsed into a single chat-primary product called FlowPilot at /pilot. The ai_sessions.session_type discriminator column is retained for data compatibility, but the product shows one unified UI and no new code branches on session_type for UI routing.
  2. The task lane is the load-bearing structural feature. It is not a sidebar of metadata. It actively tracks diagnostic state: What we know, Questions, Diagnostic checks, Suggested fix. Engineers interact with it; facts flow between sections.
  3. Resolve and Escalate are deterministic artifact generators, not free-text prompts. When an engineer clicks Resolve, a structured summary is generated from task lane state (not from the chat transcript alone) and posted to CW. The summary structure is fixed: Problem / What we confirmed / Root cause / Resolution.

0.1 Spec drift note for reviewers

This document was originally written against a set of assumptions about the codebase that turned out to be partially incorrect. Two audits (Claude Code's Phase 0 audit and the Codex plan review) surfaced drift. Key corrections already integrated:

  • API namespace is /api/v1/ai-sessions/{id}/..., not the doc's original /api/v1/sessions/{id}/.... All route references below reflect this.
  • pending_task_lane items do not have stable IDs today. Phase 2 must assign stable UUIDs when questions/checks are first persisted. session_facts.source_ref points to those JSON item IDs.
  • account_settings table did not exist. Phase 1 creates it with a JSONB preferences column; settings live in preferences until they need their own column.
  • /tickets/ai-parse endpoint does not exist. Phase 0.2 became a doc-only note; no code change.
  • [PROMOTE] marker uses JSON, not key:value. The doc's original example showed key: value lines; implementation uses a JSON object inside each [PROMOTE]...[/PROMOTE] block (matching the [QUESTIONS] / [ACTIONS] parser pattern). Multi-line text values would have been ambiguous in the key:value form. Section 8.1 below has been updated.

Any further drift found during implementation should be flagged by the implementer and reconciled in this doc before writing code that assumes the drifted spec.


1. Why this change

The current state

  • /assistant is a chat-primary AI session with a [QUESTIONS] and [DIAGNOSTIC_CHECKS] task lane.
  • /pilot was specced as a separate guided, confidence-tiered wizard with a different UI and lifecycle.
  • The FLOWPILOT-AND-RESOLUTIONASSIST.md design document treated them as two products sharing a backend.

The problems with the current state

  • Two sidebar entries, two session histories, two mental models for engineers to learn.
  • The PSA integration scope doubles (writebacks for lifecycle events must be built twice, or built for Pilot and bolted onto Assist).
  • The Team Wiki moat depends on structured session artifacts with explicit resolutions — a chat-only mode produces weaker artifacts.
  • The cockpit positioning (the core ResolutionFlow brand promise) does not map to a blank chat window.
  • Branching into two modes forces a decision onto the engineer ("which mode for this ticket?") that has no right answer.

The resolution

The existing /assistant UI already does most of what /pilot was supposed to do — structured questions, diagnostic checks, lifecycle actions in the header. It is closer to the right product than the original doc anticipated. Rather than building Pilot as a second surface, we extend Assist with the missing structural features (What we know, auto-generated summaries, escalation packages) and rename it FlowPilot.

The strategic move

FlowPilot becomes the single canonical troubleshooting surface. Every PSA writeback, every Wiki compilation path, every Script Generator invocation points here. One session shape, one lifecycle, one integration surface.


2. Terminology used in this document

Term Meaning
Session A single ai_sessions row representing one troubleshooting conversation.
Task lane The right-side panel containing What we know, Questions, Diagnostic checks, Suggested fix.
Task lane item ID A stable UUID assigned to each question / action / check inside ai_sessions.pending_task_lane when first persisted. session_facts.source_ref points to these.
Fact An item in the What we know section. Has text, source_type (question / diagnostic_check / user_note / ai_synthesis), and source_ref (task lane item ID, or null for user_note and ai_synthesis).
Suggested fix The AI's current best-guess resolution path. Has a confidence score and, optionally, a reference to a Script Library template.
Promotion The act of a question answer or diagnostic check result being converted into a fact in What we know. Triggered by AI (via [PROMOTE] marker), confirmed/editable by engineer.
Resolution note The structured document generated when the engineer clicks Resolve. Posted to CW as a ticket note.
Escalation package The structured handoff document generated when the engineer clicks Escalate. Posted to CW and attached to the session for the next engineer.
Draft template A script generated during a session where the engineer chose "Run now, templatize after resolve." Lives in draft_templates until accepted or rejected.

3. Target UI — annotated

3.1 Primary session view

Primary session view

The session UI is a four-column layout:

  1. Icon rail (64px wide) — primary app navigation. FlowPilot / Tickets / Trees / Scripts / Wiki. Avatar at bottom.
  2. Session list (260px wide) — all sessions grouped by state (Active / Recent). Each row shows title, state dot, PSA ticket number, and client name.
  3. Conversation column (fluid) — the chat thread, composer, and incident header.
  4. Task lane (380px wide) — What we know, Questions, Diagnostic checks, Suggested fix, and the Resolve action at the bottom.

Key visual and behavioral elements numbered against the mockup:

Incident header (top of conversation column)

  • PSA chip showing CW #48291 in cyan, monospaced
  • Client / contact / priority meta line
  • Incident title in Bricolage Grotesque 19px
  • Four lifecycle buttons right-aligned: Pause (ghost), Share update (neutral), Escalate (amber), Resolve (green)

Conversation column

  • Standard chat thread with pilot and user avatars
  • Pilot uses cyan gradient avatar; user uses purple gradient
  • AI messages in bg-2 bubbles with subtle border; user messages in cyan-tinted bubbles
  • Composer at bottom with inline action chips (Attach / Paste logs / Ticket context) and a send button

Task lane sections, in order:

  1. What we know (NEW)

    • Header: WHAT WE KNOW · 4 (section title + count)
    • Each fact is a card: bg-2 background, dashed circular green check, fact text, and a provenance line (from question · rules out tenant/license)
    • "+ Add a note" button at the bottom for manual facts from the engineer
    • Background has a subtle green-to-transparent gradient to visually distinguish from the rest of the lane
    • Fact editability: facts sourced from questions or diagnostic checks are read-only at the fact card level (edit the source question/check instead); manual notes and AI-synthesis facts are editable
  2. Questions

    • Header: QUESTIONS · 2 unanswered
    • Each unanswered question: title, AI hint text, Answer / Skip buttons
    • Answered questions dim to 55% opacity with a dashed border and show the resolution inline (Answered · isolated to jsmith (promoted to What we know))
  3. Diagnostic checks

    • Header: DIAGNOSTIC CHECKS · 1 / 3 run
    • "Run remaining 2 checks" button at top when applicable
    • Each check: icon + command name (monospaced), description
    • Completed checks dim and show "Complete · findings promoted to What we know" in green
  4. Suggested fix

    • Header: SUGGESTED FIX · 94% confidence
    • Amber-accented card with fix title and description
    • Clicking opens the Script Generator flow (Section 5)

Resolve action bar (bottom of task lane)

  • Small hint text ("Summary preview is open →")
  • Full-width "Resolve & post to CW" button in green

Resolution note preview (floating, anchored to Resolve button)

  • A persistent popover, NOT a modal
  • Shows the draft resolution note with Problem / What we confirmed / Root cause / Resolution sections
  • Displays the target ticket (CW #48291) and status change (Resolved)
  • Edit button opens an inline editor; Confirm & post fires the PSA writeback

3.2 Script Generator integration — template match

Template match flow

When the suggested fix references an existing Script Library template, clicking the fix opens the Script Generator panel in place of (or sliding over) the task lane. Key behavior:

  • A Verified template badge appears above the parameter form
  • Parameters pre-filled from session context get a cyan from session tag and a cyan-tinted input background
  • Each pre-filled parameter has a hint line explaining the source: "Pulled from CW company config for Acme Corp"
  • The engineer can adjust any pre-filled value before generating
  • ⌘K → "script" invokes the generator mid-conversation from anywhere in the session

3.3 Script Generator integration — no template match (three-option dialog)

No template match

When no template matches the suggested fix, FlowPilot drafts a session-specific script and presents three paths:

  1. Run as one-off (neutral outline CTA)

    • Script generated and captured in session documentation, discarded after
    • Tradeoffs: fastest, but team won't benefit next time
  2. Run now, templatize after resolve (RECOMMENDED, cyan primary CTA)

    • Script generated for this ticket; draft template queued
    • Post-resolve prompt offers to templatize (Section 3.4)
    • Tradeoffs: zero cognitive overhead now, only templatize what works, ~30s review later
  3. Build as template now (purple outline CTA)

    • Full parameterization upfront
    • Tradeoffs: immediate team benefit, but adds time mid-ticket

The drafted script renders as a code preview above the option cards with the AI's proposed parameters highlighted in amber.

3.4 Script Generator integration — post-resolve templatization prompt

Templatize prompt

If the engineer picked Option 2 in the three-option dialog and Resolve succeeds, this prompt appears after the resolution note is posted to CW:

  • Success banner confirms the resolution posted
  • Templatize card shows the script with AI-proposed parameters substituted in as {{ gateway_host }}, etc.
  • Right pane lists extracted parameters with remove buttons (engineer can adjust)
  • Provenance note: "generated from CW #48307 · resolved by M. Davis"
  • Three actions: Skip / Edit parameters / Save as team template
  • "Don't ask me again for this team" opt-out in footer

4. Data model changes

4.1 New columns on ai_sessions

ALTER TABLE ai_sessions
  ADD COLUMN resolution_note_markdown TEXT NULL,
  ADD COLUMN resolution_note_posted_at TIMESTAMPTZ NULL,
  ADD COLUMN resolution_note_external_id VARCHAR(128) NULL,  -- CW note ID after posting
  ADD COLUMN escalation_package_markdown TEXT NULL,
  ADD COLUMN escalation_package_posted_at TIMESTAMPTZ NULL,
  ADD COLUMN escalation_package_external_id VARCHAR(128) NULL,
  ADD COLUMN state_version INTEGER NOT NULL DEFAULT 0;  -- incremented on any write to facts/suggested_fixes/script_generations; drives preview cache invalidation

No migration of session_type — the column stays for data compatibility. New sessions default to the unified FlowPilot type. Phase 1 does not branch frontend routing on session_type.

4.2 New session_facts table (the What we know backing store)

CREATE TABLE session_facts (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
  account_id UUID NOT NULL REFERENCES accounts(id),  -- for RLS, per multi-tenant architecture
  text TEXT NOT NULL,
  source_type VARCHAR(32) NOT NULL CHECK (source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')),
  source_ref UUID NULL,  -- task lane item ID (from pending_task_lane JSON), null for user_note and ai_synthesis
  source_summary TEXT NULL,  -- free-text provenance label, e.g. "rules out tenant/license"
  created_by UUID NOT NULL REFERENCES users(id),
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  deleted_at TIMESTAMPTZ NULL
);
CREATE INDEX idx_session_facts_session ON session_facts(session_id) WHERE deleted_at IS NULL;
CREATE INDEX idx_session_facts_account ON session_facts(account_id);

Important: source_ref is a pointer to a JSON item inside ai_sessions.pending_task_lane (not a FK to any table). It has no database-level FK constraint. Enforce integrity at the service layer. Phase 2 includes the work of assigning stable UUIDs to task lane items so source_ref has something reliable to point to.

4.3 New session_suggested_fixes table

CREATE TABLE session_suggested_fixes (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  session_id UUID NOT NULL REFERENCES ai_sessions(id) ON DELETE CASCADE,
  account_id UUID NOT NULL REFERENCES accounts(id),
  title VARCHAR(200) NOT NULL,
  description TEXT NOT NULL,
  confidence_pct INTEGER NOT NULL CHECK (confidence_pct BETWEEN 0 AND 100),
  script_template_id UUID NULL REFERENCES script_templates(id),  -- null if no template match
  ai_drafted_script TEXT NULL,                                    -- populated if no template match
  ai_drafted_parameters JSONB NULL,                               -- AI's proposed parameterization
  user_decision VARCHAR(32) NULL CHECK (user_decision IN ('one_off', 'draft_template', 'build_template', 'dismissed')),
  superseded_at TIMESTAMPTZ NULL,  -- set when a new suggestion replaces this one
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_session_suggested_fixes_session_active ON session_suggested_fixes(session_id) WHERE superseded_at IS NULL;

A session can have multiple suggested fixes over time as the AI's understanding evolves. Only one is active (superseded_at IS NULL) at a time.

4.4 New draft_templates table

Backing store for Option 2 in the three-option dialog — scripts generated during sessions that are pending templatization.

CREATE TABLE draft_templates (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  account_id UUID NOT NULL REFERENCES accounts(id),
  source_session_id UUID NOT NULL REFERENCES ai_sessions(id),
  source_user_id UUID NOT NULL REFERENCES users(id),
  script_body TEXT NOT NULL,
  proposed_parameters JSONB NOT NULL,  -- {"parameters": [{"key": "...", "label": "...", "type": "..."}]}
  proposed_name VARCHAR(200) NULL,
  proposed_category_id UUID NULL REFERENCES script_categories(id),
  status VARCHAR(32) NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'accepted', 'rejected')),
  resolved_at TIMESTAMPTZ NULL,  -- when the user acted on the draft
  promoted_template_id UUID NULL REFERENCES script_templates(id),  -- if accepted, the created template
  created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_draft_templates_account_pending ON draft_templates(account_id) WHERE status = 'pending';

Accepted draft templates produce a new script_templates row and record the source session for provenance display.

4.5 Extension to script_templates

ALTER TABLE script_templates
  ADD COLUMN source_session_id UUID NULL REFERENCES ai_sessions(id),
  ADD COLUMN source_user_id UUID NULL REFERENCES users(id),
  ADD COLUMN source_ticket_ref VARCHAR(64) NULL;  -- e.g. "CW #48307" for display

These fields power the provenance chip in the Script Library: "generated from CW #48307 · resolved by M. Davis · used 7 times".

4.6 New account_settings table

The codebase has no existing account_settings table. Create it with a JSONB grab-bag column for simple settings, plus room for typed columns as settings graduate to needing their own structure.

CREATE TABLE account_settings (
  account_id UUID PRIMARY KEY REFERENCES accounts(id) ON DELETE CASCADE,
  preferences JSONB NOT NULL DEFAULT '{}',
  created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

Row lifecycle: rows are created lazily on first write. A get_setting(account_id, key, default) helper returns the default when no row exists — no upfront row creation for every account.

Promotion rule: settings live in preferences (keyed JSON) until they meet one of these thresholds:

  • Accessed in a hot path (frequent reads, latency-sensitive)
  • Has validation rules that warrant a CHECK constraint
  • Participates in joins or aggregations

When a setting graduates, add a typed column in a future migration and update get_setting to prefer the typed column over the JSON key.

Initial contents: templatize_prompt_enabled lives in preferences as {"templatize_prompt_enabled": true} (effective default when absent). No column needed.


5. API endpoints

All endpoints follow ResolutionFlow conventions: /api/v1/ prefix, JWT auth, tenant-scoped via RLS. All session-related routes use the /api/v1/ai-sessions/{id}/... namespace to match the existing codebase pattern (not the generic /sessions/ originally specced).

5.1 Session facts

GET    /api/v1/ai-sessions/{id}/facts               List facts for a session (ordered by created_at ASC)
POST   /api/v1/ai-sessions/{id}/facts               Create a manual fact (user_note source_type)
PATCH  /api/v1/ai-sessions/{id}/facts/{fact_id}     Edit fact text or summary
                                                    Authorization: only user_note and ai_synthesis facts are editable;
                                                    question and diagnostic_check facts return 403 (edit the source instead)
DELETE /api/v1/ai-sessions/{id}/facts/{fact_id}     Soft-delete
POST   /api/v1/ai-sessions/{id}/facts/promote       Promote a question answer or check result to a fact
                                                    Body: { source_type, source_ref, proposed_text, proposed_summary }
                                                    Returns the created fact. Used by the AI synthesis flow and by
                                                    the engineer's explicit "promote to What we know" action.

5.2 Suggested fixes

GET    /api/v1/ai-sessions/{id}/suggested-fixes/active     Returns the current active fix (superseded_at IS NULL) or 404
POST   /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision
                                                           Body: { decision: "one_off" | "draft_template" | "build_template" | "dismissed" }
                                                           Records the user's path choice. Server-side side effects:
                                                           - one_off: generates script via ScriptTemplateEngine, returns rendered script
                                                           - draft_template: same as one_off, plus creates draft_templates row
                                                           - build_template: returns redirect payload to full template creation flow
                                                           - dismissed: marks fix as superseded

5.3 Draft templates (post-resolve flow)

GET    /api/v1/draft-templates                             List pending drafts for the current user's account
                                                           (used by the Script Library "X scripts ready to review" notification)
GET    /api/v1/draft-templates/{id}                        Get a single draft including its proposed parameterization
POST   /api/v1/draft-templates/{id}/accept                 Body: { name, category_id, parameters_schema, edits }
                                                           Creates a new script_templates row with source_session_id set,
                                                           sets draft status to 'accepted', returns the new template
POST   /api/v1/draft-templates/{id}/reject                 Sets status to 'rejected'

5.4 Resolution notes and escalation packages

POST   /api/v1/ai-sessions/{id}/resolution-note/preview    Generates the draft resolution note from current session state
                                                           WITHOUT posting. Returns { markdown, target_ticket_ref }.
                                                           Called when the task lane renders and refreshed whenever
                                                           facts/suggested fix change. Cached by state_version.
POST   /api/v1/ai-sessions/{id}/resolution-note/post       Body: { markdown }  (engineer-edited version)
                                                           Posts to the linked PSA ticket, updates ticket status if configured,
                                                           marks session resolved.
POST   /api/v1/ai-sessions/{id}/escalation-package/preview Same pattern for escalation
POST   /api/v1/ai-sessions/{id}/escalation-package/post    Posts and transitions session to escalated state

5.5 Preview caching strategy

The resolution note preview and escalation package preview are LLM-generated and refresh on every fact / suggested-fix / script-generation change. To avoid LLM-per-keystroke cost:

  • Cache key: (session_id, state_version) where state_version is the ai_sessions.state_version integer column
  • Invalidation: any write to session_facts, session_suggested_fixes, or script_generations for a session atomically increments ai_sessions.state_version (a single SQL UPDATE wrapped into the same transaction)
  • Cache backend: Redis (planned for Session Sharing work; can be in-memory LRU for Phase 3, swapped to Redis when Redis is available)
  • Client debounce: 500ms on the UI side to batch rapid edits before hitting the preview endpoint

The choice of state_version over content hash is deliberate: cheaper to compute (single-integer comparison), easier to debug (logs show explicit version bumps), and makes invalidation failures visible (stale preview would keep showing an old version number).


6. Services to implement

6.1 FactSynthesisService (new)

Location: backend/app/services/fact_synthesis_service.py

Purpose: Converts question answers and diagnostic check results into candidate facts. Called by unified_chat_service's marker parser when the LLM emits a [PROMOTE] marker, and by explicit engineer action.

Key methods:

  • synthesize_from_question(question_ref: UUID, raw_answer: str) -> dict — returns {proposed_text, proposed_summary} via LLM call. The summary is the short provenance label ("rules out tenant/license").
  • synthesize_from_check(check_ref: UUID, check_output: str) -> dict — same pattern for diagnostic check output.
  • create_fact(session_id, source_type, source_ref, text, summary, user_id) -> SessionFact — persists the fact, increments ai_sessions.state_version.

Prompt engineering note: The synthesis prompt must be conservative. Hallucinated specifics are a trust-killer and would be particularly damaging because facts feed into the resolution note that gets posted to customer tickets. The prompt must explicitly instruct: "Use only information present in the answer/output. If the answer does not contain a substantive fact, return null."

6.2 ResolutionNoteGeneratorService (new)

Location: backend/app/services/resolution_note_generator.py

Purpose: Produces the structured resolution note markdown from session state.

Input: session_id Output: {markdown: str, target_ticket_ref: str | None}

Template structure:

## Problem
{ai-synthesized one-paragraph problem statement, pulling from session description + incident header}

## What we confirmed
{bulleted list of session_facts, grouped by source_type}

## Root cause
{ai-synthesized from the active suggested fix + facts}

## Resolution
{description of the fix applied, parameters used if a script ran, outcome}

The service pulls from four data sources: ai_sessions, session_facts, session_suggested_fixes (active), and script_generations (if scripts ran during the session). Passwords in script_generations.parameters_used must be redacted (already an existing Script Generator pattern).

Caching: keyed by (session_id, ai_sessions.state_version) per Section 5.5. Debounced client-side at 500ms.

6.3 EscalationPackageGeneratorService (new)

Location: backend/app/services/escalation_package_generator.py

Same structure as ResolutionNoteGenerator but with a handoff-oriented template:

## Problem
...

## What we've confirmed
...

## What we've tried
{list of diagnostic_checks run with their outcomes, scripts generated}

## Current hypothesis
{active suggested fix description}

## Suggested next steps
{ai-synthesized from the gap between facts and a complete resolution}

Same caching and invalidation model.

6.4 TemplateExtractionService (new)

Location: backend/app/services/template_extraction_service.py

Purpose: Given a concrete rendered script and session context, propose a parameterization.

Input: {script_body: str, session_context: dict, ticket_context: dict} Output: {parameters: [{key, label, type, inferred_from}], templated_body: str}

Implementation approach:

  • LLM call with a structured prompt: "Given this script that resolved a ticket, identify values that would change for a different invocation. Propose a parameter schema following the Script Generator conventions (text / password / select / boolean / multi_text / number / textarea)."
  • Post-process to ensure the proposed template renders back to the original script when given the extracted parameter values.
  • Conservative default: prefer fewer parameters. If a value looks environment-agnostic (e.g. a command name), don't parameterize it.

This service is the engine behind Option 2 and Option 3 of the three-option dialog, and behind the post-resolve templatize prompt.

6.5 Extend PSAWritebackService (existing)

Add methods using the existing PSA provider registry and post_note seam:

  • post_resolution_note(session_id, markdown) -> {external_id, posted_at}
  • post_escalation_package(session_id, markdown) -> {external_id, posted_at}
  • transition_ticket_status(ticket_ref, new_status) -> {success, verified_status}

The transition_ticket_status method must verify by re-fetching the status after the transition attempt. Failed verification is surfaced as an error, not silent success (per the existing ConnectWise integration principle).

6.6 Model and capability selection per service

Each AI-calling service must use configurable model and MCP strings from application settings, not hardcoded values. Use these defaults:

# Model tier per service
FACT_SYNTHESIS_MODEL          = "claude-haiku-4-5-20251001"  # short transformation, latency-sensitive
RESOLUTION_NOTE_MODEL         = "claude-sonnet-4-6"          # customer-facing artifact, quality matters
ESCALATION_PACKAGE_MODEL      = "claude-sonnet-4-6"          # same
TEMPLATE_EXTRACTION_MODEL     = "claude-sonnet-4-6"          # creates persistent library artifact
MAIN_CONVERSATION_MODEL       = "claude-sonnet-4-6"          # primary FlowPilot chat

# MCP availability per service (true = this service can use MCP tools when available)
FACT_SYNTHESIS_MCP_ENABLED       = False  # fast transformation, no external lookup needed
RESOLUTION_NOTE_MCP_ENABLED      = False  # summarizing existing state, not researching
ESCALATION_PACKAGE_MCP_ENABLED   = False  # same
TEMPLATE_EXTRACTION_MCP_ENABLED  = False  # purely transforms an existing script
MAIN_CONVERSATION_MCP_ENABLED    = True   # interactive troubleshooting, grounding matters
SCRIPT_GENERATOR_MCP_ENABLED     = True   # Microsoft Learn for documentation grounding

Do not hardcode model or MCP strings at call sites. Every new service reads from settings with a service-specific key.

Instrumentation: log a disputed_fact_rate metric for fact synthesis — the percentage of AI-synthesized facts that engineers subsequently edit or delete. If this exceeds 10% over a 500-session window, escalate FACT_SYNTHESIS_MODEL to claude-sonnet-4-6. If under 5%, Haiku is performing correctly.

Do not use Opus 4.7 for any of these services at current scale.


7. Frontend components

7.1 Routes to change

Current route New route Action
/assistant /pilot Move existing AssistantChatPage to /pilot.
/assistant/:sessionId /pilot/:sessionId Session-deep-links must redirect with the session ID preserved.
/assistant (bare) /pilot Permanent 301 redirect. No sunset date.
/assistant/:sessionId (deep) /pilot/:sessionId Permanent 301 redirect.

Sidebar nav entry renames from "ResolutionAssist" to "FlowPilot" with the cockpit icon. Command palette entries, dashboard cards, and session list links that previously pointed to /assistant all update to /pilot.

7.2 New React components

Under src/components/pilot/:

TaskLane.tsx                  -- The right-side panel, owns all four sections
  sections/
    WhatWeKnow.tsx            -- New component for the facts list
    WhatWeKnowItem.tsx        -- Single fact card with provenance line
    AddNoteButton.tsx         -- "+ Add a note" inline composer
    Questions.tsx             -- Existing questions rendering (moved/refactored from current location)
    DiagnosticChecks.tsx      -- Existing checks rendering (moved/refactored from current location)
    SuggestedFix.tsx          -- New or refactored component for the suggested fix card
ResolveButton.tsx             -- The Resolve CTA at the bottom of the task lane
ResolutionNotePreview.tsx     -- Floating popover anchored to Resolve button
EscalatePackagePreview.tsx    -- Same pattern for Escalate

ScriptGenInline/              -- Script Generator embedded in session context
  TemplateMatchPanel.tsx      -- Scene 1 mockup: template pre-filled
  NoTemplateDialog.tsx        -- Scene 2 mockup: three-option dialog
  TemplatizePrompt.tsx        -- Scene 3 mockup: post-resolve prompt
  ParameterizationPreview.tsx -- Shared component: script with highlighted params

Existing component folders (e.g., src/components/assistant/) may be renamed opportunistically, but behavior and route migration matter more than directory-name purity.

7.3 Component behavior contracts

WhatWeKnowItem

  • Props: {fact: SessionFact, onEdit, onDelete}
  • Renders the fact text, a green checkmark, and the provenance line with source-type color coding
  • Edit affordance: only shown when fact.source_type is user_note or ai_synthesis. Question/check facts are read-only at the card level (edit the source question/check instead).
  • Delete affordance: shown for all facts (soft-delete via DELETE endpoint)

TaskLane

  • Subscribes to a session state hook that polls for fact / question / check / suggested-fix updates
  • On any state change (state_version increment), calls POST /api/v1/ai-sessions/{id}/resolution-note/preview to refresh the ResolutionNotePreview
  • Debounce preview refresh to 500ms to avoid LLM spam

NoTemplateDialog (three-option dialog)

  • Props: {suggestedFix, onDecision}
  • Renders the three cards with the middle (draft_template) marked as recommended
  • onDecision posts to /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision and either opens the Script Generator (one_off / draft_template) or navigates to full template creation (build_template)

TemplatizePrompt

  • Rendered after successful Resolve when a draft template exists for the session AND account_settings.preferences.templatize_prompt_enabled is not false
  • Fetches proposed parameters from the draft template record
  • Save button posts to /api/v1/draft-templates/{id}/accept

8. AI prompt changes

The existing FlowPilot / ResolutionAssist system prompt needs updates to emit the new markers. Parser lives in unified_chat_service alongside the existing [QUESTIONS] / [DIAGNOSTIC_CHECKS] parsing — do not create a separate marker pipeline.

8.1 New marker: [PROMOTE]

Used to surface facts to What we know. Syntax — each block contains a single JSON object; multiple blocks may appear in one response:

[PROMOTE]
{"source_type": "question", "source_ref": "{task_lane_item_uuid}", "text": "OWA login and send/receive confirmed working for jsmith", "summary": "rules out tenant/license"}
[/PROMOTE]

The AI should emit [PROMOTE] blocks in the same message that answers or processes a question/check, so the fact appears in What we know simultaneously with the chat acknowledgment. source_ref points to the stable UUID of the task lane item being promoted (assigned in Phase 2). For source_type: "ai_synthesis", omit source_ref (or set it to null) — the parser drops it defensively even if the model includes one.

8.2 New marker: [SUGGEST_FIX]

[SUGGEST_FIX]
title: Clear cached credentials + rebuild Outlook profile
description: Stale cached credential in Credential Manager is holding the pre-reset token...
confidence: 94
script_template_slug: clear-outlook-credentials   # or omitted if no template match
ai_drafted_script: |                              # only if no template match
  # Generated by FlowPilot...
  ...
[/SUGGEST_FIX]

Emitting a new [SUGGEST_FIX] supersedes any existing active fix for the session (sets superseded_at on the old row).

8.3 Removed markers

The old [FORK] marker from the ResolutionAssist prompt is removed. Forks were a Guided-mode concept; in the unified model, they're replaced by Questions with mutually exclusive answer options.


9. Implementation phases

Each phase ends with a git commit and verification step. Do not advance to the next phase until verification passes (or, for Phase 0, the verification step is explicitly deferred to the new dev environment with a tracking TODO).

Phase 0 — Prompt caching infrastructure (prerequisite)

A codebase audit revealed that prompt caching was only implemented in assistant_chat_service.py (the file being deprecated). Every other Anthropic API call site — including all of FlowPilot's 7 call sites through AnthropicProvider — was uncached. Phase 0 must land before Phase 2 starts because new services built in Phase 2 will inherit caching from AnthropicProvider automatically once it's fixed.

Deliverables:

  • 0.1 — Cached system-block support in AnthropicProvider. Convert AnthropicProvider.generate_json() and generate_text_stream() signatures to accept system_prompt: str | list[SystemBlock]. Plain string = uncached (backward compatible). List = cached using policy α: if the caller marks cache_control on any block, honor those markers; if no block has cache_control, cache the first block only by default. For streaming, capture the final usage object via get_final_message(). Log cache_read_input_tokens and cache_creation_input_tokens on every response.

  • 0.2 — Pending target endpoint. The /tickets/ai-parse endpoint described in the original migration doc does not exist in the codebase. No code change in Phase 0. When this endpoint is built, apply the cached-system-block pattern:

    system_blocks = [
        {"type": "text", "text": members_json, "cache_control": {"type": "ephemeral"}},
        # cacheable: team-stable
        {"type": "text", "text": boards_json, "cache_control": {"type": "ephemeral"}},
        # cacheable: team-stable
        {"type": "text", "text": engineer_description},
        # uncached: per-request
    ]
    

    Remove this note when the endpoint is implemented and the pattern applied.

  • 0.3 — Opt-in caching for one-shot generators. Add cache_control to the static system prompt in ai_tree_generator_service, kb_conversion_service, ai_fix_service, and script_builder_service. Pattern: single-block list with policy α auto-caching the first (and only) block. Per-block inline comment explaining cacheability. For script_builder (multi-turn): cache only the system prompt; conversation history stays uncached in this phase. Retries in ai_tree_generator.generate_branch_detail inherit the cache automatically — no special handling.

  • 0.4 — Consolidate the MCP-capable chat path. Rename _call_anthropic_cached to chat_call_cached() in assistant_chat_service (or move to a shared module; implementer's choice based on cleanest structure). Refactor it to delegate cached-system-block plumbing to AnthropicProvider. MCP + image + beta-endpoint logic stays inside the chat wrapper — do NOT push MCP into AnthropicProvider, which is a provider-agnostic abstraction (Gemini has no MCP). Document that this wrapper is the one MCP-using caller, the exception not the rule. Track MCP unification as a separate future ticket.

  • 0.5 — MCP telemetry. Add counters for: (a) turns where MCP was available, (b) turns where the model actually invoked an MCP tool, (c) turns where the silent-retry-without-MCP fallback was triggered, (d) which MCP tool names got called. Log to whatever telemetry path exists (PostHog if wired up, otherwise structured logs). This gives us real data by the time Phase 2+ decisions about MCP investment are made. Do 0.5 first or alongside 0.1 — don't save it for last.

Per-call-site comment pattern for multi-block lists:

When a call site passes more than one block to system_prompt, add a one-line comment next to EACH block — including uncached ones — explaining why it is or isn't cached. The absence of a marker deserves documentation as much as the presence of one, because it tells the next dev you made a conscious choice.

Verification:

  • Hit any FlowPilot endpoint twice within 5 minutes. First call shows cache_creation_input_tokens > 0, second call shows cache_read_input_tokens > 0.
  • If the second call returns zero cache reads, inspect the prefix for silent invalidators (timestamps, unsorted JSON keys, varying tool list ordering). Fix before proceeding.

Verification is deferred to the new dev environment. Phase 0 code commits without live verification because no running environment exists at authoring time. A TODO(phase-0-verification) inline comment in the caching module names the verification steps. Execute verification when the new env is up; if it fails, that is a debug task then, not a blocker now.

git commit -m "feat(ai): promote AnthropicProvider to cached pattern, consolidate caching implementation"

Dependencies:

  • Phase 1 (route rename and schema) can run in parallel with Phase 0.
  • Phase 2 (What we know) must not start until Phase 0 is complete and verification has passed (or been explicitly deferred with a tracked issue).

Phase 1 — Data model and route rename (can run in parallel with Phase 0)

Deliverables:

  • Alembic migration after current repo head creating: session_facts, session_suggested_fixes, draft_templates, account_settings; column additions to ai_sessions (including state_version), script_templates.
  • All new tenant-scoped tables have account_id and RLS policies using the repo's app.current_account_id policy pattern.
  • SQLAlchemy models for each new table. AccountSettings model includes get_setting(key, default) and set_setting(key, value) helpers; lazy row creation on first write.
  • Route move: AssistantChatPage component mounted at /pilot and /pilot/:sessionId.
  • Permanent 301 redirect: /assistant/pilot, /assistant/:sessionId/pilot/:sessionId (preserving session ID).
  • Sidebar nav entry renames from "ResolutionAssist" / "AI Assistant" to "FlowPilot". Command palette entries, dashboard cards, and session list links update to /pilot.
  • No Phase 2 UI changes yet (no task lane restructuring, no What we know section).

Verification:

  • Run migration on a fresh dev database — succeeds.
  • Downgrade migration succeeds (reversibility).
  • RLS grep/check passes for new tables.
  • /assistant redirects to /pilot (301).
  • /assistant/:sessionId redirects to /pilot/:sessionId with ID preserved.
  • /pilot renders the existing chat UI with the sidebar now reading "FlowPilot".
  • No Phase 2 UI introduced.
git commit -m "feat(pilot): rename /assistant to /pilot, add session_facts/suggested_fixes/draft_templates/account_settings schema"

Phase 2 — What we know (task lane + service + API)

Deliverables:

  • Stable-UUID assignment for pending_task_lane items. When questions/checks are persisted (or when a legacy session is loaded), each item receives a UUID written back into the JSON. This is a prerequisite for session_facts.source_ref to point anywhere reliable. Handle in-flight sessions gracefully — sessions open during deploy may have unstable IDs until their next save.
  • FactSynthesisService per Section 6.1, with its LLM prompt.
  • Fact CRUD API endpoints per Section 5.1.
  • WhatWeKnow, WhatWeKnowItem, AddNoteButton, TaskLane components under src/components/pilot/.
  • Task lane layout adjustment: What we know section renders above Questions.
  • Counter in task lane header updates to X / Y answered format.
  • AI system prompt updated to emit [PROMOTE] markers; unified_chat_service marker parser extended to handle them.
  • Fact editability enforcement: API returns 403 on PATCH of question or diagnostic_check-sourced facts. UI hides the edit affordance for those facts.

Verification:

  • Open a session, answer a question; within 2 seconds a fact appears in What we know with correct provenance.
  • Click "+ Add a note", type a manual fact, confirm it persists with source_type: user_note.
  • Run a diagnostic check, confirm the check result promotes to a fact.
  • Facts persist across page reloads.
  • RLS: a user from a different account cannot read or write facts for this session.
  • Attempt to PATCH a question-sourced fact → 403.
  • PATCH a user_note fact → succeeds.

Verification deferred — same constraint as Phase 0: no live dev environment was available at authoring time. Backend pytest suite (tests/test_session_facts_api.py) and the manual scenarios above must run when the dev env is up. Failures should be treated as normal bugs, not blockers for Phase 3.

git commit -m "feat(pilot): add What we know section with fact synthesis and stable task-lane item IDs"

Phase 3 — Suggested fix + resolution note preview

Deliverables:

  • session_suggested_fixes API endpoints per Section 5.2 and data flow.
  • SuggestedFix component in the task lane.
  • AI system prompt updated to emit [SUGGEST_FIX] markers; parser handles supersession.
  • ResolutionNoteGeneratorService per Section 6.2 and preview endpoint per Section 5.4.
  • ResolutionNotePreview floating popover anchored to Resolve button.
  • Preview refreshes on fact / suggested-fix / script-generation changes via state_version increment. Client-side 500ms debounce.
  • Preview cache keyed by (session_id, state_version) per Section 5.5.

Verification:

  • Session with ≥3 facts and an active suggested fix shows a populated Resolve preview.
  • Editing a fact updates the preview within 1 second.
  • Preview markdown renders correctly with all four sections (Problem / What we confirmed / Root cause / Resolution).
  • Preview contains no hallucinated information not present in session state (human review of 5 real-ish sessions).
  • Incrementing state_version invalidates the preview cache; reading the same version returns the cached markdown.

Verified end-to-end against the dev stack on 2026-04-22:

  • /suggested-fixes/active → 404 when no fix; 200 with payload when one exists.
  • Fact write bumps state_version; preview cache invalidates as expected.
  • Sonnet generates well-formed four-section markdown grounded only in provided facts (single-fact session correctly says "Root cause not definitively isolated").
  • Second consecutive preview call with no state change returns from_cache=true and emits no LLM call.
git commit -m "feat(pilot): add suggested fix tracking and Resolve note preview with state_version caching"

Phase 4 — Resolve and Escalate PSA writebacks

Deliverables:

  • transition_ticket_status method on PSAWritebackService with CW re-fetch verification.
  • post_resolution_note endpoint and CW integration via existing PSA provider registry + post_note seam.
  • Resolve button flow: engineer edits preview → Confirm & post → server posts to PSA → stores {external_id, posted_at} → transitions status → verifies status → marks session resolved → shows templatize prompt if applicable.
  • EscalationPackageGeneratorService and parallel flow for Escalate, including CW routing rules.
  • Local-only path: resolving or escalating a session with no linked PSA ticket stores markdown locally and marks the session state without external posting.

Verification:

  • Complete a session end-to-end with a ConnectWise test instance.
  • Click Resolve, edit the preview, confirm post — verify the note appears in CW and status changes to Resolved (verified by re-fetch).
  • Click Escalate on a different session — verify the package is posted and the ticket routes correctly.
  • Simulate CW silently rejecting a status change — verify the app surfaces an error, not silent success.
  • Attempt to Resolve without a linked PSA ticket — session marks resolved locally without erroring; markdown stored in resolution_note_markdown.

Verified on 2026-04-22:

  • Local-only Resolve + Escalate confirmed end-to-end against the dev stack (no PSA instance wired): markdown stored, session.status flips, 409 on re-post.
  • Escalation-package preview generates well-formed five-section markdown from a single fact (real Sonnet); second preview call with no state change returns from_cache=true, confirming the cache-kind separation from resolution-note previews.
  • PSA post + status-verification paths covered by mocked-provider pytest cases: happy path, silent-rejection → 502 with clear detail, skipped transition when cw_resolved_status_id unset, internal-analysis note type used for escalation handoffs. Live CW round-trip still TODO once a test instance is wired.
git commit -m "feat(pilot): wire Resolve and Escalate to ConnectWise writeback with status verification"

Phase 5 — Script Generator inline integration

Deliverables:

  • ScriptGenInline/TemplateMatchPanel — when suggested fix has script_template_id, clicking the fix opens this panel with parameters pre-filled from session facts, ticket context (company configs), and AI-suggested values in the [SUGGEST_FIX] marker.
  • ScriptGenInline/NoTemplateDialog — three-option dialog when no template match.
  • User decision persisted on session_suggested_fixes.user_decision.
  • TemplateExtractionService for generating parameterization proposals (Section 6.4).
  • Script generation flow produces a script_generations record linked to the session via existing script_generations.ai_session_id; increments state_version.
  • ⌘K → "script" opens the inline generator from the FlowPilot session. No Resolve keyboard shortcut is added (browsers intercept ⌘R; decided against alternatives).
  • Script Generator inherits MCP access for Microsoft Learn lookups via the chat_call_cached wrapper (Phase 0.4), not via AnthropicProvider directly.

Verification:

  • Session with a template-matched suggested fix: clicking opens generator with ≥2 pre-filled parameters.
  • Session with a custom script suggested fix: dialog appears with three options, script preview shows parameters highlighted.
  • All three paths end correctly: one-off generates and closes, draft_template creates draft_templates row and generates, build_template opens full template creation.
  • ⌘K → "script" anywhere in a session opens the generator directly.
  • Edge case: if the suggested fix's script_template_id points at a template that has been deleted, show the no-template three-option dialog with the AI-drafted script (do not error).

Verified on 2026-04-22:

  • one_off returns rendered_script, no draft persisted.
  • draft_template returns rendered_script + draft_template_id; real Sonnet-driven TemplateExtractionService persists a draft_templates row with the fix's title pre-filled and status=pending.
  • build_template returns redirect_path=/scripts/builder?from_session=…&fix=….
  • Conservative extraction default works: a script with environment-agnostic cmdlets (cmdkey, Restart-Process) yielded zero proposed parameters as intended by the "prefer fewer parameters" rule.
  • TemplateMatchPanel falls back gracefully on 404 (deleted template) by surfacing a panel-level message; the engineer can dismiss the fix and re-trigger the AI for a fresh suggestion.
  • Cmd+K → "Open inline Script Generator" surfaces only when on a /pilot/:id route; fires a window event the chat page subscribes to. No Resolve shortcut added (per Section 14 decision).
git commit -m "feat(pilot): integrate Script Generator inline with suggested fixes"

Phase 6 — Post-resolve templatize prompt

Deliverables:

  • TemplatizePrompt component.
  • Show logic: after successful Resolve, show only when ALL of:
    1. account_settings.preferences.templatize_prompt_enabled is not false (default true when absent)
    2. Session has pending draft_templates rows
    3. The user chose draft_template on the original three-option dialog
  • Accept flow creates a new script_templates row with source_session_id, source_user_id, source_ticket_ref set. Updates draft to status='accepted', promoted_template_id set.
  • Reject flow updates draft to status='rejected'.
  • "Don't ask me again for this team" writes {"templatize_prompt_enabled": false} to account_settings.preferences.
  • Script Library sidebar shows a pending-drafts badge/count for the account.

Verification:

  • Resolve a session where the engineer picked Option 2 → templatize prompt appears with AI-proposed parameters.
  • Accept the prompt → new template appears in the Script Library with the provenance chip.
  • Skip the prompt → draft marked rejected, Script Library shows no new template.
  • Toggle "don't ask me again" → next session Resolve skips the prompt even with a pending draft.

Verified on 2026-04-22:

  • GET /draft-templates?pending_only=true returns pending rows; filter flips the set to include accepted/rejected for audit views.
  • POST /{id}/accept → creates script_templates row; source_session_id, source_user_id, source_ticket_ref (e.g. "CW #99123") copied from the source session so the Script Library provenance chip has its data. Draft flips to status='accepted', promoted_template_id populated, resolved_at set. 409 on a re-accept.
  • POST /{id}/reject → flips to status='rejected', resolved_at set.
  • GET /accounts/me/preferences → empty dict when no row; PATCH merges keys into preferences JSONB (verified round-trip persistence of templatize_prompt_enabled: false).
  • Sidebar Scripts nav gains a badge reflecting the pending draft count (fetched independently of the main sidebar stats endpoint so a draft-endpoint failure doesn't break the rest of the sidebar).
git commit -m "feat(pilot): add post-resolve templatize prompt for draft templates"

Phase 7 — Polish

Deliverables:

  • Visual polish against the mockup HTML source files (spacing, colors, typography, component structure). Use PNGs for visual target confirmation.
  • Loading states for: fact synthesis, preview generation, template extraction, PSA post/verify, script generation.
  • Empty states: no facts yet, no questions, no checks, no active suggested fix, no pending draft templates.
  • Keyboard shortcuts (no Resolve shortcut): ⌘K (command palette), ⌘↵ (send composer), ⌘G (script generator).
  • Responsive: at widths below 1200px, task lane collapses into a bottom drawer.
  • Use existing design tokens where present; add missing tokens only if needed to match the mockups.

Verification:

  • Major screens visually compare within tolerance against the mockup PNG files.
  • No horizontal scroll at 1280px viewport.
  • Keyboard shortcuts documented in-app via ? overlay.
  • Shortcuts do not conflict with browser reload.
git commit -m "feat(pilot): visual polish, empty/loading states, keyboard shortcuts"

Phase 8 — Fix Outcome Banner

Plan and rationale: phase-8-fix-outcome-banner.md

Mockups: mockups/06-slide-up-banner.html, mockups/07-verify-states.html

What this phase does: Removes the SuggestedFix card as the primary interaction point for fix application. Replaces it with a chat-composer-anchored slide-up banner (ProposalBanner) that stays visible at the bottom of the conversation column regardless of task-lane scroll depth. Addresses the user-reported discoverability problem: "the task lane fills up pretty quick … the suggested fix … is easily missed."

Key backend additions:

  • Six new columns on session_suggested_fixes: status, applied_at, verified_at, partial_notes, failure_reason, ai_outcome_proposal
  • PATCH /api/v1/ai-sessions/{session_id}/suggested-fixes/{fix_id}/outcome endpoint to record the engineer's decision
  • [FIX_OUTCOME] marker in the FlowPilot system prompt, parsed by unified_chat_service.py to trigger the banner

Key frontend additions:

  • ProposalBanner component (frontend/src/components/pilot/ProposalBanner.tsx) — slide-up banner anchored above the chat composer; shows fix title, confidence, and Accept / Dismiss / Escalate actions; auto-collapses after session resolves
  • EscalateInterceptDialog — intercepts the Escalate action when a fix proposal is active, asking whether the engineer wants to note that the fix was attempted before escalating

Commit range: cdd8bb0 (Phase 8 Task 1 start) through 8582d24

git commit -m "feat(pilot): Phase 8 — fix outcome banner replaces task-lane SuggestedFix CTA"

Phase 9 — Tabbed Script Builder

Spec: phase-9-script-builder-tab.md

Implementation plan: phase-9-implementation-plan.md

What this phase does: Resolves open items #1 (NoTemplateDialog narrow-lane bug) and #3 (Tabbed Script Builder) from the Phase 6/7 backlog. The chat region gains a [Chat] [Script Builder ●] tab strip (ChatTabStrip + a new ScriptBuilderTab controller) that hosts two modes: an AI path reusing the existing (untouched) ScriptBuilderChat, and a "Write it myself" path using ScriptBodyEditor (Monaco). Engineer submit writes the drafted script back to session_suggested_fixes.ai_drafted_script via a new PATCH endpoint — applied_at is NOT stamped (a draft is not an application). Tabs use display: none toggling so chat scroll position, draft message, AI history, and Monaco buffer are all preserved across switches. InlineNoTemplateDialog is relocated from the task-lane bottomSlot into a dedicated chat-region placement wrapper, eliminating the narrow-lane viewport-breakpoint collision that made the three-option grid unusable.

Key backend additions:

  • PATCH /api/v1/ai-sessions/{session_id}/suggested-fixes/{fix_id}/script — writes ai_drafted_script + ai_drafted_parameters without stamping applied_at; bumps state_version so Resolve/Escalate preview bundles regenerate; 409 on terminal fix status
  • Alembic migration adds origin VARCHAR(20) NOT NULL DEFAULT 'standalone' to script_builder_sessions (CHECK enum 'standalone'|'pilot_inline' + invariant origin='pilot_inline' ⇒ ai_session_id IS NOT NULL); reuses the pre-existing ai_session_id FK rather than adding a new parent column; partial unique index ux_script_builder_sessions_pilot_inline on (user_id, ai_session_id) WHERE origin='pilot_inline' backs get-or-create idempotency
  • POST /api/v1/scripts/builder/sessions extended: accepts origin + ai_session_id with auth (pilot session must belong to caller); returns existing row on duplicate; race-safe via IntegrityError + re-read fallback; list_sessions and count_user_sessions default-scope to origin='standalone' so inline sessions don't pollute the dashboard or count against the 5-session cap
  • applied_at semantics corrected: stamps only on run-declaring actions — TemplateMatchPanel "I ran this" click via new onMarkRun prop, and NoTemplateDialog decisions one_off/draft_template (both labelled "Run now, …"). build_template does NOT stamp. Script Builder tab Submit does NOT stamp. Banner Apply click no longer stamps directly

Key frontend additions:

  • ChatTabStrip[Chat] [Script Builder ●] header strip in the chat region when the active fix needs a drafted script (status proposed/applied_partial, no template, no drafted script)
  • ScriptBuilderTab — new controller wrapping ScriptBuilderChat (AI mode) + ScriptBodyEditor (Monaco, "Write it myself" mode); get-or-create on mount; Submit calls sessionSuggestedFixesApi.patchScript
  • InlineNoTemplateDialog — chat-region slide-up wrapper around the existing NoTemplateDialog; replaces the previous task-lane bottomSlot rendering of the drafted-script three-card decision
  • TemplateMatchPanel gains onMarkRun optional prop + "✓ I ran this" primary button
  • EscalateInterceptDialog gains a fourth "I applied some of it — partial" choice (dispatches applied_partial via the existing FixOutcome pass-through)

Commit range: 5bcb7aa (Phase 9 Task 1 start) through faf1d8d

git commit -m "feat(pilot): Phase 9 — tabbed Script Builder + InlineNoTemplateDialog relocation"

10. Design system reference

All components must use the existing ResolutionFlow design system. Tokens from the mockup CSS for quick reference — these should already exist in your tokens file; if they don't, add them:

/* Backgrounds */
--bg-0: #070b12;          /* page background */
--bg-1: #0d131c;          /* sidebar / chrome */
--bg-2: #121a25;          /* card / bubble background */
--bg-3: #1a2332;          /* raised element */

/* Borders */
--border: rgba(148, 163, 184, 0.12);
--border-strong: rgba(148, 163, 184, 0.22);

/* Text */
--text-primary: #e2e8f0;
--text-secondary: #94a3b8;
--text-tertiary: #64748b;

/* Brand cyan (FlowPilot accent) */
--cyan-400: #22d3ee;
--cyan-500: #06b6d4;
--cyan-600: #0891b2;
--cyan-bg: rgba(34, 211, 238, 0.10);
--cyan-border: rgba(34, 211, 238, 0.30);

/* Semantic */
--success: #34d399;   /* Resolve, facts */
--warning: #fbbf24;   /* Escalate, proposed parameters */
--danger:  #f87171;
--purple:  #a78bfa;   /* Script Generator / templates */

Typography:

  • Body: IBM Plex Sans, 14px/1.5
  • Headings: Bricolage Grotesque, 500 weight, -0.01em letter-spacing
  • Code: JetBrains Mono

Icons: Phosphor Icons (Duotone) per the recorded design decision to migrate off Lucide.


11. Test plan

Migration tests

  • Fresh DB upgrade succeeds.
  • Downgrade succeeds (reversibility).
  • New tables have RLS enabled/forced.
  • Tenant policy includes app.current_account_id.

Backend tests

  • Fact CRUD authorization (edit allowed on user_note / ai_synthesis, 403 on question / diagnostic_check).
  • Fact promotion: POST /facts/promote creates fact and increments state_version.
  • Suggested-fix supersession: emitting a new [SUGGEST_FIX] sets superseded_at on the prior active one.
  • Decision persistence on session_suggested_fixes.user_decision.
  • Resolution note preview cache invalidation on state_version increment.
  • Resolve/Escalate local-only behavior without a linked PSA ticket.
  • PSA status verification failure path (simulated rejection surfaces error).
  • Draft-template accept/reject behavior.
  • AccountSettings.get_setting returns default when row absent.

Frontend tests

  • Route redirects (/assistant/pilot, deep-link ID preservation).
  • Task lane rendering and persistence across reloads.
  • Inline fact editing refreshes the Resolve preview.
  • Script Generator option flows (template match, three-option dialog, post-resolve prompt).
  • Templatize prompt respects templatize_prompt_enabled setting.
  • Responsive drawer behavior at <1200px.

Manual QA

  • Run one ConnectWise-linked Resolve end-to-end.
  • Run one Escalate end-to-end.
  • Run one template-match script generation path.
  • Run one no-template draft-template path through post-resolve save.

12. Non-goals for this migration

Do not build these as part of this work. They belong to later phases of the roadmap.

  • Confidence tiers (Discovery / Exploring / Guided). Explicitly removed. The task lane itself is the progress signal.
  • Mode toggle between Guided and Quick ask. There is one mode.
  • "Convert to guided" promotion flow. No longer applicable.
  • Team Wiki compilation from resolved sessions. Tracked separately; depends on this migration but is not part of it.
  • SharePoint integration. Sequenced after ConnectWise per roadmap.
  • Template marketplace / sharing across accounts. Tracked under Client Context System roadmap item.
  • Backfill of What we know for pre-Phase-2 sessions. Sessions resolved before Phase 2 ships will not retroactively gain facts. Document in release notes.
  • MCP unification into AnthropicProvider. Deferred pending telemetry-driven evaluation. Track as a separate ticket.
  • Supervisor staging of resolution notes. Engineer review + Confirm & post is the committed flow (not compliance-grade draft approval).

13. Risks and mitigations

Risk Mitigation
LLM fact synthesis hallucinates specifics not in the answer Conservative prompt; engineer can edit/delete any AI-synthesized fact; provenance line shows the source so the engineer can verify. Haiku default + disputed_fact_rate telemetry triggers escalation to Sonnet if quality drops.
Resolution note preview LLM cost at scale state_version-keyed cache prevents re-generation on unchanged state; 500ms client debounce batches rapid edits.
ConnectWise silently rejects status change transition_ticket_status re-fetches and verifies; fails loudly if the change didn't stick.
Template extraction proposes bad parameterization Engineer reviews before saving; draft templates never silently become real templates; provenance chip lets team admins audit.
Users lose muscle memory from /assistant/pilot rename Permanent 301 redirect (no sunset date); deep-link session IDs preserved through the redirect.
Existing sessions have no facts at Phase 2 deploy Acceptable per non-goals. Facts accumulate for new or ongoing sessions after deploy. Document in release notes.
In-flight sessions during Phase 2 deploy lack stable task-lane item IDs Sessions open at deploy time may have unstable IDs until the next save cycle re-persists with UUIDs. Facts tied to those sessions may reference IDs that don't resolve. Engineer can manually re-promote if needed.
Phase 0 cache verification deferred to new env Tracked via inline TODO in the caching module. If verification fails when executed, debug as a normal bug — do not retroactively block dependent phases.
MCP usage data unknown, may under- or over-invest Phase 0.5 telemetry answers this within 30 days of new env being live. Schedule "MCP review" checkpoint at that mark.

14. Decisions made during migration planning

These questions were raised during the planning conversation and have been resolved. Captured here so the decisions are traceable.

  1. Keyboard shortcut for ResolveDecided: no shortcut. ⌘R conflicts with browser reload; alternatives add complexity without clear value. Resolve has a button, a preview, and a confirm step. No shortcut needed.
  2. Default for templatize_prompt_enabledDecided: true. Feature discovery outweighs annoyance at pre-revenue stage. Opt-out is one click and persistent. Tune surfacing logic rather than the default if feedback indicates over-prompting.
  3. Resolution note postingDecided: engineer edits inline, clicks Confirm & post. Supervisor staging is out of scope for this migration. Revisit if an MSP with strict compliance requirements surfaces the need.
  4. Fact synthesis model tierDecided: Haiku 4.5 behind a FACT_SYNTHESIS_MODEL config flag. All other AI services default to Sonnet 4.6. Opus 4.7 not used at current scale. Per-service MCP capability configured via matching flags (Section 6.6).
  5. MCP architecture in Phase 0Decided: leave MCP in the chat wrapper. Option C from the Phase 0 audit. Do not push MCP into the provider-agnostic AnthropicProvider. Add telemetry in Phase 0.5 to gather data for a future unification decision.
  6. Cache breakpoint policyDecided: policy α. Caller-marked cache_control is honored; if no blocks are marked, the first block is cached by default.
  7. API namespaceDecided: /api/v1/ai-sessions/{id}/..., matching the existing codebase.
  8. account_settings structureDecided: new table with JSONB preferences column, lazy row creation. Simple settings live in preferences; settings graduate to typed columns when they meet the promotion criteria (hot path / validation / joins).

End of document