Files
resolutionflow/docs/plans/2026-03-18-flowpilot-first-pivot-phase1.md
chihlasm 9bad49d568 feat(knowledge-flywheel): add Phase 3 Knowledge Flywheel — AI analysis, review queue, analytics
Phase 3 implementation:
- AI session analysis service that generates flow proposals from resolved sessions
- APScheduler job for batch processing pending analyses (max_instances=1)
- Knowledge gap detection (weak options, high escalation signals)
- Flow proposals CRUD with team admin review workflow (approve/edit/dismiss/reject)
- FlowPilot analytics dashboard with confidence tiers, PSA metrics, knowledge gaps
- In-session script generator component
- Review queue page with filtering and proposal detail panel

Bug fixes from review (12 total):
- Fix "Edit & Publish" navigating to non-existent /editor/new route
- Hide Approve button for enhancement proposals (require Edit & Publish)
- Add max_instances=1 to scheduler to prevent TOCTOU race
- Fix eventual_success case() double-counting failed retries
- Add tree_structure validation before creating tree from proposal
- Simplify script generator rendering condition
- Add severity style fallback, toFixed on rates, Link instead of <a href>
- Add toast.warning on dismiss failure, fix dedup for domain-less sessions
- Cast Decimal to int in knowledge gap evidence dicts

Also updates CLAUDE.md with lessons 67-71 and Phase 3 project structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 05:12:10 +00:00

46 KiB

FlowPilot-First Pivot — Phase 1: AI Session Core

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Build the core AI-powered troubleshooting session — the new flagship product experience. An engineer starts a session by describing a problem (free text, screenshot, log paste), FlowPilot guides them through structured diagnosis with selectable options + free-text escape, and the session closes with auto-generated documentation. Flows are built as a byproduct of real resolutions.

Architecture: New AISession and AISessionStep models. New FlowPilotEngine service orchestrating LLM calls with structured JSON output contracts. New FlowMatchingEngine service for semantic matching against existing flows. New frontend session UX with conversational layout. Reuses existing ai_provider.py, embedding_service.py, rag_service.py, and copilot_service.py patterns.

Tech Stack: FastAPI, SQLAlchemy 2.0 (async), Anthropic Claude Sonnet 4 (structured JSON output), pgvector (flow matching), React, TypeScript, Tailwind CSS v4, shadcn/ui

Pivot architecture doc: docs/ResolutionFlow_Pivot_Architecture.docx (in project root — the full strategic context)

Existing patterns to follow:

  • Models: app/models/copilot_conversation.py, app/models/script_template.py
  • Services: app/services/copilot_service.py, app/core/ai_provider.py
  • API: app/api/endpoints/copilot.py (auth, quota, error handling patterns)
  • Frontend: src/pages/FlowAssistPage.tsx, src/components/copilot/

Context: What This Pivot Changes

ResolutionFlow currently requires engineers to build flows manually before getting value. This pivot makes FlowPilot the primary interface — engineers bring a problem, FlowPilot guides diagnosis, and flows get built organically from real resolutions.

What stays: Session runner UX (promoted to core), flow editor (repurposed for curation), script generator (elevated to in-session action), PSA integration (elevated to primary intake channel), all existing models and services.

What's new in Phase 1: AI Session models, FlowPilot Engine service, Flow Matching Engine v1, new session intake + conversational diagnosis frontend, resolve/escalate endpoints with auto-documentation.


Slice 1: Database Models & Migration

Task 1: Create AISession model

Files:

  • Create: backend/app/models/ai_session.py
"""AI-powered troubleshooting session model.

Represents a complete FlowPilot interaction from intake to resolution/escalation.
This is the central entity of the FlowPilot-First pivot.
"""
import uuid
from datetime import datetime, timezone
from typing import Optional, Any, TYPE_CHECKING

from sqlalchemy import String, Text, DateTime, ForeignKey, Boolean, Integer, Float, CheckConstraint
import sqlalchemy as sa
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB

from app.core.database import Base

if TYPE_CHECKING:
    from app.models.user import User
    from app.models.team import Team
    from app.models.account import Account
    from app.models.tree import Tree
    from app.models.psa_connection import PsaConnection


class AISession(Base):
    """A FlowPilot-guided troubleshooting session.

    Lifecycle: active → resolved | escalated | abandoned
    Sessions may be paused and resumed (e.g., escalation handoff).
    """
    __tablename__ = "ai_sessions"
    __table_args__ = (
        CheckConstraint(
            "intake_type IN ('free_text', 'psa_ticket', 'screenshot', 'log_paste', 'combined')",
            name="ck_ai_sessions_intake_type",
        ),
        CheckConstraint(
            "status IN ('active', 'paused', 'resolved', 'escalated', 'abandoned')",
            name="ck_ai_sessions_status",
        ),
        CheckConstraint(
            "confidence_tier IN ('guided', 'exploring', 'discovery')",
            name="ck_ai_sessions_confidence_tier",
        ),
    )

    id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
    )
    user_id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("users.id", ondelete="CASCADE"),
        nullable=False,
        index=True,
    )
    account_id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("accounts.id", ondelete="CASCADE"),
        nullable=False,
        index=True,
    )
    team_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("teams.id", ondelete="SET NULL"),
        nullable=True,
        index=True,
    )

    # ── Intake ──
    intake_type: Mapped[str] = mapped_column(
        String(20), nullable=False, default="free_text"
    )
    intake_content: Mapped[dict[str, Any]] = mapped_column(
        JSONB, nullable=False, default=dict,
        comment="Original intake data: {text, image_urls, log_content, ticket_data}",
    )
    problem_summary: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="AI-generated one-line problem summary from intake",
    )
    problem_domain: Mapped[Optional[str]] = mapped_column(
        String(100), nullable=True,
        comment="Classified domain: active_directory, networking, m365, hardware, etc.",
    )

    # ── Session state ──
    status: Mapped[str] = mapped_column(
        String(20), nullable=False, default="active", index=True,
    )
    confidence_tier: Mapped[str] = mapped_column(
        String(20), nullable=False, default="discovery",
        comment="Current AI confidence: guided (>80%), exploring (40-80%), discovery (<40%)",
    )
    confidence_score: Mapped[float] = mapped_column(
        Float, nullable=False, default=0.0,
        comment="Numeric confidence 0.0-1.0 for internal tracking",
    )

    # ── Flow matching ──
    matched_flow_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("trees.id", ondelete="SET NULL"),
        nullable=True,
        comment="If following an existing flow, which one",
    )
    match_score: Mapped[Optional[float]] = mapped_column(
        Float, nullable=True,
        comment="Similarity score of the matched flow (0.0-1.0)",
    )

    # ── PSA link ──
    psa_ticket_id: Mapped[Optional[str]] = mapped_column(
        String(100), nullable=True,
        comment="External PSA ticket ID if session was started from a ticket",
    )
    psa_connection_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("psa_connections.id", ondelete="SET NULL"),
        nullable=True,
    )
    ticket_data: Mapped[Optional[dict[str, Any]]] = mapped_column(
        JSONB, nullable=True,
        comment="Snapshot of PSA ticket data at session start",
    )

    # ── Resolution / Escalation ──
    resolution_summary: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="What fixed the issue (set on resolution)",
    )
    resolution_action: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="The specific action/step that resolved the issue",
    )
    escalation_reason: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="Why escalated (set on escalation)",
    )
    escalation_package: Mapped[Optional[dict[str, Any]]] = mapped_column(
        JSONB, nullable=True,
        comment="Context package for receiving engineer: steps_tried, hypotheses, suggestions",
    )
    escalated_to_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("users.id", ondelete="SET NULL"),
        nullable=True,
    )

    # ── Feedback ──
    session_rating: Mapped[Optional[int]] = mapped_column(
        Integer, nullable=True,
        comment="1-5 engineer feedback rating",
    )
    session_feedback: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="Optional feedback text from engineer",
    )

    # ── AI tracking ──
    total_input_tokens: Mapped[int] = mapped_column(
        Integer, nullable=False, default=0,
    )
    total_output_tokens: Mapped[int] = mapped_column(
        Integer, nullable=False, default=0,
    )
    step_count: Mapped[int] = mapped_column(
        Integer, nullable=False, default=0,
    )

    # ── Timestamps ──
    created_at: Mapped[datetime] = mapped_column(
        DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
    )
    updated_at: Mapped[datetime] = mapped_column(
        DateTime(timezone=True),
        default=lambda: datetime.now(timezone.utc),
        onupdate=lambda: datetime.now(timezone.utc),
    )
    resolved_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
    )

    # ── LLM conversation context ──
    system_prompt_snapshot: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="Snapshot of the system prompt used (for debugging/training)",
    )
    conversation_messages: Mapped[list[dict[str, Any]]] = mapped_column(
        JSONB, nullable=False, default=list,
        comment="Full LLM message history for context continuity",
    )

    # ── Relationships ──
    user: Mapped["User"] = relationship("User", foreign_keys=[user_id])
    account: Mapped["Account"] = relationship("Account")
    team: Mapped[Optional["Team"]] = relationship("Team")
    matched_flow: Mapped[Optional["Tree"]] = relationship("Tree", foreign_keys=[matched_flow_id])
    escalated_to: Mapped[Optional["User"]] = relationship("User", foreign_keys=[escalated_to_id])
    psa_connection: Mapped[Optional["PsaConnection"]] = relationship("PsaConnection")
    steps: Mapped[list["AISessionStep"]] = relationship(
        "AISessionStep", back_populates="session",
        cascade="all, delete-orphan",
        order_by="AISessionStep.step_order",
    )

Task 2: Create AISessionStep model

Files:

  • Create: backend/app/models/ai_session_step.py
"""AI session step model.

Every interaction within an AI session is captured as a step.
Steps are the raw material that becomes flow nodes in the Knowledge Flywheel.
"""
import uuid
from datetime import datetime, timezone
from typing import Optional, Any, TYPE_CHECKING

from sqlalchemy import String, Text, DateTime, ForeignKey, Integer, Float, CheckConstraint
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB

from app.core.database import Base

if TYPE_CHECKING:
    from app.models.ai_session import AISession
    from app.models.script_template import ScriptGeneration


class AISessionStep(Base):
    """A single interaction step within a FlowPilot session.

    Step types:
    - question: FlowPilot asks a diagnostic question with options
    - action: FlowPilot suggests an action for the engineer to perform
    - script_generation: FlowPilot invokes the Script Generator
    - verification: FlowPilot asks engineer to verify a condition
    - info_request: FlowPilot asks engineer to gather specific data
    - note: Engineer or FlowPilot adds a contextual note
    - intake_analysis: Initial analysis of the intake content
    """
    __tablename__ = "ai_session_steps"
    __table_args__ = (
        CheckConstraint(
            "step_type IN ('question', 'action', 'script_generation', 'verification', "
            "'info_request', 'note', 'intake_analysis')",
            name="ck_ai_session_steps_step_type",
        ),
    )

    id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
    )
    session_id: Mapped[uuid.UUID] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("ai_sessions.id", ondelete="CASCADE"),
        nullable=False,
        index=True,
    )
    step_order: Mapped[int] = mapped_column(
        Integer, nullable=False,
        comment="Sequential position in the session (0-indexed)",
    )
    step_type: Mapped[str] = mapped_column(
        String(30), nullable=False,
    )

    # ── Content presented to engineer ──
    content: Mapped[dict[str, Any]] = mapped_column(
        JSONB, nullable=False, default=dict,
        comment="The question/action content rendered in the session UI",
    )
    context_message: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="Why FlowPilot is asking this (shown above the question)",
    )

    # ── Options (for question steps) ──
    options_presented: Mapped[Optional[list[dict[str, Any]]]] = mapped_column(
        JSONB, nullable=True,
        comment="Array of {label, value, followup_hint} options shown to engineer",
    )

    # ── Engineer response ──
    selected_option: Mapped[Optional[str]] = mapped_column(
        String(500), nullable=True,
        comment="Which option the engineer selected (value field)",
    )
    free_text_input: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="If engineer typed a custom response instead of selecting an option",
    )
    was_free_text: Mapped[bool] = mapped_column(
        default=False,
        comment="True if the engineer used the free-text escape hatch",
    )
    was_skipped: Mapped[bool] = mapped_column(
        default=False,
        comment="True if engineer selected 'I don't know / Can't check'",
    )

    # ── Action results ──
    action_result: Mapped[Optional[dict[str, Any]]] = mapped_column(
        JSONB, nullable=True,
        comment="Outcome of action step: {success: bool, details: str, next_hint: str}",
    )

    # ── Script generation link ──
    script_generation_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("script_generations.id", ondelete="SET NULL"),
        nullable=True,
    )

    # ── AI internals ──
    confidence_at_step: Mapped[float] = mapped_column(
        Float, nullable=False, default=0.0,
        comment="FlowPilot confidence level at this point (0.0-1.0)",
    )
    ai_reasoning: Mapped[Optional[str]] = mapped_column(
        Text, nullable=True,
        comment="Why FlowPilot chose this step (internal, for debugging/training)",
    )
    input_tokens: Mapped[int] = mapped_column(
        Integer, nullable=False, default=0,
    )
    output_tokens: Mapped[int] = mapped_column(
        Integer, nullable=False, default=0,
    )

    # ── Timestamps ──
    created_at: Mapped[datetime] = mapped_column(
        DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
    )
    responded_at: Mapped[Optional[datetime]] = mapped_column(
        DateTime(timezone=True), nullable=True,
        comment="When the engineer responded to this step",
    )

    # ── Relationships ──
    session: Mapped["AISession"] = relationship("AISession", back_populates="steps")
    script_generation: Mapped[Optional["ScriptGeneration"]] = relationship("ScriptGeneration")

Task 3: Register models in __init__.py

Files:

  • Edit: backend/app/models/__init__.py

Add these imports after the ScriptGeneration line:

from .ai_session import AISession
from .ai_session_step import AISessionStep

Add to __all__:

    "AISession",
    "AISessionStep",

Task 4: Create Alembic migration

Generate with:

cd backend && alembic revision --autogenerate -m "add ai_sessions and ai_session_steps tables"

The migration should create both tables with all columns, constraints, and indexes.

Also add columns to the existing trees table for flow matching support:

op.add_column('trees', sa.Column('origin', sa.String(20), nullable=True, comment='manual | ai_generated | ai_enhanced'))
op.add_column('trees', sa.Column('source_session_id', sa.dialects.postgresql.UUID(as_uuid=True), nullable=True))
op.add_column('trees', sa.Column('match_keywords', sa.dialects.postgresql.JSONB(), nullable=True, server_default='[]'))
op.add_column('trees', sa.Column('success_rate', sa.Float(), nullable=True))
op.add_column('trees', sa.Column('last_matched_at', sa.DateTime(timezone=True), nullable=True))

Verification: Run alembic upgrade head. Verify both tables exist in psql. Verify trees table has new columns.

git commit -m "feat(ai-session): add AISession and AISessionStep models + migration"

Slice 2: Pydantic Schemas

Task 5: Create AI session schemas

Files:

  • Create: backend/app/schemas/ai_session.py
"""Pydantic schemas for FlowPilot AI sessions."""
from __future__ import annotations

from typing import Optional, Any
from uuid import UUID
from datetime import datetime

from pydantic import BaseModel, Field


# ── Intake ──

class AISessionCreateRequest(BaseModel):
    """Start a new FlowPilot session."""
    intake_type: str = Field(
        "free_text",
        pattern="^(free_text|psa_ticket|screenshot|log_paste|combined)$",
    )
    intake_content: dict[str, Any] = Field(
        ...,
        description=(
            "Intake payload. Shape depends on intake_type: "
            "{text: str} for free_text, "
            "{text?: str, image_urls?: list[str]} for screenshot, "
            "{text?: str, log_content?: str} for log_paste, "
            "{ticket_id: str, psa_connection_id: str} for psa_ticket, "
            "any combination for combined."
        ),
    )
    psa_ticket_id: Optional[str] = None
    psa_connection_id: Optional[UUID] = None


class AISessionCreateResponse(BaseModel):
    """Response after starting a session — includes the first FlowPilot step."""
    session_id: UUID
    status: str
    confidence_tier: str
    problem_summary: str | None = None
    problem_domain: str | None = None
    matched_flow_id: UUID | None = None
    matched_flow_name: str | None = None
    match_score: float | None = None
    first_step: AISessionStepResponse


# ── Step interaction ──

class StepOptionSchema(BaseModel):
    """A selectable option presented to the engineer."""
    label: str
    value: str
    followup_hint: str | None = None


class AISessionStepResponse(BaseModel):
    """A FlowPilot step rendered in the session UI."""
    step_id: UUID
    step_order: int
    step_type: str
    content: dict[str, Any]
    context_message: str | None = None
    options: list[StepOptionSchema] = []
    allow_free_text: bool = True
    allow_skip: bool = True
    confidence_tier: str
    confidence_score: float

    model_config = {"from_attributes": True}


class StepResponseRequest(BaseModel):
    """Engineer's response to a FlowPilot step."""
    selected_option: str | None = None
    free_text_input: str | None = None
    was_skipped: bool = False
    action_result: dict[str, Any] | None = None


class StepResponseResponse(BaseModel):
    """FlowPilot's next step after processing the engineer's response."""
    session_id: UUID
    status: str
    confidence_tier: str
    confidence_score: float
    next_step: AISessionStepResponse | None = None
    resolution_suggested: bool = False
    resolution_summary: str | None = None


# ── Resolution / Escalation ──

class ResolveSessionRequest(BaseModel):
    """Close a session as resolved."""
    resolution_summary: str = Field(..., min_length=5, max_length=2000)
    resolution_action: str | None = None
    session_rating: int | None = Field(None, ge=1, le=5)
    session_feedback: str | None = None


class EscalateSessionRequest(BaseModel):
    """Escalate a session to another engineer."""
    escalation_reason: str = Field(..., min_length=5, max_length=2000)
    escalated_to_id: UUID | None = None


class SessionDocumentation(BaseModel):
    """Auto-generated session documentation."""
    problem_summary: str
    problem_domain: str | None = None
    intake_summary: str
    diagnostic_steps: list[DocumentationStep]
    resolution_summary: str | None = None
    escalation_reason: str | None = None
    total_steps: int
    duration_display: str | None = None
    generated_at: datetime


class DocumentationStep(BaseModel):
    """A step in the documentation trail."""
    step_number: int
    step_type: str
    description: str
    engineer_response: str | None = None
    outcome: str | None = None


class SessionCloseResponse(BaseModel):
    """Response after resolving or escalating."""
    session_id: UUID
    status: str
    documentation: SessionDocumentation


class RateSessionRequest(BaseModel):
    """Submit post-session rating."""
    rating: int = Field(..., ge=1, le=5)
    feedback: str | None = None


# ── List / Detail ──

class AISessionSummary(BaseModel):
    """Compact session for list views."""
    id: UUID
    status: str
    intake_type: str
    problem_summary: str | None = None
    problem_domain: str | None = None
    confidence_tier: str
    step_count: int
    session_rating: int | None = None
    created_at: datetime
    resolved_at: datetime | None = None

    model_config = {"from_attributes": True}


class AISessionDetail(AISessionSummary):
    """Full session detail with steps."""
    intake_content: dict[str, Any]
    matched_flow_id: UUID | None = None
    match_score: float | None = None
    resolution_summary: str | None = None
    resolution_action: str | None = None
    escalation_reason: str | None = None
    session_feedback: str | None = None
    steps: list[AISessionStepResponse] = []

    model_config = {"from_attributes": True}

Verification: Import the schemas in a Python shell and construct sample instances to verify validation.

git commit -m "feat(ai-session): add Pydantic schemas for AI sessions"

Slice 3: FlowPilot Engine Service

Task 6: Create FlowPilot Engine service

This is the brain of the product. It orchestrates the LLM, manages structured output, and drives the diagnostic conversation.

Files:

  • Create: backend/app/services/flowpilot_engine.py

Architecture:

┌─────────────────────────────────────────────┐
│              FlowPilotEngine                │
│                                             │
│  start_session(intake) → first_step         │
│  process_response(step_response) → next_step│
│  resolve_session(summary) → documentation   │
│  escalate_session(reason) → package         │
│                                             │
│  Internal:                                  │
│  _build_system_prompt(session)              │
│  _build_messages(session, new_input)        │
│  _parse_structured_output(llm_response)     │
│  _update_confidence(session, step)          │
│  _generate_documentation(session)           │
│  _classify_intake(intake_content)           │
└─────────────────────────────────────────────┘
         │
         ▼
┌─────────────────┐    ┌──────────────────┐
│  ai_provider.py │    │ FlowMatchingEngine│
│  (Anthropic)    │    │ (Slice 4)        │
└─────────────────┘    └──────────────────┘

System Prompt Structure:

The system prompt is built dynamically per session and includes:

  1. Role definition (Senior MSP engineer + structured output contract)
  2. Response format (STRICT JSON schema — never free-form prose)
  3. Team context (if available — client configs, naming conventions)
  4. Matched flow context (if a flow matched — full flow definition)
  5. Session history (all prior steps for context continuity)
  6. Available actions (script templates, verification checks)

Structured Output Contract:

FlowPilot's LLM responses must ALWAYS be valid JSON matching one of these shapes:

// Diagnostic question
{
  "type": "question",
  "content": "Brief description of what we're checking",
  "reasoning": "Internal: why this question matters (stored in ai_reasoning)",
  "context_message": "Shown to engineer: why we're asking this",
  "options": [
    {"label": "Human-readable option", "value": "machine_value", "followup_hint": "what this implies"},
    {"label": "Another option", "value": "another_value", "followup_hint": null}
  ],
  "allow_free_text": true,
  "allow_skip": true,
  "confidence": 0.65
}

// Suggested action
{
  "type": "action",
  "content": "What the engineer should do",
  "reasoning": "Internal: why this action",
  "context_message": "Here's what I'd like you to try",
  "action_type": "instruction | script_generation | verification | info_request",
  "template_id": "uuid | null",
  "pre_filled_params": {},
  "expected_outcome": "What success looks like",
  "confidence": 0.78
}

// Resolution suggestion
{
  "type": "resolution_suggestion",
  "content": "Summary of what we did and why it should be resolved",
  "reasoning": "Internal: why I think this is resolved",
  "resolution_summary": "The issue was caused by X and fixed by Y",
  "confidence": 0.92,
  "follow_up_recommendations": ["Monitor for 24 hours", "Check event log tomorrow"]
}

Key implementation details:

  • Use ai_provider.generate_json() for all LLM calls to enforce JSON output
  • Parse the JSON response and validate against the expected shapes
  • If parsing fails, retry once with a "please respond in valid JSON" nudge
  • Store reasoning in ai_reasoning column (never shown to engineer)
  • Map confidence to confidence_tier: >0.8 = guided, 0.4-0.8 = exploring, <0.4 = discovery
  • Conversation messages accumulate in session.conversation_messages JSONB field
  • Each step creates both an AISessionStep record AND appends to conversation history
  • Token counts tracked per step AND accumulated on the session

System prompt template (key excerpt — full prompt will be ~800 tokens):

FLOWPILOT_SYSTEM_PROMPT = """You are FlowPilot, an expert MSP troubleshooting assistant embedded in ResolutionFlow. You guide engineers through structured diagnosis of IT issues.

## YOUR ROLE
- Conduct systematic troubleshooting through targeted questions and actions
- Start broad, narrow down based on responses
- Never guess — ask clarifying questions when uncertain
- Suggest specific, actionable steps the engineer can verify
- When confidence is high, suggest resolution; when low, keep investigating

## RESPONSE FORMAT
You MUST respond with ONLY a valid JSON object. No markdown, no prose, no code fences.
Every response must have a "type" field: "question", "action", or "resolution_suggestion".

{structured_output_schema}

## RULES
- Maximum 5 options per question. Options should be the most likely scenarios.
- Always include relevant context in context_message — explain WHY you're asking
- confidence is a float 0.0-1.0 reflecting how certain you are about the diagnosis path
- When multiple symptoms point to one root cause with >90% confidence, suggest resolution
- If you detect the engineer needs a PowerShell script, suggest a script_generation action
- Never suggest restarting or rebooting as a first step — diagnose first
- Be specific: "Check Event Viewer > System > source NTFS" not "check the logs"

{team_context}

{matched_flow_context}
"""

start_session flow:

  1. Receive intake content
  2. Call _classify_intake() — quick LLM call (haiku-tier) to extract: problem_summary, problem_domain, key symptoms, urgency
  3. Call FlowMatchingEngine.find_matches() with extracted symptoms
  4. Build system prompt with any matched flow context
  5. Call LLM with intake as first user message → get first diagnostic step
  6. Create AISession and first AISessionStep records
  7. Return session + first step

process_response flow:

  1. Load session + all steps
  2. Append engineer's response to conversation_messages
  3. Call LLM with full conversation → get next step
  4. Parse structured output
  5. Create new AISessionStep record
  6. Update session confidence, step_count, token counts
  7. If type is resolution_suggestion, flag it but don't auto-close
  8. Return next step

resolve_session flow:

  1. Set status = resolved, resolved_at = now
  2. Call _generate_documentation() — constructs a clean doc from all steps
  3. Return documentation

_generate_documentation flow:

  1. Walk all steps in order
  2. For each question step: format as "Checked: {context_message} → Response: {selected_option or free_text}"
  3. For each action step: format as "Action: {content} → Result: {action_result}"
  4. Compile into SessionDocumentation schema
  5. Calculate duration from created_at to resolved_at

Verification: Write unit tests for _parse_structured_output and _update_confidence. Test start_session with a mock AI provider returning sample JSON. Verify session and step records are created correctly.

git commit -m "feat(ai-session): add FlowPilot Engine service with structured output"

Slice 4: Flow Matching Engine

Task 7: Create Flow Matching Engine v1

Files:

  • Create: backend/app/services/flow_matching_engine.py

Architecture: v1 uses keyword matching + existing tree embeddings for semantic search. Deliberately simple — v2 (Phase 3) will add deeper semantic matching.

Matching strategy:

  1. Keyword match: Extract key terms from intake, match against trees.match_keywords JSONB array (new column from migration)
  2. Semantic search: Embed the intake text via embedding_service.get_embedding(), cosine similarity search against tree_embeddings table
  3. Category match: Match problem_domain against trees.category
  4. Recency boost: Trees that were recently matched successfully get a score boost

Return type: List of FlowMatch(tree_id, tree_name, score, match_reason) sorted by score descending.

Threshold: Only return matches with score > 0.5. Top match with score > 0.8 triggers "guided" confidence tier.

Key implementation:

  • Reuse rag_service.search() for vector similarity (it already handles pgvector queries)
  • Combine keyword score (0.0-1.0) + semantic score (0.0-1.0) + recency score (0.0-0.2) into a weighted composite
  • Weights: semantic 0.5, keyword 0.3, recency 0.2
  • Only match against published trees (status = 'published') in the same account

Verification: Seed a few test trees with keywords. Call find_matches() with a sample intake and verify reasonable matches are returned.

git commit -m "feat(ai-session): add Flow Matching Engine v1"

Slice 5: API Endpoints

Task 8: Create AI session endpoints

Files:

  • Create: backend/app/api/endpoints/ai_sessions.py

Follow the patterns in copilot.py exactly: rate limiting, AI quota checks, error handling with AI provider errors, token usage recording.

Endpoints:

POST   /api/v1/ai-sessions                        — Start a new FlowPilot session
POST   /api/v1/ai-sessions/{id}/respond            — Submit step response, get next step
POST   /api/v1/ai-sessions/{id}/resolve             — Resolve the session
POST   /api/v1/ai-sessions/{id}/escalate            — Escalate the session
GET    /api/v1/ai-sessions                          — List user's sessions (paginated)
GET    /api/v1/ai-sessions/{id}                     — Get session detail with all steps
GET    /api/v1/ai-sessions/{id}/documentation       — Get auto-generated documentation
POST   /api/v1/ai-sessions/{id}/rate                — Submit post-session rating

Auth: All endpoints require get_current_active_user + require_engineer_or_admin.

Rate limits:

  • POST /ai-sessions: 5/minute (starting sessions is expensive)
  • POST /{id}/respond: 15/minute (normal conversation pace)
  • GET endpoints: 30/minute

AI quota: Check quota on POST /ai-sessions and POST /{id}/respond (both make LLM calls). Record usage after each call using record_ai_usage() from app.core.ai_quota_service.

Key details:

  • POST /ai-sessions calls flowpilot_engine.start_session(), returns AISessionCreateResponse
  • POST /{id}/respond calls flowpilot_engine.process_response(), returns StepResponseResponse
  • POST /{id}/resolve calls flowpilot_engine.resolve_session(), returns SessionCloseResponse
  • GET /ai-sessions supports ?status=active filter and pagination (?skip=0&limit=20)
  • Ensure the session belongs to the current user (or their team for escalation handoffs)

Task 9: Register router

Files:

  • Edit: backend/app/api/router.py

Add import and include:

from app.api.endpoints import ai_sessions
api_router.include_router(ai_sessions.router)

Verification: Start the backend. Hit POST /api/v1/ai-sessions with a sample intake via curl. Verify session is created, first step is returned. Submit a response, verify next step. Resolve, verify documentation.

git commit -m "feat(ai-session): add AI session API endpoints"

Slice 6: Frontend — Types & API Client

Task 10: Create TypeScript types

Files:

  • Create: frontend/src/types/ai-session.ts

Mirror all Pydantic schemas as TypeScript interfaces. Follow the patterns in types/copilot.ts and types/session.ts.

Key types: AISessionCreateRequest, AISessionCreateResponse, AISessionStepResponse, StepOptionSchema, StepResponseRequest, StepResponseResponse, ResolveSessionRequest, EscalateSessionRequest, SessionCloseResponse, SessionDocumentation, AISessionSummary, AISessionDetail.

Task 11: Create API client functions

Files:

  • Create: frontend/src/api/aiSessions.ts

Follow patterns in api/copilot.ts — use the existing client.ts axios instance.

export const createAISession = (data: AISessionCreateRequest) =>
  client.post<AISessionCreateResponse>('/ai-sessions', data)

export const respondToStep = (sessionId: string, data: StepResponseRequest) =>
  client.post<StepResponseResponse>(`/ai-sessions/${sessionId}/respond`, data)

export const resolveSession = (sessionId: string, data: ResolveSessionRequest) =>
  client.post<SessionCloseResponse>(`/ai-sessions/${sessionId}/resolve`, data)

export const escalateSession = (sessionId: string, data: EscalateSessionRequest) =>
  client.post<SessionCloseResponse>(`/ai-sessions/${sessionId}/escalate`, data)

export const getAISessions = (params?: { status?: string; skip?: number; limit?: number }) =>
  client.get<AISessionSummary[]>('/ai-sessions', { params })

export const getAISession = (sessionId: string) =>
  client.get<AISessionDetail>(`/ai-sessions/${sessionId}`)

export const getSessionDocumentation = (sessionId: string) =>
  client.get<SessionDocumentation>(`/ai-sessions/${sessionId}/documentation`)

export const rateSession = (sessionId: string, data: { rating: number; feedback?: string }) =>
  client.post(`/ai-sessions/${sessionId}/rate`, data)

Verification: Import in a component, verify TypeScript compiles with no errors.

git commit -m "feat(ai-session): add frontend types and API client"

Slice 7: Frontend — FlowPilot Session Page

Task 12: Create the FlowPilot session page and components

This is the most important UX in the product. It needs to feel like a smart conversation, not a form.

Files:

  • Create: frontend/src/pages/FlowPilotSessionPage.tsx — The main page
  • Create: frontend/src/components/flowpilot/FlowPilotIntake.tsx — Intake screen
  • Create: frontend/src/components/flowpilot/FlowPilotSession.tsx — Active session view
  • Create: frontend/src/components/flowpilot/FlowPilotStepCard.tsx — Individual step card
  • Create: frontend/src/components/flowpilot/FlowPilotOptions.tsx — Selectable options grid
  • Create: frontend/src/components/flowpilot/FlowPilotActionBar.tsx — Resolve/Escalate bar
  • Create: frontend/src/components/flowpilot/ConfidenceIndicator.tsx — Confidence tier badge
  • Create: frontend/src/components/flowpilot/SessionDocView.tsx — Documentation view
  • Create: frontend/src/components/flowpilot/index.ts — Barrel export
  • Create: frontend/src/hooks/useFlowPilotSession.ts — Session state management hook

Design requirements (MUST follow existing design system):

  • Dark theme: slate-900/950 backgrounds, glass morphism cards
  • Brand cyan (#06b6d4 → #22d3ee) for primary actions and active states
  • IBM Plex Sans for body, Bricolage Grotesque for headings, JetBrains Mono for code
  • shadcn/ui components as base (Button, Card, Badge, Textarea, etc.)
  • Sonner for toast notifications
  • Lucide icons throughout

Intake Screen (FlowPilotIntake.tsx)

Layout: Centered card on the page. Clean, focused, inviting.

Elements:

  • Heading: "What are you troubleshooting?" (Bricolage Grotesque)
  • Large textarea with placeholder: "Describe the issue, paste an error message, or paste log output..."
  • Below textarea: Row of input type pills/badges — "Add Screenshot" (image upload), "Paste Logs" (toggles a monospace textarea), "Pull from Ticket" (opens ticket picker — disabled until PSA integration in Phase 2)
  • "Start Session" primary button (cyan, disabled until text has content)
  • Below the CTA: subtle text "FlowPilot will analyze your input and guide you through diagnosis"

Behavior:

  • On submit: call createAISession(), transition to active session view
  • Show loading state: "Analyzing your issue..." with a subtle pulse animation
  • Screenshot upload stores the file and passes URL in intake_content.image_urls

Active Session View (FlowPilotSession.tsx)

Layout: Conversational scroll view. Steps appear sequentially like a chat, scrolling down as the conversation progresses.

Left column (main, ~70%): The conversation. Each step is a FlowPilotStepCard. History steps are collapsed/completed. The current step is expanded and interactive.

Right column (sidebar, ~30%): Session metadata — problem summary, domain badge, confidence indicator, matched flow (if any), step counter, elapsed time. Sticky positioned.

Bottom bar: FlowPilotActionBar — always visible. "Resolve" (green) and "Escalate" (amber) buttons. Only enabled after at least one diagnostic step.

Step Card (FlowPilotStepCard.tsx)

For question steps:

  • Context message (if present) shown in a subtle info banner above the question
  • Question content as the card heading
  • Options rendered as FlowPilotOptions — clickable cards in a grid (1-2 columns)
  • "None of these — let me describe" link at the bottom → expands a textarea
  • "I can't check this right now" skip link
  • After responding: card collapses to show the question + selected answer inline

For action steps:

  • Action description with checklist-style items if applicable
  • "I've completed this action" / "This didn't work" buttons
  • If script_generation: embedded script generator (reuse from existing ScriptGeneratorPanel)

For resolution_suggestion steps:

  • Summary card with green accent border
  • "Yes, this is resolved" button → opens resolve modal
  • "No, keep investigating" button → sends response back to FlowPilot

Options Grid (FlowPilotOptions.tsx)

Design: Cards in a responsive grid (2 columns on desktop, 1 on mobile). Each option card:

  • Label text (primary)
  • Followup hint (secondary, smaller, muted)
  • Hover: cyan border glow + slight scale
  • Selected: cyan background with check icon
  • Keyboard: Tab through options, Enter to select

Confidence Indicator (ConfidenceIndicator.tsx)

Subtle badge in the sidebar:

  • Guided (>0.8): Green dot + "Proven path" text
  • Exploring (0.4-0.8): Amber dot + "Investigating" text
  • Discovery (<0.4): Purple dot + "New territory" text

Tooltip on hover explains what the tier means.

useFlowPilotSession Hook

Manages all session state:

interface UseFlowPilotSession {
  // State
  session: AISessionDetail | null
  currentStep: AISessionStepResponse | null
  allSteps: AISessionStepResponse[]
  isLoading: boolean
  isProcessing: boolean  // waiting for next step from LLM
  error: string | null

  // Actions
  startSession: (intake: AISessionCreateRequest) => Promise<void>
  respondToStep: (response: StepResponseRequest) => Promise<void>
  resolveSession: (data: ResolveSessionRequest) => Promise<SessionDocumentation>
  escalateSession: (data: EscalateSessionRequest) => Promise<SessionDocumentation>
  rateSession: (rating: number, feedback?: string) => Promise<void>

  // Derived
  isActive: boolean
  canResolve: boolean
  canEscalate: boolean
}

Task 13: Add route and navigation

Files:

  • Edit: frontend/src/router.tsx

Add lazy import:

const FlowPilotSessionPage = lazy(() => import('@/pages/FlowPilotSessionPage'))

Add routes inside the AppLayout children:

{ path: 'pilot', element: page(FlowPilotSessionPage) },
{ path: 'pilot/:sessionId', element: page(FlowPilotSessionPage) },

Files:

  • Edit sidebar component (in frontend/src/components/sidebar/)

Add a prominent "New Session" entry at the top of the sidebar with the Sparkles Lucide icon and cyan highlight. This should be the most visually prominent nav item. Link to /pilot.

Verification: Navigate to /pilot. See the intake screen. Type a problem, submit. See FlowPilot's first question with selectable options. Select an option, see the next step. Continue for 3-4 steps. Resolve. See documentation. Verify all data is persisted in the database.

git commit -m "feat(ai-session): add FlowPilot session page, components, and routing"

Slice 8: Session History Integration

Task 14: AI sessions in session history

Files:

  • Edit: frontend/src/pages/SessionHistoryPage.tsx
  • Create: frontend/src/components/flowpilot/AISessionListItem.tsx

Add a tab or toggle to switch between "Flow Sessions" (existing) and "AI Sessions" (new). AI sessions show: problem summary, domain badge, status, confidence tier, step count, created_at, resolved_at.

Click on an AI session navigates to /pilot/{sessionId} which loads the session in read-only mode showing the full conversation trail.

Verification: Complete an AI session. Navigate to session history. See the AI session listed. Click it, see the full conversation in read-only mode.

git commit -m "feat(ai-session): integrate AI sessions into session history"

Summary of All New Files

Backend

app/models/ai_session.py                    # AISession model
app/models/ai_session_step.py               # AISessionStep model
app/schemas/ai_session.py                   # All Pydantic schemas
app/services/flowpilot_engine.py            # FlowPilot Engine (LLM orchestration)
app/services/flow_matching_engine.py        # Flow Matching Engine v1
app/api/endpoints/ai_sessions.py            # API endpoints
alembic/versions/xxx_add_ai_sessions.py     # Migration

Backend (edited)

app/models/__init__.py                       # Register new models
app/api/router.py                            # Register new router

Frontend

src/types/ai-session.ts                      # TypeScript types
src/api/aiSessions.ts                        # API client
src/pages/FlowPilotSessionPage.tsx           # Main page
src/hooks/useFlowPilotSession.ts             # Session state hook
src/components/flowpilot/
  FlowPilotIntake.tsx                        # Intake screen
  FlowPilotSession.tsx                       # Active session view
  FlowPilotStepCard.tsx                      # Step card
  FlowPilotOptions.tsx                       # Options grid
  FlowPilotActionBar.tsx                     # Resolve/Escalate bar
  ConfidenceIndicator.tsx                    # Confidence badge
  SessionDocView.tsx                         # Documentation view
  AISessionListItem.tsx                      # History list item
  index.ts                                   # Barrel export

Frontend (edited)

src/router.tsx                               # New routes
src/components/sidebar/                      # New nav entry
src/pages/SessionHistoryPage.tsx             # AI session tab

Config Requirements

No new environment variables needed for Phase 1 — the FlowPilot Engine reuses existing AI_PROVIDER and AI_MODEL_ANTHROPIC settings (currently claude-sonnet-4-6). The get_ai_provider() function and ai_quota_service work as-is.

Future phases may add:

  • FLOWPILOT_MODEL_TIER — route simple questions to haiku, complex diagnosis to sonnet via existing AI_MODEL_TIERS config
  • FLOWPILOT_MAX_STEPS — safety limit on steps per session (default: 30)

Testing Strategy

Backend Unit Tests

Files: backend/tests/test_flowpilot_engine.py

  • Test _parse_structured_output with valid and invalid JSON
  • Test _update_confidence tier mapping
  • Test _classify_intake with various intake types
  • Test _generate_documentation with mock session steps
  • Test Flow Matching Engine scoring with mock embeddings

Backend Integration Tests

Files: backend/tests/test_ai_sessions_api.py

  • Test full session lifecycle: create → respond (3 steps) → resolve
  • Test escalation flow: create → respond → escalate
  • Test auth enforcement (wrong user can't access another's session)
  • Test rate limiting
  • Test AI quota enforcement

Frontend

Manual testing of the full flow:

  1. Open /pilot, type a problem description
  2. Verify first step renders with selectable options
  3. Select an option, verify next step appears
  4. Use the free-text "none of these" escape hatch, verify it works
  5. Skip a step with "I can't check this", verify it works
  6. Resolve the session, verify documentation generates
  7. Check session history page, verify the AI session appears
  8. Re-open the completed session, verify read-only conversation view

What Comes Next (Phase 2 — NOT in scope here)

For context only — do NOT implement these in Phase 1:

  • PSA ticket intake: Pull ticket from ConnectWise as session input
  • PSA ticket update: Push documentation back to ticket on resolution
  • Escalation handoff: Another engineer picks up a paused session
  • Knowledge Flywheel: Post-session flow proposal generation
  • Review Queue: UI for approving AI-generated flow proposals
  • In-session Script Generator: FlowPilot invokes script generation contextually