Files
resolutionflow/docs/cockpit/2026-04-01-msp-assistant-harness-design.md
chihlasm 5dd43b2226 docs: add MSP assistant harness cockpit design spec
Design spec for evolving /assistant into a live triage cockpit.
Covers layout decisions (stacked zones, drag-resizable split),
incident header (labelled fields, AI-inferred + editable),
work zone (steps checklist + FlowPilot Asks + What We Know),
conclude modal redesign, and all required backend changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 20:59:15 +00:00

16 KiB

MSP Assistant Harness — Design Spec

Date: 2026-04-01 Status: Draft — pending user review Source: MSP_Assistant_Harness_Implementation_Plan.docx (v2.0, March 2026) + brainstorming session


Context

The /assistant page currently works as a generic AI chat surface with a task lane side panel and a chat sidebar for session history. It functions well but reads as "AI chat with extras" rather than an MSP engineer's operational tool.

The goal is to reframe the page as a live triage cockpit — the place where an engineer opens a ticket, works through it from intake to resolution, and closes with a structured handoff artifact. The underlying session, branching, and chat architecture is preserved. What changes is layout hierarchy, information density, field labelling, and the conclude output.

Scope is broader than the original docx: includes all required backend changes to support the frontend properly.


Design Decisions

1. Overall Layout — Stacked Zones

┌─────────────────────────────────────────────┐
│  Incident Header (labelled fields, 1 row)   │
├────────────────────────┬────────────────────┤
│                        │  FlowPilot Asks    │
│  Steps Checklist       │  (quick replies)   │
│  (left, ~55%)          ├────────────────────┤
│                        │  What We Know      │
│                        │  (evidence list)   │
├────────────────────────┴────────────────────┤  ← drag handle
│  Conversation Log (muted, darker bg)        │
├─────────────────────────────────────────────┤
│  Compose area                               │
└─────────────────────────────────────────────┘
  • Work zone (top) and conversation log (bottom) are drag-resizable via a handle
  • Default split: ~55% work zone, ~45% chat
  • Existing left sidebar (session history) unchanged
  • Compose area is always pinned to bottom, spans full width
  • workZoneHeight persisted to localStorage so split survives refresh

2. Incident Header

Single row with explicit micro-labels above each field:

CLIENT        DEVICE            CATEGORY          HYPOTHESIS
Contoso Ltd ✏  jsmith-desktop ✏  DNS / Network ✏   Corrupted DNS cache on NIC ✏
                                                              [CW #48291] [Resolve ▾] [⋯]
  • Each field has its own icon (visible on hover) that opens an inline edit popover
  • Fields populate from: (a) intake form on session create, (b) AI-inferred updates mid-session via triage_update, (c) manual engineer edits via PATCH /ai-sessions/{id}/triage
  • PSA ticket number shown if linked; action buttons (Resolve, overflow menu) on the right
  • Empty fields show muted placeholder text — never blank

3. Work Zone — Steps + FlowPilot Asks + What We Know

Left panel (~55%): ordered step checklist

  • Steps displayed as a vertical list: completed, active (blue border, white text), pending
  • Active step is visually distinct
  • "Generate Script" CTA appears at the bottom when a script-generation step is active

Right panel (~45%): two stacked mini-panels

  • FlowPilot Asks (top, amber label): current question from AI. When options are provided, renders as quick-reply buttons — clicking a button submits that answer as a chat message. When no options, renders a compact free-text input. Panel is empty/hidden when no pending question.
  • What We Know (bottom, muted label): running evidence list. Each entry: ✓ confirmed / ✗ ruled out / ? pending. AI appends via triage_update.evidence_items; engineer can manually add or edit entries.

4. Conversation Log Zone

  • Lives below the work zone, separated by a drag handle
  • Background: #13151c (one step darker than page) — visually recedes
  • Label: "CONVERSATION LOG" in muted colour (text-muted)
  • Messages are compact: you: / fp: prefixes instead of full name/avatar bubbles
  • Scrolls independently
  • Not collapsible — drag handle gives control

5. Conclude / Handoff Modal (redesigned)

On opening "Close Case":

  1. Header: "Close Case — [Client Name]" + outcome selector (Resolved / Escalated / Parked)
  2. Structured fields — pre-filled by streaming /handoff-draft, all editable:
    • Root Cause (short text input)
    • Resolution (what fixed it)
    • Steps Taken (list, auto-populated from step checklist)
    • Recommendations (next steps / preventive actions)
  3. Output destinations (checkboxes): Post to CW ticket note / Save to Knowledge Base / Send client summary
  4. Confirm button — triggers resolve/escalate/pause and passes structured fields into ResolutionOutputGenerator

The existing SessionResolutionOutput model and ResolutionOutputGenerator service are reused. The /handoff-draft stream starts immediately on modal open — the engineer can begin editing while fields fill in.


Backend Changes Required

1. New AISession columns (Alembic migration)

Add to ai_sessions table:

Column Type Purpose
client_name VARCHAR(255) MSP client name for incident header
asset_name VARCHAR(255) Device / asset / user being worked on
issue_category VARCHAR(100) Human-readable category (e.g. "DNS / Networking")
triage_hypothesis TEXT Current working hypothesis — AI-updated + engineer-editable
evidence_items JSONB "What We Know" list — persisted for session resume

evidence_items format: [{ "text": str, "status": "confirmed" | "ruled_out" | "pending" }]

Note: problem_domain (existing) is an internal classifier slug. issue_category is the human-readable display label for the header. Both coexist.

2. New PATCH endpoint — triage metadata

PATCH /ai-sessions/{session_id}/triage
Auth: require_engineer_or_admin
Body: { client_name?, asset_name?, issue_category?, triage_hypothesis?, evidence_items? }
Response: { id, client_name, asset_name, issue_category, triage_hypothesis, evidence_items }

Used when the engineer edits any header field or evidence list manually.

3. Updated schemas — TriageUpdate and QuestionItem.options

New TriageUpdate model (returned in chat response when AI infers session context):

class TriageUpdate(BaseModel):
    client_name: str | None = None
    asset_name: str | None = None
    issue_category: str | None = None
    triage_hypothesis: str | None = None
    evidence_items: list[dict] | None = None  # appends to existing list

Updated ChatMessageResponse:

class ChatMessageResponse(BaseModel):
    # existing fields unchanged...
    triage_update: TriageUpdate | None = None

Updated QuestionItem — add options for quick-reply buttons:

class QuestionItem(BaseModel):
    text: str
    context: str = ""
    options: list[str] | None = None  # quick-reply labels; null = free-text fallback

4. unified_chat_service.py — triage extraction

After generating each AI response, run a lightweight extraction to populate triage_update. Implementation options (pick one during implementation):

  • Option A (recommended): Embed structured extraction in the system prompt using an [TRIAGE_UPDATE] marker, similar to existing [QUESTIONS] / [ACTIONS] markers. AI emits the block if it has new triage signals; service parses it.
  • Option B: Post-response extraction pass using a fast model (claude-haiku-4-5) with the last 3 messages as context.

When triage_update contains non-null fields, the service auto-PATCHes the session record (so fields are persisted) AND returns triage_update in the response for the frontend to update the header immediately.

5. New streaming endpoint — handoff draft

POST /ai-sessions/{session_id}/handoff-draft
Auth: require_engineer_or_admin
Response: StreamingResponse (text/event-stream)

Streams a structured handoff JSON object:

{ "root_cause": "...", "resolution": "...", "steps_taken": ["..."], "recommendations": "..." }

Built from session context: problem_summary, triage_hypothesis, evidence_items, conversation_messages (last 20), step checklist from saved task lane state.

6. Updated conclude schemas

Add optional structured fields to ResolveSessionRequest and EscalateSessionRequest:

root_cause: str | None = None
steps_taken: list[str] | None = None
recommendations: str | None = None

Pass these into ResolutionOutputGenerator._build_session_context() to enrich psa_ticket_notes and client_summary outputs.

7. Session read endpoint — include new triage fields

Ensure the session detail response (GET /ai-sessions/{id}) returns the new fields so the frontend can restore header state on session resume.


Frontend Changes Required

1. AssistantChatPage layout refactor

Replace current layout (sidebar + chat column + TaskLane side panel) with the stacked cockpit layout described above.

New state:

  • triageMeta: TriageMeta{ client_name, asset_name, issue_category, triage_hypothesis, evidence_items }
  • workZoneHeight: number — persisted to localStorage('rf-assistant-work-zone-height')

On session load / resume: populate triageMeta from the session response (new fields).

On AI response: if response.triage_update is non-null, merge into triageMeta (partial update, preserve existing non-null values unless AI overwrites).

2. New component: IncidentHeader

frontend/src/components/assistant/IncidentHeader.tsx

Props: triageMeta, psaTicketId, sessionId, onFieldSave(field, value), onResolve, onOverflow

  • Renders labelled single-row header
  • Each field: micro-label + value + icon (visible on hover)
  • opens an EditPopover (small popover with text input + Save/Cancel)
  • On Save: calls aiSessionsApi.updateTriage(sessionId, { [field]: value })
  • Empty field shows muted placeholder (e.g. "Unknown client")

3. Refactored component: StepsPanel (from TaskLane)

frontend/src/components/assistant/StepsPanel.tsx

Same activeActions data source. Renders as ordered checklist:

  • Completed: + strikethrough label, muted
  • Active: + blue left border, white text, full opacity
  • Pending: + muted text

Script generation CTA: shown at bottom when the active step has command containing "script" or when AI has flagged it.

4. New component: FlowPilotAsks

frontend/src/components/assistant/FlowPilotAsks.tsx

Props: questions: QuestionItem[], onAnswer(answer: string)

  • Shows first unanswered question (or empty/hidden state if none)
  • When question.options is non-null: renders as button row, clicking calls onAnswer(option)
  • When question.options is null: renders compact text input with Send button
  • onAnswer calls handleSend in the parent page with the answer text

5. New component: WhatWeKnow

frontend/src/components/assistant/WhatWeKnow.tsx

Props: items: EvidenceItem[], onAdd(text, status), onEdit(index, text, status)

  • Renders evidence list with status icons: (confirmed, green), (ruled out, red), ? (pending, muted)
  • "+ Add finding" link at bottom opens an inline input row
  • Items are editable inline (click to edit)
  • State lives in AssistantChatPage as part of triageMeta.evidence_items, synced to backend via PATCH /triage

6. Drag handle — resizable split

Implement as a thin handle bar between work zone and conversation log. On drag:

  • Update workZoneHeight in state
  • Persist to localStorage

On mount: restore from localStorage, default to 55% of available height.

7. Compact conversation log

Replace current <ChatMessage> bubble rendering in the log zone with a compact list:

you:  Can't resolve external DNS, internal fine
fp:   Ping passed — layer 3 OK. Run nslookup google.com.
you:  Timed out on 1.1.1.1 too.

ChatMessage component still used for rich rendering (suggested flows, forks) but in a more compact variant. Full bubble rendering available on hover/expand if needed.

8. Redesigned ConcludeSessionModal

Replaces current simple textarea with the structured handoff form. On open:

  1. Call aiSessionsApi.getHandoffDraft(sessionId) — streaming — populate fields as stream arrives
  2. Render outcome selector + 4 structured fields (all <textarea> with labels)
  3. Render output destination checkboxes
  4. On Confirm: call resolve/escalate/pause with enriched request body

9. MSP-native language pass

Old New
"AI Assistant" "FlowPilot"
"New Chat" "New Case"
"Messages" "Conversation Log"
"Task Lane" (panel header) "Steps"
"Conclude" "Close Case"
"Chat history" (sidebar label) "Case History"

What This Is NOT

  • Not a redesign of the FlowPilot session page (/pilot) — separate page, untouched
  • Not a rebuild of session, branching, or PSA architecture
  • Not a new data model for conversations — conversation_messages JSONB is unchanged
  • Not a mobile-first redesign — mobile degrades cleanly but desktop is primary

Verification

Harness Feel Test (primary — subjective)

  • Open /assistant, start a new case: does the page read as an MSP triage cockpit within 3 seconds without reading labels?
  • Is the current active step obvious without scrolling through chat?
  • Do FlowPilot Asks quick-reply buttons submit answers and update the steps list?
  • Does the incident header update mid-session as the AI infers context?
  • Drag the handle, refresh: does the split restore correctly?

Functional Regression

  • Free-text chat, image paste, suggested flows, forks, branching: all work
  • Session pause, resume, and handoff end-to-end: works
  • ConcludeSessionModal resolves / escalates / parks correctly
  • Handoff draft streams and pre-fills the modal fields
  • Manual header edit saves and persists across reload

MSP Scenario Coverage (from docx)

Run end-to-end: single-user endpoint issue · M365/tenant-wide issue · network/VPN outage · escalation and resume after handoff.

Backend Checks

# Migration
alembic upgrade head

# Verify new columns
psql -U postgres -d resolutionflow -c "\d ai_sessions" | grep -E "client_name|asset_name|issue_category|triage_hypothesis|evidence_items"

# Smoke test endpoints (with valid token)
curl -X PATCH .../ai-sessions/{id}/triage -d '{"client_name":"Test"}'
curl -X POST .../ai-sessions/{id}/handoff-draft  # should stream JSON

Critical Files

File Change
backend/app/models/ai_session.py Add 5 new columns
backend/app/schemas/ai_session.py Add TriageUpdate, extend QuestionItem, update request/response schemas
backend/app/api/endpoints/ai_sessions.py Add PATCH /{id}/triage, POST /{id}/handoff-draft
backend/app/services/unified_chat_service.py Extract and return triage_update per AI response
backend/app/services/resolution_output_generator.py Accept structured handoff fields in context builder
backend/alembic/versions/NNN_add_triage_fields_to_ai_sessions.py Sequential migration (check ls backend/alembic/versions/ | sort | tail -1 for NNN)
frontend/src/pages/AssistantChatPage.tsx Full layout refactor — cockpit structure
frontend/src/components/assistant/IncidentHeader.tsx New component
frontend/src/components/assistant/StepsPanel.tsx Refactored from TaskLane
frontend/src/components/assistant/FlowPilotAsks.tsx New component
frontend/src/components/assistant/WhatWeKnow.tsx New component
frontend/src/components/assistant/ConcludeSessionModal.tsx Redesigned
frontend/src/api/aiSessions.ts Add updateTriage(), getHandoffDraft()
frontend/src/types/ai-session.ts Add TriageUpdate, TriageMeta, EvidenceItem; extend QuestionItem