docs: add MSP assistant harness cockpit design spec
Design spec for evolving /assistant into a live triage cockpit. Covers layout decisions (stacked zones, drag-resizable split), incident header (labelled fields, AI-inferred + editable), work zone (steps checklist + FlowPilot Asks + What We Know), conclude modal redesign, and all required backend changes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
363
docs/cockpit/2026-04-01-msp-assistant-harness-design.md
Normal file
363
docs/cockpit/2026-04-01-msp-assistant-harness-design.md
Normal file
@@ -0,0 +1,363 @@
|
||||
# MSP Assistant Harness — Design Spec
|
||||
**Date:** 2026-04-01
|
||||
**Status:** Draft — pending user review
|
||||
**Source:** MSP_Assistant_Harness_Implementation_Plan.docx (v2.0, March 2026) + brainstorming session
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
The `/assistant` page currently works as a generic AI chat surface with a task lane side panel and a chat sidebar for session history. It functions well but reads as "AI chat with extras" rather than an MSP engineer's operational tool.
|
||||
|
||||
The goal is to reframe the page as a **live triage cockpit** — the place where an engineer opens a ticket, works through it from intake to resolution, and closes with a structured handoff artifact. The underlying session, branching, and chat architecture is preserved. What changes is layout hierarchy, information density, field labelling, and the conclude output.
|
||||
|
||||
Scope is broader than the original docx: includes all required backend changes to support the frontend properly.
|
||||
|
||||
---
|
||||
|
||||
## Design Decisions
|
||||
|
||||
### 1. Overall Layout — Stacked Zones
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Incident Header (labelled fields, 1 row) │
|
||||
├────────────────────────┬────────────────────┤
|
||||
│ │ FlowPilot Asks │
|
||||
│ Steps Checklist │ (quick replies) │
|
||||
│ (left, ~55%) ├────────────────────┤
|
||||
│ │ What We Know │
|
||||
│ │ (evidence list) │
|
||||
├────────────────────────┴────────────────────┤ ← drag handle
|
||||
│ Conversation Log (muted, darker bg) │
|
||||
├─────────────────────────────────────────────┤
|
||||
│ Compose area │
|
||||
└─────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
- Work zone (top) and conversation log (bottom) are **drag-resizable** via a handle
|
||||
- Default split: ~55% work zone, ~45% chat
|
||||
- Existing left sidebar (session history) unchanged
|
||||
- Compose area is always pinned to bottom, spans full width
|
||||
- `workZoneHeight` persisted to `localStorage` so split survives refresh
|
||||
|
||||
### 2. Incident Header
|
||||
|
||||
Single row with explicit micro-labels above each field:
|
||||
|
||||
```
|
||||
CLIENT DEVICE CATEGORY HYPOTHESIS
|
||||
Contoso Ltd ✏ jsmith-desktop ✏ DNS / Network ✏ Corrupted DNS cache on NIC ✏
|
||||
[CW #48291] [Resolve ▾] [⋯]
|
||||
```
|
||||
|
||||
- Each field has its own `✏` icon (visible on hover) that opens an inline edit popover
|
||||
- Fields populate from: (a) intake form on session create, (b) AI-inferred updates mid-session via `triage_update`, (c) manual engineer edits via `PATCH /ai-sessions/{id}/triage`
|
||||
- PSA ticket number shown if linked; action buttons (Resolve, overflow menu) on the right
|
||||
- Empty fields show muted placeholder text — never blank
|
||||
|
||||
### 3. Work Zone — Steps + FlowPilot Asks + What We Know
|
||||
|
||||
**Left panel (~55%): ordered step checklist**
|
||||
- Steps displayed as a vertical list: `✓` completed, `→` active (blue border, white text), `○` pending
|
||||
- Active step is visually distinct
|
||||
- "Generate Script" CTA appears at the bottom when a script-generation step is active
|
||||
|
||||
**Right panel (~45%): two stacked mini-panels**
|
||||
- **FlowPilot Asks** (top, amber label): current question from AI. When `options` are provided, renders as quick-reply buttons — clicking a button submits that answer as a chat message. When no `options`, renders a compact free-text input. Panel is empty/hidden when no pending question.
|
||||
- **What We Know** (bottom, muted label): running evidence list. Each entry: `✓ confirmed` / `✗ ruled out` / `? pending`. AI appends via `triage_update.evidence_items`; engineer can manually add or edit entries.
|
||||
|
||||
### 4. Conversation Log Zone
|
||||
|
||||
- Lives below the work zone, separated by a **drag handle**
|
||||
- Background: `#13151c` (one step darker than page) — visually recedes
|
||||
- Label: "CONVERSATION LOG" in muted colour (`text-muted`)
|
||||
- Messages are compact: `you:` / `fp:` prefixes instead of full name/avatar bubbles
|
||||
- Scrolls independently
|
||||
- Not collapsible — drag handle gives control
|
||||
|
||||
### 5. Conclude / Handoff Modal (redesigned)
|
||||
|
||||
On opening "Close Case":
|
||||
|
||||
1. **Header**: "Close Case — [Client Name]" + outcome selector (Resolved / Escalated / Parked)
|
||||
2. **Structured fields** — pre-filled by streaming `/handoff-draft`, all editable:
|
||||
- **Root Cause** (short text input)
|
||||
- **Resolution** (what fixed it)
|
||||
- **Steps Taken** (list, auto-populated from step checklist)
|
||||
- **Recommendations** (next steps / preventive actions)
|
||||
3. **Output destinations** (checkboxes): Post to CW ticket note / Save to Knowledge Base / Send client summary
|
||||
4. **Confirm** button — triggers resolve/escalate/pause and passes structured fields into `ResolutionOutputGenerator`
|
||||
|
||||
The existing `SessionResolutionOutput` model and `ResolutionOutputGenerator` service are reused. The `/handoff-draft` stream starts immediately on modal open — the engineer can begin editing while fields fill in.
|
||||
|
||||
---
|
||||
|
||||
## Backend Changes Required
|
||||
|
||||
### 1. New AISession columns (Alembic migration)
|
||||
|
||||
Add to `ai_sessions` table:
|
||||
|
||||
| Column | Type | Purpose |
|
||||
|--------|------|---------|
|
||||
| `client_name` | `VARCHAR(255)` | MSP client name for incident header |
|
||||
| `asset_name` | `VARCHAR(255)` | Device / asset / user being worked on |
|
||||
| `issue_category` | `VARCHAR(100)` | Human-readable category (e.g. "DNS / Networking") |
|
||||
| `triage_hypothesis` | `TEXT` | Current working hypothesis — AI-updated + engineer-editable |
|
||||
| `evidence_items` | `JSONB` | "What We Know" list — persisted for session resume |
|
||||
|
||||
`evidence_items` format: `[{ "text": str, "status": "confirmed" | "ruled_out" | "pending" }]`
|
||||
|
||||
Note: `problem_domain` (existing) is an internal classifier slug. `issue_category` is the human-readable display label for the header. Both coexist.
|
||||
|
||||
### 2. New PATCH endpoint — triage metadata
|
||||
|
||||
```
|
||||
PATCH /ai-sessions/{session_id}/triage
|
||||
Auth: require_engineer_or_admin
|
||||
Body: { client_name?, asset_name?, issue_category?, triage_hypothesis?, evidence_items? }
|
||||
Response: { id, client_name, asset_name, issue_category, triage_hypothesis, evidence_items }
|
||||
```
|
||||
|
||||
Used when the engineer edits any header field or evidence list manually.
|
||||
|
||||
### 3. Updated schemas — TriageUpdate and QuestionItem.options
|
||||
|
||||
**New `TriageUpdate` model** (returned in chat response when AI infers session context):
|
||||
|
||||
```python
|
||||
class TriageUpdate(BaseModel):
|
||||
client_name: str | None = None
|
||||
asset_name: str | None = None
|
||||
issue_category: str | None = None
|
||||
triage_hypothesis: str | None = None
|
||||
evidence_items: list[dict] | None = None # appends to existing list
|
||||
```
|
||||
|
||||
**Updated `ChatMessageResponse`:**
|
||||
```python
|
||||
class ChatMessageResponse(BaseModel):
|
||||
# existing fields unchanged...
|
||||
triage_update: TriageUpdate | None = None
|
||||
```
|
||||
|
||||
**Updated `QuestionItem`** — add `options` for quick-reply buttons:
|
||||
```python
|
||||
class QuestionItem(BaseModel):
|
||||
text: str
|
||||
context: str = ""
|
||||
options: list[str] | None = None # quick-reply labels; null = free-text fallback
|
||||
```
|
||||
|
||||
### 4. unified_chat_service.py — triage extraction
|
||||
|
||||
After generating each AI response, run a lightweight extraction to populate `triage_update`. Implementation options (pick one during implementation):
|
||||
|
||||
- **Option A (recommended):** Embed structured extraction in the system prompt using an `[TRIAGE_UPDATE]` marker, similar to existing `[QUESTIONS]` / `[ACTIONS]` markers. AI emits the block if it has new triage signals; service parses it.
|
||||
- **Option B:** Post-response extraction pass using a fast model (`claude-haiku-4-5`) with the last 3 messages as context.
|
||||
|
||||
When `triage_update` contains non-null fields, the service auto-PATCHes the session record (so fields are persisted) AND returns `triage_update` in the response for the frontend to update the header immediately.
|
||||
|
||||
### 5. New streaming endpoint — handoff draft
|
||||
|
||||
```
|
||||
POST /ai-sessions/{session_id}/handoff-draft
|
||||
Auth: require_engineer_or_admin
|
||||
Response: StreamingResponse (text/event-stream)
|
||||
```
|
||||
|
||||
Streams a structured handoff JSON object:
|
||||
```json
|
||||
{ "root_cause": "...", "resolution": "...", "steps_taken": ["..."], "recommendations": "..." }
|
||||
```
|
||||
|
||||
Built from session context: `problem_summary`, `triage_hypothesis`, `evidence_items`, `conversation_messages` (last 20), step checklist from saved task lane state.
|
||||
|
||||
### 6. Updated conclude schemas
|
||||
|
||||
Add optional structured fields to `ResolveSessionRequest` and `EscalateSessionRequest`:
|
||||
|
||||
```python
|
||||
root_cause: str | None = None
|
||||
steps_taken: list[str] | None = None
|
||||
recommendations: str | None = None
|
||||
```
|
||||
|
||||
Pass these into `ResolutionOutputGenerator._build_session_context()` to enrich `psa_ticket_notes` and `client_summary` outputs.
|
||||
|
||||
### 7. Session read endpoint — include new triage fields
|
||||
|
||||
Ensure the session detail response (`GET /ai-sessions/{id}`) returns the new fields so the frontend can restore header state on session resume.
|
||||
|
||||
---
|
||||
|
||||
## Frontend Changes Required
|
||||
|
||||
### 1. AssistantChatPage layout refactor
|
||||
|
||||
Replace current layout (sidebar + chat column + TaskLane side panel) with the stacked cockpit layout described above.
|
||||
|
||||
**New state:**
|
||||
- `triageMeta: TriageMeta` — `{ client_name, asset_name, issue_category, triage_hypothesis, evidence_items }`
|
||||
- `workZoneHeight: number` — persisted to `localStorage('rf-assistant-work-zone-height')`
|
||||
|
||||
**On session load / resume:** populate `triageMeta` from the session response (new fields).
|
||||
|
||||
**On AI response:** if `response.triage_update` is non-null, merge into `triageMeta` (partial update, preserve existing non-null values unless AI overwrites).
|
||||
|
||||
### 2. New component: `IncidentHeader`
|
||||
|
||||
```
|
||||
frontend/src/components/assistant/IncidentHeader.tsx
|
||||
```
|
||||
|
||||
Props: `triageMeta`, `psaTicketId`, `sessionId`, `onFieldSave(field, value)`, `onResolve`, `onOverflow`
|
||||
|
||||
- Renders labelled single-row header
|
||||
- Each field: micro-label + value + `✏` icon (visible on hover)
|
||||
- `✏` opens an `EditPopover` (small popover with text input + Save/Cancel)
|
||||
- On Save: calls `aiSessionsApi.updateTriage(sessionId, { [field]: value })`
|
||||
- Empty field shows muted placeholder (e.g. "Unknown client")
|
||||
|
||||
### 3. Refactored component: `StepsPanel` (from TaskLane)
|
||||
|
||||
```
|
||||
frontend/src/components/assistant/StepsPanel.tsx
|
||||
```
|
||||
|
||||
Same `activeActions` data source. Renders as ordered checklist:
|
||||
- Completed: `✓` + strikethrough label, muted
|
||||
- Active: `→` + blue left border, white text, full opacity
|
||||
- Pending: `○` + muted text
|
||||
|
||||
Script generation CTA: shown at bottom when the active step has `command` containing "script" or when AI has flagged it.
|
||||
|
||||
### 4. New component: `FlowPilotAsks`
|
||||
|
||||
```
|
||||
frontend/src/components/assistant/FlowPilotAsks.tsx
|
||||
```
|
||||
|
||||
Props: `questions: QuestionItem[]`, `onAnswer(answer: string)`
|
||||
|
||||
- Shows first unanswered question (or empty/hidden state if none)
|
||||
- When `question.options` is non-null: renders as button row, clicking calls `onAnswer(option)`
|
||||
- When `question.options` is null: renders compact text input with Send button
|
||||
- `onAnswer` calls `handleSend` in the parent page with the answer text
|
||||
|
||||
### 5. New component: `WhatWeKnow`
|
||||
|
||||
```
|
||||
frontend/src/components/assistant/WhatWeKnow.tsx
|
||||
```
|
||||
|
||||
Props: `items: EvidenceItem[]`, `onAdd(text, status)`, `onEdit(index, text, status)`
|
||||
|
||||
- Renders evidence list with status icons: `✓` (confirmed, green), `✗` (ruled out, red), `?` (pending, muted)
|
||||
- "+ Add finding" link at bottom opens an inline input row
|
||||
- Items are editable inline (click to edit)
|
||||
- State lives in `AssistantChatPage` as part of `triageMeta.evidence_items`, synced to backend via `PATCH /triage`
|
||||
|
||||
### 6. Drag handle — resizable split
|
||||
|
||||
Implement as a thin handle bar between work zone and conversation log. On drag:
|
||||
- Update `workZoneHeight` in state
|
||||
- Persist to `localStorage`
|
||||
|
||||
On mount: restore from `localStorage`, default to `55%` of available height.
|
||||
|
||||
### 7. Compact conversation log
|
||||
|
||||
Replace current `<ChatMessage>` bubble rendering in the log zone with a compact list:
|
||||
|
||||
```
|
||||
you: Can't resolve external DNS, internal fine
|
||||
fp: Ping passed — layer 3 OK. Run nslookup google.com.
|
||||
you: Timed out on 1.1.1.1 too.
|
||||
```
|
||||
|
||||
`ChatMessage` component still used for rich rendering (suggested flows, forks) but in a more compact variant. Full bubble rendering available on hover/expand if needed.
|
||||
|
||||
### 8. Redesigned `ConcludeSessionModal`
|
||||
|
||||
Replaces current simple textarea with the structured handoff form. On open:
|
||||
1. Call `aiSessionsApi.getHandoffDraft(sessionId)` — streaming — populate fields as stream arrives
|
||||
2. Render outcome selector + 4 structured fields (all `<textarea>` with labels)
|
||||
3. Render output destination checkboxes
|
||||
4. On Confirm: call resolve/escalate/pause with enriched request body
|
||||
|
||||
### 9. MSP-native language pass
|
||||
|
||||
| Old | New |
|
||||
|-----|-----|
|
||||
| "AI Assistant" | "FlowPilot" |
|
||||
| "New Chat" | "New Case" |
|
||||
| "Messages" | "Conversation Log" |
|
||||
| "Task Lane" (panel header) | "Steps" |
|
||||
| "Conclude" | "Close Case" |
|
||||
| "Chat history" (sidebar label) | "Case History" |
|
||||
|
||||
---
|
||||
|
||||
## What This Is NOT
|
||||
|
||||
- Not a redesign of the FlowPilot session page (`/pilot`) — separate page, untouched
|
||||
- Not a rebuild of session, branching, or PSA architecture
|
||||
- Not a new data model for conversations — `conversation_messages` JSONB is unchanged
|
||||
- Not a mobile-first redesign — mobile degrades cleanly but desktop is primary
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
### Harness Feel Test (primary — subjective)
|
||||
- Open `/assistant`, start a new case: does the page read as an MSP triage cockpit within 3 seconds without reading labels?
|
||||
- Is the current active step obvious without scrolling through chat?
|
||||
- Do FlowPilot Asks quick-reply buttons submit answers and update the steps list?
|
||||
- Does the incident header update mid-session as the AI infers context?
|
||||
- Drag the handle, refresh: does the split restore correctly?
|
||||
|
||||
### Functional Regression
|
||||
- Free-text chat, image paste, suggested flows, forks, branching: all work
|
||||
- Session pause, resume, and handoff end-to-end: works
|
||||
- ConcludeSessionModal resolves / escalates / parks correctly
|
||||
- Handoff draft streams and pre-fills the modal fields
|
||||
- Manual header edit saves and persists across reload
|
||||
|
||||
### MSP Scenario Coverage (from docx)
|
||||
Run end-to-end: single-user endpoint issue · M365/tenant-wide issue · network/VPN outage · escalation and resume after handoff.
|
||||
|
||||
### Backend Checks
|
||||
```bash
|
||||
# Migration
|
||||
alembic upgrade head
|
||||
|
||||
# Verify new columns
|
||||
psql -U postgres -d resolutionflow -c "\d ai_sessions" | grep -E "client_name|asset_name|issue_category|triage_hypothesis|evidence_items"
|
||||
|
||||
# Smoke test endpoints (with valid token)
|
||||
curl -X PATCH .../ai-sessions/{id}/triage -d '{"client_name":"Test"}'
|
||||
curl -X POST .../ai-sessions/{id}/handoff-draft # should stream JSON
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Critical Files
|
||||
|
||||
| File | Change |
|
||||
|------|--------|
|
||||
| `backend/app/models/ai_session.py` | Add 5 new columns |
|
||||
| `backend/app/schemas/ai_session.py` | Add `TriageUpdate`, extend `QuestionItem`, update request/response schemas |
|
||||
| `backend/app/api/endpoints/ai_sessions.py` | Add `PATCH /{id}/triage`, `POST /{id}/handoff-draft` |
|
||||
| `backend/app/services/unified_chat_service.py` | Extract and return `triage_update` per AI response |
|
||||
| `backend/app/services/resolution_output_generator.py` | Accept structured handoff fields in context builder |
|
||||
| `backend/alembic/versions/NNN_add_triage_fields_to_ai_sessions.py` | Sequential migration (check `ls backend/alembic/versions/ \| sort \| tail -1` for NNN) |
|
||||
| `frontend/src/pages/AssistantChatPage.tsx` | Full layout refactor — cockpit structure |
|
||||
| `frontend/src/components/assistant/IncidentHeader.tsx` | New component |
|
||||
| `frontend/src/components/assistant/StepsPanel.tsx` | Refactored from `TaskLane` |
|
||||
| `frontend/src/components/assistant/FlowPilotAsks.tsx` | New component |
|
||||
| `frontend/src/components/assistant/WhatWeKnow.tsx` | New component |
|
||||
| `frontend/src/components/assistant/ConcludeSessionModal.tsx` | Redesigned |
|
||||
| `frontend/src/api/aiSessions.ts` | Add `updateTriage()`, `getHandoffDraft()` |
|
||||
| `frontend/src/types/ai-session.ts` | Add `TriageUpdate`, `TriageMeta`, `EvidenceItem`; extend `QuestionItem` |
|
||||
Reference in New Issue
Block a user