Addresses every Red and Yellow item from the codex review: - Canonical handoff: ResolutionOutputGenerator is the source of truth - AI vs manual authority: manual edits win, AI never overwrites - evidence_items: full-list replacement, frontend is merge authority - TaskLane persistence: lifted into hook, StepsPanel is presentation-only - Quick replies: immediate-send, full-stack contract change - issue_category + asset_name: free text in v1 - Adds 5 implementation guardrails and Phase 2 gate for triage extraction - Execution order updated to 37 steps with persistence extraction step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
30 KiB
MSP Assistant Harness — Super Plan
Date: 2026-04-01
Status: Approved — ready to execute
Sources: MSP_Assistant_Harness_Implementation_Plan.docx (v2.0) + 2026-04-01-msp-assistant-harness-design.md (brainstorming session)
Goal
Reframe /assistant from a generic AI chat surface into a live MSP triage cockpit. An engineer arrives with an open ticket; the page immediately reads as their operational tool — not an AI chatbot that's been adapted for IT work.
The change is a UI and data layer reframe. The existing session, branching, PSA, and conclude architecture is preserved and extended, not rebuilt.
Key Architectural Choices
This plan explicitly chooses:
FlowPilotas the primary page/product label (not "Assistant Harness")- Backend triage + handoff contracts required in v1 — not deferred to a later phase
- Desktop-first cockpit layout with clean mobile degradation
- Explicit persisted triage fields on the session model, not purely derived/computed header state
- Prompt-embedded structured extraction (
[TRIAGE_UPDATE]marker) as the primary AI triage path, with post-response model pass only as fallback - Sidebar visual demotion — existing sidebar stays but is visually de-emphasized so the cockpit reads as an operations surface, not a chat app
What Phase 0 Resolved
The brainstorming session (2026-04-01) locked these decisions. They are not open questions.
| Question | Decision |
|---|---|
| Layout structure | Stacked zones: incident header → work zone → (drag handle) → conversation log → compose |
| Incident header style | Single row, explicit micro-labels above each field, per-field ✏ edit |
| Work zone left panel | Ordered step checklist (✓ / → / ○) |
| Work zone right panel | Two stacked mini-panels: FlowPilot Asks (top) + What We Know (bottom) |
| Chat zone treatment | Drag-resizable split, compact you: / fp: prefix style, darker background |
| Chat collapsibility | Not collapsible — drag handle gives control |
| Scope | Includes all required backend changes, not UI-layer only |
| Conclude modal | Fully redesigned as structured handoff artifact |
| Page label | "FlowPilot" (not "AI Assistant") |
| "New Chat" label | "New Case" |
| "Conclude" label | "Close Case" |
| Hypothesis language | "Hypothesis" (direct, not softened to "working theory") |
| What We Know editability | Engineer-editable + AI-appended |
| Header field population | Intake form + AI-inferred mid-session + manual engineer override |
Cockpit Layout
┌─────────────────────────────────────────────────────────────┐
│ [Left sidebar — Case History, unchanged] │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ INCIDENT HEADER (single row, labelled fields) │ │
│ │ CLIENT DEVICE CATEGORY HYPOTHESIS │ │
│ │ Contoso ✏ jsmith-04 ✏ DNS/Net ✏ Cache fail ✏ │ │
│ │ [CW #48291][Resolve⋯]│ │
│ ├───────────────────────┬───────────────────────────────┤ │
│ │ │ ▸ FLOWPILOT ASKS (amber) │ │
│ │ STEPS (~55%) │ Did nslookup time out? │ │
│ │ ✓ Ping 8.8.8.8 │ [Time out] [Wrong IP] [Both] │ │
│ │ → nslookup ←active ├───────────────────────────────┤ │
│ │ ○ Flush DNS │ WHAT WE KNOW │ │
│ │ ○ Check NIC │ ✓ Gateway reachable │ │
│ │ │ ✗ DNS 1.1.1.1 — timeout │ │
│ │ [⚡ Generate Script] │ ? DNS 8.8.8.8 — pending │ │
│ ├───────────────────────┴───── ≡ drag handle ───────────┤ │
│ │ CONVERSATION LOG (compact, darker bg) │ │
│ │ you: Can't resolve external DNS, internal fine │ │
│ │ fp: Ping test passed. Run nslookup google.com. │ │
│ │ you: Timed out on 1.1.1.1 too. │ │
│ ├───────────────────────────────────────────────────────┤ │
│ │ Describe next finding or ask FlowPilot... [Send] │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
Contract Decisions (Codex Readiness Review)
The following decisions were flagged as ambiguous by the Codex readiness review. Each is now resolved.
RED — Canonical handoff artifact
Decision: ResolutionOutputGenerator is the single canonical generator. Everything else is transport or UI.
POST /handoff-draftis a preview endpoint — streams a draft for the conclude modal UI. Does not persist. Does not generate final artifacts.- On confirm (resolve/escalate), the page calls the existing resolve/escalate endpoints, which trigger
ResolutionOutputGenerator.generate_all()as today. The structured fields from the modal (root_cause,steps_taken,recommendations) are passed into_build_session_context()to enrich the final outputs. /documentation/streamand/status-updateremain untouched — they are separate transport channels for the same canonical outputs./handoff-draftis assistant-only in v1 (not shared with guided FlowPilot sessions on/pilot).
RED — AI vs manual field authority
Decision: Manual edits win. AI does not overwrite manual edits.
| Rule | Behavior |
|---|---|
| AI auto-fill | Only fills fields that are currently null or empty. Never overwrites a non-null value. |
| Manual edit | Persists immediately via PATCH /triage. Sets the field as "manually set." |
| AI after manual edit | AI may suggest an update (shown as a subtle inline prompt: "FlowPilot suggests: Contoso Corp → Contoso Ltd"), but does not auto-write. |
| Evidence items — AI | Appends new items only. Does not modify or remove existing items. |
| Evidence items — engineer | Full authority: add, edit status, edit text, remove. |
Implementation: add a triage_manual_fields set (stored in frontend localStorage per session) tracking which fields the engineer has manually edited. AI triage_update skips those fields unless the engineer explicitly accepts the suggestion.
RED — evidence_items write model
Decision: Full-list replacement for all writes. Keep it simple.
PATCH /triagesends the completeevidence_itemsarray. Backend replaces the stored array.- AI appends: frontend receives
triage_update.evidence_items, appends to the current local list, then PATCHes the full merged list. - Engineer edits: frontend modifies the local list, PATCHes the full list.
- No partial-update or append-only semantics on the backend. The frontend is the merge authority.
YELLOW — TaskLane persistence in StepsPanel
Decision: StepsPanel is presentation only. All persistence behavior stays in AssistantChatPage.
TaskLane currently owns sessionStorage drafts, debounced backend saves, and restoration. In the cockpit refactor:
AssistantChatPagelifts all persistence logic out ofTaskLaneinto the page (or a custom hook likeuseTaskPersistence)StepsPanelreceivesactiveActionsas a prop and renders them — no persistence responsibilityTaskLane.tsxremains in the codebase untouched (other pages may still use it)
YELLOW — Quick-reply submission semantics
Decision: Quick replies are immediate-send controls.
- Clicking a quick-reply button calls
handleSend(option)— the answer goes directly to the AI as a chat message - No local-only "select then send" workflow
- The answer appears in the conversation log as a regular
you:message - This is a full-stack change: prompt instructions must tell the AI to include
optionson constrained questions, parser must extract them, schema must carry them, frontend must render and submit them
YELLOW — issue_category format
Decision: Free text in v1. No controlled taxonomy.
- AI infers a human-readable category string (e.g., "DNS / Networking", "Microsoft 365", "Active Directory")
- Engineer can edit to any value via the header
✏popover - Future: may introduce a taxonomy dropdown populated from session history — but not in v1
YELLOW — asset_name when user and device differ
Decision: Free text. The engineer enters whatever is most operationally relevant.
- Could be a device name ("jsmith-desktop-04"), a user ("John Smith"), or both ("jsmith-desktop-04 / John Smith")
- AI infers from conversation context — typically the entity being troubleshot
- No enforced format in v1
YELLOW — Structured conclude fields persistence
Decision: Structured conclude fields (root_cause, steps_taken, recommendations) are passed through to ResolutionOutputGenerator but are NOT stored as separate session columns.
- They arrive in the resolve/escalate request body
_build_session_context()uses them to generate richer PSA notes and client summaries- The generated outputs (stored in
session_resolution_outputs) are the persisted artifacts - If we later need the raw structured fields, add columns then — not speculatively now
Fallback — [TRIAGE_UPDATE] unreliability
Decision: If prompt-embedded extraction proves unreliable after testing against 5 real sessions:
- First fallback: Post-response extraction using
claude-haiku-4-5with last 3 messages as context. Cheap, fast, decoupled from the main prompt. - Second fallback: Fully manual header — engineer fills in fields, AI never auto-updates. Cockpit still works; it just requires more manual input.
Gate: Phase 2 step 15 ("verify extraction in a live session") must pass before wiring triage_update into the visible header.
Implementation Guardrails
These are hard rules during implementation, not suggestions.
- Do not let AI write speculative values into the header. Every AI-inferred field must trace to ticket data or explicit conversation evidence. If the AI can't ground it, the field stays empty.
- Do not redesign conclude UX until the canonical handoff source-of-truth is wired. Phase 6 (conclude modal) depends on Phase 1 (backend) being stable.
- Do not treat
TaskLaneas presentation-only until its persistence behavior has been lifted. Extract persistence into a hook or the page before buildingStepsPanel. - Do not wire header auto-updates from
[TRIAGE_UPDATE]until real-session reliability is tested. Phase 2 step 15 is a gate. - Run
npx tsc -bafter every phase. Do not batch TypeScript error fixes (lesson #92).
Non-Goals
- No redesign of
/pilot(FlowPilot session page) — separate page, untouched - No rebuild of session, branching, or PSA architecture
- No new data model for conversations —
conversation_messagesJSONB unchanged - No mobile-first redesign — mobile degrades cleanly, desktop is primary
- No generic "assistant polish" that does not tighten the harness
Backend Changes
B1 — Alembic migration 071
File: backend/alembic/versions/071_add_triage_fields_to_ai_sessions.py
Add to ai_sessions:
| Column | Type | Notes |
|---|---|---|
client_name |
VARCHAR(255) |
MSP client for incident header |
asset_name |
VARCHAR(255) |
Device / user being worked on |
issue_category |
VARCHAR(100) |
Human-readable category ("DNS / Networking") |
triage_hypothesis |
TEXT |
Working hypothesis — AI-updated + editable |
evidence_items |
JSONB |
What We Know list — persisted for resume |
evidence_items schema: [{ "text": str, "status": "confirmed" | "ruled_out" | "pending" }]
Note: existing problem_domain is an internal classifier slug and is unchanged. issue_category is the human-readable display label. Both coexist.
B2 — Updated schemas (backend/app/schemas/ai_session.py)
New TriageUpdate:
class TriageUpdate(BaseModel):
client_name: str | None = None
asset_name: str | None = None
issue_category: str | None = None
triage_hypothesis: str | None = None
evidence_items: list[dict] | None = None # appends to existing list
Updated ChatMessageResponse:
class ChatMessageResponse(BaseModel):
# ... existing fields unchanged ...
triage_update: TriageUpdate | None = None
Updated QuestionItem — add quick-reply options:
class QuestionItem(BaseModel):
text: str
context: str = ""
options: list[str] | None = None # quick-reply labels; null → free-text input
Updated ResolveSessionRequest / EscalateSessionRequest:
root_cause: str | None = None
steps_taken: list[str] | None = None
recommendations: str | None = None
B3 — New PATCH /ai-sessions/{id}/triage endpoint
PATCH /ai-sessions/{session_id}/triage
Auth: require_engineer_or_admin
Body: { client_name?, asset_name?, issue_category?, triage_hypothesis?, evidence_items? }
Response: { id, client_name, asset_name, issue_category, triage_hypothesis, evidence_items }
Called on every manual header field edit. Partial update — only supplied fields are written.
B4 — New POST /ai-sessions/{id}/handoff-draft endpoint
POST /ai-sessions/{session_id}/handoff-draft
Auth: require_engineer_or_admin
Response: StreamingResponse (text/event-stream)
Streams structured handoff JSON built from session context:
{ "root_cause": "...", "resolution": "...", "steps_taken": ["..."], "recommendations": "..." }
Uses: problem_summary, triage_hypothesis, evidence_items, last 20 conversation_messages, saved task lane state.
Called immediately on conclude modal open — engineer can edit while stream fills in.
B5 — unified_chat_service.py — triage extraction
After each AI response, extract triage signals and return as triage_update.
Recommended approach: Add a [TRIAGE_UPDATE] structured marker to the system prompt, following the existing [QUESTIONS] / [ACTIONS] / [FORK] marker pattern. The AI emits the block only when it has new signal:
[TRIAGE_UPDATE]
client_name: Contoso Ltd
issue_category: DNS / Networking
triage_hypothesis: Corrupted DNS cache on NIC
evidence_items:
- confirmed: Gateway 192.168.1.1 reachable
- ruled_out: DNS 1.1.1.1 — timeout
[/TRIAGE_UPDATE]
Service parses this, strips it from display_content, auto-PATCHes the session record, and returns triage_update in the response.
B6 — resolution_output_generator.py — accept structured fields
Update _build_session_context() to incorporate root_cause, steps_taken, and recommendations when supplied, producing richer psa_ticket_notes and client_summary outputs.
B7 — Session detail response — expose new triage fields
GET /ai-sessions/{id} (and the session list item) must return the 5 new fields so the frontend can restore header state on session load and resume.
Frontend Changes
F1 — AssistantChatPage.tsx — cockpit layout refactor
Replace current layout (sidebar + chat column + TaskLane right rail) with the stacked cockpit structure.
New state:
triageMeta: TriageMeta—{ client_name, asset_name, issue_category, triage_hypothesis, evidence_items }workZoneHeight: number— persisted tolocalStorage('rf-assistant-work-zone-height')
On session load / resume: populate triageMeta from session response new fields.
On AI response: if response.triage_update is non-null, merge into triageMeta (partial — preserve existing non-null values unless AI explicitly overwrites).
Work zone layout: left StepsPanel + right column with FlowPilotAsks stacked above WhatWeKnow.
Chat zone layout: compact ConversationLog below drag handle, independent scroll.
F2 — New IncidentHeader.tsx
frontend/src/components/assistant/IncidentHeader.tsx
Props: triageMeta: TriageMeta, psaTicketId: string | null, sessionId: string, onFieldSave(field, value), onResolve(), onOverflow()
- Single-row bar with micro-labels (CLIENT / DEVICE / CATEGORY / HYPOTHESIS)
- Each field:
✏icon visible on hover → opens inlineEditPopover(text input + Save/Cancel) - On Save: calls
aiSessionsApi.updateTriage(sessionId, { [field]: value }) - Empty fields: muted placeholder ("Unknown client", "No device specified", etc.)
- Right side: PSA ticket badge (if linked) + Resolve button +
⋯overflow menu
F3 — Refactored StepsPanel.tsx (from TaskLane)
frontend/src/components/assistant/StepsPanel.tsx
Preserves all TaskLane data logic and persistence. Changes rendering only:
| State | Icon | Style |
|---|---|---|
| Completed | ✓ |
Strikethrough, muted, green icon |
| Active | → |
Blue left border, white text, full opacity |
| Pending | ○ |
Muted text |
Script generation CTA: shown at bottom when active step command references "script" or AI has flagged it.
TaskLane.tsx can remain for now (no renames required in this phase) — StepsPanel is a new component that consumes the same activeActions prop.
F4 — New FlowPilotAsks.tsx
frontend/src/components/assistant/FlowPilotAsks.tsx
Props: questions: QuestionItem[], onAnswer(answer: string)
- Renders first unanswered question
question.optionsnon-null → button row; clicking callsonAnswer(option)question.optionsnull → compact text input + SendonAnswercalls parent'shandleSendwith the answer string- Hidden entirely when
questionsis empty
F5 — New WhatWeKnow.tsx
frontend/src/components/assistant/WhatWeKnow.tsx
Props: items: EvidenceItem[], onAdd(text, status), onEdit(index, text, status)
- Evidence list:
✓confirmed (green) /✗ruled out (red) /?pending (muted) - "+ Add finding" inline entry at bottom
- Click any item to edit inline
- State lives in
AssistantChatPage(triageMeta.evidence_items), synced to backend viaPATCH /triage
F6 — Drag-resizable split
Thin handle bar between work zone and conversation log. On drag: update workZoneHeight in state, persist to localStorage. On mount: restore, default 55%.
F7 — Compact ConversationLog rendering
Replace current full <ChatMessage> bubbles in the log zone with a compact list: you: ... / fp: ... prefix style, tighter line height, no avatars. ChatMessage can still be used for rich content (forks, suggested flows) in a compact variant. Individual messages should support click-to-expand for full rendering when the engineer needs to re-read a longer response or review a suggested flow.
F8 — Redesigned ConcludeSessionModal.tsx
On open:
- Call
aiSessionsApi.getHandoffDraft(sessionId)(streaming) — fields fill in as stream arrives - Render: outcome selector (Resolved / Escalated / Parked)
- Render 4 structured editable fields: Root Cause, Resolution, Steps Taken, Recommendations
- Render output destination checkboxes: Post to CW note / Save to KB / Send client summary
- Confirm → call resolve/escalate/pause with enriched request body including structured fields
F9 — Sidebar visual demotion
The existing ChatSidebar stays functionally unchanged but should be visually softened so the cockpit — not the session list — reads as the primary surface. Specific changes:
- Reduce sidebar background contrast (use
bg-sidebaror one step darker) - Reduce sidebar header prominence (smaller label, no bold "Chat History" heading)
- Rename "Chat History" → "Case History" (part of language pass)
- Default sidebar to collapsed state on first cockpit load (existing collapse toggle +
localStorage)
F10 — MSP-native language pass
| Old | New |
|---|---|
| "AI Assistant" (page title, meta) | "FlowPilot" |
| "New Chat" | "New Case" |
| "Messages" | "Conversation Log" |
| "Task Lane" (panel label) | "Steps" |
| "Conclude" | "Close Case" |
| "Chat history" (sidebar label) | "Case History" |
| Compose placeholder | "Describe finding, paste log output, or ask FlowPilot..." |
F11 — New API methods (aiSessions.ts)
updateTriage(sessionId: string, fields: Partial<TriageMeta>): Promise<TriageMeta>
getHandoffDraft(sessionId: string): AsyncGenerator<HandoffDraftChunk>
F12 — New types (types/ai-session.ts)
interface TriageMeta {
client_name: string | null
asset_name: string | null
issue_category: string | null
triage_hypothesis: string | null
evidence_items: EvidenceItem[]
}
interface EvidenceItem {
text: string
status: 'confirmed' | 'ruled_out' | 'pending'
}
interface TriageUpdate extends Partial<TriageMeta> {}
// Extend existing:
interface QuestionItem {
text: string
context: string
options?: string[] // new
}
Phased Execution Order
Phase 1 — Backend Foundation
Lock backend schema and API changes first so the cockpit can be built against stable session contracts.
- Write migration
071— add 5 columns toai_sessions - Run
alembic upgrade head, verify columns - Update
AISessionmodel with new mapped columns - Add
TriageUpdateschema, extendQuestionItem, extendChatMessageResponse - Extend
ResolveSessionRequest/EscalateSessionRequestwith structured fields - Add
PATCH /{id}/triageendpoint - Add
POST /{id}/handoff-draftstreaming endpoint - Update
GET /ai-sessions/{id}response to include new triage fields - Update
resolution_output_generator._build_session_context()to use structured fields - Run backend tests —
pytest --override-ini="addopts="
Phase 2 — Triage Extraction (AI layer)
- Add
[TRIAGE_UPDATE]marker tounified_chat_service.pysystem prompt - Implement
_parse_triage_update_marker()in the service (follow existing_parse_questions_marker/_parse_actions_markerpattern) - Auto-PATCH session on non-null
triage_update(respect manual-edit authority: skip fields intriage_manual_fields) - Add
optionsgeneration instructions to[QUESTIONS]system prompt section - GATE: Verify extraction in 5 real sessions. If
[TRIAGE_UPDATE]is emitted reliably (≥4/5), proceed. Otherwise switch to Haiku post-response fallback before wiring into the header.
Phase 3 — New Frontend Types + API
- Add
TriageMeta,EvidenceItem,TriageUpdatetotypes/ai-session.ts - Extend
QuestionItemtype - Add
updateTriage()andgetHandoffDraft()toaiSessions.ts
Phase 4 — New Work Zone Components
- Extract
TaskLanepersistence logic intouseTaskPersistencehook (sessionStorage drafts, debounced saves, restoration) — prerequisite for StepsPanel - Build
IncidentHeader.tsxwithEditPopover - Build
StepsPanel.tsx(presentation only — receives props from hook) - Build
FlowPilotAsks.tsx - Build
WhatWeKnow.tsx
Phase 5 — Page Layout Refactor
- Refactor
AssistantChatPage.tsx— implement stacked cockpit layout - Wire
triageMetastate, session load population,triage_updatemerge (withtriage_manual_fieldsguard) - Implement drag-resizable split with
localStoragepersistence - Compact
ConversationLogrendering (with click-to-expand for long messages)
Phase 6 — Handoff Modal + Language Pass + Sidebar
- Redesign
ConcludeSessionModal.tsx— structured handoff form (calls/handoff-draftfor preview, confirms via existing resolve/escalate endpoints which triggerResolutionOutputGenerator) - Sidebar visual demotion — background, label prominence, default-collapsed
- MSP-native language pass across all assistant components
- Update
<PageMeta>title
Phase 7 — QA + Hardening
npx tsc -b— fix any TypeScript errorsnpm run build— production build clean- Functional regression: all chat flows, session switching, conclude/resume
- Harness feel test: cockpit within 3 seconds?
- Mobile viewport check
- Stress test: 50+ messages, 10+ steps, long outputs
Risks and Mitigations
| Risk | Mitigation |
|---|---|
[TRIAGE_UPDATE] marker extraction is unreliable — AI doesn't emit it consistently |
Gate Phase 2 on a pass/fail test with 5 real sessions before wiring it to the header. Fall back to Option B (post-response Haiku pass) if needed. |
| Header fields feel fabricated — AI guesses wrong client or hypothesis | Show confidence-aware placeholder copy ("FlowPilot is building context…") until a field has real data. Never invent. |
| Task lane visual promotion breaks established chat patterns | Keep all send/respond behavior intact. Change hierarchy only. Verify every task-lane state transition manually. |
| Handoff modal exposes weak underlying summaries | Reuse existing ResolutionOutputGenerator output where possible. Add guardrail copy for empty fields. |
| Mobile loses compose or step access | Test responsive layout as a first-class deliverable in Phase 7, not a final sweep. Enforce scroll isolation between all zones. |
tsc -b errors after component refactor |
Run npx tsc -b after every phase. Trace unused imports/props immediately — don't batch (lesson #92). |
Test Plan
Harness Feel (primary, subjective)
- Does the page read as an MSP triage cockpit within 3 seconds on first load?
- Is the active step obvious without reading chat?
- Do FlowPilot Asks quick-reply buttons work and update the step list?
- Does the incident header update mid-session as AI learns context?
- Drag handle, refresh — does split restore?
- Does the conclude modal look like a case handoff or a chat closure?
Functional Regression
- New session (no PSA) — header degrades gracefully
- New session (with CW ticket) — header populates from ticket data
- Send message →
triage_updateupdates header - Click quick-reply button → answer submitted, step advances
- Add finding to What We Know → persisted via PATCH
- Edit header field via
✏→ saved and survives refresh - Conclude as Resolved → handoff draft fills modal → post to CW note
- Conclude as Escalated → same
- Pause and resume → triage header restores from saved session fields
- Session switching (currentChatRef guard) — no stale state
- Image paste, forks, suggested flows — all still work
MSP Scenarios (from docx)
- Single-user endpoint issue (basic triage flow, script generation)
- M365 / tenant-wide issue (multi-user context, issue category)
- Network / VPN outage (asset targeting, hypothesis tracking)
- Escalation and resume (session persistence, structured handoff)
Edge Cases
- 50+ messages — layout hierarchy stays intact
- 10+ steps — step panel scrolls, compose remains accessible
- Long issue titles / hypothesis text — header truncates gracefully
- Missing PSA context — placeholder copy, not blank fields
- Narrow mobile viewport — all zones reachable
Backend Checks
# Migration
alembic upgrade head
psql -U postgres -d resolutionflow -c "\d ai_sessions" | grep -E "client_name|asset_name|issue_category|triage_hypothesis|evidence_items"
# Triage PATCH
curl -X PATCH http://localhost:8000/ai-sessions/{id}/triage \
-H "Authorization: Bearer $TOKEN" \
-d '{"client_name":"Test Client","triage_hypothesis":"Cache corruption"}'
# Handoff draft stream
curl -X POST http://localhost:8000/ai-sessions/{id}/handoff-draft \
-H "Authorization: Bearer $TOKEN"
Assumptions
- Desktop is the primary target; mobile must remain usable but does not drive the layout.
/assistantremains the chat-session cockpit;/pilotis out of scope.- New triage fields are additive — they do not replace
problem_summary,problem_domain,ticket_data, orconversation_messages. issue_categoryis the operator-facing display field;problem_domainremains the internal classifier. Both coexist.evidence_itemsis editable by both AI and engineer; engineer edits persist through the triage PATCH endpoint.- PSA context is optional — every triage header field must degrade gracefully when PSA is absent or session is free-text-only.
- The existing
TaskLane.tsxcomponent remains in the codebase —StepsPanelis a new component that consumes the same props with different rendering. No risky renames during this work.
Critical Files
| File | Change |
|---|---|
backend/alembic/versions/071_add_triage_fields_to_ai_sessions.py |
New migration |
backend/app/models/ai_session.py |
Add 5 new mapped columns |
backend/app/schemas/ai_session.py |
TriageUpdate, QuestionItem.options, extended request/response schemas |
backend/app/api/endpoints/ai_sessions.py |
PATCH /triage, POST /handoff-draft |
backend/app/services/unified_chat_service.py |
[TRIAGE_UPDATE] marker extraction, auto-PATCH |
backend/app/services/resolution_output_generator.py |
Structured fields in context builder |
frontend/src/types/ai-session.ts |
TriageMeta, EvidenceItem, TriageUpdate; extend QuestionItem |
frontend/src/api/aiSessions.ts |
updateTriage(), getHandoffDraft() |
frontend/src/pages/AssistantChatPage.tsx |
Full cockpit layout refactor |
frontend/src/components/assistant/IncidentHeader.tsx |
New |
frontend/src/components/assistant/StepsPanel.tsx |
New (from TaskLane logic) |
frontend/src/components/assistant/FlowPilotAsks.tsx |
New |
frontend/src/components/assistant/WhatWeKnow.tsx |
New |
frontend/src/components/assistant/ConcludeSessionModal.tsx |
Redesigned |