docs: resolve all contract decisions from codex readiness review
Addresses every Red and Yellow item from the codex review: - Canonical handoff: ResolutionOutputGenerator is the source of truth - AI vs manual authority: manual edits win, AI never overwrites - evidence_items: full-list replacement, frontend is merge authority - TaskLane persistence: lifted into hook, StepsPanel is presentation-only - Quick replies: immediate-send, full-stack contract change - issue_category + asset_name: free text in v1 - Adds 5 implementation guardrails and Phase 2 gate for triage extraction - Execution order updated to 37 steps with persistence extraction step Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -79,6 +79,108 @@ The brainstorming session (2026-04-01) locked these decisions. They are not open
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Contract Decisions (Codex Readiness Review)
|
||||||
|
|
||||||
|
The following decisions were flagged as ambiguous by the Codex readiness review. Each is now resolved.
|
||||||
|
|
||||||
|
### RED — Canonical handoff artifact
|
||||||
|
|
||||||
|
**Decision:** `ResolutionOutputGenerator` is the single canonical generator. Everything else is transport or UI.
|
||||||
|
|
||||||
|
- `POST /handoff-draft` is a **preview** endpoint — streams a draft for the conclude modal UI. Does not persist. Does not generate final artifacts.
|
||||||
|
- On confirm (resolve/escalate), the page calls the existing resolve/escalate endpoints, which trigger `ResolutionOutputGenerator.generate_all()` as today. The structured fields from the modal (`root_cause`, `steps_taken`, `recommendations`) are passed into `_build_session_context()` to enrich the final outputs.
|
||||||
|
- `/documentation/stream` and `/status-update` remain untouched — they are separate transport channels for the same canonical outputs.
|
||||||
|
- `/handoff-draft` is **assistant-only** in v1 (not shared with guided FlowPilot sessions on `/pilot`).
|
||||||
|
|
||||||
|
### RED — AI vs manual field authority
|
||||||
|
|
||||||
|
**Decision:** Manual edits win. AI does not overwrite manual edits.
|
||||||
|
|
||||||
|
| Rule | Behavior |
|
||||||
|
|------|----------|
|
||||||
|
| AI auto-fill | Only fills fields that are currently `null` or empty. Never overwrites a non-null value. |
|
||||||
|
| Manual edit | Persists immediately via `PATCH /triage`. Sets the field as "manually set." |
|
||||||
|
| AI after manual edit | AI may **suggest** an update (shown as a subtle inline prompt: "FlowPilot suggests: Contoso Corp → Contoso Ltd"), but does not auto-write. |
|
||||||
|
| Evidence items — AI | Appends new items only. Does not modify or remove existing items. |
|
||||||
|
| Evidence items — engineer | Full authority: add, edit status, edit text, remove. |
|
||||||
|
|
||||||
|
Implementation: add a `triage_manual_fields` set (stored in frontend `localStorage` per session) tracking which fields the engineer has manually edited. AI `triage_update` skips those fields unless the engineer explicitly accepts the suggestion.
|
||||||
|
|
||||||
|
### RED — `evidence_items` write model
|
||||||
|
|
||||||
|
**Decision:** Full-list replacement for all writes. Keep it simple.
|
||||||
|
|
||||||
|
- `PATCH /triage` sends the complete `evidence_items` array. Backend replaces the stored array.
|
||||||
|
- AI appends: frontend receives `triage_update.evidence_items`, appends to the current local list, then PATCHes the full merged list.
|
||||||
|
- Engineer edits: frontend modifies the local list, PATCHes the full list.
|
||||||
|
- No partial-update or append-only semantics on the backend. The frontend is the merge authority.
|
||||||
|
|
||||||
|
### YELLOW — TaskLane persistence in StepsPanel
|
||||||
|
|
||||||
|
**Decision:** `StepsPanel` is presentation only. All persistence behavior stays in `AssistantChatPage`.
|
||||||
|
|
||||||
|
`TaskLane` currently owns sessionStorage drafts, debounced backend saves, and restoration. In the cockpit refactor:
|
||||||
|
- `AssistantChatPage` lifts all persistence logic out of `TaskLane` into the page (or a custom hook like `useTaskPersistence`)
|
||||||
|
- `StepsPanel` receives `activeActions` as a prop and renders them — no persistence responsibility
|
||||||
|
- `TaskLane.tsx` remains in the codebase untouched (other pages may still use it)
|
||||||
|
|
||||||
|
### YELLOW — Quick-reply submission semantics
|
||||||
|
|
||||||
|
**Decision:** Quick replies are **immediate-send** controls.
|
||||||
|
|
||||||
|
- Clicking a quick-reply button calls `handleSend(option)` — the answer goes directly to the AI as a chat message
|
||||||
|
- No local-only "select then send" workflow
|
||||||
|
- The answer appears in the conversation log as a regular `you:` message
|
||||||
|
- This is a full-stack change: prompt instructions must tell the AI to include `options` on constrained questions, parser must extract them, schema must carry them, frontend must render and submit them
|
||||||
|
|
||||||
|
### YELLOW — `issue_category` format
|
||||||
|
|
||||||
|
**Decision:** Free text in v1. No controlled taxonomy.
|
||||||
|
|
||||||
|
- AI infers a human-readable category string (e.g., "DNS / Networking", "Microsoft 365", "Active Directory")
|
||||||
|
- Engineer can edit to any value via the header `✏` popover
|
||||||
|
- Future: may introduce a taxonomy dropdown populated from session history — but not in v1
|
||||||
|
|
||||||
|
### YELLOW — `asset_name` when user and device differ
|
||||||
|
|
||||||
|
**Decision:** Free text. The engineer enters whatever is most operationally relevant.
|
||||||
|
|
||||||
|
- Could be a device name ("jsmith-desktop-04"), a user ("John Smith"), or both ("jsmith-desktop-04 / John Smith")
|
||||||
|
- AI infers from conversation context — typically the entity being troubleshot
|
||||||
|
- No enforced format in v1
|
||||||
|
|
||||||
|
### YELLOW — Structured conclude fields persistence
|
||||||
|
|
||||||
|
**Decision:** Structured conclude fields (`root_cause`, `steps_taken`, `recommendations`) are **passed through to `ResolutionOutputGenerator`** but are NOT stored as separate session columns.
|
||||||
|
|
||||||
|
- They arrive in the resolve/escalate request body
|
||||||
|
- `_build_session_context()` uses them to generate richer PSA notes and client summaries
|
||||||
|
- The generated outputs (stored in `session_resolution_outputs`) are the persisted artifacts
|
||||||
|
- If we later need the raw structured fields, add columns then — not speculatively now
|
||||||
|
|
||||||
|
### Fallback — `[TRIAGE_UPDATE]` unreliability
|
||||||
|
|
||||||
|
**Decision:** If prompt-embedded extraction proves unreliable after testing against 5 real sessions:
|
||||||
|
|
||||||
|
1. **First fallback:** Post-response extraction using `claude-haiku-4-5` with last 3 messages as context. Cheap, fast, decoupled from the main prompt.
|
||||||
|
2. **Second fallback:** Fully manual header — engineer fills in fields, AI never auto-updates. Cockpit still works; it just requires more manual input.
|
||||||
|
|
||||||
|
Gate: Phase 2 step 15 ("verify extraction in a live session") must pass before wiring `triage_update` into the visible header.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Guardrails
|
||||||
|
|
||||||
|
These are hard rules during implementation, not suggestions.
|
||||||
|
|
||||||
|
1. **Do not let AI write speculative values into the header.** Every AI-inferred field must trace to ticket data or explicit conversation evidence. If the AI can't ground it, the field stays empty.
|
||||||
|
2. **Do not redesign conclude UX until the canonical handoff source-of-truth is wired.** Phase 6 (conclude modal) depends on Phase 1 (backend) being stable.
|
||||||
|
3. **Do not treat `TaskLane` as presentation-only until its persistence behavior has been lifted.** Extract persistence into a hook or the page before building `StepsPanel`.
|
||||||
|
4. **Do not wire header auto-updates from `[TRIAGE_UPDATE]` until real-session reliability is tested.** Phase 2 step 15 is a gate.
|
||||||
|
5. **Run `npx tsc -b` after every phase.** Do not batch TypeScript error fixes (lesson #92).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Non-Goals
|
## Non-Goals
|
||||||
|
|
||||||
- No redesign of `/pilot` (FlowPilot session page) — separate page, untouched
|
- No redesign of `/pilot` (FlowPilot session page) — separate page, untouched
|
||||||
@@ -369,10 +471,10 @@ interface QuestionItem {
|
|||||||
|
|
||||||
### Phase 2 — Triage Extraction (AI layer)
|
### Phase 2 — Triage Extraction (AI layer)
|
||||||
11. Add `[TRIAGE_UPDATE]` marker to `unified_chat_service.py` system prompt
|
11. Add `[TRIAGE_UPDATE]` marker to `unified_chat_service.py` system prompt
|
||||||
12. Implement `_parse_triage_update_marker()` in the service
|
12. Implement `_parse_triage_update_marker()` in the service (follow existing `_parse_questions_marker` / `_parse_actions_marker` pattern)
|
||||||
13. Auto-PATCH session on non-null `triage_update`
|
13. Auto-PATCH session on non-null `triage_update` (respect manual-edit authority: skip fields in `triage_manual_fields`)
|
||||||
14. Add `options` generation instructions to `[QUESTIONS]` system prompt section
|
14. Add `options` generation instructions to `[QUESTIONS]` system prompt section
|
||||||
15. Verify extraction in a live session
|
15. **GATE:** Verify extraction in 5 real sessions. If `[TRIAGE_UPDATE]` is emitted reliably (≥4/5), proceed. Otherwise switch to Haiku post-response fallback before wiring into the header.
|
||||||
|
|
||||||
### Phase 3 — New Frontend Types + API
|
### Phase 3 — New Frontend Types + API
|
||||||
16. Add `TriageMeta`, `EvidenceItem`, `TriageUpdate` to `types/ai-session.ts`
|
16. Add `TriageMeta`, `EvidenceItem`, `TriageUpdate` to `types/ai-session.ts`
|
||||||
@@ -380,30 +482,31 @@ interface QuestionItem {
|
|||||||
18. Add `updateTriage()` and `getHandoffDraft()` to `aiSessions.ts`
|
18. Add `updateTriage()` and `getHandoffDraft()` to `aiSessions.ts`
|
||||||
|
|
||||||
### Phase 4 — New Work Zone Components
|
### Phase 4 — New Work Zone Components
|
||||||
19. Build `IncidentHeader.tsx` with `EditPopover`
|
19. Extract `TaskLane` persistence logic into `useTaskPersistence` hook (sessionStorage drafts, debounced saves, restoration) — prerequisite for StepsPanel
|
||||||
20. Build `StepsPanel.tsx`
|
20. Build `IncidentHeader.tsx` with `EditPopover`
|
||||||
21. Build `FlowPilotAsks.tsx`
|
21. Build `StepsPanel.tsx` (presentation only — receives props from hook)
|
||||||
22. Build `WhatWeKnow.tsx`
|
22. Build `FlowPilotAsks.tsx`
|
||||||
|
23. Build `WhatWeKnow.tsx`
|
||||||
|
|
||||||
### Phase 5 — Page Layout Refactor
|
### Phase 5 — Page Layout Refactor
|
||||||
23. Refactor `AssistantChatPage.tsx` — implement stacked cockpit layout
|
24. Refactor `AssistantChatPage.tsx` — implement stacked cockpit layout
|
||||||
24. Wire `triageMeta` state, session load population, `triage_update` merge
|
25. Wire `triageMeta` state, session load population, `triage_update` merge (with `triage_manual_fields` guard)
|
||||||
25. Implement drag-resizable split with `localStorage` persistence
|
26. Implement drag-resizable split with `localStorage` persistence
|
||||||
26. Compact `ConversationLog` rendering
|
27. Compact `ConversationLog` rendering (with click-to-expand for long messages)
|
||||||
|
|
||||||
### Phase 6 — Handoff Modal + Language Pass + Sidebar
|
### Phase 6 — Handoff Modal + Language Pass + Sidebar
|
||||||
27. Redesign `ConcludeSessionModal.tsx` — structured handoff form
|
28. Redesign `ConcludeSessionModal.tsx` — structured handoff form (calls `/handoff-draft` for preview, confirms via existing resolve/escalate endpoints which trigger `ResolutionOutputGenerator`)
|
||||||
28. Sidebar visual demotion — background, label prominence, default-collapsed
|
29. Sidebar visual demotion — background, label prominence, default-collapsed
|
||||||
29. MSP-native language pass across all assistant components
|
30. MSP-native language pass across all assistant components
|
||||||
30. Update `<PageMeta>` title
|
31. Update `<PageMeta>` title
|
||||||
|
|
||||||
### Phase 7 — QA + Hardening
|
### Phase 7 — QA + Hardening
|
||||||
31. `npx tsc -b` — fix any TypeScript errors
|
32. `npx tsc -b` — fix any TypeScript errors
|
||||||
32. `npm run build` — production build clean
|
33. `npm run build` — production build clean
|
||||||
33. Functional regression: all chat flows, session switching, conclude/resume
|
34. Functional regression: all chat flows, session switching, conclude/resume
|
||||||
34. Harness feel test: cockpit within 3 seconds?
|
35. Harness feel test: cockpit within 3 seconds?
|
||||||
35. Mobile viewport check
|
36. Mobile viewport check
|
||||||
36. Stress test: 50+ messages, 10+ steps, long outputs
|
37. Stress test: 50+ messages, 10+ steps, long outputs
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user