Files
resolutionflow/docs/FlowAssist_Migration/phase-9-script-builder-tab.md
Michael Chihlas fcd224429c docs(pilot): revise Phase 9 spec per review findings
Four findings addressed:

- High: drop proposed parent_pilot_session_id column; reuse the
  existing ai_session_id FK on script_builder_sessions. Add an
  origin + ai_session_id coherence invariant.
- High: don't add a 'mode' prop to ScriptBuilderChat (it's
  presentational). Introduce a ScriptBuilderTab controller that owns
  session lifecycle + submit, renders ScriptBuilderChat unchanged.
- Medium: filter list_sessions / count_user_sessions to origin='standalone'
  so pilot_inline scratch sessions don't pollute the /script-builder
  dashboard or count against the 5-session cap.
- Medium: applied_at is stamped only when the engineer commits to a
  run-action (one_off, TemplateMatchPanel Run), not on banner Apply
  click. Corrects a Phase 8 over-eager stamp that would otherwise
  multiply across three surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:28:53 -04:00

355 lines
24 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FlowPilot Phase 9 — Tabbed Script Builder + NoTemplateDialog relocation
**Date:** 2026-04-23
**Branch target:** `feat/flowpilot-migration` (continuation of Phases 08)
**Depends on:** Phase 8 (ProposalBanner in chat region)
---
## Goal
Close the two remaining open items from the FlowPilot migration handoff:
1. **NoTemplateDialog narrow-lane bug** — today the dialog renders in the task lane (~340px) and its `grid-cols-1 sm:grid-cols-3` layout crushes the three option cards. When the AI proposes a fix with no drafted script, all three cards render disabled, producing a dead end.
2. **Tabbed Script Builder inside the chat** — give the engineer a way to draft the missing script without leaving the session (either by chatting with the AI or hand-writing in a code editor), then feed the draft back into the existing fix lifecycle.
Plus two Phase 8 cleanup items flagged during code review:
3. **`EscalateInterceptDialog` missing the Partial choice** — if a fix is in `applied_partial` when the engineer escalates, the intercept dialog's current three choices (worked / didn't work / never applied) don't match. Add a fourth choice for partial.
4. **`applied_at` semantics correction** — today Phase 8's `handleApplyFix` stamps `applied_at` on every banner Apply click, starting the Verifying timer even when the engineer is only opening a drafting/evaluation surface. Move the stamp to the actual run-action handlers (see §5).
This phase depends on Phase 8's `ProposalBanner` already being in the chat region — it reuses the same "chat-region-owns-Apply-flow" philosophy.
---
## Architectural decisions (settled during brainstorming)
| # | Decision | Rationale |
|---|---|---|
| 1 | When a fix has no `ai_drafted_script`, the banner's Apply button routes **directly** to the Script Builder tab (bypassing `NoTemplateDialog` entirely). | Banner is the single entry point for Apply. `NoTemplateDialog` stays narrowly scoped to evaluating a draft that actually exists. |
| 2 | Inside the Script Builder tab, the default experience is AI-driven — a new `ScriptBuilderTab` controller owns session lifecycle + submit, and *renders* `ScriptBuilderChat` (which stays purely presentational). A "✎ Write it myself" button in the tab's header toolbar swaps the controller's render into a Monaco editor. | AI is the common path. Persistence semantics belong on the controller, not the chat display component (`ScriptBuilderChat` already exposes `onSaveScript` as its seam — the controller wires that callback). |
| 3 | The manual editor uses **Monaco**, reusing the pattern from `frontend/src/components/tree-editor/code-mode/CodeModeEditor.tsx`. | Monaco is already a dependency (`@monaco-editor/react` + `monaco-editor`). No bundle cost, proven pattern. |
| 4 | The Script Builder tab is **always present while the fix is non-terminal** (no close affordance). An **indicator dot** on the tab signals in-progress draft state. | Matches Phase 8's `display: none` philosophy — engineers move freely between chat and draft without tracking a separate open/close state. |
| 5 | `NoTemplateDialog` (draft-exists case) moves from `TaskLane.bottomSlot` to the **chat region** (sibling of `ProposalBanner`, slides up above composer). | Script evaluation is an action surface, not a context surface — belongs with the other action surfaces. Chat region is wide enough for the three cards to actually fit side-by-side. |
| 6 | `EscalateInterceptDialog` gains a **fourth "Partial" choice** that writes `applied_partial` with a notes prompt. | Closes the gap flagged in Phase 8 final review. Minimal incremental cost since the dialog is already getting touched. |
| 7 | `applied_at` is stamped only when the engineer commits to an action that **runs or triggers** a script — not on banner Apply click. Opening a drafting/evaluation surface no longer starts the Verifying timer. | Prevents false "applied" state when the engineer is still authoring. Corrects a Phase 8 over-eager stamp that this phase would otherwise multiply across three surfaces. |
---
## Architecture
### 1. Chat region gets a tab strip
A two-tab strip at the top of the chat region:
```
┌──────────────────────────────────────┐
│ [Chat] [Script Builder ●] │
├──────────────────────────────────────┤
│ │
│ (content per active tab, │
│ via display:none toggling) │
│ │
└──────────────────────────────────────┘
```
- **When the strip renders:** only when an `activeFix` exists AND the fix is non-terminal AND (`fix.ai_drafted_script` is null AND `fix.script_template_id` is null) — i.e., the fix genuinely needs a script drafted. Otherwise the chat region shows without tabs.
- **Tab switching uses `display: none`**, not unmount. Chat scroll position, draft message, and Script Builder state all persist across switches.
- **Indicator dot** on the Script Builder tab fires when there's in-progress draft state: at least one AI message sent in the `ScriptBuilderChat`, or non-empty Monaco buffer. Clears when the draft is submitted.
- **Session switch** clears tab state via the existing `resetSessionDerivedState` helper.
### 2. Script Builder tab content
A new controller component `ScriptBuilderTab` owns the inline lifecycle:
- Creates / resumes a `script_builder_sessions` row with `origin='pilot_inline'` + `ai_session_id = <pilot session id>`.
- Manages AI-chat message state (via the existing script-builder message endpoints) and the Monaco editor buffer.
- On submit, fires `PATCH /ai-sessions/{sid}/suggested-fixes/{fid}/script`.
`ScriptBuilderChat` itself is **unchanged** — it stays a pure display component taking `messages`, `language`, `onViewScript`, `onSaveScript`, `isLoading`. The controller wires `onSaveScript` to its submit path instead of the template-creation path the standalone `/script-builder` page uses.
A header toolbar above the controller's render area hosts the mode toggle:
```
┌──────────────────────────────────────┐
│ Script Builder · Outlook fix │
│ [✎ Write myself]│
├──────────────────────────────────────┤
│ (mode-specific content) │
│ │
└──────────────────────────────────────┘
```
- Clicking **✎ Write myself** flips `scriptBuilderMode` to `'editor'` — the controller renders Monaco in place of `ScriptBuilderChat`, pre-loaded with a scaffold (fix description as a language-appropriate comment header + an empty body).
- A reciprocal **✨ Back to AI** button in editor mode returns to the chat.
- Switching modes **does not discard** work. The Monaco buffer and the script-builder session both persist across toggles. This matters when an engineer drafts with AI, switches to editor to tweak a line, then considers going back.
- Both modes share a single terminal action: the controller's **Submit → `PATCH /ai-sessions/{sid}/suggested-fixes/{fid}/script`**. On success the fix gains `ai_drafted_script`; the tab strip disappears (since the fix no longer needs a script) and the banner's Apply button now routes to `NoTemplateDialog` in the chat region.
- **Submit does NOT stamp `applied_at`.** A draft is not an application — see §5 Apply lifecycle below.
### 3. NoTemplateDialog relocation to chat region
- Removed from `TaskLane.bottomSlot`. Renders in the chat region, slide-up-above-composer (same mechanical placement as `ProposalBanner`).
- The three-card layout (`grid-cols-3` at the chat region's natural width) actually fits — no `grid-cols-1` regression needed.
- Opens when the engineer clicks Apply on the banner AND `fix.ai_drafted_script` is non-empty.
- Existing `handleScriptDecision` logic unchanged; only the render location moves.
### 4. Banner Apply routing (updated)
Three mutually-exclusive outcomes based on the fix's shape:
```
handleApplyFix():
if fix.script_template_id:
open TemplateMatchPanel (unchanged — still renders in task lane for now)
elif fix.ai_drafted_script:
open NoTemplateDialog in chat region (new location, Chat tab)
else:
open Script Builder tab in chat region (new tab)
```
The NoTemplateDialog-in-chat-region path lives on the **Chat tab** (slides up above composer; the tab strip only renders for the no-draft case, so when NoTemplateDialog shows, the tab strip is not on screen). The Script Builder tab path is the opposite — tab strip renders, engineer is on the Script Builder tab.
`TemplateMatchPanel` stays in the task lane for this phase — it's a different surface with different interactions and it's not broken. Moving it is possible future work.
### 5. Apply lifecycle — `applied_at` semantics correction
**Problem.** Today (Phase 8) `handleApplyFix` calls `POST /apply` the moment the banner's Apply button is clicked, stamping `applied_at` regardless of what happens next. This starts the Verifying timer (nudge countdown, Resolve auto-success, Escalate intercept) even if the engineer is only opening a drafting surface and hasn't actually run anything yet. For the no-draft path introduced in this phase, that's clearly wrong — opening the Script Builder tab is the start of *authoring*, not the start of *verifying*.
**Rule.** `applied_at` is stamped **only when the engineer commits to an action that produces or triggers a run**, not when they open a surface:
| Banner Apply click → routes to... | Stamps `applied_at`? |
|---|---|
| `TemplateMatchPanel` (existing flow) | Only when the engineer clicks the "Run" action inside the panel |
| `NoTemplateDialog` (existing, now in chat region) → `one_off` card | Yes — one_off runs the script |
| `NoTemplateDialog``draft_template` card | No — this is template authoring, not fix application |
| `NoTemplateDialog``build_template` card | No — navigates to the builder; no run |
| Script Builder tab → Submit | No — just produces a draft. Engineer then clicks Apply again, gets `NoTemplateDialog`, picks `one_off` to actually run |
**Implementation.**
- Remove `sessionSuggestedFixesApi.applyFix(...)` call from `handleApplyFix`. Move the call to the run-path handlers: `TemplateMatchPanel`'s run button, `handleScriptDecision`'s `one_off` branch only. The `applyFix` endpoint itself (from Phase 8 Issue #2) stays unchanged — only its call sites move.
- Until `applied_at` is stamped, the fix remains in `proposed`. `bannerMode` computation already returns `'proposed'` when `applied_at` is null, so the banner naturally stays on Proposed state through the entire drafting phase.
- **Phase 8 consequence.** This is a semantic revision of Phase 8, not just Phase 9 behavior. Add a test asserting that opening `TemplateMatchPanel` does NOT stamp `applied_at`; only the Run action does.
### 6. EscalateInterceptDialog partial choice
Adds a fourth button to the existing popover:
| Existing choices | New choice |
|---|---|
| The fix didn't work | (existing) |
| It worked — escalating for another reason | (existing) |
| Never actually applied it | (existing) |
| **I applied some of it — partial** | **NEW** |
- When clicked: prompts for partial notes (same pattern as the banner's Partial path — `window.prompt` for now, matching Phase 8's interim), then calls `patchOutcome('applied_partial', notes)`.
- `handleInterceptChoice` gains an `applied_partial` branch. The `InterceptChoice` type already includes `'applied_partial'` via `FixOutcome | 'never_applied'`, so no type changes needed.
- When a fix enters the dialog already in `applied_partial` state, the fourth button is hidden (can't transition partial → partial with different semantics). The "didn't work" button remains available to progress to `applied_failed`.
---
## Data model
### New migration
`script_builder_sessions` **already has** `ai_session_id` (FK → `ai_sessions.id`, nullable, `ON DELETE SET NULL`) with the comment "Link to FlowPilot session if launched from there." The existing column is the link we need — no new FK is added. The migration introduces only the `origin` discriminator:
```sql
ALTER TABLE script_builder_sessions
ADD COLUMN origin VARCHAR(20) NOT NULL DEFAULT 'standalone';
ALTER TABLE script_builder_sessions
ADD CONSTRAINT ck_script_builder_sessions_origin
CHECK (origin IN ('standalone', 'pilot_inline'));
-- Invariant: pilot_inline rows must be linked to a pilot session.
-- Standalone rows may or may not be linked (legacy back-channel).
ALTER TABLE script_builder_sessions
ADD CONSTRAINT ck_script_builder_sessions_origin_ai_session
CHECK (origin <> 'pilot_inline' OR ai_session_id IS NOT NULL);
```
`origin = 'standalone'` → existing `/script-builder` page usage (existing rows backfill to this default). `origin = 'pilot_inline'` → new Script Builder tab; `ai_session_id` is populated at row creation.
`origin` earns its keep as an explicit discriminator for:
- Filtering (`list_sessions` / `count_user_sessions` exclude `pilot_inline` by default — see §Data model filter changes below).
- Future split-quota billing (decided to count as one billable session for now, but tagged for analytics).
### Data-model filter changes — `script_builder_sessions` list + count
Inline sessions would otherwise pollute the standalone `/script-builder` dashboard and count against the per-user 5-session cap enforced by the `POST /script-builder/sessions` endpoint. Required changes:
- `script_builder_service.list_sessions(user_id)` → default scope `origin = 'standalone'`. Callers that genuinely want all rows (e.g., an admin dashboard in a future phase) can pass an explicit `include_inline=True` flag, but no current caller needs it.
- `script_builder_service.count_user_sessions(user_id)` → same scope.
- Both changes covered by tests:
- 5 `pilot_inline` sessions should still leave the engineer free to create 5 standalone sessions (no cap interaction).
- `list_sessions` returns only `standalone` rows.
### New backend endpoint
```
PATCH /api/v1/ai-sessions/{session_id}/suggested-fixes/{fix_id}/script
```
Request:
```json
{
"ai_drafted_script": "string (required, 1..50_000 chars)",
"ai_drafted_parameters": { /* optional JSONB */ }
}
```
Behavior:
- Auth: `require_engineer_or_admin` + `_load_session_or_404`.
- 404 if fix not found on that session.
- 409 if fix is in a terminal status (`applied_success`, `applied_failed`, `dismissed`) — a drafted script can't be attached after the fix is done.
- Sets `fix.ai_drafted_script` + `fix.ai_drafted_parameters`.
- **Does NOT stamp `fix.applied_at`.** A draft is not an application — see §5 above.
- **Bumps `ai_sessions.state_version`** — the fix just transitioned from "needs drafting" to "has draft", which affects Resolve/Escalate preview regeneration.
- Returns `SessionSuggestedFixResponse`.
### ScriptBuilderTab controller (frontend) — no changes to `ScriptBuilderChat`
`ScriptBuilderChat` (`frontend/src/components/script-builder/ScriptBuilderChat.tsx`) is a presentational component taking `messages`, `language`, `onViewScript`, `onSaveScript`, `isLoading`. **It does not need a `mode` prop** — adding persistence semantics to a display component would be wrong.
Instead, introduce a new controller component `frontend/src/components/pilot/ScriptBuilderTab.tsx` that owns the inline lifecycle:
- On mount: locate-or-create a `script_builder_sessions` row via the existing `POST /script-builder/sessions` endpoint, passing a new body field `origin: 'pilot_inline'` and the current pilot session id for `ai_session_id`. (The endpoint gains the `origin` parameter; legacy callers continue defaulting to `'standalone'`.)
- Holds local state for the AI message list, the Monaco buffer, and `scriptBuilderMode`.
- Renders `ScriptBuilderChat` in AI mode with `onSaveScript` wired to the inline submit path (PATCH /script), NOT the standalone template-creation path.
- Renders Monaco (via existing `CodeModeEditor` pattern) in `'editor'` mode with its own Save button that triggers the same submit.
- Emits an `onScriptDrafted` event to `AssistantChatPage` on success so the page can `setActiveFix(updated)`, hide the tab strip, and return the engineer to Chat tab.
The standalone `/script-builder` page retains its current behavior unchanged — it continues to create `script_templates` rows on submit. The split happens cleanly at the controller layer, not inside `ScriptBuilderChat`.
---
## State
### Frontend state (AssistantChatPage)
New local state:
- `chatTab: 'chat' | 'script_builder'` — which tab is visible. Defaults to `'chat'`.
- `scriptBuilderMode: 'ai' | 'editor'` — which sub-view inside the Script Builder tab. Defaults to `'ai'`.
- `scriptBuilderHasProgress: boolean` — drives the indicator dot.
Reset in `resetSessionDerivedState`: all three back to defaults.
Banner's Apply handler (`handleApplyFix`) updated:
- If no script + no template → set `chatTab = 'script_builder'` (and show tab strip).
- If drafted script → open NoTemplateDialog in the chat region (new state or existing `scriptPanelOpen` reused).
- If template → existing TemplateMatchPanel flow (unchanged).
### Tab strip visibility
The tab strip is derived, not state:
```ts
const showTabStrip =
activeFix != null &&
activeFix.status !== 'dismissed' &&
activeFix.status !== 'applied_success' &&
activeFix.status !== 'applied_failed' &&
!activeFix.script_template_id &&
!activeFix.ai_drafted_script
```
When the strip hides (e.g., after script is drafted), `chatTab` resets to `'chat'` to avoid stuck state.
### Tab switching guard
The existing `currentChatRef` pattern (Async-select-load-apply guard) applies: when the engineer switches chats, any in-flight tab-derived state is discarded.
---
## Out of scope
- **NoTemplateDialog grid fix.** Moved to the chat region (wide enough), so the `grid-cols-1 sm:grid-cols-3` layout now works as intended. No grid edit required.
- **`window.prompt` replacement** for partial-notes / failure-reason capture. Still the Phase 8 interim pattern; replacement is deferred to a later design debt pass.
- **TemplateMatchPanel relocation** to the chat region. Different surface, different interactions, not broken today. Possible future work.
- **Dedicated "clear AI outcome proposal" button in the UI.** Already covered by Phase 8 Issue #3 fix (DELETE endpoint + clear-on-outcome-write).
- **Task lane bottom-slot audit.** With NoTemplateDialog removed from the slot, it may be empty on most sessions. Keep the slot API stable; any cleanup is out of scope.
---
## Tests
### Backend
- **Migration:** forward + downgrade reversibility; existing rows backfill to `origin='standalone'`; the `origin='pilot_inline' ⇒ ai_session_id IS NOT NULL` invariant is enforced by the check constraint.
- **PATCH /script endpoint** (new test file `test_fix_script_endpoint.py`):
- happy path — 200, `ai_drafted_script` set, `state_version` bumped, `applied_at` untouched.
- 404 on wrong session.
- 409 on terminal status.
- 400 on empty body.
- **list/count filter changes** (extend `test_script_builder.py` or nearby):
- 5 `pilot_inline` sessions + subsequent `standalone` session creation succeeds (does not hit the 5-cap).
- `list_sessions` returns only `standalone` rows by default.
- **Apply lifecycle correction** (extend `test_fix_outcome_endpoint.py`):
- Banner Apply click that routes to a drafting/evaluation surface does NOT stamp `applied_at`.
- `one_off` decision from `NoTemplateDialog` DOES stamp `applied_at`.
- `TemplateMatchPanel` Run action DOES stamp `applied_at`.
### Frontend
Manual verification (no component test harness in this codebase per CLAUDE.md):
- No-draft fix → Apply click opens Script Builder tab.
- AI path: chat with AI, submit, tab disappears, NoTemplateDialog becomes eligible.
- Manual path: ✎ Write myself → Monaco loads with scaffold → edit → submit → tab disappears.
- Drafted fix → Apply click opens NoTemplateDialog in chat region (three cards side-by-side).
- Tab indicator dot appears on first AI message / non-empty Monaco buffer; clears on submit.
- Session switch with open Script Builder tab → tab/mode state resets.
- EscalateInterceptDialog partial choice → applied_partial written with notes.
### Build discipline
- `tsc -b` clean
- `npm run build` clean
- `docker exec resolutionflow_backend pytest` — all pre-existing suites still pass, no regression from the new endpoint.
---
## Files to touch (rough inventory)
**Backend — new:**
- `backend/alembic/versions/<hash>_script_builder_origin.py`
- `backend/tests/test_fix_script_endpoint.py`
**Backend — modified:**
- `backend/app/models/script_builder_session.py` — add `origin` column only (`ai_session_id` already exists).
- `backend/app/schemas/session_suggested_fix.py` — add `SessionSuggestedFixScriptRequest`.
- `backend/app/schemas/script_builder.py` (or equivalent) — add optional `origin` field to the `POST /script-builder/sessions` request schema; default `'standalone'`.
- `backend/app/api/endpoints/session_suggested_fixes.py` — add PATCH /script endpoint. Move the existing `applied_at` stamp out of the apply path and into `handleScriptDecision('one_off')` only (server side: no change to `/apply`; callers shift instead).
- `backend/app/api/endpoints/script_builder.py` — accept `origin` on session creation; enforce the `pilot_inline ⇒ ai_session_id` invariant at the handler level.
- `backend/app/services/script_builder_service.py` — persist `origin`; `list_sessions` + `count_user_sessions` filter to `origin='standalone'` by default.
- `backend/app/models/session_suggested_fix.py` — unchanged (schema already has `ai_drafted_script`).
**Frontend — new:**
- `frontend/src/components/pilot/ChatTabStrip.tsx` — renders the `[Chat] [Script Builder ●]` strip.
- `frontend/src/components/pilot/ScriptBuilderTab.tsx` — controller that owns session lifecycle, AI message state, Monaco buffer, mode toggle, and submit. Renders `ScriptBuilderChat` in AI mode and Monaco in editor mode.
- `frontend/src/components/pilot/NoTemplateDialogInline.tsx` (or reuse existing `NoTemplateDialog` with a new wrapper for chat-region styling).
**Frontend — modified:**
- `frontend/src/api/sessionSuggestedFixes.ts` — add `patchScript(sessionId, fixId, body, parameters)` method.
- `frontend/src/api/scriptBuilder.ts` (or equivalent) — `createSession` accepts an optional `origin` argument.
- `frontend/src/components/script-builder/ScriptBuilderChat.tsx`**unchanged**. Stays a pure display component.
- `frontend/src/pages/ScriptBuilderPage.tsx`**unchanged on the session-creation path** (defaults to `origin='standalone'`).
- `frontend/src/pages/AssistantChatPage.tsx` — wire tab strip, mount `ScriptBuilderTab`, banner Apply routing (no `applied_at` stamp on click), NoTemplateDialog chat-region render. Move the `sessionSuggestedFixesApi.applyFix(...)` call from `handleApplyFix` to `handleScriptDecision('one_off')` and the TemplateMatchPanel run-path handler.
- `frontend/src/components/pilot/EscalateInterceptDialog.tsx` — add fourth choice.
- `frontend/src/components/pilot/TaskLane.tsx` — remove `bottomSlot` usage of NoTemplateDialog (leave prop API stable).
**Frontend — deleted:**
- None (existing components get refactored, not deleted).
---
## Rollout
- Single branch, merged as part of the in-flight `feat/flowpilot-migration` PR (same as Phase 8).
- No feature flag — the new surface is strictly additive to the banner's Apply flow; old behavior for drafted-script fixes is preserved (just renders in a different location).
---
## Open deferrals (acknowledged, not in this phase)
- `window.prompt` → inline input migration for partial notes / failure reasons.
- Anti-parrot compliance of any new AI system prompt used by `ScriptBuilderChat` for ephemeral mode — verify it's the same safe prompt the standalone mode already uses (no new content to guard).
- Telemetry events for tab opens / AI→editor toggles / script submissions from tab — add in the Phase 9 implementation plan if we want them.