Files

Michael Chihlas fcd224429c docs(pilot): revise Phase 9 spec per review findings

Four findings addressed:

- High: drop proposed parent_pilot_session_id column; reuse the
  existing ai_session_id FK on script_builder_sessions. Add an
  origin + ai_session_id coherence invariant.
- High: don't add a 'mode' prop to ScriptBuilderChat (it's
  presentational). Introduce a ScriptBuilderTab controller that owns
  session lifecycle + submit, renders ScriptBuilderChat unchanged.
- Medium: filter list_sessions / count_user_sessions to origin='standalone'
  so pilot_inline scratch sessions don't pollute the /script-builder
  dashboard or count against the 5-session cap.
- Medium: applied_at is stamped only when the engineer commits to a
  run-action (one_off, TemplateMatchPanel Run), not on banner Apply
  click. Corrects a Phase 8 over-eager stamp that would otherwise
  multiply across three surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-23 23:28:53 -04:00

24 KiB

Raw Blame History

FlowPilot Phase 9 — Tabbed Script Builder + NoTemplateDialog relocation

Date: 2026-04-23 Branch target: feat/flowpilot-migration (continuation of Phases 0–8) Depends on: Phase 8 (ProposalBanner in chat region)

Goal

Close the two remaining open items from the FlowPilot migration handoff:

NoTemplateDialog narrow-lane bug — today the dialog renders in the task lane (~340px) and its grid-cols-1 sm:grid-cols-3 layout crushes the three option cards. When the AI proposes a fix with no drafted script, all three cards render disabled, producing a dead end.
Tabbed Script Builder inside the chat — give the engineer a way to draft the missing script without leaving the session (either by chatting with the AI or hand-writing in a code editor), then feed the draft back into the existing fix lifecycle.

Plus two Phase 8 cleanup items flagged during code review:

EscalateInterceptDialog missing the Partial choice — if a fix is in applied_partial when the engineer escalates, the intercept dialog's current three choices (worked / didn't work / never applied) don't match. Add a fourth choice for partial.
applied_at semantics correction — today Phase 8's handleApplyFix stamps applied_at on every banner Apply click, starting the Verifying timer even when the engineer is only opening a drafting/evaluation surface. Move the stamp to the actual run-action handlers (see §5).

This phase depends on Phase 8's ProposalBanner already being in the chat region — it reuses the same "chat-region-owns-Apply-flow" philosophy.

Architectural decisions (settled during brainstorming)

#	Decision	Rationale
1	When a fix has no `ai_drafted_script`, the banner's Apply button routes directly to the Script Builder tab (bypassing `NoTemplateDialog` entirely).	Banner is the single entry point for Apply. `NoTemplateDialog` stays narrowly scoped to evaluating a draft that actually exists.
2	Inside the Script Builder tab, the default experience is AI-driven — a new `ScriptBuilderTab` controller owns session lifecycle + submit, and renders `ScriptBuilderChat` (which stays purely presentational). A "✎ Write it myself" button in the tab's header toolbar swaps the controller's render into a Monaco editor.	AI is the common path. Persistence semantics belong on the controller, not the chat display component (`ScriptBuilderChat` already exposes `onSaveScript` as its seam — the controller wires that callback).
3	The manual editor uses Monaco, reusing the pattern from `frontend/src/components/tree-editor/code-mode/CodeModeEditor.tsx`.	Monaco is already a dependency (`@monaco-editor/react` + `monaco-editor`). No bundle cost, proven pattern.
4	The Script Builder tab is always present while the fix is non-terminal (no close affordance). An indicator dot on the tab signals in-progress draft state.	Matches Phase 8's `display: none` philosophy — engineers move freely between chat and draft without tracking a separate open/close state.
5	`NoTemplateDialog` (draft-exists case) moves from `TaskLane.bottomSlot` to the chat region (sibling of `ProposalBanner`, slides up above composer).	Script evaluation is an action surface, not a context surface — belongs with the other action surfaces. Chat region is wide enough for the three cards to actually fit side-by-side.
6	`EscalateInterceptDialog` gains a fourth "Partial" choice that writes `applied_partial` with a notes prompt.	Closes the gap flagged in Phase 8 final review. Minimal incremental cost since the dialog is already getting touched.
7	`applied_at` is stamped only when the engineer commits to an action that runs or triggers a script — not on banner Apply click. Opening a drafting/evaluation surface no longer starts the Verifying timer.	Prevents false "applied" state when the engineer is still authoring. Corrects a Phase 8 over-eager stamp that this phase would otherwise multiply across three surfaces.

Architecture

1. Chat region gets a tab strip

A two-tab strip at the top of the chat region:

┌──────────────────────────────────────┐
│ [Chat]  [Script Builder ●]           │
├──────────────────────────────────────┤
│                                      │
│   (content per active tab,           │
│    via display:none toggling)        │
│                                      │
└──────────────────────────────────────┘

When the strip renders: only when an activeFix exists AND the fix is non-terminal AND (fix.ai_drafted_script is null AND fix.script_template_id is null) — i.e., the fix genuinely needs a script drafted. Otherwise the chat region shows without tabs.
Tab switching uses display: none, not unmount. Chat scroll position, draft message, and Script Builder state all persist across switches.
Indicator dot on the Script Builder tab fires when there's in-progress draft state: at least one AI message sent in the ScriptBuilderChat, or non-empty Monaco buffer. Clears when the draft is submitted.
Session switch clears tab state via the existing resetSessionDerivedState helper.

2. Script Builder tab content

A new controller component ScriptBuilderTab owns the inline lifecycle:

Creates / resumes a script_builder_sessions row with origin='pilot_inline' + ai_session_id = <pilot session id>.
Manages AI-chat message state (via the existing script-builder message endpoints) and the Monaco editor buffer.
On submit, fires PATCH /ai-sessions/{sid}/suggested-fixes/{fid}/script.

ScriptBuilderChat itself is unchanged — it stays a pure display component taking messages, language, onViewScript, onSaveScript, isLoading. The controller wires onSaveScript to its submit path instead of the template-creation path the standalone /script-builder page uses.

A header toolbar above the controller's render area hosts the mode toggle:

┌──────────────────────────────────────┐
│ Script Builder · Outlook fix         │
│                      [✎ Write myself]│
├──────────────────────────────────────┤
│  (mode-specific content)             │
│                                      │
└──────────────────────────────────────┘

Clicking ✎ Write myself flips scriptBuilderMode to 'editor' — the controller renders Monaco in place of ScriptBuilderChat, pre-loaded with a scaffold (fix description as a language-appropriate comment header + an empty body).
A reciprocal ✨ Back to AI button in editor mode returns to the chat.
Switching modes does not discard work. The Monaco buffer and the script-builder session both persist across toggles. This matters when an engineer drafts with AI, switches to editor to tweak a line, then considers going back.
Both modes share a single terminal action: the controller's Submit → PATCH /ai-sessions/{sid}/suggested-fixes/{fid}/script. On success the fix gains ai_drafted_script; the tab strip disappears (since the fix no longer needs a script) and the banner's Apply button now routes to NoTemplateDialog in the chat region.
Submit does NOT stamp applied_at. A draft is not an application — see §5 Apply lifecycle below.

3. NoTemplateDialog relocation to chat region

Removed from TaskLane.bottomSlot. Renders in the chat region, slide-up-above-composer (same mechanical placement as ProposalBanner).
The three-card layout (grid-cols-3 at the chat region's natural width) actually fits — no grid-cols-1 regression needed.
Opens when the engineer clicks Apply on the banner AND fix.ai_drafted_script is non-empty.
Existing handleScriptDecision logic unchanged; only the render location moves.

4. Banner Apply routing (updated)

Three mutually-exclusive outcomes based on the fix's shape:

handleApplyFix():
  if fix.script_template_id:
    open TemplateMatchPanel (unchanged — still renders in task lane for now)
  elif fix.ai_drafted_script:
    open NoTemplateDialog in chat region (new location, Chat tab)
  else:
    open Script Builder tab in chat region (new tab)

The NoTemplateDialog-in-chat-region path lives on the Chat tab (slides up above composer; the tab strip only renders for the no-draft case, so when NoTemplateDialog shows, the tab strip is not on screen). The Script Builder tab path is the opposite — tab strip renders, engineer is on the Script Builder tab.

TemplateMatchPanel stays in the task lane for this phase — it's a different surface with different interactions and it's not broken. Moving it is possible future work.

5. Apply lifecycle — `applied_at` semantics correction

Problem. Today (Phase 8) handleApplyFix calls POST /apply the moment the banner's Apply button is clicked, stamping applied_at regardless of what happens next. This starts the Verifying timer (nudge countdown, Resolve auto-success, Escalate intercept) even if the engineer is only opening a drafting surface and hasn't actually run anything yet. For the no-draft path introduced in this phase, that's clearly wrong — opening the Script Builder tab is the start of authoring, not the start of verifying.

Rule. applied_at is stamped only when the engineer commits to an action that produces or triggers a run, not when they open a surface:

Banner Apply click → routes to...	Stamps `applied_at`?
`TemplateMatchPanel` (existing flow)	Only when the engineer clicks the "Run" action inside the panel
`NoTemplateDialog` (existing, now in chat region) → `one_off` card	Yes — one_off runs the script
`NoTemplateDialog` → `draft_template` card	No — this is template authoring, not fix application
`NoTemplateDialog` → `build_template` card	No — navigates to the builder; no run
Script Builder tab → Submit	No — just produces a draft. Engineer then clicks Apply again, gets `NoTemplateDialog`, picks `one_off` to actually run

Implementation.

Remove sessionSuggestedFixesApi.applyFix(...) call from handleApplyFix. Move the call to the run-path handlers: TemplateMatchPanel's run button, handleScriptDecision's one_off branch only. The applyFix endpoint itself (from Phase 8 Issue #2) stays unchanged — only its call sites move.
Until applied_at is stamped, the fix remains in proposed. bannerMode computation already returns 'proposed' when applied_at is null, so the banner naturally stays on Proposed state through the entire drafting phase.
Phase 8 consequence. This is a semantic revision of Phase 8, not just Phase 9 behavior. Add a test asserting that opening TemplateMatchPanel does NOT stamp applied_at; only the Run action does.

6. EscalateInterceptDialog partial choice

Adds a fourth button to the existing popover:

Existing choices	New choice
The fix didn't work	(existing)
It worked — escalating for another reason	(existing)
Never actually applied it	(existing)
I applied some of it — partial	NEW

When clicked: prompts for partial notes (same pattern as the banner's Partial path — window.prompt for now, matching Phase 8's interim), then calls patchOutcome('applied_partial', notes).
handleInterceptChoice gains an applied_partial branch. The InterceptChoice type already includes 'applied_partial' via FixOutcome | 'never_applied', so no type changes needed.
When a fix enters the dialog already in applied_partial state, the fourth button is hidden (can't transition partial → partial with different semantics). The "didn't work" button remains available to progress to applied_failed.

Data model

New migration

script_builder_sessions already has ai_session_id (FK → ai_sessions.id, nullable, ON DELETE SET NULL) with the comment "Link to FlowPilot session if launched from there." The existing column is the link we need — no new FK is added. The migration introduces only the origin discriminator:

ALTER TABLE script_builder_sessions
  ADD COLUMN origin VARCHAR(20) NOT NULL DEFAULT 'standalone';

ALTER TABLE script_builder_sessions
  ADD CONSTRAINT ck_script_builder_sessions_origin
    CHECK (origin IN ('standalone', 'pilot_inline'));

-- Invariant: pilot_inline rows must be linked to a pilot session.
-- Standalone rows may or may not be linked (legacy back-channel).
ALTER TABLE script_builder_sessions
  ADD CONSTRAINT ck_script_builder_sessions_origin_ai_session
    CHECK (origin <> 'pilot_inline' OR ai_session_id IS NOT NULL);

origin = 'standalone' → existing /script-builder page usage (existing rows backfill to this default). origin = 'pilot_inline' → new Script Builder tab; ai_session_id is populated at row creation.

origin earns its keep as an explicit discriminator for:

Filtering (list_sessions / count_user_sessions exclude pilot_inline by default — see §Data model filter changes below).
Future split-quota billing (decided to count as one billable session for now, but tagged for analytics).

Data-model filter changes — `script_builder_sessions` list + count

Inline sessions would otherwise pollute the standalone /script-builder dashboard and count against the per-user 5-session cap enforced by the POST /script-builder/sessions endpoint. Required changes:

script_builder_service.list_sessions(user_id) → default scope origin = 'standalone'. Callers that genuinely want all rows (e.g., an admin dashboard in a future phase) can pass an explicit include_inline=True flag, but no current caller needs it.
script_builder_service.count_user_sessions(user_id) → same scope.
Both changes covered by tests:
- 5 pilot_inline sessions should still leave the engineer free to create 5 standalone sessions (no cap interaction).
- list_sessions returns only standalone rows.

New backend endpoint

PATCH /api/v1/ai-sessions/{session_id}/suggested-fixes/{fix_id}/script

Request:

{
  "ai_drafted_script": "string (required, 1..50_000 chars)",
  "ai_drafted_parameters": { /* optional JSONB */ }
}

Behavior:

Auth: require_engineer_or_admin + _load_session_or_404.
404 if fix not found on that session.
409 if fix is in a terminal status (applied_success, applied_failed, dismissed) — a drafted script can't be attached after the fix is done.
Sets fix.ai_drafted_script + fix.ai_drafted_parameters.
Does NOT stamp fix.applied_at. A draft is not an application — see §5 above.
Bumps ai_sessions.state_version — the fix just transitioned from "needs drafting" to "has draft", which affects Resolve/Escalate preview regeneration.
Returns SessionSuggestedFixResponse.

ScriptBuilderTab controller (frontend) — no changes to `ScriptBuilderChat`

ScriptBuilderChat (frontend/src/components/script-builder/ScriptBuilderChat.tsx) is a presentational component taking messages, language, onViewScript, onSaveScript, isLoading. It does not need a mode prop — adding persistence semantics to a display component would be wrong.

Instead, introduce a new controller component frontend/src/components/pilot/ScriptBuilderTab.tsx that owns the inline lifecycle:

On mount: locate-or-create a script_builder_sessions row via the existing POST /script-builder/sessions endpoint, passing a new body field origin: 'pilot_inline' and the current pilot session id for ai_session_id. (The endpoint gains the origin parameter; legacy callers continue defaulting to 'standalone'.)
Holds local state for the AI message list, the Monaco buffer, and scriptBuilderMode.
Renders ScriptBuilderChat in AI mode with onSaveScript wired to the inline submit path (PATCH /script), NOT the standalone template-creation path.
Renders Monaco (via existing CodeModeEditor pattern) in 'editor' mode with its own Save button that triggers the same submit.
Emits an onScriptDrafted event to AssistantChatPage on success so the page can setActiveFix(updated), hide the tab strip, and return the engineer to Chat tab.

The standalone /script-builder page retains its current behavior unchanged — it continues to create script_templates rows on submit. The split happens cleanly at the controller layer, not inside ScriptBuilderChat.

State

Frontend state (AssistantChatPage)

New local state:

chatTab: 'chat' | 'script_builder' — which tab is visible. Defaults to 'chat'.
scriptBuilderMode: 'ai' | 'editor' — which sub-view inside the Script Builder tab. Defaults to 'ai'.
scriptBuilderHasProgress: boolean — drives the indicator dot.

Reset in resetSessionDerivedState: all three back to defaults.

Banner's Apply handler (handleApplyFix) updated:

If no script + no template → set chatTab = 'script_builder' (and show tab strip).
If drafted script → open NoTemplateDialog in the chat region (new state or existing scriptPanelOpen reused).
If template → existing TemplateMatchPanel flow (unchanged).

Tab strip visibility

The tab strip is derived, not state:

const showTabStrip =
  activeFix != null &&
  activeFix.status !== 'dismissed' &&
  activeFix.status !== 'applied_success' &&
  activeFix.status !== 'applied_failed' &&
  !activeFix.script_template_id &&
  !activeFix.ai_drafted_script

When the strip hides (e.g., after script is drafted), chatTab resets to 'chat' to avoid stuck state.

Tab switching guard

The existing currentChatRef pattern (Async-select-load-apply guard) applies: when the engineer switches chats, any in-flight tab-derived state is discarded.

Out of scope

NoTemplateDialog grid fix. Moved to the chat region (wide enough), so the grid-cols-1 sm:grid-cols-3 layout now works as intended. No grid edit required.
window.prompt replacement for partial-notes / failure-reason capture. Still the Phase 8 interim pattern; replacement is deferred to a later design debt pass.
TemplateMatchPanel relocation to the chat region. Different surface, different interactions, not broken today. Possible future work.
Dedicated "clear AI outcome proposal" button in the UI. Already covered by Phase 8 Issue #3 fix (DELETE endpoint + clear-on-outcome-write).
Task lane bottom-slot audit. With NoTemplateDialog removed from the slot, it may be empty on most sessions. Keep the slot API stable; any cleanup is out of scope.

Tests

Backend

Migration: forward + downgrade reversibility; existing rows backfill to origin='standalone'; the origin='pilot_inline' ⇒ ai_session_id IS NOT NULL invariant is enforced by the check constraint.
PATCH /script endpoint (new test file test_fix_script_endpoint.py):
- happy path — 200, ai_drafted_script set, state_version bumped, applied_at untouched.
- 404 on wrong session.
- 409 on terminal status.
- 400 on empty body.
list/count filter changes (extend test_script_builder.py or nearby):
- 5 pilot_inline sessions + subsequent standalone session creation succeeds (does not hit the 5-cap).
- list_sessions returns only standalone rows by default.
Apply lifecycle correction (extend test_fix_outcome_endpoint.py):
- Banner Apply click that routes to a drafting/evaluation surface does NOT stamp applied_at.
- one_off decision from NoTemplateDialog DOES stamp applied_at.
- TemplateMatchPanel Run action DOES stamp applied_at.

Frontend

Manual verification (no component test harness in this codebase per CLAUDE.md):

No-draft fix → Apply click opens Script Builder tab.
AI path: chat with AI, submit, tab disappears, NoTemplateDialog becomes eligible.
Manual path: ✎ Write myself → Monaco loads with scaffold → edit → submit → tab disappears.
Drafted fix → Apply click opens NoTemplateDialog in chat region (three cards side-by-side).
Tab indicator dot appears on first AI message / non-empty Monaco buffer; clears on submit.
Session switch with open Script Builder tab → tab/mode state resets.
EscalateInterceptDialog partial choice → applied_partial written with notes.

Build discipline

tsc -b clean
npm run build clean
docker exec resolutionflow_backend pytest — all pre-existing suites still pass, no regression from the new endpoint.

Files to touch (rough inventory)

Backend — new:

backend/alembic/versions/<hash>_script_builder_origin.py
backend/tests/test_fix_script_endpoint.py

Backend — modified:

backend/app/models/script_builder_session.py — add origin column only (ai_session_id already exists).
backend/app/schemas/session_suggested_fix.py — add SessionSuggestedFixScriptRequest.
backend/app/schemas/script_builder.py (or equivalent) — add optional origin field to the POST /script-builder/sessions request schema; default 'standalone'.
backend/app/api/endpoints/session_suggested_fixes.py — add PATCH /script endpoint. Move the existing applied_at stamp out of the apply path and into handleScriptDecision('one_off') only (server side: no change to /apply; callers shift instead).
backend/app/api/endpoints/script_builder.py — accept origin on session creation; enforce the pilot_inline ⇒ ai_session_id invariant at the handler level.
backend/app/services/script_builder_service.py — persist origin; list_sessions + count_user_sessions filter to origin='standalone' by default.
backend/app/models/session_suggested_fix.py — unchanged (schema already has ai_drafted_script).

Frontend — new:

frontend/src/components/pilot/ChatTabStrip.tsx — renders the [Chat] [Script Builder ●] strip.
frontend/src/components/pilot/ScriptBuilderTab.tsx — controller that owns session lifecycle, AI message state, Monaco buffer, mode toggle, and submit. Renders ScriptBuilderChat in AI mode and Monaco in editor mode.
frontend/src/components/pilot/NoTemplateDialogInline.tsx (or reuse existing NoTemplateDialog with a new wrapper for chat-region styling).

Frontend — modified:

frontend/src/api/sessionSuggestedFixes.ts — add patchScript(sessionId, fixId, body, parameters) method.
frontend/src/api/scriptBuilder.ts (or equivalent) — createSession accepts an optional origin argument.
frontend/src/components/script-builder/ScriptBuilderChat.tsx — unchanged. Stays a pure display component.
frontend/src/pages/ScriptBuilderPage.tsx — unchanged on the session-creation path (defaults to origin='standalone').
frontend/src/pages/AssistantChatPage.tsx — wire tab strip, mount ScriptBuilderTab, banner Apply routing (no applied_at stamp on click), NoTemplateDialog chat-region render. Move the sessionSuggestedFixesApi.applyFix(...) call from handleApplyFix to handleScriptDecision('one_off') and the TemplateMatchPanel run-path handler.
frontend/src/components/pilot/EscalateInterceptDialog.tsx — add fourth choice.
frontend/src/components/pilot/TaskLane.tsx — remove bottomSlot usage of NoTemplateDialog (leave prop API stable).

Frontend — deleted:

None (existing components get refactored, not deleted).

Rollout

Single branch, merged as part of the in-flight feat/flowpilot-migration PR (same as Phase 8).
No feature flag — the new surface is strictly additive to the banner's Apply flow; old behavior for drafted-script fixes is preserved (just renders in a different location).

Open deferrals (acknowledged, not in this phase)

window.prompt → inline input migration for partial notes / failure reasons.
Anti-parrot compliance of any new AI system prompt used by ScriptBuilderChat for ephemeral mode — verify it's the same safe prompt the standalone mode already uses (no new content to guard).
Telemetry events for tab opens / AI→editor toggles / script submissions from tab — add in the Phase 9 implementation plan if we want them.

24 KiB Raw Blame History Unescape Escape