From 75b59123e6623ddc7e58207274dd5868f31a1868 Mon Sep 17 00:00:00 2001 From: Michael Chihlas Date: Thu, 23 Apr 2026 23:34:53 -0400 Subject: [PATCH] =?UTF-8?q?docs(pilot):=20Phase=209=20spec=20=E2=80=94=20f?= =?UTF-8?q?ix=20Apply=20semantics=20+=20session=20idempotency?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Four review findings addressed: - High: draft_template 'Run now, templatize after' DOES run the script; applied_at table now stamps for both one_off and draft_template. Only build_template (no run) skips the stamp. - Medium: TemplateMatchPanel needs an explicit '✓ I ran this' button. Generate/Copy don't commit to running. The new button is the stamp moment for template-match fixes. - Medium: get-or-create for inline script_builder_sessions — POST /script-builder/sessions is now idempotent for origin='pilot_inline' (returns the existing row for a (user, ai_session_id) pair). Backed by a partial unique index: UNIQUE (user_id, ai_session_id) WHERE origin = 'pilot_inline' so remount doesn't create duplicates and draft continuity is preserved. - Medium: authorization — the create endpoint validates that any provided ai_session_id is owned by the current user (same guard other pilot endpoints use). Prevents cross-user attachment of scratch sessions to arbitrary pilot sessions. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../phase-9-script-builder-tab.md | 49 +++++++++++++++---- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/docs/FlowAssist_Migration/phase-9-script-builder-tab.md b/docs/FlowAssist_Migration/phase-9-script-builder-tab.md index b09e24a6..afdabe75 100644 --- a/docs/FlowAssist_Migration/phase-9-script-builder-tab.md +++ b/docs/FlowAssist_Migration/phase-9-script-builder-tab.md @@ -118,16 +118,18 @@ The NoTemplateDialog-in-chat-region path lives on the **Chat tab** (slides up ab | Banner Apply click → routes to... | Stamps `applied_at`? | |---|---| -| `TemplateMatchPanel` (existing flow) | Only when the engineer clicks the "Run" action inside the panel | -| `NoTemplateDialog` (existing, now in chat region) → `one_off` card | Yes — one_off runs the script | -| `NoTemplateDialog` → `draft_template` card | No — this is template authoring, not fix application | -| `NoTemplateDialog` → `build_template` card | No — navigates to the builder; no run | -| Script Builder tab → Submit | No — just produces a draft. Engineer then clicks Apply again, gets `NoTemplateDialog`, picks `one_off` to actually run | +| `TemplateMatchPanel` (existing flow) | Only when the engineer clicks a new explicit **"✓ I ran this"** action inside the panel (see below) | +| `NoTemplateDialog` → `one_off` card ("Run now, no template") | **Yes** — the card click declares "I'm running this now" | +| `NoTemplateDialog` → `draft_template` card ("Run now, templatize after") | **Yes** — same declaration, the template proposal is a side effect | +| `NoTemplateDialog` → `build_template` card ("Just open the builder") | No — no run is declared; the engineer is going off to author a proper template | +| Script Builder tab → Submit | No — just produces a draft. Engineer then clicks Apply again, gets `NoTemplateDialog`, picks `one_off` or `draft_template` to declare the run | + +**New explicit "I ran this" action in `TemplateMatchPanel`.** Today the panel has Generate, Copy, and Edit Parameters — none of which commit to running. Copying doesn't imply running; the engineer can walk away. This phase adds a distinct primary button (accent-colored, below Copy) labeled **"✓ I ran this"** or **"Mark as applied"**. Click → calls `applyFix` → fix transitions to Verifying. Until clicked, the fix stays in `proposed`. **Implementation.** -- Remove `sessionSuggestedFixesApi.applyFix(...)` call from `handleApplyFix`. Move the call to the run-path handlers: `TemplateMatchPanel`'s run button, `handleScriptDecision`'s `one_off` branch only. The `applyFix` endpoint itself (from Phase 8 Issue #2) stays unchanged — only its call sites move. +- Remove `sessionSuggestedFixesApi.applyFix(...)` call from `handleApplyFix`. Move it to the three run-declaring call sites: `NoTemplateDialog`'s `handleScriptDecision('one_off' | 'draft_template')` paths AND the new `TemplateMatchPanel` "I ran this" button. The `applyFix` endpoint itself (from Phase 8 Issue #2) stays unchanged — only its call sites move. - Until `applied_at` is stamped, the fix remains in `proposed`. `bannerMode` computation already returns `'proposed'` when `applied_at` is null, so the banner naturally stays on Proposed state through the entire drafting phase. -- **Phase 8 consequence.** This is a semantic revision of Phase 8, not just Phase 9 behavior. Add a test asserting that opening `TemplateMatchPanel` does NOT stamp `applied_at`; only the Run action does. +- **Phase 8 consequence.** This is a semantic revision of Phase 8, not just Phase 9 behavior. Tests must assert: opening `TemplateMatchPanel` does NOT stamp `applied_at`; clicking "I ran this" DOES; `NoTemplateDialog` `one_off` AND `draft_template` both DO; `build_template` does NOT. ### 6. EscalateInterceptDialog partial choice @@ -150,7 +152,7 @@ Adds a fourth button to the existing popover: ### New migration -`script_builder_sessions` **already has** `ai_session_id` (FK → `ai_sessions.id`, nullable, `ON DELETE SET NULL`) with the comment "Link to FlowPilot session if launched from there." The existing column is the link we need — no new FK is added. The migration introduces only the `origin` discriminator: +`script_builder_sessions` **already has** `ai_session_id` (FK → `ai_sessions.id`, nullable, `ON DELETE SET NULL`) with the comment "Link to FlowPilot session if launched from there." The existing column is the link we need — no new FK is added. The migration introduces only the `origin` discriminator plus a uniqueness guard for inline sessions: ```sql ALTER TABLE script_builder_sessions @@ -165,6 +167,14 @@ ALTER TABLE script_builder_sessions ALTER TABLE script_builder_sessions ADD CONSTRAINT ck_script_builder_sessions_origin_ai_session CHECK (origin <> 'pilot_inline' OR ai_session_id IS NOT NULL); + +-- Uniqueness: at most one pilot_inline session per (user, pilot session). +-- Required to back the get-or-create semantics on the endpoint and prevent +-- duplicate scratch rows on remount. Partial index scoped to pilot_inline +-- so standalone rows are unaffected. +CREATE UNIQUE INDEX ux_script_builder_sessions_pilot_inline + ON script_builder_sessions (user_id, ai_session_id) + WHERE origin = 'pilot_inline'; ``` `origin = 'standalone'` → existing `/script-builder` page usage (existing rows backfill to this default). `origin = 'pilot_inline'` → new Script Builder tab; `ai_session_id` is populated at row creation. @@ -212,7 +222,7 @@ Behavior: Instead, introduce a new controller component `frontend/src/components/pilot/ScriptBuilderTab.tsx` that owns the inline lifecycle: -- On mount: locate-or-create a `script_builder_sessions` row via the existing `POST /script-builder/sessions` endpoint, passing a new body field `origin: 'pilot_inline'` and the current pilot session id for `ai_session_id`. (The endpoint gains the `origin` parameter; legacy callers continue defaulting to `'standalone'`.) +- On mount: **get-or-create** the single inline `script_builder_sessions` row for `(current user, current pilot session)` via the existing `POST /script-builder/sessions` endpoint, passing `origin: 'pilot_inline'` and the current pilot session id for `ai_session_id`. The endpoint becomes idempotent for `origin='pilot_inline'` — if a row exists for that `(user_id, ai_session_id)` pair, it's returned; otherwise created. The partial unique index on the DB backs the invariant independent of endpoint code. Remounting (tab hide/show, page refresh) resumes the same session — no duplicates, no lost draft continuity. - Holds local state for the AI message list, the Monaco buffer, and `scriptBuilderMode`. - Renders `ScriptBuilderChat` in AI mode with `onSaveScript` wired to the inline submit path (PATCH /script), NOT the standalone template-creation path. - Renders Monaco (via existing `CodeModeEditor` pattern) in `'editor'` mode with its own Save button that triggers the same submit. @@ -220,6 +230,18 @@ Instead, introduce a new controller component `frontend/src/components/pilot/Scr The standalone `/script-builder` page retains its current behavior unchanged — it continues to create `script_templates` rows on submit. The split happens cleanly at the controller layer, not inside `ScriptBuilderChat`. +### `POST /script-builder/sessions` — changes for inline origin + +The existing endpoint is extended in three ways: + +1. **Accepts `origin`** in the request body (`Literal['standalone', 'pilot_inline']`, default `'standalone'`). Legacy callers unchanged. +2. **Authorization on `ai_session_id`.** When `origin='pilot_inline'` is passed AND `ai_session_id` is provided, the handler MUST verify the referenced `ai_sessions` row is owned by the current user (or within their account — whichever guard `_load_session_or_404(db, ai_session_id, current_user)` already enforces for the pilot endpoints). Without this check, a caller could attach an inline scratch session to an arbitrary pilot session. The check fires before any row lookup or creation. +3. **Idempotent for `origin='pilot_inline'`.** If a row with `(user_id = current, ai_session_id = provided, origin = 'pilot_inline')` already exists, the handler returns that row (200) instead of creating a new one (201). The unique partial index enforces at-most-one at the DB layer; a race between two concurrent POSTs surfaces as an integrity error that the handler catches and re-reads. + +For `origin='standalone'`, behavior is unchanged — always creates, still subject to the 5-session cap. + +The 5-session cap applies only to `standalone` rows (see §Data-model filter changes). Inline sessions are out of that accounting entirely. + --- ## State @@ -285,7 +307,14 @@ The existing `currentChatRef` pattern (Async-select-load-apply guard) applies: w - **Apply lifecycle correction** (extend `test_fix_outcome_endpoint.py`): - Banner Apply click that routes to a drafting/evaluation surface does NOT stamp `applied_at`. - `one_off` decision from `NoTemplateDialog` DOES stamp `applied_at`. - - `TemplateMatchPanel` Run action DOES stamp `applied_at`. + - `draft_template` decision from `NoTemplateDialog` DOES stamp `applied_at` (it still runs the script). + - `build_template` decision from `NoTemplateDialog` does NOT stamp (no run). + - `TemplateMatchPanel` "I ran this" action DOES stamp `applied_at`; Generate / Copy alone do NOT. +- **Script Builder session create — inline semantics** (extend `test_script_builder.py` or equivalent): + - First `POST /script-builder/sessions` with `origin='pilot_inline', ai_session_id=X` creates and returns a row. + - Second `POST` with the same `(ai_session_id, user)` returns the SAME row (no duplicate created); DB row count confirms. + - `POST` with `origin='pilot_inline'` and `ai_session_id` pointing at another user's pilot session is rejected (403/404). + - Race: two concurrent `POST`s for the same `(user, ai_session_id)` resolve to the same row id (one winner, one returns the existing). ### Frontend