Commit Graph

944 Commits

Author SHA1 Message Date
d386d11af2 docs(pilot): correct Phase 9 migration description
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
Handoff + migration spec incorrectly claimed Phase 9 added a new
parent_pilot_session_id FK. The implementation reuses the existing
ai_session_id column; the migration only adds the origin discriminator
+ partial unique index. Also: ScriptBuilderTab wraps ScriptBuilderChat
and ScriptBodyEditor (Monaco), not "ScriptBuilderChat in ephemeral
mode" — there is no ephemeral mode on the presentational component.

Applies applied_at call-site specifics: handleScriptDecision stamps
on one_off/draft_template, TemplateMatchPanel stamps on onMarkRun,
Script Builder tab Submit does not stamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:17:08 -04:00
65a831bf9a docs(pilot): Phase 9 handoff + migration spec update
Marks open items #1 (NoTemplateDialog narrow-lane) and #3 (Tabbed
Script Builder) as resolved. Records the applied_at semantics
correction as shipped. Final Phase 9 row added to the 'What shipped'
table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:14:41 -04:00
faf1d8dd12 fix(pilot): applied_at stamps on run-declaring actions, not Apply click
Per Phase 9 §5. Before: banner Apply click stamped applied_at
regardless of whether the engineer had committed to running anything,
starting the Verifying timer prematurely. After:

- handleApplyFix no longer calls applyFix(). It just routes to the
  right surface (TemplateMatchPanel / InlineNoTemplateDialog / Script
  Builder tab).
- handleScriptDecision stamps applied_at for one_off + draft_template
  (both labels are 'Run now, …' — the click is the declaration).
  build_template does not stamp.
- TemplateMatchPanel's new 'I ran this' button calls applyFix via a
  new onMarkRun prop.
- Script Builder tab Submit does not stamp (a draft is not a run).

No backend change — the /apply endpoint is unchanged. Only call sites
move.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 04:11:56 -04:00
0386fa1fd5 feat(pilot): mount ChatTabStrip + ScriptBuilderTab + InlineNoTemplateDialog
Wires the three new components into AssistantChatPage:
- ChatTabStrip renders when the active fix needs a script drafted.
- ScriptBuilderTab sits alongside chat via display:none toggling so
  chat scroll position + builder state both persist.
- InlineNoTemplateDialog replaces the task-lane bottomSlot render for
  the drafted-script evaluation case; three cards finally fit.
- Banner Apply routing updated: no-draft/no-template → Script Builder
  tab; drafted → InlineNoTemplateDialog; template → unchanged path.

applyFix() call site moves land in the next task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 04:02:20 -04:00
82db1c78e4 feat(pilot): EscalateInterceptDialog — fourth 'partial' choice
Closes the gap Phase 8 final review flagged. When a fix is in
applied_partial state and the engineer escalates, the intercept no
longer forces them to approximate with didn't-work/worked/never-applied.

AssistantChatPage's handleInterceptChoice (Task 13) already dispatches
to patchOutcome for any FixOutcome value, so no handler change is
needed — the type already supports applied_partial.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 03:04:05 -04:00
f930787200 feat(pilot): TemplateMatchPanel — explicit 'I ran this' action
Generate and Copy alone don't declare a run — the engineer can walk
away after copying. Phase 9 §5 defines an explicit run-declaration
affordance so applied_at only stamps on the engineer's positive
commitment. Wiring from AssistantChatPage lands in Task 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 03:02:17 -04:00
5bcb7aa7c3 feat(pilot): InlineNoTemplateDialog — chat-region placement wrapper
Slide-up wrapper around the existing NoTemplateDialog for rendering
in the chat region above the composer (parallel to ProposalBanner).
The chat region's width lets grid-cols-3 finally work as intended.

No change to NoTemplateDialog itself; decision callbacks and card
copy stay identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:56:35 -04:00
04fbfe3b8f feat(pilot): ScriptBuilderTab controller
Owns the inline Script Builder session lifecycle:
- Get-or-create (origin='pilot_inline', ai_session_id) on mount.
- Renders ScriptBuilderChat in AI mode and CodeModeEditor (Monaco) in
  'Write it myself' mode. Mode toggles via display:none so buffer and
  messages persist across switches.
- Submit → sessionSuggestedFixesApi.patchScript; emits onScriptDrafted
  to parent, which refreshes the fix and hides the tab strip.
- Relays in-progress state to the parent via onProgressChange for the
  ChatTabStrip's indicator dot.

ScriptBuilderChat is untouched (stays presentational). Persistence
semantics live on the controller, not the display component.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:55:12 -04:00
f92cbefed9 feat(pilot): ChatTabStrip component — [Chat] [Script Builder ●]
Two-tab strip for the chat region. Parent controls mounting (strip only
appears when the fix needs a script drafted). Indicator dot signals
in-progress draft state. Tab switching via onChange callback; parent
handles display:none toggling so tab contents preserve state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:45:16 -04:00
c9306e40c9 feat(pilot): frontend API client — patchScript + inline createSession
sessionSuggestedFixesApi.patchScript(sessionId, fixId, script, params?)
hits the new PATCH /script endpoint.

scriptBuilder.createSession accepts an optional options bag with
origin + aiSessionId, defaulting to standalone when omitted so legacy
callers stay behavior-preserving.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:38:07 -04:00
1c855563ee feat(pilot): PATCH /suggested-fixes/:id/script endpoint
Called by the inline Script Builder tab on Submit. Writes
ai_drafted_script + ai_drafted_parameters to the fix without stamping
applied_at (a draft is not an application — that's §5 of the Phase 9
spec). Bumps state_version so Resolve/Escalate preview bundles
regenerate.

409 on terminal fix status. 404 on wrong session. 422 on empty script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:34:06 -04:00
d4fae87236 feat(pilot): inline Script Builder session — idempotent create + auth + filtered list
POST /script-builder/sessions now supports origin='pilot_inline':
- Requires ai_session_id; validates it against current user ownership.
- Get-or-create: returns existing row for (user, ai_session_id) pair.
- Partial unique index on the DB backs the invariant; races resolve to
  the single winner row.

list_sessions + count_user_sessions default-scope to origin='standalone'
so inline scratch sessions don't pollute the /script-builder dashboard
or count against the 5-session cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:24:57 -04:00
f2fce27f0d feat(pilot): pydantic schemas for inline origin + script PATCH
- ScriptBuilderCreateRequest gains origin ('standalone' | 'pilot_inline')
  and optional ai_session_id. Handler-side validation (next task) enforces
  pilot_inline ⇒ ai_session_id required + owned by caller.
- SessionSuggestedFixScriptRequest added for the new PATCH /script
  endpoint (Phase 9 Task 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:53:28 -04:00
93c974466a feat(pilot): script_builder_sessions.origin on SQLAlchemy model
Mirrors the DB column added in the prior migration. App-level default
is 'standalone' so existing callers of ScriptBuilderSession(...) work
without code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:48:22 -04:00
8012668975 feat(pilot): add origin + inline idempotency to script_builder_sessions
Phase 9 prep. Adds:
- origin VARCHAR(20) NOT NULL with CHECK ('standalone' | 'pilot_inline')
- invariant: pilot_inline rows must have ai_session_id
- partial unique index on (user_id, ai_session_id) WHERE origin='pilot_inline'
  — backs get-or-create idempotency for the inline Script Builder tab.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:22:53 -04:00
563bb1aa6f docs(pilot): Phase 9 implementation plan
14-task plan covering:
- DB migration for origin + partial unique index on script_builder_sessions
- Pydantic schemas for inline origin + PATCH /script
- POST /script-builder/sessions idempotent for pilot_inline + auth
- list_sessions + count_user_sessions filtered to standalone
- PATCH /suggested-fixes/:id/script (bumps state_version, no applied_at)
- Frontend API client additions
- ChatTabStrip, ScriptBuilderTab (controller), InlineNoTemplateDialog
- TemplateMatchPanel 'I ran this' action
- EscalateInterceptDialog fourth 'partial' choice
- AssistantChatPage integration + applyFix call-site relocation
- Docs + handoff updates

Paired with the spec at phase-9-script-builder-tab.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:03:57 -04:00
1d2d548fc8 docs(pilot): Phase 9 spec — final consistency polish
- Frontend scriptBuilder API client inventory now matches the backend
  schema: createSession accepts BOTH origin and ai_session_id (both
  required together for inline callers, both omitted for standalone).
- 'If template -> unchanged' sharpened: render location is unchanged,
  but run stamping moves into the panel's new 'I ran this' action per
  the §5 apply lifecycle correction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:54:04 -04:00
3ee0101c6d docs(pilot): Phase 9 spec — ownership + schema corrections
- scriptBuilderMode ownership: pinned to ScriptBuilderTab, removed from
  AssistantChatPage's state list. Parent never drives the AI/editor
  toggle; controller owns it and resets naturally on session switch via
  unmount/remount. scriptBuilderHasProgress stays on the page (needed
  for the tab strip indicator dot) and is driven by the controller via
  an onProgressChange callback.
- ScriptBuilderCreateRequest schema: explicitly calls for TWO new
  optional fields (origin + ai_session_id), not just origin. Handler
  enforces: when origin='pilot_inline', ai_session_id is required and
  must pass the current-user ownership check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:49:08 -04:00
861d082ff7 docs(pilot): Phase 9 spec — consistency pass on Apply stamp call sites
Three consistency fixes:
- File inventory (backend + frontend) now names all three apply-stamp
  call sites: handleScriptDecision('one_off' | 'draft_template') plus
  TemplateMatchPanel's 'I ran this' handler. Previously listed only
  'one_off' in two places, contradicting the §5 lifecycle table.
- NoTemplateDialog relocation section no longer claims the decision
  handler is 'unchanged' — it is unchanged EXCEPT for the moved
  apply stamp, which is the point of §5.
- Open deferrals entry on ScriptBuilderChat 'ephemeral mode' removed;
  replaced with the actual new surface (ScriptBuilderTab controller),
  which reuses the existing script-builder prompt unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:41:17 -04:00
75b59123e6 docs(pilot): Phase 9 spec — fix Apply semantics + session idempotency
Four review findings addressed:

- High: draft_template 'Run now, templatize after' DOES run the
  script; applied_at table now stamps for both one_off and
  draft_template. Only build_template (no run) skips the stamp.
- Medium: TemplateMatchPanel needs an explicit '✓ I ran this' button.
  Generate/Copy don't commit to running. The new button is the stamp
  moment for template-match fixes.
- Medium: get-or-create for inline script_builder_sessions —
  POST /script-builder/sessions is now idempotent for
  origin='pilot_inline' (returns the existing row for a
  (user, ai_session_id) pair). Backed by a partial unique index:
    UNIQUE (user_id, ai_session_id) WHERE origin = 'pilot_inline'
  so remount doesn't create duplicates and draft continuity is
  preserved.
- Medium: authorization — the create endpoint validates that any
  provided ai_session_id is owned by the current user (same guard
  other pilot endpoints use). Prevents cross-user attachment of
  scratch sessions to arbitrary pilot sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:34:53 -04:00
fcd224429c docs(pilot): revise Phase 9 spec per review findings
Four findings addressed:

- High: drop proposed parent_pilot_session_id column; reuse the
  existing ai_session_id FK on script_builder_sessions. Add an
  origin + ai_session_id coherence invariant.
- High: don't add a 'mode' prop to ScriptBuilderChat (it's
  presentational). Introduce a ScriptBuilderTab controller that owns
  session lifecycle + submit, renders ScriptBuilderChat unchanged.
- Medium: filter list_sessions / count_user_sessions to origin='standalone'
  so pilot_inline scratch sessions don't pollute the /script-builder
  dashboard or count against the 5-session cap.
- Medium: applied_at is stamped only when the engineer commits to a
  run-action (one_off, TemplateMatchPanel Run), not on banner Apply
  click. Corrects a Phase 8 over-eager stamp that would otherwise
  multiply across three surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:28:53 -04:00
196c003876 docs(pilot): Phase 9 spec — tabbed Script Builder + NoTemplateDialog relocation
Design doc for the FlowPilot migration's remaining open items:
- NoTemplateDialog narrow-lane bug (resolved by moving the dialog to
  the chat region alongside ProposalBanner — three cards fit naturally
  at that width; grid-cols fix no longer needed)
- Tabbed Script Builder inside the chat (new [Chat] [Script Builder ●]
  tab strip; AI chat default with 'Write it myself' Monaco escape hatch)

Plus a Phase 8 cleanup:
- EscalateInterceptDialog fourth 'I applied some of it — partial' choice

All six architecture decisions settled via brainstorming before writing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:03:57 -04:00
f2b9476edb docs(pilot): log Issues #1-4 findings for Phase 8 review
Tracks the three code-review issues that were fixed on this branch
(#1 outcome-aware previews, #2 persist Apply, #3 persist proposal
rejection) plus a newly-documented pre-existing test failure (#4 —
decision-endpoint test written in Phase 3 never updated when Phase 5
added the drafted-script validation guard).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:18:13 -04:00
70c5da0c75 fix(pilot): persist AI-proposal rejection + clear on outcome write
Issue #3 from phase-8-review-issues.md. 'Not yet' on the AI-confirming
banner was a local-state hide; the proposal re-surfaced on the next
refreshSessionDerived call.

Two-part fix:
- PATCH /outcome now clears ai_outcome_proposal on any terminal action
  (engineer has taken a decision; stale AI proposal is moot).
- New DELETE /ai-sessions/:sid/suggested-fixes/:fid/ai-outcome-proposal
  endpoint for explicit 'Not yet' rejection. Does not touch status
  or state_version — pure UI state.

Frontend handleRejectAIProposal now calls the DELETE and setActiveFix
with the server response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:15:48 -04:00
de2bef3175 fix(pilot): persist Apply — stamp applied_at on click
Issue #2 from phase-8-review-issues.md. Apply was client-side-only via
a bannerApplied flag. Refresh / chat reselect / multi-tab would drop
Verifying state back to Proposed.

- New POST /ai-sessions/{sid}/suggested-fixes/{fid}/apply stamps
  applied_at without changing status (still 'proposed'). Idempotent
  if already stamped; 409 if fix is past proposed (a terminal outcome
  was already recorded).
- Bumps state_version so resolve/escalate preview bundles reflect that
  the fix has entered verifying.
- Frontend handleApplyFix calls the endpoint and uses the returned
  applied_at directly. bannerApplied client flag is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:10:52 -04:00
362c7b1d79 fix(pilot): outcome-aware Resolve/Escalate previews
Issue #1 from phase-8-review-issues.md. Cache invalidation alone isn't
enough — previews were also omitting outcome fields from the LLM bundle,
so a fresh regenerate still couldn't distinguish proposed / failed /
partial / success.

- PATCH /outcome now bumps ai_sessions.state_version (matches
  record_decision's existing pattern).
- Resolution-note + escalation-package bundles now include status,
  applied_at, verified_at, partial_notes, failure_reason on the active fix.
- Generator prompts prescribe outcome-aware phrasing (closure language
  for success; what-we've-tried + next-steps for failed/partial).
- New end-to-end test asserts the regenerated preview reflects the
  recorded outcome, not just that the cache key changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:04:56 -04:00
ec104dc8de docs(pilot): sync Phase 8 handoff with actual implementation
Correct the stale ai_sessions.fix_outcome reference (no such column) —
the real schema adds six columns to session_suggested_fixes. Update
last_commit to reflect the docs-correction tip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 19:48:54 -04:00
a47ce07326 docs(pilot): fix Phase 8 column + commit-SHA references
Correct the FLOWPILOT-MIGRATION.md stale references to a non-existent
ai_sessions.fix_outcome column — the actual implementation added six
columns to session_suggested_fixes. Also fix a stale first-commit SHA
(6721b84 → cdd8bb0, the former was amended away).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:42:51 -04:00
2a54127a54 docs(pilot): Phase 8 fix outcome banner — handoff + migration spec
Marks open item #2 (task-lane crowding / Suggested Fix discoverability)
as resolved by Phase 8. Open items #1 (NoTemplateDialog narrow-lane)
and #3 (Tabbed Script Builder inside chat) remain deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:52:07 -04:00
8582d24236 chore(pilot): remove deprecated SuggestedFix task-lane card
Superseded by ProposalBanner (Phase 8). The import was already removed
from AssistantChatPage in the previous commit; this deletes the orphaned
file itself and strips the now-unused suggestedFixSlot prop from
TaskLane's interface and both call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:48:42 -04:00
bdb238a274 feat(pilot): mount ProposalBanner + wire implicit signals
Replaces the task-lane SuggestedFix card with the ProposalBanner docked
above the chat composer. Wires:
- Resolve-while-verifying auto-marks applied_success (one-click resolve).
- Escalate-while-verifying opens EscalateInterceptDialog to capture the
  real outcome (default: didn't work) before handoff.
- 3+ post-apply engineer messages trigger the passive Nudge banner.
- AI [FIX_OUTCOME] proposals surface in the AIConfirming state; one-click
  confirm applies the outcome.

Banner state resets on session switch via resetSessionDerivedState.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:42:01 -04:00
075b0fc1d8 feat(pilot): EscalateInterceptDialog popover
Anchored above the Escalate button, captures fix outcome before the
engineer hands off the ticket. Defaults to 'didn't work' on Enter
(the common case). Alternatives: 'worked, escalating for another
reason' (preserves success) and 'never actually applied' (dismiss).

Task 11 will wire this to AssistantChatPage's Escalate handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:48:33 -04:00
217747f46e feat(pilot): banner AI-confirming, Nudge, Collapsed states
Completes ProposalBanner's state machine. AIConfirming (accent-blue)
surfaces the AI's [FIX_OUTCOME] proposal with one-click accept; Nudge
is the compact passive-prompt variant for post-apply chats; Collapsed
is the 28px expand-hint strip.

Adds onSilenceNudge prop so the parent can silence the nudge without
collapsing it (Task 11 wires this). Removes the last three stale
eslint-disable-next-line comments — all sub-components now use props.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:39:08 -04:00
7fa1d6a32f feat(pilot): banner Verifying + Partial states
Verifying: amber pulse animation, confidence pill becomes 'Applied Xm ago',
three actions (overflow for Mark partial, Didn't work, It worked). window.prompt
used for the partial notes + failure reason inputs — good-enough v1 pending
an inline composer.

Partial: cyan-toned to signal 'parked, outcome unknown', shows saved notes
inline, Finish it / Didn't work / It worked actions.

Adds pulse-amber to @theme animations alongside slide-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:32:02 -04:00
ac67e48500 feat(pilot): ProposalBanner scaffold + Proposed state
New component that will replace the task-lane SuggestedFix card. Docks
above the chat composer with a 320ms slide-up animation. This commit
implements only the Proposed state (Tasks 8 & 9 fill Verifying, Partial,
AI-confirming, Nudge, Collapsed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:25:41 -04:00
cdd29b460e feat(pilot): frontend fix-outcome types + patchOutcome API
Extends SessionSuggestedFix with outcome fields (status, applied_at,
verified_at, partial_notes, failure_reason, ai_outcome_proposal) and
adds a patchOutcome method hitting the new backend endpoint.

FixStatus (5 values) + FixOutcome (4 writable values) mirror the
backend Pydantic types and the DB check constraint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:20:16 -04:00
2cde6673b0 feat(pilot): [FIX_OUTCOME] system prompt instructions
Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's
parser consumes. Placeholder-only per the anti-parrot pattern — no
literal UUIDs, outcomes, or reasons that could leak into unrelated
sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:17:21 -04:00
c0112f8bee feat(pilot): [FIX_OUTCOME] marker parser + AI outcome proposal
The AI emits [FIX_OUTCOME] when the engineer indicates in chat that a
prior suggested fix worked, didn't work, or was partially applied. The
marker writes to session_suggested_fixes.ai_outcome_proposal (JSONB),
which the frontend surfaces as a "confirm outcome?" banner. The status
column is only updated when the engineer clicks confirm (via PATCH
/outcome endpoint from Task 3).

Placeholder-only system prompt wiring comes in Task 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:08:43 -04:00
8988dbc885 feat(pilot): PATCH /suggested-fixes/:id/outcome endpoint + tests
Records engineer-reported outcome (applied_success|applied_failed|
applied_partial|dismissed). Enforces transition rules (partial → success/
failed allowed; terminal outcomes return 409) and notes requirements
(applied_partial requires notes).

Sets verified_at on success/failure, stamps applied_at if not already
set (handles the case where the AI [FIX_OUTCOME] marker fires before
the engineer clicks Apply).

Also fixes pre-existing test-infrastructure bug: network_diagram.py used
bare string server_default="'[]'" for JSONB columns, which asyncpg
rejects during test schema creation. Changed to text("'[]'::jsonb") to
match the pattern used by script_template.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:59:34 -04:00
4a8e3ae954 feat(pilot): pydantic schemas for fix outcome patch
Adds FixStatus literal (5 values matching the DB check constraint),
extends SessionSuggestedFixResponse with outcome fields, and introduces
SessionSuggestedFixOutcomeRequest for the PATCH /outcome endpoint coming
in Task 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:44:39 -04:00
cdd8bb05cc feat(pilot): add outcome tracking columns to session_suggested_fixes
Phase 8 prep for the fix outcome banner. Adds:
- status (proposed|applied_success|applied_failed|applied_partial|dismissed)
- applied_at, verified_at (timestamps)
- partial_notes, failure_reason (engineer-provided context)
- ai_outcome_proposal (JSONB for AI [FIX_OUTCOME] marker payloads)

Backfills status='dismissed' from user_decision='dismissed'. status is
orthogonal to user_decision — outcome (did the fix work?) vs script-path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:40:17 -04:00
8879f96fbf fix(pilot): drop sticky section headers in task lane
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
Each lane section (What we know, Questions, Diagnostic Checks, Suggested
fix) had its own `position: sticky; top: 0` header. As the engineer
scrolled past a section, that section's header would pin until the
section's bottom edge cleared the viewport, producing an "orphaned"
label floating over unrelated content below. Headers now scroll with
their content — in a 340px-wide lane the affordance was negative value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:01:14 -04:00
8a242f5db9 feat(pilot): Phase 7 — polish (loading/empty states, shortcuts, responsive drawer)
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
- WhatWeKnow shows a "synthesizing" indicator + skeleton pulse while the
  chat cycle is in-flight; task-lane header mirrors the signal with a
  "thinking" pip so engineers know the AI is still working.
- Quiet-state hint when the lane is open (facts exist) but no open
  questions, checks, or active fix — keeps the surface from looking
  "finished" when the AI is about to follow up.
- Keyboard shortcuts: ⌘↵/Ctrl+↵ send in the composer (plain Enter still
  sends), ⌘G toggles the Script Generator panel for the active fix,
  `?` opens a new ShortcutsHelpOverlay listing all bindings. ⌘K palette
  was already wired in TopBar.
- Responsive: below 1200px the task lane collapses to a bottom drawer
  with a backdrop + a floating "Tasks ●" toggle button. TaskLane now
  takes a `variant: 'side' | 'drawer'` prop; drawer variant drops the
  resize handle and uses the shared slide-in-bottom animation.
- Build hygiene: fixed a pre-existing TS error in confirm-post error
  handling (duplicate `response` type keys) and an unused-import warning
  in TemplatizePrompt.

Verified: `npx tsc -b` and `npm run build` both clean against the dev
stack; Vite HMR applied each change without errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:19:44 -04:00
4aaf57adb5 feat(pilot): Phase 6 — post-resolve templatize prompt + draft accept/reject
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Closes the loop on the Phase 5 "Run now, templatize after resolve" path.
After a session resolves, drafts queued by the three-option dialog surface
as a modal that lets the engineer review the AI-proposed parameterization
and either save as a reusable team template or skip. A "don't ask again"
toggle writes to account_settings.preferences so the next resolve won't
pop the modal.

Backend:
- /api/v1/draft-templates:
  * GET — list account drafts (pending_only default true; pass false for
    audit view including accepted/rejected)
  * GET /{id} — single draft
  * POST /{id}/accept — promotes to a new script_templates row with
    source_session_id / source_user_id / source_ticket_ref populated
    (drives the Script Library "generated from CW #X · resolved by Y"
    provenance chip). Draft flips to status=accepted,
    promoted_template_id set, resolved_at stamped. 409 on re-accept /
    already-rejected. 400 on unknown category_id.
  * POST /{id}/reject — flips to status=rejected. 409 on re-reject.
- /api/v1/accounts/me/preferences (GET/PATCH) — thin wrapper over
  AccountSettings.get_setting/set_setting. PATCH merges keys into the
  JSONB column, preserving existing keys the client didn't touch.
  Used by the "Don't ask again for this team" checkbox
  (templatize_prompt_enabled=false) and, forward-looking, by
  cw_resolved_status_id / cw_escalated_status_id from Phase 4.
- 13 tests: list filter, accept with/without edited_body, provenance
  copy-through, reject, 409 on re-accept / re-reject, 400 on unknown
  category, prefs round-trip with merge semantics.

Frontend:
- src/components/pilot/script/TemplatizePrompt.tsx — modal showing the
  drafted script with proposed parameters in the Phase 5
  ParameterizationPreview, editable name/category/description, an
  individual-parameter remove button, and the "don't ask again" opt-out.
  Accept posts to /draft-templates/{id}/accept + optionally PATCHes
  preferences. Skip posts /reject.
- src/api/draftTemplates.ts — typed client plus accountPreferencesApi.
- AssistantChatPage: after a successful Resolve (external OR local),
  fetches preferences + pending drafts for the session and queues the
  modal one draft at a time. Escalate does not trigger this flow.
- Sidebar: Scripts nav shows the pending-draft count as a badge. Fetched
  independently of the main sidebar stats so endpoint flakes don't
  break the rest of the sidebar.

Verified live 2026-04-22: seed two drafts → GET sees both pending →
accept draft A (template created, provenance CW #99123 populated) →
reject draft B → pending count drops → PATCH opt-out → GET confirms
persistence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:37:49 -04:00
ddae171a37 fix(pilot): clear messages in resetSessionDerivedState — was leaking across chats
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Symptom: sidebar showed "User mjones got locked out … 0 messages" but the
conversation pane was rendering 2 messages from a different chat. The
task lane content matched what was displayed (so the AI was fine post-
prompt-sweep) — the leak was purely UI: messages from the previous chat
stayed on screen until the new chat's getSession returned.

selectChat resetSessionDerivedState() then awaits getSession before
calling setMessages(detail.conversation_messages). Between the reset
and that await, the prior chat's messages remain visible. handleNewChat
already had an explicit setMessages([]) call so it was unaffected;
selectChat did not.

Folded setMessages([]) into resetSessionDerivedState so any new chat-
switch entry point gets the wipe for free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:15:39 -04:00
d0ebdef9e8 fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The "AI parrots example content from system prompt" bug bit us twice in
one day across two different prompt sites. Patching individual prompts
is treating the symptom; this commit makes the rule structural.

Audit + sanitize:
- assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in
  prior commits, but the [FORK] schema still had literal "Brief reason"
  / "Short name" / "One sentence" placeholders. Replaced with
  <angle-bracket> placeholders. Anti-parrot rule itself rewritten to
  describe the failure mode abstractly instead of naming "jsmith" so
  the rule no longer trips the guardrail (and so the model doesn't
  see "jsmith" as a token at all).
- ai_chat_service.py — removed three concrete-example offenders:
  "Get-Service ADSync" command literal, the "DC01 server_name" intake
  form payload (in two places), and the inline interview demos using
  "Azure AD Sync failures" / "Exchange Online mailbox migration".
  Replaced with technology-neutral schema descriptions.
- ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the
  fully-fleshed DNS troubleshooting tree (with literal Dnscache /
  ipconfig / google.com / Start-Service) with a placeholder schema
  showing only ID-linkage shape.
- kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked
  Server Manager + DC01 example payload with a placeholder schema.

Guardrail (tests/test_prompt_anti_parrot.py):
- Imports every module under app/services/ and app/core/ and walks
  every uppercase string constant ending in _PROMPT, _SCHEMA,
  _PROTOCOL, _FORMAT, or _CONTEXT.
- test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache,
  google.com, "Outlook keeps", "Teams drops") must not appear in any
  prompt constant. Add to the list when a new leak shows up in prod —
  the list IS the audit trail.
- test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.)
  must contain placeholders only. Distinguishes JSON keys (followed
  by ':', allowed) from JSON values (followed by ',' / ']' / '}',
  must be <placeholder>); allows pipe-separated enum types
  (text|password|select) and a small set of fixed enum values
  (question, diagnostic_check, decision, action, ...). Verified by
  feeding the test a known-bad block — caught it correctly.

Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming
the test as the enforcement point so future contributors know how to
extend it (add to the known-leaked list when a new leak surfaces).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:09:30 -04:00
50215b9110 fix(pilot): strip literal example content from system prompt — model was parroting
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The system prompt had a "Complete example of a correct first response"
section with a specific Outlook/WiFi/jsmith scenario plus literal JSON
payloads in [QUESTIONS], [ACTIONS], [SUGGEST_FIX], and [PROMOTE]
markers. The model was emitting those literal strings (the same
WiFi/laptop questions, the same "Clear cached credentials" suggested
fix, the same "OWA login confirmed for jsmith" promote) on EVERY
unrelated chat — making the task lane look like it was leaking previous-
session data when in fact the AI was just reciting the prompt examples.

Replaced literal example content with `<placeholder>` schemas. Added an
explicit ANTI-PARROT RULE in the FINAL REMINDER section calling out
that the angle-bracket placeholders show SHAPE, not CONTENT, with
concrete examples of the failure mode (printer ticket → don't ask
about Outlook; user not named jsmith → don't name jsmith).

Same scrub applied to the FORK section's "Outlook AND Teams dropping"
and the worked fork-flow example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:36:29 -04:00
ce7c8ac3d5 fix(pilot): wipe full task-lane state on chat switch + extract palette event
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Two fixes from the Phase 5 shakedown:

1. Stale lane data leaking across chats. handleNewChat, sendPrefill, and
   handleResumeNew were each missed when Phase 3/5 added activeFix,
   previewKind, previewData, and scriptPanelOpen — only selectChat reset
   the full set. Result: starting a new chat while a Suggested Fix card
   was active showed the previous session's fix card (and any open
   preview/script panel) until the next backend refresh swept it.
   Consolidated all four entry points behind a single
   resetSessionDerivedState() helper so adding new lane state in future
   phases only requires touching one place.

2. CommandPalette TDZ on cold load. SCRIPTS_INLINE_QUICK_ACTION (line 66)
   referenced PILOT_INLINE_SCRIPT_PATH declared at line 94 — module-level
   evaluation hit the use before the declaration. Browser blanked with
   "Cannot access 'PILOT_INLINE_SCRIPT_PATH' before initialization".
   Moved the path const above its first use; also extracted
   PILOT_INLINE_SCRIPT_EVENT into a tiny @/lib/pilotEvents module so
   AssistantChatPage doesn't import the palette component just to read a
   string — that mixed-export pattern broke Fast Refresh ("consistent
   components exports") and added an unnecessary import edge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:30:18 -04:00
fa61376303 feat(pilot): Phase 5 — inline Script Generator integration
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Wires the SuggestedFix card to an inline panel that handles both cases:
template-matched fixes open the Script Library generator with parameters
pre-filled from session context; un-matched fixes open the three-option
dialog (one_off / draft_template / build_template). The decision endpoint
records the path choice with side effects: draft_template persists a
draft_templates row via a Sonnet-driven TemplateExtractionService;
build_template returns a redirect to the Script Builder; one_off just
records the choice.

Backend:
- TemplateExtractionService: drafts a parameter schema from a concrete
  rendered script. Conservative by default ("prefer fewer parameters").
  Round-trip-validates that templated_body only references declared
  parameters; missing-key mismatch falls back to the original script
  with no params. LLM/parse failures fall back identically — the
  engineer can still create a draft and refine in the post-resolve
  prompt (Phase 6).
- /suggested-fixes/{fix_id}/decision side effects:
  * one_off → returns rendered_script (engineer's edited version or the
    fix's ai_drafted_script verbatim)
  * draft_template → same + creates draft_templates row with extracted
    params, returns draft_template_id
  * build_template → returns redirect_path=/scripts/builder?from_session=
    &fix= so the frontend can navigate to the builder pre-loaded
- 400 when a non-template fix has no ai_drafted_script (template-matched
  fixes take the dedicated /scripts/generate path, not this endpoint).
- 12 tests: TemplateExtractionService parse + fallback paths, all four
  decision branches, edited_script override, missing-script 400.

Frontend:
- src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog,
  ParameterizationPreview}.tsx — inline panels rendered in the task
  lane's bottom slot when the engineer clicks a SuggestedFix card.
- TemplateMatchPanel: loads template via /scripts/templates/{id},
  pre-fills params from fix.ai_drafted_parameters with cyan "from
  session" tags, generates via existing /scripts/generate (already
  bumps state_version on ai_session_id from Phase 3). 404 falls back
  with a clear message instead of erroring.
- NoTemplateDialog: shows the AI-drafted script with proposed parameter
  values highlighted in amber via ParameterizationPreview; three option
  cards with the middle (draft_template) flagged Recommended; inline
  edit on the script body before deciding.
- SuggestedFix card now clickable: onActivate toggles the inline panel.
- AssistantChatPage: scriptPanelOpen state + handleScriptDecision that
  navigates on build_template and toasts on the other paths. Active fix
  changes auto-close the panel so engineers don't act on stale state.
- Cmd+K → "Open inline Script Generator" palette entry surfaces only on
  /pilot/:id routes; fires a window event the chat page subscribes to.
  No Resolve shortcut added per Section 14 decision (browser ⌘R conflict).

Verified 2026-04-22 against the dev stack:
- one_off / draft_template / build_template all return the right shape
  with real Sonnet TemplateExtractionService for the draft path.
- Conservative extraction confirmed: cmdkey + Restart-Process script
  yielded zero proposed parameters as intended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 00:15:29 -04:00
8fd2c1bac6 feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Wires the preview popover's Confirm & post action to ConnectWise (and,
via the provider pattern, any future PSA). Adds the parallel Escalate
flow with the handoff-oriented five-section markdown. Sessions without a
linked PSA ticket resolve/escalate locally — markdown stored, status
flipped, nothing posted externally.

Backend:
- EscalationPackageGeneratorService: Sonnet, five sections (Problem /
  What we've confirmed / What we've tried / Current hypothesis /
  Suggested next steps). Shares the preview_cache with a separate KIND
  so Resolve and Escalate previews for the same state coexist.
- PSAWritebackService: post_resolution_note (RESOLUTION note type,
  customer-visible), post_escalation_package (INTERNAL_ANALYSIS,
  handoff for the next engineer only), transition_ticket_status with
  mandatory re-fetch verification. PSAStatusVerificationError surfaces
  loudly when CW silently rejects a status change — the
  ConnectWise anti-pattern CLAUDE.md flags.
- Endpoints:
  * POST /ai-sessions/{id}/escalation-package/preview
  * POST /ai-sessions/{id}/resolution-note/post
  * POST /ai-sessions/{id}/escalation-package/post
  Outcomes: "resolved" / "escalated" with external_id + verified status,
  "resolved_local" / "escalated_local" when no PSA linked.
- Target CW status IDs live in account_settings.preferences
  (cw_resolved_status_id, cw_escalated_status_id). When unset, the post
  proceeds without a status transition — response includes a
  status_transition_skipped_reason rather than silently erroring.
- 7 tests: local-only path, PSA happy path with verified transition,
  status verification failure → 502, skipped transition when
  unconfigured, 409 on already-resolved re-post, escalate parallel path,
  internal-analysis note type enforced.

Frontend:
- ResolutionNotePreview now kind-parameterized ('resolve' | 'escalate')
  with inline edit + Confirm & post. Preview loads from the matching
  backend endpoint; posting calls the matching endpoint; outcome toast
  surfaces the verified CW status or the local-only result.
- AssistantChatPage: previewKind state replaces previewOpen; two toggle
  buttons (Preview Resolve note / Escalate instead) in the lane's bottom
  slot. handleConfirmPost dispatches by kind.

Verified 2026-04-22:
- Local-only Resolve + Escalate round-trip against the dev stack.
- Live Sonnet escalation-package preview; cache hit on repeat call
  with no state change (separate cache kind from resolution-note).
- PSA post + status-verification paths covered by mocked-provider pytest
  cases. Live CW round-trip pending a test CW instance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 23:54:54 -04:00