feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness #147

Merged

chihlasm merged 85 commits from feat/flowpilot-migration into main

2026-04-25 06:02:14 +00:00

Author	SHA1	Message	Date
Michael Chihlas	a45915fbbc	Merge main into feat/flowpilot-migration (PR #148 backports) Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 37s Details CI / frontend (pull_request) Failing after 1m11s Details CI / e2e (pull_request) Has been skipped Details Brings PR #148 — two pre-existing CI fixes (network_diagrams JSONB server_default, removed deprecated session-scoped event_loop fixture). The conftest.py event_loop fix on main is already incorporated in FlowPilot's `b14a16a` (RLS-gating commit, which dropped the same fixture as part of its larger refactor). Kept HEAD's version of the RLS-gating collection hook; the event_loop fixture removal is identical. The network_diagram.py fix lands cleanly via auto-merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:01:46 -04:00
Michael Chihlas	1c904373f8	Merge main into feat/flowpilot-migration Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 36s Details CI / frontend (pull_request) Failing after 1m7s Details CI / e2e (pull_request) Has been skipped Details Brings in PR #141 (PSA ticket management) so FlowPilot can ship on top of a unified main. Two manual conflict resolutions: 1. CLAUDE.md — kept the FlowPilot ai-handoff rewrite (`.ai/`-driven protocol). The pre-rewrite reference content (CW integration notes, lessons archive, env vars table) lives in `docs/connectwise/`, `docs/LESSONS-ARCHIVE.md`, and DEV-ENV.md by design. 2. frontend/src/pages/AssistantChatPage.tsx — both conflict regions were purely additive. Concatenated FlowPilot's Phase 2-9 state hooks (facts, activeFix, preview*, scriptPanelOpen, templatizeQueue) with PSA's spin-off ticket state (linkedTicket, showNewTicket, spinOffHint). Both modal mounts (TemplatizePrompt, ShortcutsHelpOverlay, NewTicketModal) kept. All setters wired by either branch are intact. Verification: - `tsc -b` clean across the merged tree. - Browser smoke-test (Session B fixture): Phase 9 ProposalBanner ("Run AI-drafted PowerShell to recover SSL VPN") renders alongside PSA's new Tickets sidebar icon. Console clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 01:03:33 -04:00
Michael Chihlas	9330ce4782	fix(pilot): two Phase 9 layout/state bugs surfaced by QA fixtures All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details 1. EscalateInterceptDialog clipped off-screen. The dialog was positioned with `absolute bottom-full mb-2 left-0` under the assumption the Escalate button would have room above it. In practice the button lives in the chat-page action bar near y≈105, so the 302 px dialog overflows the top of the viewport and only the last option is visible. Switch to `top-full mt-2 right-0` — anchors the dialog below the button and aligns its right edge with the button (avoids overflow off the right when the button is in the right-side action cluster). 2. TemplateMatchPanel never renders on a fresh session. `handleApplyFix` for the script_template_id branch only sets `scriptPanelOpen=true`, but TemplateMatchPanel is mounted inside `TaskLane.bottomSlot`. On sessions with no questions/facts the lane defaults closed, so the panel exists in the React tree but inside an unrendered TaskLane — the user clicks Apply fix and nothing visibly changes. Fix: also `setShowTaskLane(true)` in that branch so the lane opens alongside the panel. The ai_drafted_script branch is fine (InlineNoTemplateDialog renders in the chat region, not in the lane), so it's left alone. Both bugs were latent — they only surface on sessions that haven't accumulated TaskLane state yet (questions/facts). Fresh sessions created from the StartSessionInput hide them because the AI's first turn populates questions and the lane auto-opens. Caught using the new seed_phase9_qa_fixtures.py harness. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:08:50 -04:00
Michael Chihlas	d68131a865	feat(seed): Phase 9 QA fixture seeder Adds backend/scripts/seed_phase9_qa_fixtures.py — creates 4 ai_sessions plus matching session_suggested_fixes that pre-bake the four backend states the AI orchestrator must produce to mount the five conditional Phase 9 components: A. no template, no draft → ChatTabStrip + ScriptBuilderTab B. ai_drafted_script set → InlineNoTemplateDialog C. script_template_id set → TemplateMatchPanel D. applied_at + status=proposed → EscalateInterceptDialog (verify state) Background: a Phase 9 QA pass against a regular session left these five components unreached because the AI didn't emit SUGGEST_FIX in time/at all. Seeding directly bypasses the AI and lets QA exercise each surface deterministically. UUIDs are deterministic (uuid5 over a fixed namespace) so re-runs upsert. Pass --reset to wipe and recreate. Each session gets two synthetic conversation messages so the chat header's canAct gate (messages.length >= 2) opens up Resolve/Escalate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:08:38 -04:00
Michael Chihlas	875bd924a9	fix(pilot): auto-scroll Resolve preview into view when opened The ResolutionNotePreview popover renders inside TaskLane's overflow-y-auto region at the bottom of the lane. On a 720px viewport with the default question/check list expanded, the popover lands below the visible scroll position — the engineer clicks "Preview Resolve note", sees the button label flip to "Showing", but no preview appears on screen. Add a useEffect that calls scrollIntoView({block: 'nearest'}) on the popover's outer div whenever `open` flips to true. block: 'nearest' scrolls just enough to make it visible without yanking the lane to the top. Discovered during Phase 9 QA. Reproduced at 1280x720; fix verified visually in the same QA run (screenshots in .gstack/qa-reports/phase9-*/). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:45:52 -04:00
Michael Chihlas	49c6c8fd00	fix(seed): include cancel_at_period_end in test-user subscription INSERT Discovered during Phase 9 QA: seed_test_users.py was missing the cancel_at_period_end column in its subscriptions INSERT, but the column is NOT NULL (added in 016_add_subscription_tables.py). Result: seed crashed with NotNullViolationError before any users were created, blocking auth in fresh dev environments. Pre-existing on main; not introduced by the FlowPilot migration branch. Default value: false. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:36:04 -04:00
Michael Chihlas	a77e8ea578	chore: bootstrap gstack team mode Per gstack team-mode install: adds a PreToolUse hook that blocks skill usage when gstack isn't installed globally, so contributors are prompted to install it. Un-ignores the two required files (.claude/settings.json, .claude/hooks/check-gstack.sh) while keeping settings.local.json and other Claude state ignored. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:17:06 -04:00
Michael Chihlas	90252bc98f	docs(claude-md): expand gstack section with full grouped command list Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:17:01 -04:00
Michael Chihlas	036431aef8	chore(ai): update HANDOFF.md and SESSION_LOG.md for session end All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Reflect current state: dual-agent migration + Codex review round + branch cleanup (RLS test gating, Phase 9 docs, .remember/ gitignore, landing-handoff deletion). Working tree clean, no active task, 3 cleanup commits queued to push. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 16:16:55 -04:00
Michael Chihlas	b3be1e0749	chore: ignore .remember/ skill runtime state Runtime hook logs and PIDs from the remember skill — local-only, not repo content. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 16:09:23 -04:00
Michael Chihlas	b3506b5e73	docs(pilot): phase 9 review issues Review findings companion to docs/FlowAssist_Migration/Issues/phase-8-review-issues.md. Documents the issues addressed by commit `24972e8` (partial-outcome notes + per-fix script-builder remount). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 16:09:23 -04:00
Michael Chihlas	b14a16a1ab	chore(tests): gate RLS tests behind RUN_RLS_TESTS flag Continues the test-isolation work from `dab740d`. RLS migration tests run against a policy-installed database and fail in the default create_all suite, so they need to be opt-in: - pytest.ini: register `rls` marker. - conftest.py: auto-deselect test_rls_isolation.py unless RUN_RLS_TESTS=1. Drops the deprecated session-scoped event_loop fixture (not needed since pytest-asyncio 0.23+). - test_rls_isolation.py: tag module with `rls` marker. Replace hardcoded `patherly_test` DB reference with parsed DATABASE_TEST_URL (matches conftest.py default `resolutionflow_test`). Updated docstring command to show RUN_RLS_TESTS=1. - requirements-dev.txt: bump pytest-asyncio 0.23.0 → 0.24.0 (loop-scope marker behavior required by the RLS module fixture). Run the RLS suite with: RUN_RLS_TESTS=1 DB_APP_ROLE_PASSWORD=... pytest tests/test_rls_isolation.py Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 16:09:13 -04:00
Michael Chihlas	9c8ba296a8	fix(ai): correct stale role-hierarchy and file-listing claims All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Codex review of the dual-agent handoff migration flagged factual errors carried over verbatim from the pre-migration CLAUDE.md. All claims verified against the live code before correction. PROJECT_CONTEXT.md — SaaS shape: - Role hierarchy was `super_admin > team_admin > engineer > viewer`, but `backend/app/core/permissions.py:4` and `frontend/src/hooks/usePermissions.ts:4` both define it as `super_admin > owner > engineer > viewer`. The `team_admin` concept exists separately as an orthogonal team-scoped gate (`require_team_admin`, `is_team_admin=True` + valid `team_id`), not a level in the primary hierarchy. - Dep list was missing `require_account_owner` and `require_team_admin`, both present in `backend/app/api/deps.py`. PROJECT_CONTEXT.md — directory tree: - `api/endpoints/` comment listed 11 routers; `api/router.py` actually registers 50+. Replaced with a summary that points at `router.py` as the source of truth instead of trying to maintain a freezing list. - `services/psa/` comment omitted `exceptions.py` and `ticket_context.py`, both present in the directory. CURRENT_TASK.md + TODO.md: - Replaced `<!-- EXAMPLE -->` placeholders with clearer empty-state sentinels so a resume agent sees "no real task yet" at a glance rather than placeholder acceptance criteria that look unresolved. SESSION_LOG.md updated with a follow-up bullet documenting this pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 15:09:22 -04:00
Michael Chihlas	bee8690056	chore(ai): migrate to dual-agent handoff system Split the monolithic CLAUDE.md into a durable handoff system: - .ai/PROJECT_CONTEXT.md — stable architectural truth (stack, structure, SaaS shape, ConnectWise, coding standards, frontend patterns, critical lessons). Ported verbatim from the previous CLAUDE.md. - .ai/CURRENT_TASK.md — single active task with DoD + out-of-scope. - .ai/HANDOFF.md — resume point, kept under ~2K tokens. - .ai/TODO.md — backlog, read only when CURRENT_TASK complete. - .ai/DECISIONS.md — append-only architectural decision log. - .ai/SESSION_LOG.md — append-only chronological history. - .ai/README.md — human-facing explanation of the system. Root agent files share a byte-identical protocol block (verified via diff): - CLAUDE.md — primary agent, with GitNexus + gstack tooling and the Claude Opus 4.7 co-author trailer. - AGENTS.md — OpenAI Codex resume agent, with grep/rg fallbacks and the Codex co-author trailer. Steps in when Claude hits session/weekly limits. Legacy root-level SESSION-HANDOFF.md deleted — superseded by .ai/HANDOFF.md. It was a self-describing one-off from the Design System v4 migration and had no external references. Supersedes previous CLAUDE.md. Old version recoverable via `git show pre-ai-handoff:CLAUDE.md` (tag points at commit `e110fed`). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 14:50:41 -04:00
Michael Chihlas	e110fedfe4	chore: snapshot CLAUDE.md before ai-handoff migration	2026-04-24 14:21:21 -04:00
Michael Chihlas	dab740ddf7	fix(tests): isolate test DB from dev DB and plug admin-db override gap All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Root cause of the 06:32 AM outage: running 'pytest tests/' inside the resolutionflow_backend container silently dropped the public schema on the DEV database. Two layered bugs made this possible; both are fixed. Bug 1 — env-var lookup in conftest.TEST_DATABASE_URL put DATABASE_URL (which normally points at the dev/prod DB) ahead of DATABASE_TEST_URL. When DATABASE_URL is set, pytest used the dev DB as the 'test' DB and the test_db fixture's DROP SCHEMA public CASCADE wiped it. Fixed: - Honor only DATABASE_TEST_URL (or the localhost fallback). - Assert at module load that the DB name contains 'test' — refuses to run otherwise. Makes future misconfiguration impossible. Bug 2 — conftest overrode app.dependency_overrides[get_db] but not get_admin_db. Endpoints using get_admin_db (register, admin routes) bypassed the test session and hit the real admin DB. Before Bug 1 was fixed this was hidden because both engines pointed at the same dev DB. With isolation in place, register started failing 'Email already registered' because of stale users in the dev DB. Fixed: - Also override get_admin_db to yield the same test session. RLS is not enabled in the create_all-managed test schema, so sharing is safe. Also adds DATABASE_TEST_URL=resolutionflow_test to docker-compose.dev.yml so pytest in the container works out of the box. Verified: 49/50 Phase 8 + 9 tests pass against resolutionflow_test; the 1 failure is the pre-existing Phase 8 Issue #4 (test_record_decision_persists_and_bumps_state_version). Refs gitea #145 (will update that issue with this as the primary fix). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 13:14:08 -04:00
Michael Chihlas	24972e8444	fix(pilot): Phase 9 review — partial-outcome notes + per-fix script-builder remount All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Addresses docs/FlowAssist_Migration/Issues/phase-9-review-issues.md. Issue #1 (High): "Applied partially" from the escalation intercept silently dropped because the backend requires notes on applied_partial and the dialog sent none. The catch was silent and the UI advanced into the conclude flow as if the outcome were recorded. - EscalateInterceptDialog now has a two-step flow: clicking the partial choice reveals a notes textarea (autofocused, required non-empty) plus Back / "Record partial & escalate" buttons. - onChoose signature extended to (choice, notes?). - handleInterceptChoice passes notes to patchOutcome; on failure it surfaces a toast and does NOT advance to the conclude modal, so the intercept stays open for retry. Issue #2 (Medium/High): ScriptBuilderTab kept local state across active-fix changes within the same pilot session, so a stale draft could PATCH against a newer fix.id. Added key={activeFix.id} on the mount — forces a clean remount per fix; backend get-or-create (keyed on user+ai_session_id) still returns the same session row, which is the intended resume-on-refresh semantic; but messages/editorBuffer/latestScript local state resets. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 11:08:00 -04:00
Michael Chihlas	d386d11af2	docs(pilot): correct Phase 9 migration description All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details Handoff + migration spec incorrectly claimed Phase 9 added a new parent_pilot_session_id FK. The implementation reuses the existing ai_session_id column; the migration only adds the origin discriminator + partial unique index. Also: ScriptBuilderTab wraps ScriptBuilderChat and ScriptBodyEditor (Monaco), not "ScriptBuilderChat in ephemeral mode" — there is no ephemeral mode on the presentational component. Applies applied_at call-site specifics: handleScriptDecision stamps on one_off/draft_template, TemplateMatchPanel stamps on onMarkRun, Script Builder tab Submit does not stamp. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 06:17:08 -04:00
Michael Chihlas	65a831bf9a	docs(pilot): Phase 9 handoff + migration spec update Marks open items #1 (NoTemplateDialog narrow-lane) and #3 (Tabbed Script Builder) as resolved. Records the applied_at semantics correction as shipped. Final Phase 9 row added to the 'What shipped' table. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 06:14:41 -04:00
Michael Chihlas	faf1d8dd12	fix(pilot): applied_at stamps on run-declaring actions, not Apply click Per Phase 9 §5. Before: banner Apply click stamped applied_at regardless of whether the engineer had committed to running anything, starting the Verifying timer prematurely. After: - handleApplyFix no longer calls applyFix(). It just routes to the right surface (TemplateMatchPanel / InlineNoTemplateDialog / Script Builder tab). - handleScriptDecision stamps applied_at for one_off + draft_template (both labels are 'Run now, …' — the click is the declaration). build_template does not stamp. - TemplateMatchPanel's new 'I ran this' button calls applyFix via a new onMarkRun prop. - Script Builder tab Submit does not stamp (a draft is not a run). No backend change — the /apply endpoint is unchanged. Only call sites move. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 04:11:56 -04:00
Michael Chihlas	0386fa1fd5	feat(pilot): mount ChatTabStrip + ScriptBuilderTab + InlineNoTemplateDialog Wires the three new components into AssistantChatPage: - ChatTabStrip renders when the active fix needs a script drafted. - ScriptBuilderTab sits alongside chat via display:none toggling so chat scroll position + builder state both persist. - InlineNoTemplateDialog replaces the task-lane bottomSlot render for the drafted-script evaluation case; three cards finally fit. - Banner Apply routing updated: no-draft/no-template → Script Builder tab; drafted → InlineNoTemplateDialog; template → unchanged path. applyFix() call site moves land in the next task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 04:02:20 -04:00
Michael Chihlas	82db1c78e4	feat(pilot): EscalateInterceptDialog — fourth 'partial' choice Closes the gap Phase 8 final review flagged. When a fix is in applied_partial state and the engineer escalates, the intercept no longer forces them to approximate with didn't-work/worked/never-applied. AssistantChatPage's handleInterceptChoice (Task 13) already dispatches to patchOutcome for any FixOutcome value, so no handler change is needed — the type already supports applied_partial. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 03:04:05 -04:00
Michael Chihlas	f930787200	feat(pilot): TemplateMatchPanel — explicit 'I ran this' action Generate and Copy alone don't declare a run — the engineer can walk away after copying. Phase 9 §5 defines an explicit run-declaration affordance so applied_at only stamps on the engineer's positive commitment. Wiring from AssistantChatPage lands in Task 13. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 03:02:17 -04:00
Michael Chihlas	5bcb7aa7c3	feat(pilot): InlineNoTemplateDialog — chat-region placement wrapper Slide-up wrapper around the existing NoTemplateDialog for rendering in the chat region above the composer (parallel to ProposalBanner). The chat region's width lets grid-cols-3 finally work as intended. No change to NoTemplateDialog itself; decision callbacks and card copy stay identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:56:35 -04:00
Michael Chihlas	04fbfe3b8f	feat(pilot): ScriptBuilderTab controller Owns the inline Script Builder session lifecycle: - Get-or-create (origin='pilot_inline', ai_session_id) on mount. - Renders ScriptBuilderChat in AI mode and CodeModeEditor (Monaco) in 'Write it myself' mode. Mode toggles via display:none so buffer and messages persist across switches. - Submit → sessionSuggestedFixesApi.patchScript; emits onScriptDrafted to parent, which refreshes the fix and hides the tab strip. - Relays in-progress state to the parent via onProgressChange for the ChatTabStrip's indicator dot. ScriptBuilderChat is untouched (stays presentational). Persistence semantics live on the controller, not the display component. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:55:12 -04:00
Michael Chihlas	f92cbefed9	feat(pilot): ChatTabStrip component — [Chat] [Script Builder ●] Two-tab strip for the chat region. Parent controls mounting (strip only appears when the fix needs a script drafted). Indicator dot signals in-progress draft state. Tab switching via onChange callback; parent handles display:none toggling so tab contents preserve state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:45:16 -04:00
Michael Chihlas	c9306e40c9	feat(pilot): frontend API client — patchScript + inline createSession sessionSuggestedFixesApi.patchScript(sessionId, fixId, script, params?) hits the new PATCH /script endpoint. scriptBuilder.createSession accepts an optional options bag with origin + aiSessionId, defaulting to standalone when omitted so legacy callers stay behavior-preserving. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:38:07 -04:00
Michael Chihlas	1c855563ee	feat(pilot): PATCH /suggested-fixes/:id/script endpoint Called by the inline Script Builder tab on Submit. Writes ai_drafted_script + ai_drafted_parameters to the fix without stamping applied_at (a draft is not an application — that's §5 of the Phase 9 spec). Bumps state_version so Resolve/Escalate preview bundles regenerate. 409 on terminal fix status. 404 on wrong session. 422 on empty script. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:34:06 -04:00
Michael Chihlas	d4fae87236	feat(pilot): inline Script Builder session — idempotent create + auth + filtered list POST /script-builder/sessions now supports origin='pilot_inline': - Requires ai_session_id; validates it against current user ownership. - Get-or-create: returns existing row for (user, ai_session_id) pair. - Partial unique index on the DB backs the invariant; races resolve to the single winner row. list_sessions + count_user_sessions default-scope to origin='standalone' so inline scratch sessions don't pollute the /script-builder dashboard or count against the 5-session cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:24:57 -04:00
Michael Chihlas	f2fce27f0d	feat(pilot): pydantic schemas for inline origin + script PATCH - ScriptBuilderCreateRequest gains origin ('standalone' \| 'pilot_inline') and optional ai_session_id. Handler-side validation (next task) enforces pilot_inline ⇒ ai_session_id required + owned by caller. - SessionSuggestedFixScriptRequest added for the new PATCH /script endpoint (Phase 9 Task 6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 01:53:28 -04:00
Michael Chihlas	93c974466a	feat(pilot): script_builder_sessions.origin on SQLAlchemy model Mirrors the DB column added in the prior migration. App-level default is 'standalone' so existing callers of ScriptBuilderSession(...) work without code changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 01:48:22 -04:00
Michael Chihlas	8012668975	feat(pilot): add origin + inline idempotency to script_builder_sessions Phase 9 prep. Adds: - origin VARCHAR(20) NOT NULL with CHECK ('standalone' \| 'pilot_inline') - invariant: pilot_inline rows must have ai_session_id - partial unique index on (user_id, ai_session_id) WHERE origin='pilot_inline' — backs get-or-create idempotency for the inline Script Builder tab. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 00:22:53 -04:00
Michael Chihlas	563bb1aa6f	docs(pilot): Phase 9 implementation plan 14-task plan covering: - DB migration for origin + partial unique index on script_builder_sessions - Pydantic schemas for inline origin + PATCH /script - POST /script-builder/sessions idempotent for pilot_inline + auth - list_sessions + count_user_sessions filtered to standalone - PATCH /suggested-fixes/:id/script (bumps state_version, no applied_at) - Frontend API client additions - ChatTabStrip, ScriptBuilderTab (controller), InlineNoTemplateDialog - TemplateMatchPanel 'I ran this' action - EscalateInterceptDialog fourth 'partial' choice - AssistantChatPage integration + applyFix call-site relocation - Docs + handoff updates Paired with the spec at phase-9-script-builder-tab.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 00:03:57 -04:00
Michael Chihlas	1d2d548fc8	docs(pilot): Phase 9 spec — final consistency polish - Frontend scriptBuilder API client inventory now matches the backend schema: createSession accepts BOTH origin and ai_session_id (both required together for inline callers, both omitted for standalone). - 'If template -> unchanged' sharpened: render location is unchanged, but run stamping moves into the panel's new 'I ran this' action per the §5 apply lifecycle correction. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:54:04 -04:00
Michael Chihlas	3ee0101c6d	docs(pilot): Phase 9 spec — ownership + schema corrections - scriptBuilderMode ownership: pinned to ScriptBuilderTab, removed from AssistantChatPage's state list. Parent never drives the AI/editor toggle; controller owns it and resets naturally on session switch via unmount/remount. scriptBuilderHasProgress stays on the page (needed for the tab strip indicator dot) and is driven by the controller via an onProgressChange callback. - ScriptBuilderCreateRequest schema: explicitly calls for TWO new optional fields (origin + ai_session_id), not just origin. Handler enforces: when origin='pilot_inline', ai_session_id is required and must pass the current-user ownership check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:49:08 -04:00
Michael Chihlas	861d082ff7	docs(pilot): Phase 9 spec — consistency pass on Apply stamp call sites Three consistency fixes: - File inventory (backend + frontend) now names all three apply-stamp call sites: handleScriptDecision('one_off' \| 'draft_template') plus TemplateMatchPanel's 'I ran this' handler. Previously listed only 'one_off' in two places, contradicting the §5 lifecycle table. - NoTemplateDialog relocation section no longer claims the decision handler is 'unchanged' — it is unchanged EXCEPT for the moved apply stamp, which is the point of §5. - Open deferrals entry on ScriptBuilderChat 'ephemeral mode' removed; replaced with the actual new surface (ScriptBuilderTab controller), which reuses the existing script-builder prompt unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:41:17 -04:00
Michael Chihlas	75b59123e6	docs(pilot): Phase 9 spec — fix Apply semantics + session idempotency Four review findings addressed: - High: draft_template 'Run now, templatize after' DOES run the script; applied_at table now stamps for both one_off and draft_template. Only build_template (no run) skips the stamp. - Medium: TemplateMatchPanel needs an explicit '✓ I ran this' button. Generate/Copy don't commit to running. The new button is the stamp moment for template-match fixes. - Medium: get-or-create for inline script_builder_sessions — POST /script-builder/sessions is now idempotent for origin='pilot_inline' (returns the existing row for a (user, ai_session_id) pair). Backed by a partial unique index: UNIQUE (user_id, ai_session_id) WHERE origin = 'pilot_inline' so remount doesn't create duplicates and draft continuity is preserved. - Medium: authorization — the create endpoint validates that any provided ai_session_id is owned by the current user (same guard other pilot endpoints use). Prevents cross-user attachment of scratch sessions to arbitrary pilot sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:34:53 -04:00
Michael Chihlas	fcd224429c	docs(pilot): revise Phase 9 spec per review findings Four findings addressed: - High: drop proposed parent_pilot_session_id column; reuse the existing ai_session_id FK on script_builder_sessions. Add an origin + ai_session_id coherence invariant. - High: don't add a 'mode' prop to ScriptBuilderChat (it's presentational). Introduce a ScriptBuilderTab controller that owns session lifecycle + submit, renders ScriptBuilderChat unchanged. - Medium: filter list_sessions / count_user_sessions to origin='standalone' so pilot_inline scratch sessions don't pollute the /script-builder dashboard or count against the 5-session cap. - Medium: applied_at is stamped only when the engineer commits to a run-action (one_off, TemplateMatchPanel Run), not on banner Apply click. Corrects a Phase 8 over-eager stamp that would otherwise multiply across three surfaces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:28:53 -04:00
Michael Chihlas	196c003876	docs(pilot): Phase 9 spec — tabbed Script Builder + NoTemplateDialog relocation Design doc for the FlowPilot migration's remaining open items: - NoTemplateDialog narrow-lane bug (resolved by moving the dialog to the chat region alongside ProposalBanner — three cards fit naturally at that width; grid-cols fix no longer needed) - Tabbed Script Builder inside the chat (new [Chat] [Script Builder ●] tab strip; AI chat default with 'Write it myself' Monaco escape hatch) Plus a Phase 8 cleanup: - EscalateInterceptDialog fourth 'I applied some of it — partial' choice All six architecture decisions settled via brainstorming before writing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:03:57 -04:00
Michael Chihlas	f2b9476edb	docs(pilot): log Issues #1-4 findings for Phase 8 review Tracks the three code-review issues that were fixed on this branch (#1 outcome-aware previews, #2 persist Apply, #3 persist proposal rejection) plus a newly-documented pre-existing test failure (#4 — decision-endpoint test written in Phase 3 never updated when Phase 5 added the drafted-script validation guard). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:18:13 -04:00
Michael Chihlas	70c5da0c75	fix(pilot): persist AI-proposal rejection + clear on outcome write Issue #3 from phase-8-review-issues.md. 'Not yet' on the AI-confirming banner was a local-state hide; the proposal re-surfaced on the next refreshSessionDerived call. Two-part fix: - PATCH /outcome now clears ai_outcome_proposal on any terminal action (engineer has taken a decision; stale AI proposal is moot). - New DELETE /ai-sessions/:sid/suggested-fixes/:fid/ai-outcome-proposal endpoint for explicit 'Not yet' rejection. Does not touch status or state_version — pure UI state. Frontend handleRejectAIProposal now calls the DELETE and setActiveFix with the server response. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:15:48 -04:00
Michael Chihlas	de2bef3175	fix(pilot): persist Apply — stamp applied_at on click Issue #2 from phase-8-review-issues.md. Apply was client-side-only via a bannerApplied flag. Refresh / chat reselect / multi-tab would drop Verifying state back to Proposed. - New POST /ai-sessions/{sid}/suggested-fixes/{fid}/apply stamps applied_at without changing status (still 'proposed'). Idempotent if already stamped; 409 if fix is past proposed (a terminal outcome was already recorded). - Bumps state_version so resolve/escalate preview bundles reflect that the fix has entered verifying. - Frontend handleApplyFix calls the endpoint and uses the returned applied_at directly. bannerApplied client flag is removed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:10:52 -04:00
Michael Chihlas	362c7b1d79	fix(pilot): outcome-aware Resolve/Escalate previews Issue #1 from phase-8-review-issues.md. Cache invalidation alone isn't enough — previews were also omitting outcome fields from the LLM bundle, so a fresh regenerate still couldn't distinguish proposed / failed / partial / success. - PATCH /outcome now bumps ai_sessions.state_version (matches record_decision's existing pattern). - Resolution-note + escalation-package bundles now include status, applied_at, verified_at, partial_notes, failure_reason on the active fix. - Generator prompts prescribe outcome-aware phrasing (closure language for success; what-we've-tried + next-steps for failed/partial). - New end-to-end test asserts the regenerated preview reflects the recorded outcome, not just that the cache key changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:04:56 -04:00
Michael Chihlas	ec104dc8de	docs(pilot): sync Phase 8 handoff with actual implementation Correct the stale ai_sessions.fix_outcome reference (no such column) — the real schema adds six columns to session_suggested_fixes. Update last_commit to reflect the docs-correction tip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 19:48:54 -04:00
Michael Chihlas	a47ce07326	docs(pilot): fix Phase 8 column + commit-SHA references Correct the FLOWPILOT-MIGRATION.md stale references to a non-existent ai_sessions.fix_outcome column — the actual implementation added six columns to session_suggested_fixes. Also fix a stale first-commit SHA (6721b84 → `cdd8bb0`, the former was amended away). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 17:42:51 -04:00
Michael Chihlas	2a54127a54	docs(pilot): Phase 8 fix outcome banner — handoff + migration spec Marks open item #2 (task-lane crowding / Suggested Fix discoverability) as resolved by Phase 8. Open items #1 (NoTemplateDialog narrow-lane) and #3 (Tabbed Script Builder inside chat) remain deferred. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:52:07 -04:00
Michael Chihlas	8582d24236	chore(pilot): remove deprecated SuggestedFix task-lane card Superseded by ProposalBanner (Phase 8). The import was already removed from AssistantChatPage in the previous commit; this deletes the orphaned file itself and strips the now-unused suggestedFixSlot prop from TaskLane's interface and both call sites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:48:42 -04:00
Michael Chihlas	bdb238a274	feat(pilot): mount ProposalBanner + wire implicit signals Replaces the task-lane SuggestedFix card with the ProposalBanner docked above the chat composer. Wires: - Resolve-while-verifying auto-marks applied_success (one-click resolve). - Escalate-while-verifying opens EscalateInterceptDialog to capture the real outcome (default: didn't work) before handoff. - 3+ post-apply engineer messages trigger the passive Nudge banner. - AI [FIX_OUTCOME] proposals surface in the AIConfirming state; one-click confirm applies the outcome. Banner state resets on session switch via resetSessionDerivedState. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 16:42:01 -04:00
Michael Chihlas	075b0fc1d8	feat(pilot): EscalateInterceptDialog popover Anchored above the Escalate button, captures fix outcome before the engineer hands off the ticket. Defaults to 'didn't work' on Enter (the common case). Alternatives: 'worked, escalating for another reason' (preserves success) and 'never actually applied' (dismiss). Task 11 will wire this to AssistantChatPage's Escalate handler. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:48:33 -04:00
Michael Chihlas	217747f46e	feat(pilot): banner AI-confirming, Nudge, Collapsed states Completes ProposalBanner's state machine. AIConfirming (accent-blue) surfaces the AI's [FIX_OUTCOME] proposal with one-click accept; Nudge is the compact passive-prompt variant for post-apply chats; Collapsed is the 28px expand-hint strip. Adds onSilenceNudge prop so the parent can silence the nudge without collapsing it (Task 11 wires this). Removes the last three stale eslint-disable-next-line comments — all sub-components now use props. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:39:08 -04:00
Michael Chihlas	7fa1d6a32f	feat(pilot): banner Verifying + Partial states Verifying: amber pulse animation, confidence pill becomes 'Applied Xm ago', three actions (overflow for Mark partial, Didn't work, It worked). window.prompt used for the partial notes + failure reason inputs — good-enough v1 pending an inline composer. Partial: cyan-toned to signal 'parked, outcome unknown', shows saved notes inline, Finish it / Didn't work / It worked actions. Adds pulse-amber to @theme animations alongside slide-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:32:02 -04:00
Michael Chihlas	ac67e48500	feat(pilot): ProposalBanner scaffold + Proposed state New component that will replace the task-lane SuggestedFix card. Docks above the chat composer with a 320ms slide-up animation. This commit implements only the Proposed state (Tasks 8 & 9 fill Verifying, Partial, AI-confirming, Nudge, Collapsed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:25:41 -04:00
Michael Chihlas	cdd29b460e	feat(pilot): frontend fix-outcome types + patchOutcome API Extends SessionSuggestedFix with outcome fields (status, applied_at, verified_at, partial_notes, failure_reason, ai_outcome_proposal) and adds a patchOutcome method hitting the new backend endpoint. FixStatus (5 values) + FixOutcome (4 writable values) mirror the backend Pydantic types and the DB check constraint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:20:16 -04:00
Michael Chihlas	2cde6673b0	feat(pilot): [FIX_OUTCOME] system prompt instructions Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's parser consumes. Placeholder-only per the anti-parrot pattern — no literal UUIDs, outcomes, or reasons that could leak into unrelated sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:17:21 -04:00
Michael Chihlas	c0112f8bee	feat(pilot): [FIX_OUTCOME] marker parser + AI outcome proposal The AI emits [FIX_OUTCOME] when the engineer indicates in chat that a prior suggested fix worked, didn't work, or was partially applied. The marker writes to session_suggested_fixes.ai_outcome_proposal (JSONB), which the frontend surfaces as a "confirm outcome?" banner. The status column is only updated when the engineer clicks confirm (via PATCH /outcome endpoint from Task 3). Placeholder-only system prompt wiring comes in Task 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:08:43 -04:00
Michael Chihlas	8988dbc885	feat(pilot): PATCH /suggested-fixes/:id/outcome endpoint + tests Records engineer-reported outcome (applied_success\|applied_failed\| applied_partial\|dismissed). Enforces transition rules (partial → success/ failed allowed; terminal outcomes return 409) and notes requirements (applied_partial requires notes). Sets verified_at on success/failure, stamps applied_at if not already set (handles the case where the AI [FIX_OUTCOME] marker fires before the engineer clicks Apply). Also fixes pre-existing test-infrastructure bug: network_diagram.py used bare string server_default="'[]'" for JSONB columns, which asyncpg rejects during test schema creation. Changed to text("'[]'::jsonb") to match the pattern used by script_template.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:59:34 -04:00
Michael Chihlas	4a8e3ae954	feat(pilot): pydantic schemas for fix outcome patch Adds FixStatus literal (5 values matching the DB check constraint), extends SessionSuggestedFixResponse with outcome fields, and introduces SessionSuggestedFixOutcomeRequest for the PATCH /outcome endpoint coming in Task 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:44:39 -04:00
Michael Chihlas	cdd8bb05cc	feat(pilot): add outcome tracking columns to session_suggested_fixes Phase 8 prep for the fix outcome banner. Adds: - status (proposed\|applied_success\|applied_failed\|applied_partial\|dismissed) - applied_at, verified_at (timestamps) - partial_notes, failure_reason (engineer-provided context) - ai_outcome_proposal (JSONB for AI [FIX_OUTCOME] marker payloads) Backfills status='dismissed' from user_decision='dismissed'. status is orthogonal to user_decision — outcome (did the fix work?) vs script-path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 14:40:17 -04:00
Michael Chihlas	8879f96fbf	fix(pilot): drop sticky section headers in task lane All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details Each lane section (What we know, Questions, Diagnostic Checks, Suggested fix) had its own `position: sticky; top: 0` header. As the engineer scrolled past a section, that section's header would pin until the section's bottom edge cleared the viewport, producing an "orphaned" label floating over unrelated content below. Headers now scroll with their content — in a 340px-wide lane the affordance was negative value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 16:01:14 -04:00
Michael Chihlas	8a242f5db9	feat(pilot): Phase 7 — polish (loading/empty states, shortcuts, responsive drawer) All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details - WhatWeKnow shows a "synthesizing" indicator + skeleton pulse while the chat cycle is in-flight; task-lane header mirrors the signal with a "thinking" pip so engineers know the AI is still working. - Quiet-state hint when the lane is open (facts exist) but no open questions, checks, or active fix — keeps the surface from looking "finished" when the AI is about to follow up. - Keyboard shortcuts: ⌘↵/Ctrl+↵ send in the composer (plain Enter still sends), ⌘G toggles the Script Generator panel for the active fix, `?` opens a new ShortcutsHelpOverlay listing all bindings. ⌘K palette was already wired in TopBar. - Responsive: below 1200px the task lane collapses to a bottom drawer with a backdrop + a floating "Tasks ●" toggle button. TaskLane now takes a `variant: 'side' \| 'drawer'` prop; drawer variant drops the resize handle and uses the shared slide-in-bottom animation. - Build hygiene: fixed a pre-existing TS error in confirm-post error handling (duplicate `response` type keys) and an unused-import warning in TemplatizePrompt. Verified: `npx tsc -b` and `npm run build` both clean against the dev stack; Vite HMR applied each change without errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 14:19:44 -04:00
Michael Chihlas	4aaf57adb5	feat(pilot): Phase 6 — post-resolve templatize prompt + draft accept/reject All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details Closes the loop on the Phase 5 "Run now, templatize after resolve" path. After a session resolves, drafts queued by the three-option dialog surface as a modal that lets the engineer review the AI-proposed parameterization and either save as a reusable team template or skip. A "don't ask again" toggle writes to account_settings.preferences so the next resolve won't pop the modal. Backend: - /api/v1/draft-templates: * GET — list account drafts (pending_only default true; pass false for audit view including accepted/rejected) * GET /{id} — single draft * POST /{id}/accept — promotes to a new script_templates row with source_session_id / source_user_id / source_ticket_ref populated (drives the Script Library "generated from CW #X · resolved by Y" provenance chip). Draft flips to status=accepted, promoted_template_id set, resolved_at stamped. 409 on re-accept / already-rejected. 400 on unknown category_id. * POST /{id}/reject — flips to status=rejected. 409 on re-reject. - /api/v1/accounts/me/preferences (GET/PATCH) — thin wrapper over AccountSettings.get_setting/set_setting. PATCH merges keys into the JSONB column, preserving existing keys the client didn't touch. Used by the "Don't ask again for this team" checkbox (templatize_prompt_enabled=false) and, forward-looking, by cw_resolved_status_id / cw_escalated_status_id from Phase 4. - 13 tests: list filter, accept with/without edited_body, provenance copy-through, reject, 409 on re-accept / re-reject, 400 on unknown category, prefs round-trip with merge semantics. Frontend: - src/components/pilot/script/TemplatizePrompt.tsx — modal showing the drafted script with proposed parameters in the Phase 5 ParameterizationPreview, editable name/category/description, an individual-parameter remove button, and the "don't ask again" opt-out. Accept posts to /draft-templates/{id}/accept + optionally PATCHes preferences. Skip posts /reject. - src/api/draftTemplates.ts — typed client plus accountPreferencesApi. - AssistantChatPage: after a successful Resolve (external OR local), fetches preferences + pending drafts for the session and queues the modal one draft at a time. Escalate does not trigger this flow. - Sidebar: Scripts nav shows the pending-draft count as a badge. Fetched independently of the main sidebar stats so endpoint flakes don't break the rest of the sidebar. Verified live 2026-04-22: seed two drafts → GET sees both pending → accept draft A (template created, provenance CW #99123 populated) → reject draft B → pending count drops → PATCH opt-out → GET confirms persistence. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:37:49 -04:00
Michael Chihlas	ddae171a37	fix(pilot): clear messages in resetSessionDerivedState — was leaking across chats All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details Symptom: sidebar showed "User mjones got locked out … 0 messages" but the conversation pane was rendering 2 messages from a different chat. The task lane content matched what was displayed (so the AI was fine post- prompt-sweep) — the leak was purely UI: messages from the previous chat stayed on screen until the new chat's getSession returned. selectChat resetSessionDerivedState() then awaits getSession before calling setMessages(detail.conversation_messages). Between the reset and that await, the prior chat's messages remain visible. handleNewChat already had an explicit setMessages([]) call so it was unaffected; selectChat did not. Folded setMessages([]) into resetSessionDerivedState so any new chat- switch entry point gets the wipe for free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:15:39 -04:00
Michael Chihlas	d0ebdef9e8	fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details The "AI parrots example content from system prompt" bug bit us twice in one day across two different prompt sites. Patching individual prompts is treating the symptom; this commit makes the rule structural. Audit + sanitize: - assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in prior commits, but the [FORK] schema still had literal "Brief reason" / "Short name" / "One sentence" placeholders. Replaced with <angle-bracket> placeholders. Anti-parrot rule itself rewritten to describe the failure mode abstractly instead of naming "jsmith" so the rule no longer trips the guardrail (and so the model doesn't see "jsmith" as a token at all). - ai_chat_service.py — removed three concrete-example offenders: "Get-Service ADSync" command literal, the "DC01 server_name" intake form payload (in two places), and the inline interview demos using "Azure AD Sync failures" / "Exchange Online mailbox migration". Replaced with technology-neutral schema descriptions. - ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the fully-fleshed DNS troubleshooting tree (with literal Dnscache / ipconfig / google.com / Start-Service) with a placeholder schema showing only ID-linkage shape. - kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked Server Manager + DC01 example payload with a placeholder schema. Guardrail (tests/test_prompt_anti_parrot.py): - Imports every module under app/services/ and app/core/ and walks every uppercase string constant ending in _PROMPT, _SCHEMA, _PROTOCOL, _FORMAT, or _CONTEXT. - test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache, google.com, "Outlook keeps", "Teams drops") must not appear in any prompt constant. Add to the list when a new leak shows up in prod — the list IS the audit trail. - test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.) must contain placeholders only. Distinguishes JSON keys (followed by ':', allowed) from JSON values (followed by ',' / ']' / '}', must be <placeholder>); allows pipe-separated enum types (text\|password\|select) and a small set of fixed enum values (question, diagnostic_check, decision, action, ...). Verified by feeding the test a known-bad block — caught it correctly. Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming the test as the enforcement point so future contributors know how to extend it (add to the known-leaked list when a new leak surfaces). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:09:30 -04:00
Michael Chihlas	50215b9110	fix(pilot): strip literal example content from system prompt — model was parroting All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details The system prompt had a "Complete example of a correct first response" section with a specific Outlook/WiFi/jsmith scenario plus literal JSON payloads in [QUESTIONS], [ACTIONS], [SUGGEST_FIX], and [PROMOTE] markers. The model was emitting those literal strings (the same WiFi/laptop questions, the same "Clear cached credentials" suggested fix, the same "OWA login confirmed for jsmith" promote) on EVERY unrelated chat — making the task lane look like it was leaking previous- session data when in fact the AI was just reciting the prompt examples. Replaced literal example content with `<placeholder>` schemas. Added an explicit ANTI-PARROT RULE in the FINAL REMINDER section calling out that the angle-bracket placeholders show SHAPE, not CONTENT, with concrete examples of the failure mode (printer ticket → don't ask about Outlook; user not named jsmith → don't name jsmith). Same scrub applied to the FORK section's "Outlook AND Teams dropping" and the worked fork-flow example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 01:36:29 -04:00
Michael Chihlas	ce7c8ac3d5	fix(pilot): wipe full task-lane state on chat switch + extract palette event All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details Two fixes from the Phase 5 shakedown: 1. Stale lane data leaking across chats. handleNewChat, sendPrefill, and handleResumeNew were each missed when Phase 3/5 added activeFix, previewKind, previewData, and scriptPanelOpen — only selectChat reset the full set. Result: starting a new chat while a Suggested Fix card was active showed the previous session's fix card (and any open preview/script panel) until the next backend refresh swept it. Consolidated all four entry points behind a single resetSessionDerivedState() helper so adding new lane state in future phases only requires touching one place. 2. CommandPalette TDZ on cold load. SCRIPTS_INLINE_QUICK_ACTION (line 66) referenced PILOT_INLINE_SCRIPT_PATH declared at line 94 — module-level evaluation hit the use before the declaration. Browser blanked with "Cannot access 'PILOT_INLINE_SCRIPT_PATH' before initialization". Moved the path const above its first use; also extracted PILOT_INLINE_SCRIPT_EVENT into a tiny @/lib/pilotEvents module so AssistantChatPage doesn't import the palette component just to read a string — that mixed-export pattern broke Fast Refresh ("consistent components exports") and added an unnecessary import edge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 01:30:18 -04:00
Michael Chihlas	fa61376303	feat(pilot): Phase 5 — inline Script Generator integration All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details Wires the SuggestedFix card to an inline panel that handles both cases: template-matched fixes open the Script Library generator with parameters pre-filled from session context; un-matched fixes open the three-option dialog (one_off / draft_template / build_template). The decision endpoint records the path choice with side effects: draft_template persists a draft_templates row via a Sonnet-driven TemplateExtractionService; build_template returns a redirect to the Script Builder; one_off just records the choice. Backend: - TemplateExtractionService: drafts a parameter schema from a concrete rendered script. Conservative by default ("prefer fewer parameters"). Round-trip-validates that templated_body only references declared parameters; missing-key mismatch falls back to the original script with no params. LLM/parse failures fall back identically — the engineer can still create a draft and refine in the post-resolve prompt (Phase 6). - /suggested-fixes/{fix_id}/decision side effects: * one_off → returns rendered_script (engineer's edited version or the fix's ai_drafted_script verbatim) * draft_template → same + creates draft_templates row with extracted params, returns draft_template_id * build_template → returns redirect_path=/scripts/builder?from_session= &fix= so the frontend can navigate to the builder pre-loaded - 400 when a non-template fix has no ai_drafted_script (template-matched fixes take the dedicated /scripts/generate path, not this endpoint). - 12 tests: TemplateExtractionService parse + fallback paths, all four decision branches, edited_script override, missing-script 400. Frontend: - src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog, ParameterizationPreview}.tsx — inline panels rendered in the task lane's bottom slot when the engineer clicks a SuggestedFix card. - TemplateMatchPanel: loads template via /scripts/templates/{id}, pre-fills params from fix.ai_drafted_parameters with cyan "from session" tags, generates via existing /scripts/generate (already bumps state_version on ai_session_id from Phase 3). 404 falls back with a clear message instead of erroring. - NoTemplateDialog: shows the AI-drafted script with proposed parameter values highlighted in amber via ParameterizationPreview; three option cards with the middle (draft_template) flagged Recommended; inline edit on the script body before deciding. - SuggestedFix card now clickable: onActivate toggles the inline panel. - AssistantChatPage: scriptPanelOpen state + handleScriptDecision that navigates on build_template and toasts on the other paths. Active fix changes auto-close the panel so engineers don't act on stale state. - Cmd+K → "Open inline Script Generator" palette entry surfaces only on /pilot/:id routes; fires a window event the chat page subscribes to. No Resolve shortcut added per Section 14 decision (browser ⌘R conflict). Verified 2026-04-22 against the dev stack: - one_off / draft_template / build_template all return the right shape with real Sonnet TemplateExtractionService for the draft path. - Conservative extraction confirmed: cmdkey + Restart-Process script yielded zero proposed parameters as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 00:15:29 -04:00
Michael Chihlas	8fd2c1bac6	feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details Wires the preview popover's Confirm & post action to ConnectWise (and, via the provider pattern, any future PSA). Adds the parallel Escalate flow with the handoff-oriented five-section markdown. Sessions without a linked PSA ticket resolve/escalate locally — markdown stored, status flipped, nothing posted externally. Backend: - EscalationPackageGeneratorService: Sonnet, five sections (Problem / What we've confirmed / What we've tried / Current hypothesis / Suggested next steps). Shares the preview_cache with a separate KIND so Resolve and Escalate previews for the same state coexist. - PSAWritebackService: post_resolution_note (RESOLUTION note type, customer-visible), post_escalation_package (INTERNAL_ANALYSIS, handoff for the next engineer only), transition_ticket_status with mandatory re-fetch verification. PSAStatusVerificationError surfaces loudly when CW silently rejects a status change — the ConnectWise anti-pattern CLAUDE.md flags. - Endpoints: * POST /ai-sessions/{id}/escalation-package/preview * POST /ai-sessions/{id}/resolution-note/post * POST /ai-sessions/{id}/escalation-package/post Outcomes: "resolved" / "escalated" with external_id + verified status, "resolved_local" / "escalated_local" when no PSA linked. - Target CW status IDs live in account_settings.preferences (cw_resolved_status_id, cw_escalated_status_id). When unset, the post proceeds without a status transition — response includes a status_transition_skipped_reason rather than silently erroring. - 7 tests: local-only path, PSA happy path with verified transition, status verification failure → 502, skipped transition when unconfigured, 409 on already-resolved re-post, escalate parallel path, internal-analysis note type enforced. Frontend: - ResolutionNotePreview now kind-parameterized ('resolve' \| 'escalate') with inline edit + Confirm & post. Preview loads from the matching backend endpoint; posting calls the matching endpoint; outcome toast surfaces the verified CW status or the local-only result. - AssistantChatPage: previewKind state replaces previewOpen; two toggle buttons (Preview Resolve note / Escalate instead) in the lane's bottom slot. handleConfirmPost dispatches by kind. Verified 2026-04-22: - Local-only Resolve + Escalate round-trip against the dev stack. - Live Sonnet escalation-package preview; cache hit on repeat call with no state change (separate cache kind from resolution-note). - PSA post + status-verification paths covered by mocked-provider pytest cases. Live CW round-trip pending a test CW instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 23:54:54 -04:00
Michael Chihlas	7ccf4c602b	fix(pilot): reorder Phase 3 useCallbacks to avoid TDZ on render All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details refreshSessionDerived's dep array referenced refreshActiveFix and schedulePreviewRefresh before they were declared. React evaluates useCallback deps synchronously during render, so the page blew up with "Cannot access 'refreshActiveFix' before initialization" before a single render completed. Moved the three leaf helpers above the aggregator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 23:44:19 -04:00
Michael Chihlas	66e592096c	feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache Adds the AI-proposed resolution path and the inline preview of the markdown that will be posted to the customer ticket on Resolve. The preview is keyed on (session_id, ai_sessions.state_version) so back-to- back fetches against unchanged state hit an in-process cache instead of paying for a Sonnet call. Backend: - preview_cache: in-process LRU keyed on (kind, session_id, state_version). No TTL — state_version is the source of truth. Soft-cap 5000 entries. - unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON payload, confidence clamped 0-100), supersession persistence (sets superseded_at on prior active row), atomic state_version bump. - ResolutionNoteGeneratorService: pulls session, facts, active fix, and redacted script_generations into a structured input bundle for Sonnet; produces the four-section markdown (Problem / What we confirmed / Root cause / Resolution). Sensitive script parameters redacted via ScriptTemplateEngine.redact_sensitive driven by the template's parameters_schema. - /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active fix or 404. - /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records one_off / draft_template / build_template / dismissed; dismiss supersedes; bumps state_version. 409 on dismissing an already- superseded fix. - /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns cached markdown; from_cache flag in payload signals cache hit. - scripts.py POST /generate now bumps state_version on the linked ai_session_id when present (third source of preview-cache invalidation per Section 5.5). - ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit, format, supersession semantics). - 12 tests covering the parser (well-formed, last-wins, malformed, confidence clamping), supersession + state_version invariant, all decision branches, preview cache hit-on-no-change + miss-after-write. Frontend: - src/components/pilot/sections/SuggestedFix.tsx — amber-accented card with confidence badge; dismiss action wired to the decision endpoint. - src/components/pilot/ResolutionNotePreview.tsx — popover with refresh, loading state, cached/fresh indicator, ticket-ref display. - src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes 404 to null so callers don't have to special-case. - TaskLane gains suggestedFixSlot + bottomSlot props (rendered after Diagnostic Checks; bottomSlot anchors the Resolve action). - AssistantChatPage: refreshSessionDerived helper batches fact + fix refresh; fact mutations and chat sends both schedule a 500ms-debounced preview refresh per the Section 5.5 spec. Verified end-to-end against the dev stack with a real Sonnet call: - /active 404 → fact create → preview generates four-section markdown grounded only in provided facts → second preview call hits cache (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:45:52 -04:00
Michael Chihlas	625dba7548	feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs Adds the load-bearing structural feature of the FlowPilot migration: a "What we know" panel that holds confirmed facts for a session, fed by AI [PROMOTE] markers and engineer-added notes. Facts feed the resolution note preview (Phase 3) and survive across turns via stable UUIDs assigned to pending_task_lane items. Backend: - FactSynthesisService: create/update/soft-delete facts with atomic state_version bumps; LLM-backed synthesize_from_question/check on the fact_synthesis (Haiku) action tier per Section 6.6. - /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or via synthesis). PATCH returns 403 for question/diagnostic_check facts (edit the source item instead, Section 7.3). - unified_chat_service: [PROMOTE] marker parser (JSON-block per Section 8.1 spec drift note), stable-UUID assignment for pending_task_lane questions/actions preserved by exact text/label match across turns. - ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to emit, hallucination guardrails, source_ref handling. - 17 tests covering parser, stable IDs, service validation, CRUD, editability rule, both promote modes, 422 null-synthesis path, state_version invariant. Frontend: - src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton} — green-gradient section above Questions, dashed-circle check, inline edit/delete gated by the server's editable flag. - TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept per the doc's "rename is opportunistic" guidance). - AssistantChatPage fetches facts on selectChat and refetches after each chat send (so [PROMOTE]-synthesized facts appear immediately); auto- opens the lane when facts exist. Verification: end-to-end smoke against the local docker stack confirms all five endpoints (list/create/patch/delete/promote) plus the 403 editability rule. pytest suite verifies the same with mocked LLM. Live [PROMOTE] flow remains untested until used in the UI — the marker shape is covered by parser tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:13:44 -04:00
Michael Chihlas	19cfd71995	chore(flowpilot-migration): remove migration handoff note after verification All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details Gate 1 complete on Proxmox dev host (docker-01): - Alembic at f07010f17b01 (single head); downgrade/upgrade roundtrip clean. - Phase 0 prompt-cache verified: direct provider probe shows cache_create=5398 → cache_read=5398 across two calls; chat path emitted two anthropic.cache events 55s apart on a real FlowPilot session. - Frontend npm run build clean (57.63s, no TS errors, no stale FlowPilotSessionPage imports). - /assistant/:id → /pilot/:id redirect fires correctly and session detail loads (GET /api/v1/ai-sessions/<id> 200); a blank-until-click UX polish will be tracked separately. - Dashboard session-tile dispatcher routes to /pilot/:id. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 01:21:08 -04:00
Michael Chihlas	3b55697c77	dev-env(proxmox): switch compose to direct-port exposure; document homelab topology - docker-compose.dev.yml: drop Traefik/dev.resolutionflow.com labels, expose backend:8000 and frontend:5173 directly; swap relative bind mounts for ${REPO_ROOT}/... so compose works when driven from inside a code-server container with the host Docker socket mounted; default POSTGRES_PORT to 5433 host-side; add explicit uvicorn/npm run dev commands; add ENABLE_MCP_MICROSOFT_LEARN and docker-01/Tailscale CORS origins. - frontend/vite.config.ts: replace dev.resolutionflow.com with allowedHosts=['docker-01', '.ts.net', 'localhost'] for direct-port access over the private network. - DEV-ENV.md: add Section 11 reference topology for the homelab Proxmox + code-server Option B setup, plus troubleshooting entries for the REPO_ROOT-empty-mount trap and the Vite allowedHosts rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 00:18:31 -04:00
Michael Chihlas	851966966d	docs(claude-md): compact CLAUDE.md for 2026-04-19 baseline Trim from 570 → 264 lines. Archived lessons and fixes-in-code remain in docs/LESSONS-ARCHIVE.md; CLAUDE.md now only carries what a fresh session can't derive from the repo state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 00:18:15 -04:00
Michael Chihlas	66968e4c59	docs(flowpilot-migration): add ephemeral migration handoff note All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Self-contained status snapshot for picking up Phase 0 + Phase 1 work after the Proxmox dev-environment move. Lists what is done, what is owed (the Gate 1 verification checklist), known drift, and the recommended order of operations after the move. Explicitly ephemeral — the doc instructs the reader to delete it once Gate 1 verification has passed. Durable dev-env setup lives in DEV-ENV.md; this file covers only the "where is the work right now" handoff for this specific migration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:31:16 +00:00
Michael Chihlas	b0622f5511	docs(dev-env): rewrite DEV-ENV.md for host-agnostic setup The previous version was tightly coupled to the Hostinger VPS at 46.202.92.250 — hardcoded IP, Traefik/Let's-Encrypt assumption, specific Docker-volume paths. Rewriting ahead of the Proxmox migration so a fresh clone on any Linux host (LXC, VM, bare metal, VPS) can stand up a working dev environment without pre-baked assumptions about topology. Structural changes: - Introduces Option A (all-in-one host) / Option B (Docker Compose) / Option C (split services) topology choice up front, so readers commit to one shape before touching commands. - Adds a "per-host configuration" template the reader fills in once (DEV_HOST, POSTGRES_PORT, SECRET_KEY, API keys), referenced by name throughout the rest of the doc. No more hardcoded IPs. - Adds an explicit verification section (Section 6) with concrete expected outcomes: alembic head, reversibility, prompt-cache hit, frontend build, /assistant→/pilot redirect, dispatcher routing, CORS. - References the Phase 0 TODO(phase0-verify) in ai_provider.py and the expected alembic head (f07010f17b01) as of the current branch. - Adds a troubleshooting section pulling in CLAUDE.md lessons that bite people repeatedly: stale Vite env vars, RLS policy violations, EACCES on dist/, multi-head alembic state, invisible cache misses. - Documents the structured log events the backend emits (anthropic.cache, mcp.turn, mcp.fallback) so readers know what to grep for during verification. Deliberately excluded: - Production deployment (lives in CLAUDE.md Deployment section). - Reverse-proxy configuration (whatever the reader prefers). - code-server install specifics (Docker vs LXC vs native is reader's choice; once running, this doc applies). - Proxmox-specific instructions — the doc is host-agnostic so it survives the next migration as well. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 22:31:03 +00:00
Michael Chihlas	f3c3ee5b57	feat(pilot): unify AI troubleshooting surface at /pilot, redirect /assistant (Phase 1) All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details Collapses the pre-existing dual-surface setup (AssistantChatPage at /assistant, FlowPilotSessionPage at /pilot) into a single chat-primary surface per architectural claim #1 of FLOWPILOT-MIGRATION.md. Router changes (frontend/src/router.tsx): - /pilot and /pilot/:sessionId now render AssistantChatPage. - /assistant redirects permanently to /pilot via <Navigate replace>. - /assistant/:sessionId redirects to /pilot/:sessionId preserving the ID via an AssistantSessionRedirect helper that reads the param. - FlowPilotSessionPage is no longer imported or mounted. Per the beta-history-disposable decision, the file stays on disk for reference but is unreachable; delete once nothing else in the tree imports it. Dispatcher de-branching — previously these sites routed by session_type (chat -> /assistant, otherwise -> /pilot). All now unconditionally go to /pilot/:id since session_type is no longer used for frontend routing: - components/dashboard/ActiveFlowPilotSessions.tsx - components/dashboard/RecentFlowPilotSessions.tsx - components/flowpilot/AISessionListItem.tsx (keeps isChat for icon selection, but linkTo is unconditional) User-facing label + navigation updates: - components/layout/CommandPalette.tsx: "AI Assistant" palette entry becomes "FlowPilot" pointing to /pilot; the sparkles quick-action also routes to /pilot. - components/dashboard/StartSessionInput.tsx: both navigate() call sites now go to /pilot instead of /assistant. - lib/routePrefetch.ts: prefetch entry for AssistantChatPage keyed to /pilot (the real surface) rather than /assistant (now redirect-only). Preserved intentionally (not user-facing routes): - Backend /assistant/retention API path and the assistantChatApi module name — those are internal API and module identifiers, not SPA routes. - src/components/assistant/* and src/types/assistant-chat — TypeScript module paths, not routes. - Sidebar.tsx — no top-level AI entry existed to rename; /pilot is already in the History group's matchPaths. Whether FlowPilot deserves its own rail entry is a future UX decision, not Phase 1 scope. - FlowPilotAnalyticsPage at /analytics/flowpilot — analytics for the unified product, not guided-only, per the agreed Q16 interpretation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 18:48:00 +00:00
Michael Chihlas	b49772f1a1	feat(models): Phase 1 SQLAlchemy models — SessionFact, SessionSuggestedFix, DraftTemplate, AccountSettings Backs the schema added in `210d310` with SQLAlchemy 2.0 models. - SessionFact: "What we know" facts with polymorphic source_ref pointing at task-lane item UUIDs inside ai_sessions.pending_task_lane (not a FK per Section 4.2). - SessionSuggestedFix: AI-proposed resolutions with supersession tracking and the full user_decision state machine. - DraftTemplate: post-resolve templatization queue with promotion to script_templates. - AccountSettings: per-account JSONB preferences grab-bag with async classmethod helpers — get_setting(db, account_id, key, default) reads without creating, set_setting(db, account_id, key, value) upserts via Postgres ON CONFLICT + jsonb `\|\|` merge so existing keys are preserved. Lazy row creation matches the Phase 1 design. Column additions on existing models to mirror the migration: - AISession: resolution_note_* / escalation_package_* / state_version (the preview-cache-invalidation counter consumed by Phase 3). - ScriptTemplate: source_session_id / source_user_id / source_ticket_ref (provenance for templates promoted from DraftTemplate). All four new models registered in app.models.__init__ and __all__. TYPE_CHECKING-guarded relationship imports throughout, matching the repo's existing model style. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 18:35:00 +00:00
Michael Chihlas	210d310fb2	feat(db): Phase 1 schema — session_facts, suggested_fixes, draft_templates, account_settings Adds the backing store for the FlowPilot unified session surface, per the FLOWPILOT-MIGRATION.md Phase 1 deliverable. Descends from production head 074 (add_network_diagrams_table). New tables (all tenant-scoped, all RLS-enabled + forced): - session_facts — "What we know" facts. source_ref is a polymorphic pointer to a task-lane item inside ai_sessions.pending_task_lane (no DB-level FK; integrity enforced at service layer per Section 4.2 of the design doc). Soft-delete via deleted_at; active-facts partial index excludes deleted rows. - session_suggested_fixes — AI-proposed resolutions. One active per session at a time (supersession tracked via superseded_at; partial index on (session_id) WHERE superseded_at IS NULL powers the "find active fix" query). - draft_templates — scripts pending post-resolve templatization. Partial index on (account_id) WHERE status='pending' supports the "N scripts ready to review" Script Library badge. - account_settings — new per-account table with JSONB preferences grab-bag. Rows created lazily on first write; get_setting returns default when no row exists. Column additions on ai_sessions: - resolution_note_markdown / posted_at / external_id - escalation_package_markdown / posted_at / external_id - state_version (INTEGER NOT NULL DEFAULT 0) — incremented atomically by any write that invalidates the resolution note preview cache per Section 5.5. Phase 3 consumes this. Column additions on script_templates: - source_session_id, source_user_id, source_ticket_ref — powers the "generated from CW #X · resolved by Y · used N times" provenance chip in the Script Library. RLS pattern matches the repo convention (074 / network_diagrams is the nearest template): ENABLE + FORCE, USING + WITH CHECK on `account_id = app.current_account_id`. Downgrade is reversible — drops in the inverse order of creation so FK dependencies unwind. No runtime verification from code-server; migration apply + downgrade will be verified on the new dev environment per the standing deferral. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 18:14:26 +00:00
Michael Chihlas	92fadfb90a	docs(flowpilot-migration): integrate Codex plan review + Phase 0 audit findings Significant rewrite of FLOWPILOT-MIGRATION.md after post-Codex plan review and the Phase 0 in-flight audit. Archives the pre-rewrite version as FLOWPILOT-MIGRATION-v1.md and keeps the Codex review under CODEX-FlowAssist-Migration-PLAN.md for traceability. Substantive changes that affect implementation: - Section 0.1 adds a spec-drift note listing corrections integrated into this revision (API namespace, task-lane item UUIDs, account_settings creation, missing /tickets/ai-parse endpoint). - Section 2 adds "Task lane item ID" terminology — stable UUID assigned to items inside ai_sessions.pending_task_lane so session_facts.source_ref has something reliable to point to. - Section 4.1 adds ai_sessions.state_version (INTEGER NOT NULL DEFAULT 0) and escalation_package_external_id. state_version drives preview cache invalidation; incremented atomically on writes to facts / suggested fixes / script_generations. - Section 4.6 creates account_settings as a new table with JSONB preferences column, lazy row creation, and a promotion rule for when a setting should graduate to a typed column. - Section 5 namespaces all session-scoped routes under /api/v1/ai-sessions/{id}/... to match the existing codebase pattern. - Section 5.5 documents the preview caching strategy (state_version keyed, 500ms client debounce, Redis planned). - Section 6.6 adds per-service MCP capability flags alongside the model tier flags. - Section 7.1 makes the /assistant -> /pilot redirect include the session-deep-link path and preserve the session ID. - Section 8.2 adds supersession semantics for [SUGGEST_FIX] markers. - Section 9 Phase 1 now explicitly includes account_settings and state_version; Phase 3 uses state_version-keyed caching; Phase 5 mentions MCP inheritance via chat_call_cached wrapper. - Section 11 adds a dedicated test plan (migrations, backend, frontend, manual QA). - Section 14 captures the eight planning decisions made during the Phase 0 conversation so they are traceable. No code changes in this commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 17:05:04 +00:00
Michael Chihlas	3f0a132058	refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4) Renames the chat caller to a name that signals its actual purpose, and factors the reusable cached-system-block + cached-history + cache-usage-log primitives out to app.core.ai_provider so they can be shared with the provider-generic path without pulling MCP/beta/images into the abstract interface. Helpers added to ai_provider.py: - `build_anthropic_chat_messages(history, new_message, images, format_reminder)` — owns: copy history, apply cache_control to last history message, append format reminder to new message, render images as multimodal blocks. Anthropic-shaped by design; do not call from Gemini paths. chat_call_cached keeps exactly the concerns that are unique to the one MCP/beta/multimodal chat caller: - Anthropic beta endpoint invocation - Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN) - Retry-without-MCP fallback - Format-reminder content string (declared as module constant) - Phase 0.5 telemetry (mcp.turn, mcp.fallback) Documents in the module docstring AND at the function site that this is the ONE MCP/beta chat caller and should not become the general provider path. MCP/beta/images are features of exactly one optional Anthropic beta endpoint; routing them through AnthropicProvider would leak a provider- specific concern into the abstract interface that also serves Gemini. Behavior change: chat_call_cached now reuses the singleton AnthropicProvider HTTP client via `_get_anthropic_client(...)` instead of instantiating a new `anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern and avoids burning connections per-turn. No user-visible difference. No runtime verification from code-server. TODO(phase0-verify) in ai_provider.py tracks the cache-hit verification owed on the new dev env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 17:03:09 +00:00
Michael Chihlas	da93ae55c3	feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3) Wraps each static system prompt in a single-block list so Phase 0.1's AnthropicProvider applies cache_control: ephemeral automatically (policy α, first block gets marked when no caller-authored cache_control is present). Call sites: - ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens) - ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT (~2.5k tokens with few-shot example); retries inside the same function re-read the cached block instead of paying full input cost on each attempt - kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt (each caches independently by text content) - ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry - script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language substitution — same-language sessions share cache entries) Each edit includes an inline comment explaining why the block is cacheable (stable-constant, retry-reuse, per-language variant) so a future dev can see the intent at the cache_control marker site. script_builder history caching deliberately deferred — per Phase 0.1 decision (option i), AnthropicProvider does not automatically cache the message list. If script_builder's growing 20-message history turns out to be a visible cost driver via the anthropic.cache telemetry, route that caller through the 0.4 chat wrapper which handles history caching. No runtime verification from code-server; cache-hit behavior will be confirmed against the new dev environment when it's up, per the inline TODO(phase0-verify) in ai_provider.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:29:45 +00:00
Michael Chihlas	56fd440b16	docs(flowpilot-migration): flag Phase 0.2 as pending-endpoint; target not yet built The /tickets/ai-parse endpoint named in Phase 0.2 does not exist in the codebase (verified: zero matches for ai-parse/ai_parse across endpoints, services, models, and all branches/commit messages). integrations.py:557 is get_ticket_statuses — a CW passthrough with no AI call. Adding a block-quoted note under the 0.2 deliverable that flags the drift, records the cached-system-block pattern to apply when the endpoint is built, and instructs the next editor to remove the note once applied. No implementation change this commit — guidance only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:24:33 +00:00
Michael Chihlas	b3be66652e	feat(ai): structured-system-block caching in AnthropicProvider (Phase 0.1) Widens AIProvider.generate_json / generate_text / generate_text_stream signatures to accept `system_prompt: str \| list[SystemBlock]`: - `str` (the existing call shape): passes through uncached, unchanged behavior. Every existing caller stays on the uncached path — no silent behavior change. - `list[SystemBlock]`: enables Anthropic prompt caching via structured system blocks. Caller-authored `cache_control` is honored verbatim (policy α); if no block carries it, the provider applies `cache_control: {"type": "ephemeral"}` to the first block only. Gemini ignores cache_control and concatenates list entries into one system string — the widened signature is strictly additive on that path. Adds `anthropic.cache` structured-log telemetry: on every Anthropic response (streaming included, via `stream.get_final_message()`), logs `cache_read_input_tokens` and `cache_creation_input_tokens`. Telemetry failure in streaming is swallowed so the user-facing stream never breaks. Verification deferred: cannot run from code-server (no Python, no DB, no dev env). TODO(phase0-verify) left inline in the module docstring. First verification task on the new dev environment is to hit any FlowPilot endpoint twice within 5 minutes and confirm the second call shows cache_read_input_tokens > 0 in the `anthropic.cache` log event. If verification fails, that's a debug task on the new env — not a blocker for continuing Phase 0.2/0.3/0.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:17:12 +00:00
Michael Chihlas	0fbc1e0a57	feat(telemetry): add MCP per-turn structured-log telemetry (Phase 0.5) Emits structured `mcp.turn` log events on every Anthropic-path chat turn, capturing whether MCP was wired in (mcp_available), whether the model actually invoked an MCP tool (mcp_invoked), which tool names fired, and whether the silent retry-without-MCP fallback was triggered. Adds a separate `mcp.fallback` event with error type/message for fallback occurrences. Establishes baseline data for deciding whether MCP investment is earning its keep before Phase 2+ expands the product footprint. Scope: the one MCP-using code path (`_call_anthropic_cached`) — not a general instrumentation layer. No new dependencies, no schema changes, no behavior change. Standard library `logging` is the sink; PostHog is not wired on the backend. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 15:57:13 +00:00
Michael Chihlas	46291f30b9	docs: add FlowPilot migration design doc and mockups Brings the locked FlowPilot migration design onto the branch that will implement it. Includes the annotated target UI mockups (primary session view + three Script Generator integration states) and the superseded FLOWPILOT-AND-RESOLUTIONASSIST.md for historical reference. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 15:22:39 +00:00

feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness #147

85 Commits