resolutionflow

Author	SHA1	Message	Date
Michael Chihlas	2a2329ad19	docs(ai): handoff state after bell-icon fix; record draft PR #155 All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m41s Details CI / backend (pull_request) Successful in 9m55s Details CI / e2e (pull_request) Successful in 9m13s Details Updates the handoff trio after the legacy notification flow fix and the branch push. PR #155 is open against main as draft. Resume point is now visual QA via /qa, then deferred follow-ups (chat-input suggested-step chips, snapshot expansion). Logs the open question about whether EscalateModal should switch to /handoff. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:33:44 -04:00
Michael Chihlas	641853a002	fix(escalations): bell-icon notification opens the pickup flow Some checks failed Mirror to GitHub / mirror (push) Successful in 4s Details CI / backend (pull_request) Failing after 1m17s Details CI / frontend (pull_request) Successful in 4m53s Details CI / e2e (pull_request) Successful in 9m18s Details Two backend changes that unbreak the senior-pickup path from the notification panel: 1. notification_service: session.escalated link template now ends with ?pickup=true so the senior lands in the handoff/pickup flow on click. Without it, navigation hit /pilot/:id directly, which then 404'd on the GET because the senior isn't yet escalated_to_id — the user perceives this as the bell-icon "just clearing the notification". 2. ai_sessions GET access: any account member can now read an escalated session's detail when status is requesting_escalation or escalated. The owner-only guard was overly restrictive for explicitly-shared in-transit states. Tenant boundary is enforced by RLS on the underlying query, so account-scope is the right ceiling here. After pickup, the existing handler/escalated_to_id checks still apply. Verified live: re-login as the senior engineer and GET the active escalated session — now returns 200 with full detail. Focused test subset plus tests/test_sessions.py and tests/test_session_sharing.py → 94 passed in 43.26s, no regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:29:47 -04:00
Michael Chihlas	c194ba4a43	docs(ai): handoff state after magic-moment screen lands Marks the magic-moment handoff-context screen as shipped, points the next session at visual QA + push + draft PR, and captures the deferred follow-ups (suggested-step chips, snapshot expansion, toolbar button on revisits, owner analytics, Playwright e2e). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:08:07 -04:00
Michael Chihlas	8e9d22e0e0	feat(escalations): magic-moment handoff-context screen on pickup Adds the dedicated 4-section handoff-context view that renders BEFORE the FlowPilot session for senior techs picking up an escalated session, then dissolves on "Start here". This is the wedge's demonstrable magic moment — what the GTM Loom records. - HandoffContextScreen.tsx: pure presentational, takes a HandoffResponse plus onStartHere / onDismiss callbacks. Sections: header (problem summary, domain, step count, escalated-time, priority badge), "What's been tried" (engineer notes + step-count affordance), "AI assessment" (likely_cause / suggested_steps / confidence badge), Start here CTA. Confidence badge accepts both numeric (0..1) and string ("low"/"medium"/"high") shapes — backend currently emits the latter. Renders an explicit "assessment unavailable" branch when ai_assessment_data is null (the 5s timeout from `9bdd995` fired). Honors prefers-reduced-motion (animate-fade-in vs animate-slide-up). ARIA dialog + focus on the primary CTA. Esc dismisses when used as a re-openable overlay; pre-claim, Start here is the only exit. - FlowPilotSessionPage.tsx: on /pilot/:id?pickup=true, fetch the handoff list via handoffsApi.listHandoffs (account-scoped via RLS, no claim required) and find the latest unclaimed escalate handoff. If found, render the magic-moment screen and skip the regular loadSession (the senior isn't yet escalated_to_id, so GET would 404). Start here calls claimHandoff, drops the pickup query param, dismisses the screen — the existing loadSession effect then fires because the senior is now escalated_to_id. A "Context" toolbar button on active sessions re-opens the screen as a dismissible overlay (visible only when the senior arrived via the magic-moment flow this session — handoff lookup on demand). Verified end-to-end against the running dev stack: listHandoffs returns the unclaimed handoff with full payload; claim flips session status from escalated → active; subsequent GET succeeds. tsc -b clean. Defers (TODO followups): suggested-step chips below the chat input that prefill on click (requires threading through to FlowPilotMessageBar); snapshot expansion to include the recent diagnostic steps pre-claim; toolbar Context button on sessions where the senior didn't arrive via magic-moment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:06:14 -04:00
Michael Chihlas	f65b65790c	docs(ai): handoff state after frontend SSE slice lands Marks the SSE subscription as shipped, points the next-session resume target at the magic-moment handoff-context screen, and logs the live end-to-end verification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:57:20 -04:00
Michael Chihlas	b8627f4180	feat(escalations): subscribe EscalationQueue to live SSE arrivals Adds the frontend live-arrival slice on top of the test-stabilized SSE backend. Senior techs now see a junior's escalation slide into the queue without refresh. - streamEscalations(handlers, signal) in aiSessions.ts: fetch-based ReadableStream parser (native EventSource cannot send auth headers). Handles SSE frames, partial frames across chunks, : keepalive heartbeats. Dispatches ready and handoff_created. - HandoffCreatedEvent + EscalationStreamHandlers types mirror the bus payload published by HandoffManager.dispatch_escalation_notifications. - EscalationQueue.tsx: AbortController-managed subscription with exponential-backoff reconnect (1s → 30s cap, attempt counter resets on ready). On handoff_created, refetch and diff against previous IDs via sessionsRef; new arrivals prepended (newest-first) above established cards (oldest-first preserved). Slide-in tag held for 800ms so the locked 200ms animation completes. Tab-title flash prefixes (N) while document.hidden, restores on focus / unmount. prefers-reduced-motion swaps slide-in for fade-in. ARIA region + aria-live=polite + aria-label on heading. Pick Up bumped to py-2.5 to clear the 44px touch floor. Verified end-to-end against the running dev stack: subscriber received the ready frame on connect; after posting a handoff via the API, the subscriber received the handoff_created frame with the expected payload — wire format matches the parser. Backend regression: focused subset still 32 passed in 18.91s. Frontend tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:57:15 -04:00
Michael Chihlas	02d5c6c08c	docs(ai): refresh handoff state for next-session pickup under 200k context Default Claude Code model is being switched from Opus 4.7 1M-context to Opus 4.7 (200k). Tighten the per-session pickup docs so they're self-sufficient under the smaller window: - CURRENT_TASK now reflects the post-Codex state: 8 commits on the branch (5 feat + WIP SSE + 2 Codex test/latency fixes + 1 doc refresh), 32/32 backend tests with -n auto, frontend tsc -b clean. Remaining work re-scoped: the SSE backend half is feature-complete and tested, so what's left is the FRONTEND SSE subscription in EscalationQueue.tsx, then the magic-moment handoff-context screen, then push + draft PR. - Session log gets a Claude Code entry covering today's planning → build → pause-for-Codex arc, the design decisions locked into the doc and code, the two TODOs added (peer-tech escalation, mobile responsive), and the model-switch context for the next session. - HANDOFF.md needs no change — Codex's update in `9bdd995` already describes the resume point and watch-outs cleanly. No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:13:40 -04:00
Michael Chihlas	9bdd9959a8	fix(handoff): bound escalation assessment latency Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 20:03:14 -04:00
Michael Chihlas	fff8338bf2	docs(ai): track escalation assessment latency follow-up Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:55:31 -04:00
Michael Chihlas	bc15952857	fix(tests): stabilize escalation SSE backend tests Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:47:43 -04:00
Michael Chihlas	ba46fc5644	docs(ai): pause Escalation Mode build mid-SSE for Codex review Update HANDOFF to reflect: - Build paused after the WIP SSE commit (`87bd0b7`) - What Codex should look at on the SSE bus + endpoint + dispatch wiring - Resume point post-review: re-run tests with -n auto, then frontend SSE subscription, then magic-moment screen - Test-suite watch-out: per-test DROP SCHEMA fixture means concurrent pytest runs on the same DB collide; always one-suite-at-a-time or -n auto with conftest's per-worker DB isolation No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:16 -04:00
Michael Chihlas	87bd0b7c56	WIP: SSE pub/sub for live escalation arrivals (paused for Codex review) First half of the WebSocket/SSE push slice. Paused mid-flight to hand the branch to Codex for outside-voice review before stacking more commits on top. See .ai/HANDOFF.md for the full pause context + what to look at. What's here: - backend/app/core/escalation_bus.py — module-level singleton in-memory pub/sub keyed by account_id. asyncio.Queue per subscriber with 64-event maxsize and drop-on-full semantics. Designed to be swappable for Redis pub/sub when Railway scales past single-replica. - backend/app/api/endpoints/session_handoffs.py — GET /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe bound to current_user.account_id. - backend/app/services/handoff_manager.py — dispatch_escalation_notifications now publishes a `handoff_created` event to the bus BEFORE the email fan-out, in a try/except so a bus failure can't block email delivery. - backend/tests/test_escalation_bus.py — 7 unit tests, all green standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers. - backend/tests/test_handoff_manager.py — +1 dispatcher integration test (publishes to bus, payload shape). - backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer blocked, ready event handshake). [gstack-context] Decisions: - SSE over WebSocket (one-way, browser EventSource semantics, fewer moving parts behind Railway proxy) - In-memory bus over Redis for v1 pilot (3 MSPs, single replica) - Drop-on-full subscriber queue rather than back-pressure publishers - Bus publish ahead of email send, both wrapped in try/except so neither can break handoff creation - Frontend will be a fetch-based ReadableStream reader matching the existing streamDocumentation pattern, not native EventSource (custom-header auth) Remaining (post-Codex): - Frontend SSE subscription in EscalationQueue.tsx (slide-in, reconnect, tab-title flash, prefers-reduced-motion) - Magic-moment handoff-context screen - Re-run the full backend test suite to verify the SSE + dispatcher integration tests (bus units already green standalone) Tried: - Running the full test suite repeatedly without xdist; the per-test DROP SCHEMA + recreate fixture made wall-clock prohibitive when multiple stale runs collided on the same Postgres test schema. Resolution: -n auto next time. [/gstack-context] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:07 -04:00
Michael Chihlas	a283d0d3fd	docs(ai): refresh handoff state mid-flight on Escalation Mode build Capture the in-flight state of the Escalation Mode wedge build so the next session (or Codex resume) picks up cleanly without re-deriving context: - CURRENT_TASK now describes the wedge, what's done across the 5 commits on this branch, what remains (WebSocket push, magic-moment screen, analytics page, e2e), and the two-metric framing readers MUST internalize before quoting numbers - HANDOFF resume point is the WebSocket/SSE push (live-arrival half of the notification dual-path); includes suggested first slice + watch-outs (no user_id on ai_session_step, denormalized account_id, peer-escalation still gated to session owner) - Both files reference the design doc and the kill-switch criterion No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 16:38:14 -04:00
Michael Chihlas	9f0bfd44f9	feat(escalations): mount time-to-first-action stat-card on /escalations Surfaces the new GET /analytics/flowpilot/escalations endpoint as a card above the EscalationQueue list. Closes the loop from yesterday's metric endpoint commit — seniors and owners see the wedge stat the moment they open the queue, which is the daily-reps version of the GTM ROI story. Pieces: - EscalationMetrics TS interface mirroring the backend Pydantic model (incl. metric_definition disclaimer field) - flowpilotAnalyticsApi.getEscalationMetrics(period) client method - EscalationMetricCard component: * loading skeleton, error state, zero-data empty state * avg + median + n_with_action/n_claimed conversion rate * humanized seconds → "Ns" / "N.N min" formatting * inline disclaimer reminding callers this is in-product time-to- first-action only, NOT the savings claim — pair with manual baseline (per /codex review's two-metric correction) - Wired into EscalationQueuePage above EscalationQueue DS-aligned: card-flat, accent-dim usage held to interactive elements, text-muted-foreground for secondary copy, font-heading on the headline number, explicit transition properties (no `transition: all`). Respects prefers-reduced-motion implicitly (only animation is the loading pulse, which Tailwind's animate-pulse already gates). tsc -b clean. No new tests in this commit — component is a thin state-machine over an axios call; integration coverage comes from the existing backend tests + the e2e Playwright work in the plan. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 16:00:34 -04:00
Michael Chihlas	07d0db9579	feat(handoff): email engineer-or-admin teammates on escalation First half of the Escalation Mode notification dual-path. WebSocket/SSE push is the second half (next commit) — email handles offline seniors, push handles online ones for the magic-moment demo. HandoffManager.dispatch_escalation_notifications: - Pulls active engineer/admin/owner-role users in the same account_id (excludes the escalator + viewers + soft-deleted) - Sends via existing EmailService.send_notification_email, concurrent via asyncio.gather; per-message failures don't block the rest - Wrapped in try/except: any exception is logged + swallowed. Handoff creation is authoritative; notification is advisory. This is the graceful-degradation regression both eng + codex reviews flagged as critical (handoff must succeed even if SMTP is down). Endpoint wiring (POST /ai-sessions/{id}/handoff): - Dispatch fires AFTER db.commit() — never email about a rolled-back handoff. Trust-erosion bug if we got that wrong. - Only fires for intent=escalate. Park is private to the escalator. Tests (4 new): - emails-engineer-recipients-in-account: viewer excluded, escalator excluded, only the engineer/admin teammates get the message - skipped-for-park-intent: park doesn't fan out - graceful-degradation-when-email-raises: RuntimeError from the email service does NOT bubble out of dispatch - endpoint-dispatches-on-escalate: end-to-end wiring through POST Per-channel delivery records (replacing the dead `notification_sent` boolean per Codex correction) is a v1.x story — for now application logs are the audit trail. See docs/plans/2026-04-27-escalation-mode-wedge-design.md. 20 tests green across handoff_manager + session_handoffs_api + flowpilot_analytics_escalations. No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:58:05 -04:00
Michael Chihlas	7a5b853b3b	feat(api): role-gate handoff claim to engineer-or-admin POST /ai-sessions/{id}/handoffs/{hid}/claim previously required only an authenticated user, so a viewer-role account user could claim escalations. Codex review flagged this as wedge-relevant: the Escalation Mode race- condition story (two seniors clicking Pick Up simultaneously) depends on auth gating for audit integrity. Originally captured as a deferred TODO during /plan-eng-review, then moved in-scope by /codex review. Swap the dep to require_engineer_or_admin. One-line change. Two new tests: - viewer_role gets 403 with "Engineer or admin access required" - engineer/owner role still succeeds and claimed_at + claimed_by populate Existing handoff create + queue tests unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:46:59 -04:00
Michael Chihlas	52f6d0308f	feat(analytics): add escalation time-to-first-action metric endpoint GET /api/v1/analytics/flowpilot/escalations?period={7d,30d,90d} Computes the in-product wedge metric for Escalation Mode: average / median / p95 seconds between SessionHandoff.claimed_at and the first ai_session_step created on the same session after that timestamp. Account-scoped, role-gated to engineer-or-admin. The metric is intentionally NOT called "minutes recovered" — that's the two-metric framing locked by /codex review: this in-product number must be paired with manual baseline (the verbal-handoff stopwatch from The Assignment) to produce the savings claim. Schema's `metric_definition` field surfaces the disclaimer in every response so callers don't oversell it. Implementation notes: - Uses correlated scalar subquery for first-step-after-claim per handoff, aggregates avg/median/p95 in Python (~1k rows/account/month is well within budget; cleaner than percentile_cont gymnastics in SQL) - Excludes unclaimed handoffs (claimed_at IS NULL) - Counts claimed-but-no-action handoffs in n_handoffs_claimed but not in n_handoffs_with_action — surfaces the conversion-rate signal - Floors negative deltas at 0 to handle clock-drift edge cases Tests cover happy path, zero-data, claimed-but-no-action accounting, period window filtering, multi-handoff aggregation, multi-tenant isolation (Phase 4 RLS landmine pattern), viewer-role 403 gate, and period validation. 9 tests, all green. No regressions in existing handoff_manager / session_handoffs suites. First piece of the Approach A wedge build per docs/plans/2026-04-27-escalation-mode-wedge-design.md. Unblocks the queue stat-card and the analytics page. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:25:46 -04:00
Michael Chihlas	d51e95cdfa	docs(plans): add escalation-mode wedge design + test plan Captures the GTM thesis, premises, reduced-scope engineering plan, locked UI specs, and embedded review report for the Escalation Mode wedge — output of /office-hours, /plan-eng-review, /plan-design-review, and /codex review. Codex review surfaced two corrections we applied: - two-metric framing (manual baseline vs in-product time-to-first-action) - claim role gate moved in-scope (was deferred TODO) TODO updates: peer-tech escalation + claim role gate captured (the latter then moved in-scope by the codex pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:18:46 -04:00
chihlasm	c0ed6d9840	Merge pull request 'docs(ai): refresh handoff state after PR #153 merge' (#154 ) from chore/post-153-handoff into main All checks were successful CI / frontend (push) Successful in 5m37s Details Mirror to GitHub / mirror (push) Successful in 14s Details CI / backend (push) Successful in 10m48s Details CI / e2e (push) Successful in 11m0s Details Reviewed-on: #154	2026-04-26 05:33:31 +00:00
Michael Chihlas	8f818a7c71	docs(ai): refresh handoff state after PR #153 merge All checks were successful Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 5m49s Details CI / backend (pull_request) Successful in 11m5s Details CI / e2e (pull_request) Successful in 11m36s Details - CURRENT_TASK rolls forward — PR #153 closed out, no active task, with recommended next moves (promote e2e gate to required, pick from TODO). - HANDOFF rewritten — new home position is `main`; documents the e2e job's stub ANTHROPIC_API_KEY convention so future AI-touching e2e tests know what to expect. - SESSION_LOG entry extended with the CI env-var diagnosis, the fix, the merge, and pointers to the natural next pickups. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 01:14:49 -04:00
chihlasm	68fcdc6122	Merge PR #153 : fix(chat): sync currentChatRef when prefill creates a new chat session All checks were successful CI / frontend (push) Successful in 5m57s Details Mirror to GitHub / mirror (push) Successful in 13s Details CI / backend (push) Successful in 10m28s Details CI / e2e (push) Successful in 12m0s Details Fixes a silent-drop bug where the dashboard prefill flow created a new chat session but didn't update the in-flight guard ref, so subsequent task-lane submissions had their AI follow-up responses discarded. Includes a Playwright regression test that drives the prefill flow and stubs /ai-sessions/*/chat to verify the second AI turn renders. Also adds a stub ANTHROPIC_API_KEY to the e2e CI job so AI-gated endpoints clear their _require_ai_enabled() check (the chat call itself is intercepted in the browser, so no real Anthropic traffic).	2026-04-26 05:05:54 +00:00
Michael Chihlas	11fe32f4c6	fix(ci): set stub ANTHROPIC_API_KEY for e2e job so AI-gated endpoints respond All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m39s Details CI / backend (pull_request) Successful in 10m24s Details CI / e2e (pull_request) Successful in 12m14s Details POST /api/v1/ai-sessions and friends call _require_ai_enabled(), which returns 503 when no provider key is set. The new prefill-handoff regression test (e2e/assistant-chat-prefill.spec.ts) drives the dashboard prefill flow, which has to create a chat session before its page.route stub on /chat can fire — so without a key, session creation 503s and the test never sees the task lane. The Playwright stub intercepts /chat in the browser, so the backend never actually contacts Anthropic — but the AI-enabled gate still needs to pass. A stub value is enough. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:51:39 -04:00
Michael Chihlas	43eed720d9	docs(ai): close out PR #150 , set PR #153 as active task Some checks failed Mirror to GitHub / mirror (push) Successful in 13s Details CI / frontend (pull_request) Successful in 5m50s Details CI / e2e (pull_request) Failing after 6m50s Details CI / backend (pull_request) Successful in 10m40s Details - CURRENT_TASK.md rolled forward — the CI-recovery task is complete (PR #150 merged as 87bb20b; backend gate is in required checks). Active task is now landing PR #153. - HANDOFF.md rewritten — new resume point is watching CI on the rebased SHA `1559feb` and merging when all three checks are green. - SESSION_LOG.md gains a 2026-04-26 entry covering the prefill bug diagnosis, fix, regression test, and the rebase off post-#150 main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:30:50 -04:00
Michael Chihlas	1559feb759	docs(ai): track currentChatRef silent-swallow follow-up in TODO Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m43s Details CI / e2e (pull_request) Failing after 6m40s Details CI / backend (pull_request) Has been cancelled Details The guard pattern that masked the prefill-ref bug fixed in PR #153 is applied across handleSend, handleTaskSubmit, selectChat, refreshFacts, refreshActiveFix, and refreshPreview. Worth either logging the mismatch path or distinguishing expected-stale from unexpected-stale so the next instance of this class of bug surfaces instead of hiding. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:24:25 -04:00
Michael Chihlas	b56da2facd	fix(chat): sync currentChatRef when prefill creates a new chat session The dashboard prefill flow in AssistantChatPage set activeChatId after creating a new session but never updated currentChatRef.current. Every later handleSend / handleTaskSubmit then tripped the `currentChatRef.current !== sentForChatId` guard that was supposed to discard responses for stale chats — and silently dropped the AI's follow-up. The user saw their submitted message but no assistant reply, no toast, no task-lane update. Mirrors what handleNewChat and handleResumeNew already do. Adds an e2e regression test that drives the dashboard prefill, submits a partial task-lane response, and asserts the second AI turn renders. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:24:02 -04:00
chihlasm	87bb20b8f0	Merge PR #150 : fix(ci): consolidated CI recovery — backend green, xdist parallelization, e2e selector + decoupling All checks were successful CI / frontend (push) Successful in 5m42s Details Mirror to GitHub / mirror (push) Successful in 13s Details CI / backend (push) Successful in 10m21s Details CI / e2e (push) Successful in 11m5s Details	2026-04-25 21:57:26 +00:00
Michael Chihlas	1e3a6cfa01	fix(e2e): harden card selectors for session resume All checks were successful Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 5m43s Details CI / backend (pull_request) Successful in 10m21s Details CI / e2e (pull_request) Successful in 11m23s Details Co-Authored-By: Codex <noreply@openai.com>	2026-04-25 16:42:33 -04:00
Michael Chihlas	ede6eebf9a	docs(ai): note e2e decoupling commit (`261814a`) in HANDOFF Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m43s Details CI / e2e (pull_request) Failing after 9m30s Details CI / backend (pull_request) Successful in 10m18s Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 16:12:19 -04:00
Michael Chihlas	261814ae65	perf(ci): decouple e2e from frontend — build frontend inline in e2e job Some checks failed Mirror to GitHub / mirror (push) Successful in 14s Details CI / frontend (pull_request) Successful in 5m44s Details CI / e2e (pull_request) Failing after 7m42s Details CI / backend (pull_request) Successful in 10m28s Details Before: e2e \`needs: [frontend]\` waited for the frontend job to upload a build artifact, then downloaded it. With multiple runners this means the third runner sat idle for ~6 min while frontend ran, then started e2e — total wall-clock max(backend, frontend+e2e) ≈ 11 min. After: e2e builds its own frontend (npm ci + npm run build are already in the job; just dropped the artifact download step and added the build). e2e starts immediately on a free runner. Adds ~1-2 min to the e2e job duration but removes ~5 min of waiting and eliminates the cross-job artifact mechanism entirely. Side benefit: no more \`actions/upload-artifact\` v3/v4 GHES headaches on the cross-job handoff. The \`if: always()\` upload of the playwright-report at the end of e2e is kept (failure report retrieval is still useful), but it's a leaf-output, not a dependency. Net wall-clock: max(backend=9m, frontend=6m, e2e=7m) ≈ 9 min on the 3-runner setup, down from ~11 min. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:59:00 -04:00
Michael Chihlas	6656ebdead	docs(ai): reflect PR consolidation — #151/#152 merged into #150 Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details CI / frontend (pull_request) Has been cancelled Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:55:08 -04:00
Michael Chihlas	69f2a37591	fix(e2e): update 5 selectors that drifted with FlowPilot/PSA UI changes Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m52s Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details Mechanical drift between the e2e selectors and the current UI surfaced on the first CI run after PR #149 unblocked the artifact upload step. Five tests, three categories of drift: 1. Page heading renames (navigation.spec.ts) - `Sessions` → `Session History` on /sessions - `Account Settings` → `Account Management` on /account 2. Route rename (command-palette.spec.ts:74) - The "Troubleshoot with FlowPilot" command palette option now lands on /pilot (Phase 1 of the FlowPilot migration renamed /assistant). /assistant still 301-redirects, so the assertion accepts either. 3. Feature moved to /sessions (history.spec.ts, resume.spec.ts) - Default tab on /sessions is "AI Sessions"; flow-session filtering and the Resume button moved behind the "Flow Sessions" tab. Both tests now click that tab before asserting. - resume.spec.ts no longer starts at /trees (Resume buttons aren't rendered there anymore — the flow lives on /sessions). Destination URL (/trees/:id/navigate) is unchanged. No product-code changes — these are pure test updates against the shipped UI. Run the suite locally with `cd frontend && npm run test:e2e` once a fresh build is available. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:53:57 -04:00
Michael Chihlas	7f714363dd	perf(ci): pytest-xdist with per-worker DBs — 22m → ~4m Backend suite is the slow gate (1076 passed locally in 22m27s on fix/ci-workflow-config). Adding pytest-xdist with per-worker DB isolation drops it to ~4m20s on the 8-core homelab runner. Verified locally: `pytest -n auto --no-cov` finished in 4m28s real time (15m19s user — confirms ~5× parallelism). How it works: - conftest.py reads `PYTEST_XDIST_WORKER` (set per worker by xdist — 'gw0', 'gw1', …). When set, derives a per-worker DB URL like `…/resolutionflow_test_gw0`. The base DB stays for serial / master runs. - `_ensure_worker_db_exists` runs synchronously at conftest import, connects to the postgres maintenance DB, and `CREATE DATABASE`s the worker-suffixed DB if it doesn't exist. Idempotent across runs. - The "test" safety guard still applies — every worker DB name contains "test" so the assertion holds. - The per-test `DROP SCHEMA public CASCADE` now operates on the worker's isolated DB, no cross-worker race. CI workflow: backend job switches to `pytest -n auto`. Coverage still collected (pytest-cov has built-in xdist support). Adds `pytest-xdist==3.6.1` to requirements-dev.txt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:53:47 -04:00
Michael Chihlas	1bd43abb8f	fix(ci): drop postgres host port mapping (multi-runner port collision) Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 6m44s Details CI / e2e (pull_request) Failing after 8m43s Details CI / backend (pull_request) Has been cancelled Details With 3 Gitea Actions runners on the same homelab box, two simultaneous backend (or backend + e2e) jobs both try to bind 0.0.0.0:5432 for their postgres service containers. The second fails with: failed to set up container networking: ... Bind for 0.0.0.0:5432 failed: port is already allocated The host-port mapping isn't actually needed — the workflow uses \`DATABASE_URL: postgresql+asyncpg://...@postgres:5432/...\` (hostname \`postgres\` is the service container's docker-network DNS name). The tests run inside the act container which is on the same docker network, so they reach postgres without going through the host. Removing \`ports: 5432:5432\` from both backend and e2e job service definitions lets multiple postgres services run in parallel on different docker networks without colliding on the host. Surfaced when PR #150 ran in parallel with another job after the multi-runner setup. Backend instant-failed in 2s on the docker run. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:28:17 -04:00
Michael Chihlas	c203b70ef9	docs(ai): queue data-testid hardening + reflect PR #152 + 3-runner setup Some checks failed CI / backend (pull_request) Failing after 2s Details Mirror to GitHub / mirror (push) Successful in 15s Details CI / e2e (pull_request) Has been cancelled Details CI / frontend (pull_request) Has been cancelled Details TODO.md: Promote pytest-xdist to ✅ (PR #151 carries it). Adds three new backlog items: - data-testid hardening for e2e-critical interactive elements (sparked by PR #152's selector drift work) - per-test transactional rollback (next big speedup if needed) - pytest-testmon for PR-time test selection HANDOFF.md: Three open PRs now (#150, #151, #152), all independent. Three Gitea runner agents now registered, so jobs run in parallel. Combined with #151's xdist, the prior 1h 14m wall-clock should drop to ~6-10 min. Updated merge order: #152 first (smallest), #150 next, #151 last. After all three land, enable CI / backend then CI / e2e as required status checks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:26:21 -04:00
Michael Chihlas	f27e3b44b0	docs(ai): SESSION_LOG entry for the parallelization session Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Successful in 32m33s Details CI / frontend (pull_request) Successful in 5m42s Details CI / e2e (pull_request) Failing after 4m58s Details (Was meant to land in fe632c9; the multi-line edit failed silently because Codex's earlier entry shifted the surrounding context.) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:15:41 -04:00
Michael Chihlas	fe632c9194	docs(ai): handoff after CI parallelization + final test fix Some checks failed Mirror to GitHub / mirror (push) Has been cancelled Details CI / backend (pull_request) Successful in 30m26s Details CI / frontend (pull_request) Successful in 5m46s Details CI / e2e (pull_request) Failing after 5m3s Details Updates HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md to reflect: - PR #150 now contains the AI-provider test mock + caching + maxfail. Backend CI should be fully green for the first time in months. - PR #151 stacked on #150: pytest-xdist with per-worker DBs. Local verification: 22m 27s → 4m 28s (5× speedup), 1076 passed both runs. - DoD is now: merge #150, then #151, then add CI / backend (pull_request) to required status checks on main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:15:07 -04:00
Michael Chihlas	e976fb4e87	fix(ci): mock AI provider in record_decision test + cache pip/npm + drop term-missing Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / backend (pull_request) Successful in 31m8s Details CI / frontend (pull_request) Successful in 5m42s Details CI / e2e (pull_request) Failing after 4m57s Details Three changes that get PR #150 to a green CI gate: 1. test_record_decision_persists_and_bumps_state_version — the `decision: draft_template` path calls `_extract_template_parameters` (TemplateExtractionService → AI provider). CI doesn't set ANTHROPIC_API_KEY/GOOGLE_AI_API_KEY, so the endpoint raised `RuntimeError: No AI provider configured` and returned 500. The test isn't exercising the AI integration — patched the extractor with an AsyncMock returning a minimal valid `{templated_body, parameters}` dict. Verified locally: the test now passes. 2. pip + npm caches in backend, frontend, and e2e jobs. Keyed on the hash of requirements.txt / package-lock.json with a runner-os restore-key fallback. Saves ~30-60s per run on cache hit. 3. Pytest invocation tightened*: - Dropped `--cov-report=term-missing` — the custom "Display coverage summary" step below parses coverage.json and prints the same module list more concisely. Term-missing dumps every uncovered line which adds ~5-10s of stdout. - Added `--maxfail=10` so a structural breakage (fixture explosion, DB unreachable) bails after 10 errors instead of running the full 25-min suite. Tunable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:01:05 -04:00
Michael Chihlas	0aefaa78eb	docs(ai): queue pytest-xdist parallelization in TODO.md Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Has been cancelled Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details Capture the backend pytest parallelization work so it survives session end. Backend suite is currently ~22 min wall-clock for 1076 tests; xdist with one-DB-per-worker should land in the 3-6 min range on the homelab Gitea Actions runner. Also queues two backlog items: - Frontend lint warnings (23 react-hooks/exhaustive-deps after PR #149) - Periodic audit of the ResourceWarning filterwarnings added by Codex Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 11:35:38 -04:00
Michael Chihlas	49f88569da	wip(handoff): restore backend suite to green Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / backend (pull_request) Failing after 27m35s Details CI / frontend (pull_request) Successful in 2m46s Details CI / e2e (pull_request) Failing after 4m9s Details Co-Authored-By: Codex <noreply@openai.com>	2026-04-25 06:13:23 -04:00
Michael Chihlas	208ec996d5	docs(ai): handoff for Codex — CI recovery + 54 real backend failures Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 28m15s Details CI / frontend (pull_request) Successful in 2m55s Details CI / e2e (pull_request) Failing after 4m23s Details Updates HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md so the next session has accurate resume state. Summary of where things are: - PR #141 (PSA tickets), PR #147 (FlowPilot Phase 1-9), PR #148 (CI fixes part 1), PR #149 (CI fixes part 2) all merged to main in this session. - Branch protection enabled on main: PR-only, CI / frontend required. - PR #150 (this branch) is the last CI-config PR — adds DATABASE_TEST_URL to the workflow and pins upload-artifact to v3. - Next session: watch #150's CI, merge if green, add CI / backend to required checks, then start on the 54 real backend test failures. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 03:36:54 -04:00
Michael Chihlas	8f7df2c0ef	fix(ci): set DATABASE_TEST_URL + downgrade upload-artifact to v3 (Gitea Actions) Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 28m29s Details CI / frontend (pull_request) Successful in 3m11s Details CI / e2e (pull_request) Failing after 4m56s Details Two CI-config issues blocking the gate from going green: 1. Backend tests connect to localhost instead of postgres service. conftest.py reads DATABASE_TEST_URL only — DATABASE_URL is intentionally not consulted (per dab740d's test-DB-isolation hardening — running pytest with DATABASE_URL set previously dropped the dev DB schema). The CI workflow only sets DATABASE_URL, so conftest falls back to its localhost default and every fixture-setup fails with `OSError: Connect call failed ('127.0.0.1', 5432)` — observed as 638 errors on the latest main run. Add DATABASE_TEST_URL pointing at the postgres service container. Same connection string as DATABASE_URL — the test DB and the app DB are the same physical postgres in CI; conftest's safety assertion is satisfied by the URL containing "test". 2. Frontend artifact upload fails on Gitea Actions runner. actions/upload-artifact@v4 (and v5) are not supported on Gitea Actions / GHES — the runner returns `GHESNotSupportedError: ... not currently supported on GHES`. Lint itself is now passing (0 errors after PR #149); the job exits 1 only because the upload step then fails. Pin upload-artifact + download-artifact to v3, the latest version compatible with Gitea Actions until they ship v4 support. After this lands, both backend and frontend CI gates should turn green — at which point we can also add backend to the required status checks on main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 03:28:54 -04:00
chihlasm	f27f671fe6	Merge PR #149 : fix(ci): frontend lint to zero errors + test-DB schema fix + dev-deps installable Some checks failed CI / backend (push) Failing after 10m26s Details CI / frontend (push) Failing after 2m35s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 15s Details	2026-04-25 07:12:15 +00:00
Michael Chihlas	d6218f2e07	fix(tests): import all models in conftest so create_all sees the full schema Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 11m23s Details CI / frontend (pull_request) Failing after 2m41s Details CI / e2e (pull_request) Has been skipped Details The test_db fixture calls Base.metadata.create_all on a fresh test DB. That only creates tables for models that have been imported (and thus registered with Base.metadata) by the time the fixture runs. app.main imports app.core.database (which gives us Base) but does NOT eagerly import the model modules — most are pulled in lazily inside scheduler functions (archive_stale_ai_sessions etc.) and route modules. At fixture-setup time, only the handful of models touched by those eager imports are on the metadata, so any test that exercises PSA, network diagrams, ratings, escalations, etc. fails with \`UndefinedTableError: relation "X" does not exist\` and a cascade of 500s on every endpoint that queries the missing table. Adding \`from app import models as _models\` (rather than the bare \`import app.models\` which would shadow the \`app\` FastAPI instance imported just above) pulls in app/models/__init__.py, which itself imports every model module — registering all ~60 tables with Base.metadata before create_all runs. Verified locally: tests/test_psa_writeback_phase4.py went from 1 failed / 6 errors → 4 failed / 3 passed (the cascading errors were masking the actual passes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:49:06 -04:00
Michael Chihlas	920a246d77	fix(react): remove four setState-in-effect cascades flagged by react-hooks v5 Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 11m23s Details CI / frontend (pull_request) Failing after 2m42s Details CI / e2e (pull_request) Has been skipped Details The new react-hooks lint rule "Calling setState synchronously within an effect can trigger cascading renders" flagged real anti-patterns in four spots. Refactored each per the rule's intent (derive during render, or use useSyncExternalStore for external subscriptions). 1. hooks/useMediaQuery.ts — replaced the useState + useEffect pair with useSyncExternalStore. That's the canonical React hook for subscribing to external stores (matchMedia in this case) without mirroring into local state via an effect. Snapshot/getServerSnapshot pair preserves the SSR-safe behaviour. 2. components/network/nodes/DeviceNode.tsx — the prop-sync useEffect that copied nodeData.label into labelValue was redundant. labelValue is the EDIT BUFFER; while not editing, the displayed span now reads nodeData.label directly. The buffer is initialized only when an edit session starts (onDoubleClick). 3. components/network/nodes/GroupNode.tsx — same pattern, same fix. 4. components/dashboard/TicketQueue.tsx — the setTickets([]) + setLoading(true) + fetchTickets() chain in the effect was the cascade. Pushed those writes inside fetchTickets (after the function boundary, so they batch with the eventual setTickets(result)). Added a request-id ref so a slow first response can't overwrite a fast second one. Frontend lint: 20 errors → 0 errors. tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:33:13 -04:00
Michael Chihlas	b7f8e70be2	fix(lint): replace explicit-any types + unused-expressions ternaries Five files, all stylistic: - useFlowPilotSession.ts: typed the axios error shape with a narrow inline type instead of \`as any\`. - FlowPilotSessionPage.tsx: same — typed location.state once, then destructured. - ScriptBuilderTab.tsx: handleViewScript was a placeholder no-op; declared the args properly with \`void script; void filename\` so the signature matches ScriptBuilderChatProps without no-unused-vars firing. - TicketsPage.tsx: replaced 8 ternaries-as-statements (\`x ? f() : g()\`) with proper if/else blocks. Same control flow, satisfies no-unused-expressions, and reads better in the URL-param update paths. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:57 -04:00
Michael Chihlas	857d73e3d0	fix(lint): move AssistantSessionRedirect out of router.tsx (react-refresh gate) react-refresh/only-export-components fires when a file with the \`router\` const export also defines a component (the redirect helper). Moves the small helper to its own file under components/routing/ so HMR can keep the route-component module hot-reload-eligible. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:50 -04:00
Michael Chihlas	406ee0ef97	fix(deps): bump pytest 7.4 → 8.4, pytest-cov 4.1 → 5.0 to satisfy pytest-asyncio 0.24 pytest-asyncio==0.24.0 (added on the FlowPilot branch as part of the RLS test infra refactor) declares pytest>=8.2 — but requirements-dev.txt still pinned pytest==7.4.3, so a clean pip install fails with ResolutionImpossible. CI runners that started from a fresh image would have refused to install dev deps; the FlowPilot tests passed locally only because the dev container had a pre-installed pytest 8.x lying around. pytest-cov 4.1.0 also needs >= 5.0 to play nicely with pytest 8. No code changes — pytest 8 is API-compatible with the existing test suite once the install resolves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:43 -04:00
chihlasm	32fae2c693	Merge PR #147 : feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness Some checks failed CI / backend (push) Failing after 36s Details CI / frontend (push) Failing after 1m11s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 11s Details	2026-04-25 06:02:14 +00:00
Michael Chihlas	a45915fbbc	Merge main into feat/flowpilot-migration (PR #148 backports) Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 37s Details CI / frontend (pull_request) Failing after 1m11s Details CI / e2e (pull_request) Has been skipped Details Brings PR #148 — two pre-existing CI fixes (network_diagrams JSONB server_default, removed deprecated session-scoped event_loop fixture). The conftest.py event_loop fix on main is already incorporated in FlowPilot's `b14a16a` (RLS-gating commit, which dropped the same fixture as part of its larger refactor). Kept HEAD's version of the RLS-gating collection hook; the event_loop fixture removal is identical. The network_diagram.py fix lands cleanly via auto-merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:01:46 -04:00
chihlasm	06593a40d9	Merge PR #148 : fix(tests): repair two pre-existing bugs blocking backend CI Some checks failed CI / backend (push) Has been cancelled Details CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details Mirror to GitHub / mirror (push) Has been cancelled Details	2026-04-25 06:01:08 +00:00

... 3 4 5 6 7 ...

1248 Commits