resolutionflow

Author	SHA1	Message	Date
Michael Chihlas	fff8338bf2	docs(ai): track escalation assessment latency follow-up Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:55:31 -04:00
Michael Chihlas	bc15952857	fix(tests): stabilize escalation SSE backend tests Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:47:43 -04:00
Michael Chihlas	ba46fc5644	docs(ai): pause Escalation Mode build mid-SSE for Codex review Update HANDOFF to reflect: - Build paused after the WIP SSE commit (`87bd0b7`) - What Codex should look at on the SSE bus + endpoint + dispatch wiring - Resume point post-review: re-run tests with -n auto, then frontend SSE subscription, then magic-moment screen - Test-suite watch-out: per-test DROP SCHEMA fixture means concurrent pytest runs on the same DB collide; always one-suite-at-a-time or -n auto with conftest's per-worker DB isolation No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:16 -04:00
Michael Chihlas	87bd0b7c56	WIP: SSE pub/sub for live escalation arrivals (paused for Codex review) First half of the WebSocket/SSE push slice. Paused mid-flight to hand the branch to Codex for outside-voice review before stacking more commits on top. See .ai/HANDOFF.md for the full pause context + what to look at. What's here: - backend/app/core/escalation_bus.py — module-level singleton in-memory pub/sub keyed by account_id. asyncio.Queue per subscriber with 64-event maxsize and drop-on-full semantics. Designed to be swappable for Redis pub/sub when Railway scales past single-replica. - backend/app/api/endpoints/session_handoffs.py — GET /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe bound to current_user.account_id. - backend/app/services/handoff_manager.py — dispatch_escalation_notifications now publishes a `handoff_created` event to the bus BEFORE the email fan-out, in a try/except so a bus failure can't block email delivery. - backend/tests/test_escalation_bus.py — 7 unit tests, all green standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers. - backend/tests/test_handoff_manager.py — +1 dispatcher integration test (publishes to bus, payload shape). - backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer blocked, ready event handshake). [gstack-context] Decisions: - SSE over WebSocket (one-way, browser EventSource semantics, fewer moving parts behind Railway proxy) - In-memory bus over Redis for v1 pilot (3 MSPs, single replica) - Drop-on-full subscriber queue rather than back-pressure publishers - Bus publish ahead of email send, both wrapped in try/except so neither can break handoff creation - Frontend will be a fetch-based ReadableStream reader matching the existing streamDocumentation pattern, not native EventSource (custom-header auth) Remaining (post-Codex): - Frontend SSE subscription in EscalationQueue.tsx (slide-in, reconnect, tab-title flash, prefers-reduced-motion) - Magic-moment handoff-context screen - Re-run the full backend test suite to verify the SSE + dispatcher integration tests (bus units already green standalone) Tried: - Running the full test suite repeatedly without xdist; the per-test DROP SCHEMA + recreate fixture made wall-clock prohibitive when multiple stale runs collided on the same Postgres test schema. Resolution: -n auto next time. [/gstack-context] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:07 -04:00
Michael Chihlas	a283d0d3fd	docs(ai): refresh handoff state mid-flight on Escalation Mode build Capture the in-flight state of the Escalation Mode wedge build so the next session (or Codex resume) picks up cleanly without re-deriving context: - CURRENT_TASK now describes the wedge, what's done across the 5 commits on this branch, what remains (WebSocket push, magic-moment screen, analytics page, e2e), and the two-metric framing readers MUST internalize before quoting numbers - HANDOFF resume point is the WebSocket/SSE push (live-arrival half of the notification dual-path); includes suggested first slice + watch-outs (no user_id on ai_session_step, denormalized account_id, peer-escalation still gated to session owner) - Both files reference the design doc and the kill-switch criterion No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 16:38:14 -04:00
Michael Chihlas	9f0bfd44f9	feat(escalations): mount time-to-first-action stat-card on /escalations Surfaces the new GET /analytics/flowpilot/escalations endpoint as a card above the EscalationQueue list. Closes the loop from yesterday's metric endpoint commit — seniors and owners see the wedge stat the moment they open the queue, which is the daily-reps version of the GTM ROI story. Pieces: - EscalationMetrics TS interface mirroring the backend Pydantic model (incl. metric_definition disclaimer field) - flowpilotAnalyticsApi.getEscalationMetrics(period) client method - EscalationMetricCard component: * loading skeleton, error state, zero-data empty state * avg + median + n_with_action/n_claimed conversion rate * humanized seconds → "Ns" / "N.N min" formatting * inline disclaimer reminding callers this is in-product time-to- first-action only, NOT the savings claim — pair with manual baseline (per /codex review's two-metric correction) - Wired into EscalationQueuePage above EscalationQueue DS-aligned: card-flat, accent-dim usage held to interactive elements, text-muted-foreground for secondary copy, font-heading on the headline number, explicit transition properties (no `transition: all`). Respects prefers-reduced-motion implicitly (only animation is the loading pulse, which Tailwind's animate-pulse already gates). tsc -b clean. No new tests in this commit — component is a thin state-machine over an axios call; integration coverage comes from the existing backend tests + the e2e Playwright work in the plan. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 16:00:34 -04:00
Michael Chihlas	07d0db9579	feat(handoff): email engineer-or-admin teammates on escalation First half of the Escalation Mode notification dual-path. WebSocket/SSE push is the second half (next commit) — email handles offline seniors, push handles online ones for the magic-moment demo. HandoffManager.dispatch_escalation_notifications: - Pulls active engineer/admin/owner-role users in the same account_id (excludes the escalator + viewers + soft-deleted) - Sends via existing EmailService.send_notification_email, concurrent via asyncio.gather; per-message failures don't block the rest - Wrapped in try/except: any exception is logged + swallowed. Handoff creation is authoritative; notification is advisory. This is the graceful-degradation regression both eng + codex reviews flagged as critical (handoff must succeed even if SMTP is down). Endpoint wiring (POST /ai-sessions/{id}/handoff): - Dispatch fires AFTER db.commit() — never email about a rolled-back handoff. Trust-erosion bug if we got that wrong. - Only fires for intent=escalate. Park is private to the escalator. Tests (4 new): - emails-engineer-recipients-in-account: viewer excluded, escalator excluded, only the engineer/admin teammates get the message - skipped-for-park-intent: park doesn't fan out - graceful-degradation-when-email-raises: RuntimeError from the email service does NOT bubble out of dispatch - endpoint-dispatches-on-escalate: end-to-end wiring through POST Per-channel delivery records (replacing the dead `notification_sent` boolean per Codex correction) is a v1.x story — for now application logs are the audit trail. See docs/plans/2026-04-27-escalation-mode-wedge-design.md. 20 tests green across handoff_manager + session_handoffs_api + flowpilot_analytics_escalations. No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:58:05 -04:00
Michael Chihlas	7a5b853b3b	feat(api): role-gate handoff claim to engineer-or-admin POST /ai-sessions/{id}/handoffs/{hid}/claim previously required only an authenticated user, so a viewer-role account user could claim escalations. Codex review flagged this as wedge-relevant: the Escalation Mode race- condition story (two seniors clicking Pick Up simultaneously) depends on auth gating for audit integrity. Originally captured as a deferred TODO during /plan-eng-review, then moved in-scope by /codex review. Swap the dep to require_engineer_or_admin. One-line change. Two new tests: - viewer_role gets 403 with "Engineer or admin access required" - engineer/owner role still succeeds and claimed_at + claimed_by populate Existing handoff create + queue tests unaffected. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:46:59 -04:00
Michael Chihlas	52f6d0308f	feat(analytics): add escalation time-to-first-action metric endpoint GET /api/v1/analytics/flowpilot/escalations?period={7d,30d,90d} Computes the in-product wedge metric for Escalation Mode: average / median / p95 seconds between SessionHandoff.claimed_at and the first ai_session_step created on the same session after that timestamp. Account-scoped, role-gated to engineer-or-admin. The metric is intentionally NOT called "minutes recovered" — that's the two-metric framing locked by /codex review: this in-product number must be paired with manual baseline (the verbal-handoff stopwatch from The Assignment) to produce the savings claim. Schema's `metric_definition` field surfaces the disclaimer in every response so callers don't oversell it. Implementation notes: - Uses correlated scalar subquery for first-step-after-claim per handoff, aggregates avg/median/p95 in Python (~1k rows/account/month is well within budget; cleaner than percentile_cont gymnastics in SQL) - Excludes unclaimed handoffs (claimed_at IS NULL) - Counts claimed-but-no-action handoffs in n_handoffs_claimed but not in n_handoffs_with_action — surfaces the conversion-rate signal - Floors negative deltas at 0 to handle clock-drift edge cases Tests cover happy path, zero-data, claimed-but-no-action accounting, period window filtering, multi-handoff aggregation, multi-tenant isolation (Phase 4 RLS landmine pattern), viewer-role 403 gate, and period validation. 9 tests, all green. No regressions in existing handoff_manager / session_handoffs suites. First piece of the Approach A wedge build per docs/plans/2026-04-27-escalation-mode-wedge-design.md. Unblocks the queue stat-card and the analytics page. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:25:46 -04:00
Michael Chihlas	d51e95cdfa	docs(plans): add escalation-mode wedge design + test plan Captures the GTM thesis, premises, reduced-scope engineering plan, locked UI specs, and embedded review report for the Escalation Mode wedge — output of /office-hours, /plan-eng-review, /plan-design-review, and /codex review. Codex review surfaced two corrections we applied: - two-metric framing (manual baseline vs in-product time-to-first-action) - claim role gate moved in-scope (was deferred TODO) TODO updates: peer-tech escalation + claim role gate captured (the latter then moved in-scope by the codex pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:18:46 -04:00
chihlasm	c0ed6d9840	Merge pull request 'docs(ai): refresh handoff state after PR #153 merge' (#154 ) from chore/post-153-handoff into main All checks were successful CI / frontend (push) Successful in 5m37s Details Mirror to GitHub / mirror (push) Successful in 14s Details CI / backend (push) Successful in 10m48s Details CI / e2e (push) Successful in 11m0s Details Reviewed-on: #154	2026-04-26 05:33:31 +00:00
Michael Chihlas	8f818a7c71	docs(ai): refresh handoff state after PR #153 merge All checks were successful Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 5m49s Details CI / backend (pull_request) Successful in 11m5s Details CI / e2e (pull_request) Successful in 11m36s Details - CURRENT_TASK rolls forward — PR #153 closed out, no active task, with recommended next moves (promote e2e gate to required, pick from TODO). - HANDOFF rewritten — new home position is `main`; documents the e2e job's stub ANTHROPIC_API_KEY convention so future AI-touching e2e tests know what to expect. - SESSION_LOG entry extended with the CI env-var diagnosis, the fix, the merge, and pointers to the natural next pickups. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 01:14:49 -04:00
chihlasm	68fcdc6122	Merge PR #153 : fix(chat): sync currentChatRef when prefill creates a new chat session All checks were successful CI / frontend (push) Successful in 5m57s Details Mirror to GitHub / mirror (push) Successful in 13s Details CI / backend (push) Successful in 10m28s Details CI / e2e (push) Successful in 12m0s Details Fixes a silent-drop bug where the dashboard prefill flow created a new chat session but didn't update the in-flight guard ref, so subsequent task-lane submissions had their AI follow-up responses discarded. Includes a Playwright regression test that drives the prefill flow and stubs /ai-sessions/*/chat to verify the second AI turn renders. Also adds a stub ANTHROPIC_API_KEY to the e2e CI job so AI-gated endpoints clear their _require_ai_enabled() check (the chat call itself is intercepted in the browser, so no real Anthropic traffic).	2026-04-26 05:05:54 +00:00
Michael Chihlas	11fe32f4c6	fix(ci): set stub ANTHROPIC_API_KEY for e2e job so AI-gated endpoints respond All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m39s Details CI / backend (pull_request) Successful in 10m24s Details CI / e2e (pull_request) Successful in 12m14s Details POST /api/v1/ai-sessions and friends call _require_ai_enabled(), which returns 503 when no provider key is set. The new prefill-handoff regression test (e2e/assistant-chat-prefill.spec.ts) drives the dashboard prefill flow, which has to create a chat session before its page.route stub on /chat can fire — so without a key, session creation 503s and the test never sees the task lane. The Playwright stub intercepts /chat in the browser, so the backend never actually contacts Anthropic — but the AI-enabled gate still needs to pass. A stub value is enough. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:51:39 -04:00
Michael Chihlas	43eed720d9	docs(ai): close out PR #150 , set PR #153 as active task Some checks failed Mirror to GitHub / mirror (push) Successful in 13s Details CI / frontend (pull_request) Successful in 5m50s Details CI / e2e (pull_request) Failing after 6m50s Details CI / backend (pull_request) Successful in 10m40s Details - CURRENT_TASK.md rolled forward — the CI-recovery task is complete (PR #150 merged as 87bb20b; backend gate is in required checks). Active task is now landing PR #153. - HANDOFF.md rewritten — new resume point is watching CI on the rebased SHA `1559feb` and merging when all three checks are green. - SESSION_LOG.md gains a 2026-04-26 entry covering the prefill bug diagnosis, fix, regression test, and the rebase off post-#150 main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:30:50 -04:00
Michael Chihlas	1559feb759	docs(ai): track currentChatRef silent-swallow follow-up in TODO Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m43s Details CI / e2e (pull_request) Failing after 6m40s Details CI / backend (pull_request) Has been cancelled Details The guard pattern that masked the prefill-ref bug fixed in PR #153 is applied across handleSend, handleTaskSubmit, selectChat, refreshFacts, refreshActiveFix, and refreshPreview. Worth either logging the mismatch path or distinguishing expected-stale from unexpected-stale so the next instance of this class of bug surfaces instead of hiding. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:24:25 -04:00
Michael Chihlas	b56da2facd	fix(chat): sync currentChatRef when prefill creates a new chat session The dashboard prefill flow in AssistantChatPage set activeChatId after creating a new session but never updated currentChatRef.current. Every later handleSend / handleTaskSubmit then tripped the `currentChatRef.current !== sentForChatId` guard that was supposed to discard responses for stale chats — and silently dropped the AI's follow-up. The user saw their submitted message but no assistant reply, no toast, no task-lane update. Mirrors what handleNewChat and handleResumeNew already do. Adds an e2e regression test that drives the dashboard prefill, submits a partial task-lane response, and asserts the second AI turn renders. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-26 00:24:02 -04:00
chihlasm	87bb20b8f0	Merge PR #150 : fix(ci): consolidated CI recovery — backend green, xdist parallelization, e2e selector + decoupling All checks were successful CI / frontend (push) Successful in 5m42s Details Mirror to GitHub / mirror (push) Successful in 13s Details CI / backend (push) Successful in 10m21s Details CI / e2e (push) Successful in 11m5s Details	2026-04-25 21:57:26 +00:00
Michael Chihlas	1e3a6cfa01	fix(e2e): harden card selectors for session resume All checks were successful Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 5m43s Details CI / backend (pull_request) Successful in 10m21s Details CI / e2e (pull_request) Successful in 11m23s Details Co-Authored-By: Codex <noreply@openai.com>	2026-04-25 16:42:33 -04:00
Michael Chihlas	ede6eebf9a	docs(ai): note e2e decoupling commit (`261814a`) in HANDOFF Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m43s Details CI / e2e (pull_request) Failing after 9m30s Details CI / backend (pull_request) Successful in 10m18s Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 16:12:19 -04:00
Michael Chihlas	261814ae65	perf(ci): decouple e2e from frontend — build frontend inline in e2e job Some checks failed Mirror to GitHub / mirror (push) Successful in 14s Details CI / frontend (pull_request) Successful in 5m44s Details CI / e2e (pull_request) Failing after 7m42s Details CI / backend (pull_request) Successful in 10m28s Details Before: e2e \`needs: [frontend]\` waited for the frontend job to upload a build artifact, then downloaded it. With multiple runners this means the third runner sat idle for ~6 min while frontend ran, then started e2e — total wall-clock max(backend, frontend+e2e) ≈ 11 min. After: e2e builds its own frontend (npm ci + npm run build are already in the job; just dropped the artifact download step and added the build). e2e starts immediately on a free runner. Adds ~1-2 min to the e2e job duration but removes ~5 min of waiting and eliminates the cross-job artifact mechanism entirely. Side benefit: no more \`actions/upload-artifact\` v3/v4 GHES headaches on the cross-job handoff. The \`if: always()\` upload of the playwright-report at the end of e2e is kept (failure report retrieval is still useful), but it's a leaf-output, not a dependency. Net wall-clock: max(backend=9m, frontend=6m, e2e=7m) ≈ 9 min on the 3-runner setup, down from ~11 min. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:59:00 -04:00
Michael Chihlas	6656ebdead	docs(ai): reflect PR consolidation — #151/#152 merged into #150 Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details CI / frontend (pull_request) Has been cancelled Details Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:55:08 -04:00
Michael Chihlas	69f2a37591	fix(e2e): update 5 selectors that drifted with FlowPilot/PSA UI changes Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Successful in 5m52s Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details Mechanical drift between the e2e selectors and the current UI surfaced on the first CI run after PR #149 unblocked the artifact upload step. Five tests, three categories of drift: 1. Page heading renames (navigation.spec.ts) - `Sessions` → `Session History` on /sessions - `Account Settings` → `Account Management` on /account 2. Route rename (command-palette.spec.ts:74) - The "Troubleshoot with FlowPilot" command palette option now lands on /pilot (Phase 1 of the FlowPilot migration renamed /assistant). /assistant still 301-redirects, so the assertion accepts either. 3. Feature moved to /sessions (history.spec.ts, resume.spec.ts) - Default tab on /sessions is "AI Sessions"; flow-session filtering and the Resume button moved behind the "Flow Sessions" tab. Both tests now click that tab before asserting. - resume.spec.ts no longer starts at /trees (Resume buttons aren't rendered there anymore — the flow lives on /sessions). Destination URL (/trees/:id/navigate) is unchanged. No product-code changes — these are pure test updates against the shipped UI. Run the suite locally with `cd frontend && npm run test:e2e` once a fresh build is available. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:53:57 -04:00
Michael Chihlas	7f714363dd	perf(ci): pytest-xdist with per-worker DBs — 22m → ~4m Backend suite is the slow gate (1076 passed locally in 22m27s on fix/ci-workflow-config). Adding pytest-xdist with per-worker DB isolation drops it to ~4m20s on the 8-core homelab runner. Verified locally: `pytest -n auto --no-cov` finished in 4m28s real time (15m19s user — confirms ~5× parallelism). How it works: - conftest.py reads `PYTEST_XDIST_WORKER` (set per worker by xdist — 'gw0', 'gw1', …). When set, derives a per-worker DB URL like `…/resolutionflow_test_gw0`. The base DB stays for serial / master runs. - `_ensure_worker_db_exists` runs synchronously at conftest import, connects to the postgres maintenance DB, and `CREATE DATABASE`s the worker-suffixed DB if it doesn't exist. Idempotent across runs. - The "test" safety guard still applies — every worker DB name contains "test" so the assertion holds. - The per-test `DROP SCHEMA public CASCADE` now operates on the worker's isolated DB, no cross-worker race. CI workflow: backend job switches to `pytest -n auto`. Coverage still collected (pytest-cov has built-in xdist support). Adds `pytest-xdist==3.6.1` to requirements-dev.txt. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:53:47 -04:00
Michael Chihlas	1bd43abb8f	fix(ci): drop postgres host port mapping (multi-runner port collision) Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 6m44s Details CI / e2e (pull_request) Failing after 8m43s Details CI / backend (pull_request) Has been cancelled Details With 3 Gitea Actions runners on the same homelab box, two simultaneous backend (or backend + e2e) jobs both try to bind 0.0.0.0:5432 for their postgres service containers. The second fails with: failed to set up container networking: ... Bind for 0.0.0.0:5432 failed: port is already allocated The host-port mapping isn't actually needed — the workflow uses \`DATABASE_URL: postgresql+asyncpg://...@postgres:5432/...\` (hostname \`postgres\` is the service container's docker-network DNS name). The tests run inside the act container which is on the same docker network, so they reach postgres without going through the host. Removing \`ports: 5432:5432\` from both backend and e2e job service definitions lets multiple postgres services run in parallel on different docker networks without colliding on the host. Surfaced when PR #150 ran in parallel with another job after the multi-runner setup. Backend instant-failed in 2s on the docker run. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:28:17 -04:00
Michael Chihlas	c203b70ef9	docs(ai): queue data-testid hardening + reflect PR #152 + 3-runner setup Some checks failed CI / backend (pull_request) Failing after 2s Details Mirror to GitHub / mirror (push) Successful in 15s Details CI / e2e (pull_request) Has been cancelled Details CI / frontend (pull_request) Has been cancelled Details TODO.md: Promote pytest-xdist to ✅ (PR #151 carries it). Adds three new backlog items: - data-testid hardening for e2e-critical interactive elements (sparked by PR #152's selector drift work) - per-test transactional rollback (next big speedup if needed) - pytest-testmon for PR-time test selection HANDOFF.md: Three open PRs now (#150, #151, #152), all independent. Three Gitea runner agents now registered, so jobs run in parallel. Combined with #151's xdist, the prior 1h 14m wall-clock should drop to ~6-10 min. Updated merge order: #152 first (smallest), #150 next, #151 last. After all three land, enable CI / backend then CI / e2e as required status checks. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 15:26:21 -04:00
Michael Chihlas	f27e3b44b0	docs(ai): SESSION_LOG entry for the parallelization session Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Successful in 32m33s Details CI / frontend (pull_request) Successful in 5m42s Details CI / e2e (pull_request) Failing after 4m58s Details (Was meant to land in fe632c9; the multi-line edit failed silently because Codex's earlier entry shifted the surrounding context.) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:15:41 -04:00
Michael Chihlas	fe632c9194	docs(ai): handoff after CI parallelization + final test fix Some checks failed Mirror to GitHub / mirror (push) Has been cancelled Details CI / backend (pull_request) Successful in 30m26s Details CI / frontend (pull_request) Successful in 5m46s Details CI / e2e (pull_request) Failing after 5m3s Details Updates HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md to reflect: - PR #150 now contains the AI-provider test mock + caching + maxfail. Backend CI should be fully green for the first time in months. - PR #151 stacked on #150: pytest-xdist with per-worker DBs. Local verification: 22m 27s → 4m 28s (5× speedup), 1076 passed both runs. - DoD is now: merge #150, then #151, then add CI / backend (pull_request) to required status checks on main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:15:07 -04:00
Michael Chihlas	e976fb4e87	fix(ci): mock AI provider in record_decision test + cache pip/npm + drop term-missing Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / backend (pull_request) Successful in 31m8s Details CI / frontend (pull_request) Successful in 5m42s Details CI / e2e (pull_request) Failing after 4m57s Details Three changes that get PR #150 to a green CI gate: 1. test_record_decision_persists_and_bumps_state_version — the `decision: draft_template` path calls `_extract_template_parameters` (TemplateExtractionService → AI provider). CI doesn't set ANTHROPIC_API_KEY/GOOGLE_AI_API_KEY, so the endpoint raised `RuntimeError: No AI provider configured` and returned 500. The test isn't exercising the AI integration — patched the extractor with an AsyncMock returning a minimal valid `{templated_body, parameters}` dict. Verified locally: the test now passes. 2. pip + npm caches in backend, frontend, and e2e jobs. Keyed on the hash of requirements.txt / package-lock.json with a runner-os restore-key fallback. Saves ~30-60s per run on cache hit. 3. Pytest invocation tightened*: - Dropped `--cov-report=term-missing` — the custom "Display coverage summary" step below parses coverage.json and prints the same module list more concisely. Term-missing dumps every uncovered line which adds ~5-10s of stdout. - Added `--maxfail=10` so a structural breakage (fixture explosion, DB unreachable) bails after 10 errors instead of running the full 25-min suite. Tunable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 12:01:05 -04:00
Michael Chihlas	0aefaa78eb	docs(ai): queue pytest-xdist parallelization in TODO.md Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / frontend (pull_request) Has been cancelled Details CI / e2e (pull_request) Has been cancelled Details CI / backend (pull_request) Has been cancelled Details Capture the backend pytest parallelization work so it survives session end. Backend suite is currently ~22 min wall-clock for 1076 tests; xdist with one-DB-per-worker should land in the 3-6 min range on the homelab Gitea Actions runner. Also queues two backlog items: - Frontend lint warnings (23 react-hooks/exhaustive-deps after PR #149) - Periodic audit of the ResourceWarning filterwarnings added by Codex Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 11:35:38 -04:00
Michael Chihlas	49f88569da	wip(handoff): restore backend suite to green Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / backend (pull_request) Failing after 27m35s Details CI / frontend (pull_request) Successful in 2m46s Details CI / e2e (pull_request) Failing after 4m9s Details Co-Authored-By: Codex <noreply@openai.com>	2026-04-25 06:13:23 -04:00
Michael Chihlas	208ec996d5	docs(ai): handoff for Codex — CI recovery + 54 real backend failures Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 28m15s Details CI / frontend (pull_request) Successful in 2m55s Details CI / e2e (pull_request) Failing after 4m23s Details Updates HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md so the next session has accurate resume state. Summary of where things are: - PR #141 (PSA tickets), PR #147 (FlowPilot Phase 1-9), PR #148 (CI fixes part 1), PR #149 (CI fixes part 2) all merged to main in this session. - Branch protection enabled on main: PR-only, CI / frontend required. - PR #150 (this branch) is the last CI-config PR — adds DATABASE_TEST_URL to the workflow and pins upload-artifact to v3. - Next session: watch #150's CI, merge if green, add CI / backend to required checks, then start on the 54 real backend test failures. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 03:36:54 -04:00
Michael Chihlas	8f7df2c0ef	fix(ci): set DATABASE_TEST_URL + downgrade upload-artifact to v3 (Gitea Actions) Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 28m29s Details CI / frontend (pull_request) Successful in 3m11s Details CI / e2e (pull_request) Failing after 4m56s Details Two CI-config issues blocking the gate from going green: 1. Backend tests connect to localhost instead of postgres service. conftest.py reads DATABASE_TEST_URL only — DATABASE_URL is intentionally not consulted (per dab740d's test-DB-isolation hardening — running pytest with DATABASE_URL set previously dropped the dev DB schema). The CI workflow only sets DATABASE_URL, so conftest falls back to its localhost default and every fixture-setup fails with `OSError: Connect call failed ('127.0.0.1', 5432)` — observed as 638 errors on the latest main run. Add DATABASE_TEST_URL pointing at the postgres service container. Same connection string as DATABASE_URL — the test DB and the app DB are the same physical postgres in CI; conftest's safety assertion is satisfied by the URL containing "test". 2. Frontend artifact upload fails on Gitea Actions runner. actions/upload-artifact@v4 (and v5) are not supported on Gitea Actions / GHES — the runner returns `GHESNotSupportedError: ... not currently supported on GHES`. Lint itself is now passing (0 errors after PR #149); the job exits 1 only because the upload step then fails. Pin upload-artifact + download-artifact to v3, the latest version compatible with Gitea Actions until they ship v4 support. After this lands, both backend and frontend CI gates should turn green — at which point we can also add backend to the required status checks on main. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 03:28:54 -04:00
chihlasm	f27f671fe6	Merge PR #149 : fix(ci): frontend lint to zero errors + test-DB schema fix + dev-deps installable Some checks failed CI / backend (push) Failing after 10m26s Details CI / frontend (push) Failing after 2m35s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 15s Details	2026-04-25 07:12:15 +00:00
Michael Chihlas	d6218f2e07	fix(tests): import all models in conftest so create_all sees the full schema Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 11m23s Details CI / frontend (pull_request) Failing after 2m41s Details CI / e2e (pull_request) Has been skipped Details The test_db fixture calls Base.metadata.create_all on a fresh test DB. That only creates tables for models that have been imported (and thus registered with Base.metadata) by the time the fixture runs. app.main imports app.core.database (which gives us Base) but does NOT eagerly import the model modules — most are pulled in lazily inside scheduler functions (archive_stale_ai_sessions etc.) and route modules. At fixture-setup time, only the handful of models touched by those eager imports are on the metadata, so any test that exercises PSA, network diagrams, ratings, escalations, etc. fails with \`UndefinedTableError: relation "X" does not exist\` and a cascade of 500s on every endpoint that queries the missing table. Adding \`from app import models as _models\` (rather than the bare \`import app.models\` which would shadow the \`app\` FastAPI instance imported just above) pulls in app/models/__init__.py, which itself imports every model module — registering all ~60 tables with Base.metadata before create_all runs. Verified locally: tests/test_psa_writeback_phase4.py went from 1 failed / 6 errors → 4 failed / 3 passed (the cascading errors were masking the actual passes). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:49:06 -04:00
Michael Chihlas	920a246d77	fix(react): remove four setState-in-effect cascades flagged by react-hooks v5 Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 11m23s Details CI / frontend (pull_request) Failing after 2m42s Details CI / e2e (pull_request) Has been skipped Details The new react-hooks lint rule "Calling setState synchronously within an effect can trigger cascading renders" flagged real anti-patterns in four spots. Refactored each per the rule's intent (derive during render, or use useSyncExternalStore for external subscriptions). 1. hooks/useMediaQuery.ts — replaced the useState + useEffect pair with useSyncExternalStore. That's the canonical React hook for subscribing to external stores (matchMedia in this case) without mirroring into local state via an effect. Snapshot/getServerSnapshot pair preserves the SSR-safe behaviour. 2. components/network/nodes/DeviceNode.tsx — the prop-sync useEffect that copied nodeData.label into labelValue was redundant. labelValue is the EDIT BUFFER; while not editing, the displayed span now reads nodeData.label directly. The buffer is initialized only when an edit session starts (onDoubleClick). 3. components/network/nodes/GroupNode.tsx — same pattern, same fix. 4. components/dashboard/TicketQueue.tsx — the setTickets([]) + setLoading(true) + fetchTickets() chain in the effect was the cascade. Pushed those writes inside fetchTickets (after the function boundary, so they batch with the eventual setTickets(result)). Added a request-id ref so a slow first response can't overwrite a fast second one. Frontend lint: 20 errors → 0 errors. tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:33:13 -04:00
Michael Chihlas	b7f8e70be2	fix(lint): replace explicit-any types + unused-expressions ternaries Five files, all stylistic: - useFlowPilotSession.ts: typed the axios error shape with a narrow inline type instead of \`as any\`. - FlowPilotSessionPage.tsx: same — typed location.state once, then destructured. - ScriptBuilderTab.tsx: handleViewScript was a placeholder no-op; declared the args properly with \`void script; void filename\` so the signature matches ScriptBuilderChatProps without no-unused-vars firing. - TicketsPage.tsx: replaced 8 ternaries-as-statements (\`x ? f() : g()\`) with proper if/else blocks. Same control flow, satisfies no-unused-expressions, and reads better in the URL-param update paths. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:57 -04:00
Michael Chihlas	857d73e3d0	fix(lint): move AssistantSessionRedirect out of router.tsx (react-refresh gate) react-refresh/only-export-components fires when a file with the \`router\` const export also defines a component (the redirect helper). Moves the small helper to its own file under components/routing/ so HMR can keep the route-component module hot-reload-eligible. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:50 -04:00
Michael Chihlas	406ee0ef97	fix(deps): bump pytest 7.4 → 8.4, pytest-cov 4.1 → 5.0 to satisfy pytest-asyncio 0.24 pytest-asyncio==0.24.0 (added on the FlowPilot branch as part of the RLS test infra refactor) declares pytest>=8.2 — but requirements-dev.txt still pinned pytest==7.4.3, so a clean pip install fails with ResolutionImpossible. CI runners that started from a fresh image would have refused to install dev deps; the FlowPilot tests passed locally only because the dev container had a pre-installed pytest 8.x lying around. pytest-cov 4.1.0 also needs >= 5.0 to play nicely with pytest 8. No code changes — pytest 8 is API-compatible with the existing test suite once the install resolves. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:32:43 -04:00
chihlasm	32fae2c693	Merge PR #147 : feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness Some checks failed CI / backend (push) Failing after 36s Details CI / frontend (push) Failing after 1m11s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 11s Details	2026-04-25 06:02:14 +00:00
Michael Chihlas	a45915fbbc	Merge main into feat/flowpilot-migration (PR #148 backports) Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 37s Details CI / frontend (pull_request) Failing after 1m11s Details CI / e2e (pull_request) Has been skipped Details Brings PR #148 — two pre-existing CI fixes (network_diagrams JSONB server_default, removed deprecated session-scoped event_loop fixture). The conftest.py event_loop fix on main is already incorporated in FlowPilot's `b14a16a` (RLS-gating commit, which dropped the same fixture as part of its larger refactor). Kept HEAD's version of the RLS-gating collection hook; the event_loop fixture removal is identical. The network_diagram.py fix lands cleanly via auto-merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 02:01:46 -04:00
chihlasm	06593a40d9	Merge PR #148 : fix(tests): repair two pre-existing bugs blocking backend CI Some checks failed CI / backend (push) Has been cancelled Details CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details Mirror to GitHub / mirror (push) Has been cancelled Details	2026-04-25 06:01:08 +00:00
Michael Chihlas	9737d90f1b	fix(tests): repair two pre-existing bugs blocking the backend CI gate Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 19m36s Details CI / frontend (pull_request) Failing after 1m8s Details CI / e2e (pull_request) Has been skipped Details 1. backend/app/models/network_diagram.py — `nodes` and `edges` columns used `server_default="'[]'"` (a Python string), which SQLAlchemy wraps in single quotes when generating DDL, producing `JSONB DEFAULT '''[]'''` — invalid JSON. Switch to `server_default=text("'[]'::jsonb")` so the literal is passed through and the table can actually be created. Surfaced on every CI run as `asyncpg.exceptions.InvalidTextRepresentationError: invalid input syntax for type json` at fixture setup time, cascading hundreds of test errors. 2. backend/tests/conftest.py — drop the deprecated session-scoped `event_loop` fixture. Since pytest-asyncio 0.23+, the plugin manages the loop itself; redefining it with a session scope but never `set_event_loop()`-ing it left the loop dangling, so any test that called `asyncio.run()` (e.g. `test_tasks_are_isolated`) closed the process loop and broke the next async test in the module — `test_require_tenant_context_raises_403_when_no_account` was the visible casualty in the CI logs. Verified locally: - `pytest tests/test_uploads.py::test_upload_success` — was setup-error on `network_diagrams` DDL; now passes. - `pytest tests/test_tenant_context.py` — was 1 fail / 3 pass; now 4/4. Both are real bugs, not test infrastructure churn. Pre-existing on main; not introduced here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 01:49:50 -04:00
Michael Chihlas	1c904373f8	Merge main into feat/flowpilot-migration Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 36s Details CI / frontend (pull_request) Failing after 1m7s Details CI / e2e (pull_request) Has been skipped Details Brings in PR #141 (PSA ticket management) so FlowPilot can ship on top of a unified main. Two manual conflict resolutions: 1. CLAUDE.md — kept the FlowPilot ai-handoff rewrite (`.ai/`-driven protocol). The pre-rewrite reference content (CW integration notes, lessons archive, env vars table) lives in `docs/connectwise/`, `docs/LESSONS-ARCHIVE.md`, and DEV-ENV.md by design. 2. frontend/src/pages/AssistantChatPage.tsx — both conflict regions were purely additive. Concatenated FlowPilot's Phase 2-9 state hooks (facts, activeFix, preview*, scriptPanelOpen, templatizeQueue) with PSA's spin-off ticket state (linkedTicket, showNewTicket, spinOffHint). Both modal mounts (TemplatizePrompt, ShortcutsHelpOverlay, NewTicketModal) kept. All setters wired by either branch are intact. Verification: - `tsc -b` clean across the merged tree. - Browser smoke-test (Session B fixture): Phase 9 ProposalBanner ("Run AI-drafted PowerShell to recover SSL VPN") renders alongside PSA's new Tickets sidebar icon. Console clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 01:03:33 -04:00
chihlasm	16060d2235	Merge PR #141 : feat: PSA ticket management — /tickets page, detail panel, AI ticket creation Some checks failed CI / backend (push) Failing after 19m11s Details CI / frontend (push) Failing after 1m19s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 11s Details	2026-04-25 04:59:02 +00:00
Michael Chihlas	9330ce4782	fix(pilot): two Phase 9 layout/state bugs surfaced by QA fixtures All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details 1. EscalateInterceptDialog clipped off-screen. The dialog was positioned with `absolute bottom-full mb-2 left-0` under the assumption the Escalate button would have room above it. In practice the button lives in the chat-page action bar near y≈105, so the 302 px dialog overflows the top of the viewport and only the last option is visible. Switch to `top-full mt-2 right-0` — anchors the dialog below the button and aligns its right edge with the button (avoids overflow off the right when the button is in the right-side action cluster). 2. TemplateMatchPanel never renders on a fresh session. `handleApplyFix` for the script_template_id branch only sets `scriptPanelOpen=true`, but TemplateMatchPanel is mounted inside `TaskLane.bottomSlot`. On sessions with no questions/facts the lane defaults closed, so the panel exists in the React tree but inside an unrendered TaskLane — the user clicks Apply fix and nothing visibly changes. Fix: also `setShowTaskLane(true)` in that branch so the lane opens alongside the panel. The ai_drafted_script branch is fine (InlineNoTemplateDialog renders in the chat region, not in the lane), so it's left alone. Both bugs were latent — they only surface on sessions that haven't accumulated TaskLane state yet (questions/facts). Fresh sessions created from the StartSessionInput hide them because the AI's first turn populates questions and the lane auto-opens. Caught using the new seed_phase9_qa_fixtures.py harness. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:08:50 -04:00
Michael Chihlas	d68131a865	feat(seed): Phase 9 QA fixture seeder Adds backend/scripts/seed_phase9_qa_fixtures.py — creates 4 ai_sessions plus matching session_suggested_fixes that pre-bake the four backend states the AI orchestrator must produce to mount the five conditional Phase 9 components: A. no template, no draft → ChatTabStrip + ScriptBuilderTab B. ai_drafted_script set → InlineNoTemplateDialog C. script_template_id set → TemplateMatchPanel D. applied_at + status=proposed → EscalateInterceptDialog (verify state) Background: a Phase 9 QA pass against a regular session left these five components unreached because the AI didn't emit SUGGEST_FIX in time/at all. Seeding directly bypasses the AI and lets QA exercise each surface deterministically. UUIDs are deterministic (uuid5 over a fixed namespace) so re-runs upsert. Pass --reset to wipe and recreate. Each session gets two synthetic conversation messages so the chat header's canAct gate (messages.length >= 2) opens up Resolve/Escalate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:08:38 -04:00
Michael Chihlas	875bd924a9	fix(pilot): auto-scroll Resolve preview into view when opened The ResolutionNotePreview popover renders inside TaskLane's overflow-y-auto region at the bottom of the lane. On a 720px viewport with the default question/check list expanded, the popover lands below the visible scroll position — the engineer clicks "Preview Resolve note", sees the button label flip to "Showing", but no preview appears on screen. Add a useEffect that calls scrollIntoView({block: 'nearest'}) on the popover's outer div whenever `open` flips to true. block: 'nearest' scrolls just enough to make it visible without yanking the lane to the top. Discovered during Phase 9 QA. Reproduced at 1280x720; fix verified visually in the same QA run (screenshots in .gstack/qa-reports/phase9-*/). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:45:52 -04:00
Michael Chihlas	49c6c8fd00	fix(seed): include cancel_at_period_end in test-user subscription INSERT Discovered during Phase 9 QA: seed_test_users.py was missing the cancel_at_period_end column in its subscriptions INSERT, but the column is NOT NULL (added in 016_add_subscription_tables.py). Result: seed crashed with NotNullViolationError before any users were created, blocking auth in fresh dev environments. Pre-existing on main; not introduced by the FlowPilot migration branch. Default value: false. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:36:04 -04:00
Michael Chihlas	a77e8ea578	chore: bootstrap gstack team mode Per gstack team-mode install: adds a PreToolUse hook that blocks skill usage when gstack isn't installed globally, so contributors are prompted to install it. Un-ignores the two required files (.claude/settings.json, .claude/hooks/check-gstack.sh) while keeping settings.local.json and other Claude state ignored. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-24 23:17:06 -04:00

1 2 3 4 5 ...

1040 Commits