Merge PR #150: fix(ci): consolidated CI recovery — backend green, xdist parallelization, e2e selector + decoupling

2026-04-25 21:57:26 +00:00
parent f27f671fe6 1e3a6cfa01
commit 87bb20b8f0
40 changed files with 411 additions and 121 deletions
--- a/.ai/CURRENT_TASK.md
+++ b/.ai/CURRENT_TASK.md
@@ -1,33 +1,21 @@
 # CURRENT_TASK.md

-**Task:** none — replace this file when starting the next real task.
+**Task:** Land consolidated CI-recovery PR #150 and lock reliable CI gates on `main`.

-**Status:** not-started
-
-**Definition of Done:** n/a
-
-**Assumptions:** n/a
-
-**Out of scope:** n/a
-
---
-
-<!-- When you start a real task, replace the block above with:
-
-**Task:** One-sentence goal.
-
-**Status:** not-started | in-progress | blocked | ready-for-review | complete
+**Status:** in-progress

 **Definition of Done:**
- [ ] Testable criterion 1
- [ ] Testable criterion 2
- [ ] Tests added or updated
- [ ] `npm run build` passes (frontend) / `pytest` passes (backend)
+- [ ] PR #150 (`fix/ci-workflow-config`) merged. `CI / backend (pull_request)`, `CI / frontend (pull_request)`, and `CI / e2e (pull_request)` show success before merge.
+- [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required).
+- [ ] Optional: `CI / e2e (pull_request)` confirmed clean across at least one PR run and added to required checks.

 **Assumptions:**
- What we're treating as given
+- The 8-core homelab Gitea Actions runner can support `-n auto` (8 xdist workers). If memory pressure shows up in CI, drop to `-n 4`.
+- pytest-cov's xdist support continues to handle the coverage merge and `--cov-fail-under=50` check correctly.
+- The per-worker DB creation in `conftest.py` is idempotent and racing workers on first import won't all try to CREATE DATABASE simultaneously — postgres serializes that, but if it surfaces issues, wrap with an advisory lock.

 **Out of scope:**
- What this task explicitly does NOT cover
-
-->
+- Frontend lint warnings (23 remain after #149).
+- The 23 react-hooks/exhaustive-deps warnings.
+- RLS test suite (gated behind `RUN_RLS_TESTS=1`; not in default CI).
+- Per-test transactional rollback (would shave another 30-40% off backend time but is a much bigger refactor — capture in TODO if interested).
--- a/.ai/HANDOFF.md
+++ b/.ai/HANDOFF.md
@@ -2,34 +2,63 @@

 # HANDOFF.md

-**Last updated:** 2026-04-24 (America/New_York)
+**Last updated:** 2026-04-25 16:41 EDT

-**Active task:** None — see [CURRENT_TASK.md](CURRENT_TASK.md). Replace it when picking up the next real task.
+**Active task:** Land PR #150 (the consolidated CI-recovery PR), then enable backend and eventually e2e gates on `main`. See [CURRENT_TASK.md](CURRENT_TASK.md).

-**Branch:** `feat/flowpilot-migration` — a long-running FlowPilot Phase 9 feature branch. The recent AI-handoff migration commits ride on this branch (not on their own branch); they'll merge to `main` whenever Phase 9 does.
+**Branch:** `fix/ci-workflow-config` -> PR #150. PRs #151 and #152 were closed and consolidated into this branch.

-**Branch state:** 3 commits ahead of `origin/feat/flowpilot-migration`:
+## Current resume point

- `b3be1e0 chore: ignore .remember/ skill runtime state`
- `b3506b5 docs(pilot): phase 9 review issues`
- `b14a16a chore(tests): gate RLS tests behind RUN_RLS_TESTS flag`
+Latest PR #150 CI had backend and frontend green, but `CI / e2e (pull_request)` failed on the resume smoke test.

-Earlier in this session (already pushed to origin):
+The failure was not product behavior. Playwright was using:

- `9c8ba29 fix(ai): correct stale role-hierarchy and file-listing claims`
- `bee8690 chore(ai): migrate to dual-agent handoff system`
- `e110fed chore: snapshot CLAUDE.md before ai-handoff migration` (tag: `pre-ai-handoff`)
+```ts
+page.locator('.bg-card').filter({ hasText: tree.name }).first()
+```

-**Where I left off:**
- File: n/a — nothing mid-edit.
- Next intended action: push the 3 unpushed commits when ready (`git push`), then start the next real task (replace `CURRENT_TASK.md`, update this file).
+On the session history page this matched the tree filter `<select>` first because the select options contain the same flow name, then the test waited forever for a `Resume` button inside the select.

-**Uncommitted state:**
- Working tree is clean.
+This session fixed that properly by adding stable test IDs to repeated cards and moving e2e tests off `.bg-card` selectors:

-**Immediate next steps:**
-1. `git push` to publish the 3 local commits (cleanup batch).
-2. When starting the next real feature task: replace `CURRENT_TASK.md` with actual goal/DoD, rewrite this file's resume section.
+- `flow-session-card` in `SessionHistoryPage.tsx`
+- `tree-card` in `TreeGridView.tsx` and `TreeListView.tsx`
+- `share-card` in `MySharesPage.tsx`

-**Open questions / blockers:**
- None. The dual-agent handoff system is live and has survived one Codex review round (see DECISIONS.md 2026-04-24 entry; corrections in `9c8ba29`).
+The workflow was also hardened:
+
+- Postgres service healthchecks now run `pg_isready -U postgres` instead of checking as `root`.
+- The e2e frontend build now bakes `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"`, matching the Playwright backend origin.
+
+## Verification completed
+
+- `git diff --check`
+- Confirmed no remaining `.bg-card` selectors in `frontend/e2e/*.ts`.
+- `docker exec -w /app resolutionflow_frontend npm run build`
+- Ran migrations and test-user seed in the dev backend container.
+- Focused Playwright verification in an Actions-like Ubuntu container:
+  - First `e2e/resume.spec.ts` passed.
+  - Then `e2e/history.spec.ts e2e/library.spec.ts e2e/library-start.spec.ts e2e/resume.spec.ts e2e/shares.spec.ts --project=chromium --workers=1` passed: `6 passed (1.3m)`.
+
+## Immediate next steps
+
+1. Push the WIP commit from this session to PR #150.
+2. Watch PR #150 CI on the new SHA. Expected result: backend, frontend, and e2e all green.
+3. Merge PR #150 when green.
+4. Enable `CI / backend (pull_request)` as a required status check on `main`.
+5. After at least one reliable green PR run, consider adding `CI / e2e (pull_request)` as required too.
+
+## Branch protection on main (current)
+
+- PR-only merges
+- `CI / frontend (pull_request)` required
+- Force-push blocked
+- No review required (solo)
+
+## Useful breadcrumbs
+
+- `.gitea/workflows/ci.yml` contains the parallel backend/frontend/e2e workflow.
+- `backend/scripts/seed_phase9_qa_fixtures.py` pre-bakes Phase 9 QA fixtures.
+- `.gstack/qa-reports/phase9-20260424-232700/REPORT.md` has the FlowPilot QA report.
+- Per-worker test DBs accumulate on the Postgres service. Cheap to leave around; cleanup if needed.
--- a/.ai/SESSION_LOG.md
+++ b/.ai/SESSION_LOG.md
@@ -12,6 +12,48 @@

 ---

+## 2026-04-25 16:41 EDT — Codex — Stabilize PR #150 e2e selectors
+
+- Investigated the remaining PR #150 failure after backend and frontend CI were green. The e2e resume smoke test was not failing because of product behavior; it used `.bg-card` plus text filtering and matched the tree filter `<select>` before the intended session card.
+- Added stable test IDs to flow session, tree, and share cards, then updated affected e2e tests to target those cards instead of Tailwind class names.
+- Hardened the CI workflow by making Postgres healthchecks authenticate as `postgres` and baking `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"` into the e2e frontend build.
+- Verified with `git diff --check`, frontend build in Docker, no remaining `.bg-card` e2e selectors, and focused Playwright runs in an Actions-like Ubuntu container: resume spec passed, then history/library/library-start/resume/shares passed (`6 passed`).
+- Left for next session: push this WIP commit to PR #150, watch CI, merge when all three jobs are green, then enable backend branch protection and consider the e2e gate after a reliable green run.
+- Files touched: `.gitea/workflows/ci.yml`, `frontend/e2e/history.spec.ts`, `frontend/e2e/library-start.spec.ts`, `frontend/e2e/library.spec.ts`, `frontend/e2e/resume.spec.ts`, `frontend/e2e/shares.spec.ts`, `frontend/src/components/library/TreeGridView.tsx`, `frontend/src/components/library/TreeListView.tsx`, `frontend/src/pages/MySharesPage.tsx`, `frontend/src/pages/SessionHistoryPage.tsx`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/SESSION_LOG.md`.
+
+## 2026-04-25 12:00 America/New_York — Claude Code — Mock final AI-provider test, cache CI deps, parallelize backend with pytest-xdist
+
+- Diagnosed why CI was still red despite Codex's local 1076 passed: a single test (`test_record_decision_persists_and_bumps_state_version`) needed `ANTHROPIC_API_KEY` because the `decision: draft_template` path calls `TemplateExtractionService` → AI provider. Patched `_extract_template_parameters` with an `AsyncMock` so the test no longer depends on AI availability. Verified.
+- Pushed Codex's WIP commit `49f8856` to PR #150 (had been local-only per handoff protocol).
+- PR #150 (`fix/ci-workflow-config`) extended with cheap CI wins: `actions/cache@v3` for pip + npm in all three jobs; dropped `--cov-report=term-missing` (the custom display step parses JSON); added `--maxfail=10` so structural breakage exits fast.
+- PR #151 (`fix/ci-pytest-xdist`) opened, stacked on #150: pytest-xdist with per-worker DB isolation. `conftest.py` reads `PYTEST_XDIST_WORKER`, computes a per-worker DB URL like `…_gw0`, and synchronously CREATEs the DB on first import. The per-test `DROP SCHEMA public CASCADE` then operates on the worker's isolated DB. Verified locally: backend suite went from 22m 27s serial → 4m 28s parallel (8 workers), 1076 passed in both cases. ~5× speedup.
+- Decided NOT to do per-test transactional rollback (bigger refactor); captured for future TODO consideration.
+- Left for next session: watch CI on both PRs, merge in order (#150 first, #151 second), then enable `CI / backend (pull_request)` as a required status check on main.
+- Files touched: `backend/tests/test_session_suggested_fixes_api.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `.gitea/workflows/ci.yml`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/TODO.md`.
+
+## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
+
+- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.
+- Added backend CI/dev system packages required by WeasyPrint PDF export.
+- Stabilized the pytest harness for pytest-asyncio/asyncpg teardown ResourceWarnings under `filterwarnings = error`.
+- Verified `pytest --override-ini="addopts=" -q` inside `resolutionflow_backend`: `1076 passed, 35 deselected in 1347.41s`.
+- Left for next session: commit/push if needed, check and merge PR #150 when Gitea CI is green, add backend CI as a required branch-protection check, and rerun frontend lint if final DoD requires it.
+- Files touched: `.gitea/workflows/ci.yml`, `backend/Dockerfile.dev`, `backend/app/api/endpoints/folders.py`, `backend/app/api/endpoints/script_builder.py`, `backend/app/api/endpoints/shares.py`, `backend/app/api/router.py`, `backend/app/models/ai_session.py`, `backend/app/schemas/user.py`, `backend/app/services/assistant_chat_service.py`, `backend/app/services/resolution_output_generator.py`, `backend/app/services/script_builder_service.py`, `backend/pytest.ini`, `backend/tests/conftest.py`, and focused backend tests.
+
+## 2026-04-25 02:00 America/New_York — Claude Code — Land FlowPilot + PSA, recover CI from 488 errors to ~4
+
+- Started session by completing pending FlowPilot Phase 9 QA: ran `/qa` against the seeded fixtures, found and fixed four latent layout/state bugs (`ResolutionNotePreview` off-screen, `TemplateMatchPanel` deadlock when TaskLane closed, `EscalateInterceptDialog` clipped above viewport, `seed_test_users.py` `cancel_at_period_end` NOT NULL crash). Added a new fixture seeder `backend/scripts/seed_phase9_qa_fixtures.py` that pre-bakes the four backend states the AI orchestrator needs to emit, so future QA can exercise all 7 conditional Phase 9 components without depending on stochastic AI behavior.
+- Discovered PR #141 (PSA ticket management) and `feat/flowpilot-migration` had 5 overlapping files but only 2 real conflicts (`CLAUDE.md`, `AssistantChatPage.tsx`). Conflicts were both additive — concatenated rather than chose-a-side.
+- Merged PSA first (PR #141), then merged FlowPilot (PR #147), each through Gitea API. `tsc -b` clean and visual smoke-test confirmed PSA's Tickets sidebar coexists with Phase 9 ProposalBanner.
+- Discovered main had been merging through a broken CI gate for several merges. Initially recommended "stop the line, fix CI before shipping." After scoping the actual rot (~50% of tests red, ~600 errors on a clean run), reversed the recommendation: ship the queue first because FlowPilot itself carried significant test-infra repairs that would be duplicated work on a fresh recovery branch.
+- PR #148: two surgical fixes to main (network_diagrams JSONB `server_default` triple-quote bug, deprecated session-scoped `event_loop` fixture in conftest). +78 passing / -114 errors.
+- PR #149: frontend lint `20 errors → 0`, `requirements-dev.txt` pytest pin bumped to satisfy `pytest-asyncio==0.24.0`'s `pytest>=8.2`, and a one-line `from app import models as _models` in conftest that registers all ~60 models with `Base.metadata` before `create_all`. The conftest fix collapsed 484 of the remaining 488 backend errors. `1018 passed / 4 errors / 54 failed` after.
+- Enabled Gitea branch protection on `main`: PR-only merges, `CI / frontend (pull_request)` required, force-push blocked, no review required.
+- Discovered CI on the merge commit STILL showed red despite local pytest being mostly green. Root cause: workflow only set `DATABASE_URL`, but conftest reads only `DATABASE_TEST_URL` (per `dab740d`'s safety hardening). 638 connection-refused errors on every fixture setup. Plus `actions/upload-artifact@v4` not supported by Gitea Actions. PR #150 fixes both.
+- Left for next session: merge PR #150 once CI confirms green, add `CI / backend (pull_request)` to required status checks, then root-cause and fix the 54 real backend test failures (one sample seen — `test_user` fixture leaking across calls causing duplicate-email violations).
+- Files touched (committed): `backend/scripts/seed_test_users.py`, `backend/scripts/seed_phase9_qa_fixtures.py` (new), `backend/app/models/network_diagram.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `frontend/src/components/pilot/ResolutionNotePreview.tsx`, `frontend/src/components/pilot/EscalateInterceptDialog.tsx`, `frontend/src/components/pilot/ScriptBuilderTab.tsx`, `frontend/src/pages/AssistantChatPage.tsx`, `frontend/src/pages/FlowPilotSessionPage.tsx`, `frontend/src/pages/TicketsPage.tsx`, `frontend/src/hooks/useFlowPilotSession.ts`, `frontend/src/hooks/useMediaQuery.ts`, `frontend/src/components/dashboard/TicketQueue.tsx`, `frontend/src/components/network/nodes/DeviceNode.tsx`, `frontend/src/components/network/nodes/GroupNode.tsx`, `frontend/src/components/routing/AssistantSessionRedirect.tsx` (new), `frontend/src/router.tsx`, `.gitea/workflows/ci.yml`, `.claude/settings.json` (new), `.claude/hooks/check-gstack.sh` (new), `.gitignore`, `CLAUDE.md`, `.gstack/qa-reports/phase9-*/` (QA artifacts).
+- Net merges to main: PR #141 (PSA), PR #147 (FlowPilot), PR #148 (CI fixes part 1), PR #149 (CI fixes part 2). PR #150 still open at session end.
+
 ## 2026-04-24 — Claude Code — Migrate to dual-agent handoff system

 - Split CLAUDE.md into `.ai/PROJECT_CONTEXT.md` + shared-protocol root files (`CLAUDE.md`, `AGENTS.md`).
--- a/.ai/TODO.md
+++ b/.ai/TODO.md
@@ -5,8 +5,12 @@

 ## Up next

- [ ] No queued backlog yet.
+- [ ] **Parallelize backend pytest with pytest-xdist.** ✅ landing as PR #151. Verified locally: backend suite 22 min → 4m 28s with `-n auto` on the 8-core homelab runner. Per-worker DB isolation via `PYTEST_XDIST_WORKER` in conftest.py.

 ## Backlog

- [ ] No queued backlog yet.
+- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
+- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
+- [ ] **Add `data-testid` attributes to e2e-critical interactive elements.** PR #152 fixed five Playwright tests by chasing UI-text changes (`Sessions` → `Session History`, `Account Settings` → `Account Management`, `/assistant` → `/pilot`, "Flow Sessions" tab, Resume button on session cards). Each was a one-line selector update, but every UI churn re-breaks them. Adding stable `data-testid` attributes on the targeted elements (page heading wrappers, tab nav, primary action buttons) and switching tests to `getByTestId` would make these immune to copy/route renames. Scope it small — start with `SessionHistoryPage` heading, the AI/Flow Sessions tab buttons, the per-session `Resume` button, and the command-palette FlowPilot option.
+- [ ] **Per-test transactional rollback in `test_db` fixture.** Bigger engineering than xdist (which we already shipped). Instead of `DROP SCHEMA public CASCADE` per test, wrap each test in a savepoint and rollback at teardown. ~30-40% additional speedup on top of xdist for test-DB-heavy tests. Real refactor; only worth it if the suite gets significantly larger or runs more frequently.
+- [ ] **Consider `pytest-testmon` for PR-time test selection.** Tracks which tests touched which source files and only re-runs affected ones. Best for small PRs touching ~few files. Adds cache-invalidation complexity; only worth it if the suite stays painfully long even after xdist.
--- a/.gitea/workflows/ci.yml
+++ b/.gitea/workflows/ci.yml
@@ -17,10 +17,13 @@ jobs:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: resolutionflow_test
-        ports:
-          - 5432:5432
+        # No host port mapping. Tests connect to `postgres:5432` (the service
+        # container's docker-network DNS name), not `localhost:5432`. With
+        # multiple Gitea runners on the same homelab box, host-port mapping
+        # would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
+        # second fails with "port is already allocated".
        options: >-
-          --health-cmd pg_isready
+          --health-cmd "pg_isready -U postgres"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
@@ -28,6 +31,12 @@ jobs:
    env:
      DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
      DATABASE_URL_SYNC: postgresql://postgres:postgres@postgres:5432/resolutionflow_test
+      # conftest.py reads DATABASE_TEST_URL only (DATABASE_URL is intentionally
+      # not consulted after the dab740d test-isolation hardening). The CI test
+      # DB is the same postgres service, so point DATABASE_TEST_URL at it
+      # explicitly — without this, conftest falls back to localhost:5432 and
+      # all tests fail at fixture setup with "connection refused".
+      DATABASE_TEST_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
      SECRET_KEY: ci-test-secret-key-not-for-production
      DEBUG: "true"
      APP_NAME: ResolutionFlow
@@ -37,6 +46,19 @@ jobs:
    steps:
      - uses: actions/checkout@v4

+      - name: Cache pip
+        uses: actions/cache@v3
+        with:
+          path: ~/.cache/pip
+          key: pip-${{ runner.os }}-${{ hashFiles('backend/requirements.txt', 'backend/requirements-dev.txt') }}
+          restore-keys: |
+            pip-${{ runner.os }}-
+
+      - name: Install system dependencies
+        run: |
+          apt-get update
+          apt-get install -y libpango1.0-dev libcairo2-dev libgdk-pixbuf-2.0-dev libffi-dev libjpeg-dev zlib1g-dev
+
      - name: Install dependencies
        run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt

@@ -47,7 +69,15 @@ jobs:
        run: cd backend && python scripts/check_tenant_filters.py

      - name: Run tests with coverage
-        run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50
+        # `-n auto` parallelizes across all runner cores via pytest-xdist.
+        # conftest.py creates a per-worker DB (resolutionflow_test_gw0,
+        # resolutionflow_test_gw1, …) so the per-test DROP SCHEMA doesn't
+        # race across workers. Master/serial runs keep the base DB.
+        # term-missing dropped — the custom "Display coverage summary" step
+        # below parses coverage.json and prints the same info more concisely.
+        # --maxfail=10 short-circuits on structural breakage so we don't burn
+        # 25 minutes when a fixture explodes.
+        run: cd backend && python -m pytest --override-ini="addopts=" -n auto --maxfail=10 --cov=app --cov-report=json:coverage.json --cov-fail-under=50

      - name: Display coverage summary
        if: always()
@@ -75,6 +105,14 @@ jobs:
    steps:
      - uses: actions/checkout@v4

+      - name: Cache npm
+        uses: actions/cache@v3
+        with:
+          path: ~/.npm
+          key: npm-${{ runner.os }}-${{ hashFiles('frontend/package-lock.json') }}
+          restore-keys: |
+            npm-${{ runner.os }}-
+
      - name: Install dependencies
        run: cd frontend && npm ci

@@ -87,15 +125,14 @@ jobs:
      - name: Build
        run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build

-      - name: Upload build artifact
-        uses: actions/upload-artifact@v4
-        with:
-          name: frontend-dist
-          path: frontend/dist
-          retention-days: 1
+      # Build artifact intentionally NOT uploaded. The e2e job below builds
+      # its own frontend rather than downloading one from this job, so there
+      # is no need for the cross-job artifact handoff (which previously broke
+      # on actions/upload-artifact@v4 GHES support and forced a v3 pin).
+      # Decoupling also lets e2e start immediately rather than waiting for
+      # this job to finish — important on a multi-runner setup.

  e2e:
-    needs: [frontend]
    runs-on: ubuntu-latest

    services:
@@ -105,10 +142,13 @@ jobs:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: resolutionflow_test
-        ports:
-          - 5432:5432
+        # No host port mapping. Tests connect to `postgres:5432` (the service
+        # container's docker-network DNS name), not `localhost:5432`. With
+        # multiple Gitea runners on the same homelab box, host-port mapping
+        # would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
+        # second fails with "port is already allocated".
        options: >-
-          --health-cmd pg_isready
+          --health-cmd "pg_isready -U postgres"
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
@@ -125,17 +165,35 @@ jobs:
    steps:
      - uses: actions/checkout@v4

+      - name: Cache pip
+        uses: actions/cache@v3
+        with:
+          path: ~/.cache/pip
+          key: pip-${{ runner.os }}-${{ hashFiles('backend/requirements.txt', 'backend/requirements-dev.txt') }}
+          restore-keys: |
+            pip-${{ runner.os }}-
+
+      - name: Cache npm
+        uses: actions/cache@v3
+        with:
+          path: ~/.npm
+          key: npm-${{ runner.os }}-${{ hashFiles('frontend/package-lock.json') }}
+          restore-keys: |
+            npm-${{ runner.os }}-
+
      - name: Install backend dependencies
        run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt

      - name: Install frontend dependencies
        run: cd frontend && npm ci

-      - name: Download frontend build
-        uses: actions/download-artifact@v4
-        with:
-          name: frontend-dist
-          path: frontend/dist
+      - name: Build frontend
+        # Building inline (instead of downloading an artifact from the
+        # frontend job) drops the cross-job dependency, so e2e can start
+        # immediately on a free runner. Adds ~1-2 min of build time, but
+        # eliminates the artifact-upload mechanism entirely (no more
+        # v3/v4 GHES headaches) and saves ~5 min of waiting.
+        run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}" npm run build

      - name: Install Playwright browser
        run: cd frontend && npx playwright install --with-deps chromium
@@ -145,7 +203,7 @@ jobs:

      - name: Upload Playwright report
        if: always()
-        uses: actions/upload-artifact@v4
+        uses: actions/upload-artifact@v3
        with:
          name: playwright-report
          path: |
--- a/backend/Dockerfile.dev
+++ b/backend/Dockerfile.dev
@@ -5,6 +5,12 @@ WORKDIR /app
 RUN apt-get update && apt-get install -y \
    gcc \
    libpq-dev \
+    libpango1.0-dev \
+    libcairo2-dev \
+    libgdk-pixbuf-2.0-dev \
+    libffi-dev \
+    libjpeg-dev \
+    zlib1g-dev \
    && rm -rf /var/lib/apt/lists/*

 COPY requirements.txt requirements-dev.txt ./
@@ -12,4 +18,4 @@ RUN pip install --no-cache-dir -r requirements-dev.txt

 EXPOSE 8000

-CMD [ "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload" ]
+CMD [ "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload" ]
--- a/backend/app/api/endpoints/folders.py
+++ b/backend/app/api/endpoints/folders.py
@@ -194,6 +194,7 @@ async def create_folder(

    new_folder = UserFolder(
        user_id=current_user.id,
+        account_id=current_user.account_id,
        name=folder_data.name,
        color=folder_data.color,
        icon=folder_data.icon,
--- a/backend/app/api/endpoints/script_builder.py
+++ b/backend/app/api/endpoints/script_builder.py
@@ -260,6 +260,7 @@ async def save_to_library(
            category_id=data.category_id,
            share_with_team=data.share_with_team,
            user_id=current_user.id,
+            account_id=current_user.account_id,
            team_id=current_user.team_id,
            script_body=data.script_body,
            parameters_schema=data.parameters_schema,
--- a/backend/app/api/endpoints/shares.py
+++ b/backend/app/api/endpoints/shares.py
@@ -20,6 +20,7 @@ from app.core.audit import log_audit
 from app.core.rate_limit import limiter

 router = APIRouter(tags=["shares"])
+public_router = APIRouter(tags=["shares"])


 def build_share_response(share: SessionShare) -> ShareResponse:
@@ -206,7 +207,7 @@ async def _get_optional_user(request: Request, db: AsyncSession) -> Optional[Use
        return None


-@router.get("/share/{share_token}", response_model=SharePublicView)
+@public_router.get("/share/{share_token}", response_model=SharePublicView)
@limiter.limit("30/minute")
 async def access_share(
    share_token: str,
--- a/backend/app/api/router.py
+++ b/backend/app/api/router.py
@@ -78,9 +78,11 @@ api_router = APIRouter()
 # ---------------------------------------------------------------------------
 api_router.include_router(auth.router)
 api_router.include_router(shared.router)       # Public share links (no auth)
+api_router.include_router(shares.public_router)  # Public session share links (optional auth)
 api_router.include_router(beta_signup.router)
 api_router.include_router(webhooks.router)     # Stripe webhook receiver
 api_router.include_router(public_templates.router)  # Public gallery (no auth, rate-limited)
+api_router.include_router(survey.router)       # Public survey flow (no auth, rate-limited)

 # ---------------------------------------------------------------------------
 # Admin endpoints — super_admin only
@@ -125,7 +127,6 @@ api_router.include_router(ai_fix.router, dependencies=_tenant_deps)
 api_router.include_router(ai_chat.router, dependencies=_tenant_deps)
 api_router.include_router(copilot.router, dependencies=_tenant_deps)
 api_router.include_router(assistant_chat.router, dependencies=_tenant_deps)
-api_router.include_router(survey.router, dependencies=_tenant_deps)
 api_router.include_router(tree_transfer.router, dependencies=_tenant_deps)
 api_router.include_router(ai_suggestions.router, dependencies=_tenant_deps)
 api_router.include_router(kb_accelerator.router, dependencies=_tenant_deps)
--- a/backend/app/models/ai_session.py
+++ b/backend/app/models/ai_session.py
@@ -10,7 +10,7 @@ from typing import Optional, Any, TYPE_CHECKING
 from sqlalchemy import String, Text, DateTime, ForeignKey, Boolean, Integer, Float, CheckConstraint
 import sqlalchemy as sa
 from sqlalchemy.orm import Mapped, mapped_column, relationship
-from sqlalchemy.dialects.postgresql import UUID, JSONB
+from sqlalchemy.dialects.postgresql import UUID, JSONB, TSVECTOR

 from app.core.database import Base

@@ -46,6 +46,7 @@ class AISession(Base):
            "confidence_tier IN ('guided', 'exploring', 'discovery')",
            name="ck_ai_sessions_confidence_tier",
        ),
+        sa.Index("idx_ai_sessions_search", "search_vector", postgresql_using="gin"),
    )

    id: Mapped[uuid.UUID] = mapped_column(
@@ -150,6 +151,18 @@ class AISession(Base):
        Text, nullable=True,
        comment="Why escalated (set on escalation)",
    )
+    search_vector: Mapped[Optional[str]] = mapped_column(
+        TSVECTOR,
+        sa.Computed(
+            "to_tsvector('english', "
+            "coalesce(problem_summary, '') || ' ' || "
+            "coalesce(resolution_summary, '') || ' ' || "
+            "coalesce(escalation_reason, '') || ' ' || "
+            "coalesce(problem_domain, ''))",
+            persisted=True,
+        ),
+        nullable=True,
+    )
    escalation_package: Mapped[Optional[dict[str, Any]]] = mapped_column(
        JSONB, nullable=True,
        comment="Context package for receiving engineer: steps_tried, hypotheses, suggestions",
--- a/backend/app/schemas/user.py
+++ b/backend/app/schemas/user.py
@@ -68,4 +68,6 @@ class RoleUpdate(BaseModel):


 class AccountRoleUpdate(BaseModel):
-    account_role: str = Field(..., pattern="^(owner|admin|engineer|viewer)$")
+    # Ownership changes must go through the explicit transfer-ownership flow so
+    # account.owner_id stays consistent with user.account_role.
+    account_role: str = Field(..., pattern="^(admin|engineer|viewer)$")
--- a/backend/app/services/assistant_chat_service.py
+++ b/backend/app/services/assistant_chat_service.py
@@ -300,13 +300,14 @@ To create a fork, append this marker AFTER your [QUESTIONS]/[ACTIONS] markers:
 When you identify a second distinct issue that is clearly separate from the primary topic \
 of this session, suggest creating a spin-off ticket using the [ACTIONS] marker below. \
 Use this sparingly — only when the issue is genuinely independent, not for every tangential mention.
+Use `create_spin_off_ticket` as the command value for this action.

 Format:
 [ACTIONS]
 [
  {
    "label": "Create ticket: <brief issue title>",
-    "command": "create_spin_off_ticket",
+    "command": "<spin-off ticket action command>",
    "description": "<one sentence description of the separate issue>"
  }
 ]
--- a/backend/app/services/resolution_output_generator.py
+++ b/backend/app/services/resolution_output_generator.py
@@ -5,6 +5,7 @@ from uuid import UUID

 from sqlalchemy import select
 from sqlalchemy.ext.asyncio import AsyncSession
+from sqlalchemy.orm import selectinload

 from app.models.ai_session import AISession
 from app.models.session_resolution_output import SessionResolutionOutput
@@ -21,7 +22,9 @@ class ResolutionOutputGenerator:

    async def generate_all(self, session_id: UUID) -> list[SessionResolutionOutput]:
        result = await self.db.execute(
-            select(AISession).where(AISession.id == session_id)
+            select(AISession)
+            .options(selectinload(AISession.steps))
+            .where(AISession.id == session_id)
        )
        session = result.scalar_one_or_none()
        if not session:
--- a/backend/app/services/script_builder_service.py
+++ b/backend/app/services/script_builder_service.py
@@ -360,6 +360,7 @@ async def save_to_library(
    category_id: UUID | None,
    share_with_team: bool,
    user_id: UUID,
+    account_id: UUID,
    team_id: UUID | None,
    script_body: str | None = None,
    parameters_schema: dict | None = None,
@@ -401,6 +402,7 @@ async def save_to_library(
            id=uuid_mod.uuid4(),
            category_id=resolved_category_id,
            created_by=user_id,
+            account_id=account_id,
            team_id=team_id if share_with_team else None,
            name=name,
            slug=slug,
--- a/backend/pytest.ini
+++ b/backend/pytest.ini
@@ -35,6 +35,9 @@ testpaths = tests
 # Warnings
 filterwarnings =
    error
+    ignore:unclosed <socket\.socket.*:ResourceWarning
+    ignore:unclosed transport .*:ResourceWarning
+    ignore:unclosed event loop .*:ResourceWarning
    ignore::DeprecationWarning
    ignore::PendingDeprecationWarning
    ignore::pluggy.PluggyTeardownRaisedWarning
--- a/backend/requirements-dev.txt
+++ b/backend/requirements-dev.txt
@@ -4,6 +4,7 @@
 # Testing — pytest-asyncio 0.24+ requires pytest>=8.2
 pytest==8.4.2
 pytest-asyncio==0.24.0
+pytest-xdist==3.6.1
 httpx>=0.27.0
 pytest-cov==5.0.0

--- a/backend/tests/conftest.py
+++ b/backend/tests/conftest.py
@@ -5,6 +5,7 @@ Provides test database setup, client fixtures, and authentication helpers.
 """

 import os
+import asyncio
 from typing import AsyncGenerator
 import pytest
 import sqlalchemy as sa
@@ -34,11 +35,64 @@ settings.REQUIRE_INVITE_CODE = False
 # would silently nuke the dev database. Only DATABASE_TEST_URL is honored,
 # and the safety assertion below refuses to run against a DB whose name
 # doesn't contain "test".
-TEST_DATABASE_URL = os.environ.get(
+_BASE_TEST_DATABASE_URL = os.environ.get(
    "DATABASE_TEST_URL",
    "postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
 )

+
+def _worker_db_url(base_url: str) -> str:
+    """Per-worker DB URL for pytest-xdist parallelization.
+
+    pytest-xdist sets PYTEST_XDIST_WORKER to 'gw0', 'gw1', ... per worker
+    process. Each worker needs its own database so the per-test
+    `DROP SCHEMA public CASCADE` doesn't race across workers. Master/serial
+    runs (no xdist) keep the base DB. The base DB is created by the postgres
+    service container; per-worker DBs are CREATE DATABASE-d on first import
+    by `_ensure_worker_db_exists` below.
+    """
+    worker = os.environ.get("PYTEST_XDIST_WORKER")
+    if not worker or worker == "master":
+        return base_url
+    head, tail = base_url.rsplit("/", 1)
+    db_name, _, query = tail.partition("?")
+    suffix = f"?{query}" if query else ""
+    return f"{head}/{db_name}_{worker}{suffix}"
+
+
+def _ensure_worker_db_exists(worker_url: str, base_url: str) -> None:
+    """Create the per-worker DB if it doesn't exist. Runs synchronously at
+    conftest import time (before any async test machinery), using psycopg2
+    against the postgres maintenance DB. No-op when not running under xdist.
+    """
+    if worker_url == base_url:
+        return
+    head, tail = worker_url.rsplit("/", 1)
+    worker_db = tail.partition("?")[0]
+    # Strip the +asyncpg dialect for sync psycopg2 + connect to 'postgres'.
+    sync_head = head.replace("+asyncpg", "")
+    admin_url = f"{sync_head}/postgres"
+    # Lazy import — psycopg2 is a transitive backend dep; not imported at
+    # module top to keep the conftest light when xdist isn't in use.
+    from sqlalchemy import create_engine
+    engine = create_engine(admin_url, isolation_level="AUTOCOMMIT")
+    try:
+        with engine.begin() as conn:
+            exists = conn.execute(
+                sa.text("SELECT 1 FROM pg_database WHERE datname = :n"),
+                {"n": worker_db},
+            ).scalar()
+            if not exists:
+                # Identifier interpolation is safe — worker_db is built from
+                # the trusted base URL + 'gw\d+' worker suffix.
+                conn.execute(sa.text(f'CREATE DATABASE "{worker_db}"'))
+    finally:
+        engine.dispose()
+
+
+TEST_DATABASE_URL = _worker_db_url(_BASE_TEST_DATABASE_URL)
+_ensure_worker_db_exists(TEST_DATABASE_URL, _BASE_TEST_DATABASE_URL)
+
 # Belt-and-suspenders: refuse to run tests against a DB whose name doesn't
 # contain "test". Parses the last path segment of the URL (everything after
 # the final '/', with query string stripped) so credentials / hosts that
@@ -73,6 +127,20 @@ def pytest_collection_modifyitems(config, items):
        items[:] = selected


+@pytest.hookimpl(trylast=True, hookwrapper=True)
+def pytest_runtest_teardown(item, nextitem):
+    """Close pytest-asyncio's post-test clean loop before warnings collect it."""
+    yield
+    policy = asyncio.get_event_loop_policy()
+    try:
+        loop = policy.get_event_loop()
+    except RuntimeError:
+        return
+    if not loop.is_running() and not loop.is_closed():
+        loop.close()
+        policy.set_event_loop(None)
+
+
@pytest.fixture
 async def test_db() -> AsyncGenerator[AsyncSession, None]:
    """
@@ -137,6 +205,7 @@ async def test_db() -> AsyncGenerator[AsyncSession, None]:
    # Dispose engine first so all pooled connections are released,
    # then reconnect to perform the schema teardown cleanly.
    await engine.dispose()
+    await asyncio.sleep(0.01)

    # Drop all tables after test (CASCADE for circular FKs)
    teardown_engine = create_async_engine(
@@ -150,6 +219,7 @@ async def test_db() -> AsyncGenerator[AsyncSession, None]:
            await conn.execute(sa.text("CREATE SCHEMA public"))
    finally:
        await teardown_engine.dispose()
+        await asyncio.sleep(0.01)


@pytest.fixture
--- a/backend/tests/test_ai_endpoints.py
+++ b/backend/tests/test_ai_endpoints.py
@@ -74,19 +74,25 @@ def _mock_ai_provider(text: str, input_tokens: int = 100, output_tokens: int = 2
@pytest.fixture
 def enable_ai():
    """Temporarily enable AI by setting a fake API key."""
-    original = settings.ANTHROPIC_API_KEY
+    original_anthropic = settings.ANTHROPIC_API_KEY
+    original_google = settings.GOOGLE_AI_API_KEY
    settings.ANTHROPIC_API_KEY = "test-key-fake"
+    settings.GOOGLE_AI_API_KEY = None
    yield
-    settings.ANTHROPIC_API_KEY = original
+    settings.ANTHROPIC_API_KEY = original_anthropic
+    settings.GOOGLE_AI_API_KEY = original_google


@pytest.fixture
 def disable_ai():
    """Ensure AI is disabled."""
-    original = settings.ANTHROPIC_API_KEY
+    original_anthropic = settings.ANTHROPIC_API_KEY
+    original_google = settings.GOOGLE_AI_API_KEY
    settings.ANTHROPIC_API_KEY = None
+    settings.GOOGLE_AI_API_KEY = None
    yield
-    settings.ANTHROPIC_API_KEY = original
+    settings.ANTHROPIC_API_KEY = original_anthropic
+    settings.GOOGLE_AI_API_KEY = original_google


 # ── Quota endpoint ──
--- a/backend/tests/test_branch_manager.py
+++ b/backend/tests/test_branch_manager.py
@@ -66,6 +66,7 @@ async def test_create_fork(client: AsyncClient, test_user, auth_headers, test_db

    step = AISessionStep(
        session_id=session.id,
+        account_id=session.account_id,
        step_order=0,
        step_type="question",
        content={"text": "What's the issue?"},
@@ -119,7 +120,7 @@ async def test_switch_branch(client: AsyncClient, test_user, auth_headers, test_
    root = await manager.create_root_branch(session.id)

    step = AISessionStep(
-        session_id=session.id, step_order=0, step_type="question",
+        session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
        content={"text": "test"}, confidence_at_step=0.5,
    )
    test_db.add(step)
@@ -197,7 +198,7 @@ async def test_get_branch_tree(client: AsyncClient, test_user, auth_headers, tes
    root = await manager.create_root_branch(session.id)

    step = AISessionStep(
-        session_id=session.id, step_order=0, step_type="question",
+        session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
        content={"text": "test"}, confidence_at_step=0.5,
    )
    test_db.add(step)
--- a/backend/tests/test_psa_writeback_phase4.py
+++ b/backend/tests/test_psa_writeback_phase4.py
@@ -50,6 +50,7 @@ async def _make_session(test_db, user, *, with_psa: bool = False) -> AISession:
        conn = PsaConnection(
            account_id=user["user_data"]["account_id"],
            provider="connectwise",
+            display_name="Test ConnectWise",
            site_url="https://fake.cw.local",
            company_id="TEST",
            credentials_encrypted=encrypt_credentials({"public_key": "x", "private_key": "y"}),
--- a/backend/tests/test_script_builder.py
+++ b/backend/tests/test_script_builder.py
@@ -472,19 +472,20 @@ class TestScriptBuilderSlugCollision:
        # Pre-create a template with slug "test-script" to cause collision
        user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
        user_id = user_resp.json()["id"]
+        account_id = user_resp.json()["account_id"]
        await test_db.execute(
            sa.text("""
                INSERT INTO script_templates
-                    (id, category_id, created_by, name, slug, script_body,
+                    (id, category_id, created_by, account_id, name, slug, script_body,
                     parameters_schema, default_values, validation_rules, tags,
                     complexity, is_active, version, usage_count, created_at, updated_at)
                VALUES
-                    (:id, 'a0000000-0000-0000-0000-000000000001'::uuid, :uid,
+                    (:id, 'a0000000-0000-0000-0000-000000000001'::uuid, :uid, :account_id,
                     'Test Script', 'test-script', 'echo hello',
                     '{"parameters": []}', '{}', '{}', '["powershell"]',
                     'beginner', true, 1, 0, NOW(), NOW())
            """),
-            {"id": str(uuid_mod.uuid4()), "uid": user_id},
+            {"id": str(uuid_mod.uuid4()), "uid": user_id, "account_id": account_id},
        )
        await test_db.commit()

@@ -561,6 +562,7 @@ class TestScriptTemplateFilters:
        """mine=true returns only templates created by the current user."""
        user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
        user_id = user_resp.json()["id"]
+        account_id = user_resp.json()["account_id"]

        second_resp = await client.get("/api/v1/auth/me", headers=second_user_headers)
        second_user_id = second_resp.json()["id"]
@@ -571,32 +573,32 @@ class TestScriptTemplateFilters:
        await test_db.execute(
            sa.text("""
                INSERT INTO script_templates
-                    (id, category_id, created_by, team_id, name, slug, script_body,
+                    (id, category_id, created_by, account_id, team_id, name, slug, script_body,
                     parameters_schema, default_values, validation_rules, tags,
                     complexity, is_active, version, usage_count, created_at, updated_at)
                VALUES
-                    (:id, :cat, :uid, NULL,
+                    (:id, :cat, :uid, :account_id, NULL,
                     'My Script', 'my-script', 'echo mine',
                     '{"parameters": []}', '{}', '{}', '[]',
                     'beginner', true, 1, 0, NOW(), NOW())
            """),
-            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id},
+            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id},
        )

        # Create template owned by second user (no team_id, so visible to all)
        await test_db.execute(
            sa.text("""
                INSERT INTO script_templates
-                    (id, category_id, created_by, team_id, name, slug, script_body,
+                    (id, category_id, created_by, account_id, team_id, name, slug, script_body,
                     parameters_schema, default_values, validation_rules, tags,
                     complexity, is_active, version, usage_count, created_at, updated_at)
                VALUES
-                    (:id, :cat, :uid, NULL,
+                    (:id, :cat, :uid, :account_id, NULL,
                     'Other Script', 'other-script', 'echo other',
                     '{"parameters": []}', '{}', '{}', '[]',
                     'beginner', true, 1, 0, NOW(), NOW())
            """),
-            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": second_user_id},
+            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": second_user_id, "account_id": account_id},
        )
        await test_db.commit()

@@ -617,6 +619,7 @@ class TestScriptTemplateFilters:
        """shared=true returns only templates shared with the user's team."""
        user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
        user_id = user_resp.json()["id"]
+        account_id = user_resp.json()["account_id"]

        cat_id = "b0000000-0000-0000-0000-000000000001"

@@ -639,32 +642,32 @@ class TestScriptTemplateFilters:
        await test_db.execute(
            sa.text("""
                INSERT INTO script_templates
-                    (id, category_id, created_by, team_id, name, slug, script_body,
+                    (id, category_id, created_by, account_id, team_id, name, slug, script_body,
                     parameters_schema, default_values, validation_rules, tags,
                     complexity, is_active, version, usage_count, created_at, updated_at)
                VALUES
-                    (:id, :cat, :uid, :tid,
+                    (:id, :cat, :uid, :account_id, :tid,
                     'Team Script', 'team-script', 'echo team',
                     '{"parameters": []}', '{}', '{}', '[]',
                     'beginner', true, 1, 0, NOW(), NOW())
            """),
-            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "tid": team_id},
+            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id, "tid": team_id},
        )

        # Template NOT shared (no team_id)
        await test_db.execute(
            sa.text("""
                INSERT INTO script_templates
-                    (id, category_id, created_by, team_id, name, slug, script_body,
+                    (id, category_id, created_by, account_id, team_id, name, slug, script_body,
                     parameters_schema, default_values, validation_rules, tags,
                     complexity, is_active, version, usage_count, created_at, updated_at)
                VALUES
-                    (:id, :cat, :uid, NULL,
+                    (:id, :cat, :uid, :account_id, NULL,
                     'Personal Script', 'personal-script', 'echo personal',
                     '{"parameters": []}', '{}', '{}', '[]',
                     'beginner', true, 1, 0, NOW(), NOW())
            """),
-            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id},
+            {"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id},
        )
        await test_db.commit()

--- a/backend/tests/test_session_branches_api.py
+++ b/backend/tests/test_session_branches_api.py
@@ -49,7 +49,7 @@ async def test_create_fork(client: AsyncClient, test_user, auth_headers, test_db
    await test_db.flush()

    step = AISessionStep(
-        session_id=session.id, step_order=0, step_type="question",
+        session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
        content={"text": "test"}, confidence_at_step=0.5,
    )
    test_db.add(step)
@@ -88,7 +88,7 @@ async def test_switch_branch(client: AsyncClient, test_user, auth_headers, test_
    await test_db.flush()

    step = AISessionStep(
-        session_id=session.id, step_order=0, step_type="question",
+        session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
        content={"text": "test"}, confidence_at_step=0.5,
    )
    test_db.add(step)
--- a/backend/tests/test_session_resolutions_api.py
+++ b/backend/tests/test_session_resolutions_api.py
@@ -45,6 +45,7 @@ async def test_edit_output_api(client: AsyncClient, test_user, auth_headers, tes

    output = SessionResolutionOutput(
        session_id=session.id,
+        account_id=session.account_id,
        output_type="psa_ticket_notes",
        generated_content="Original",
        status="draft",
--- a/backend/tests/test_session_sharing.py
+++ b/backend/tests/test_session_sharing.py
@@ -219,7 +219,7 @@ class TestSessionSharing:
            json={"visibility": "public"},
            headers=other_headers
        )
-        assert response.status_code == 403
+        assert response.status_code == 404

    async def test_share_nonexistent_session(self, client: AsyncClient, auth_headers):
        """Creating a share for nonexistent session returns 404."""
--- a/backend/tests/test_session_suggested_fixes_api.py
+++ b/backend/tests/test_session_suggested_fixes_api.py
@@ -213,15 +213,28 @@ async def test_record_decision_persists_and_bumps_state_version(
        title="x",
        description="y",
        confidence_pct=50,
+        ai_drafted_script="Write-Output 'ok'",
    )
    test_db.add(fix)
    await test_db.commit()

-    r = await client.post(
-        f"/api/v1/ai-sessions/{session.id}/suggested-fixes/{fix.id}/decision",
-        headers=auth_headers,
-        json={"decision": "draft_template"},
-    )
+    # The draft_template path calls TemplateExtractionService, which needs an
+    # AI provider configured. CI doesn't set ANTHROPIC_API_KEY/GOOGLE_AI_API_KEY,
+    # and this test isn't exercising the AI integration — patch the extractor
+    # with a minimal valid response so the rest of the decision flow runs.
+    extractor_stub = AsyncMock(return_value={
+        "templated_body": "Write-Output 'ok'",
+        "parameters": [],
+    })
+    with patch(
+        "app.api.endpoints.session_suggested_fixes._extract_template_parameters",
+        extractor_stub,
+    ):
+        r = await client.post(
+            f"/api/v1/ai-sessions/{session.id}/suggested-fixes/{fix.id}/decision",
+            headers=auth_headers,
+            json={"decision": "draft_template"},
+        )
    assert r.status_code == 200
    assert r.json()["user_decision"] == "draft_template"

--- a/backend/tests/test_tenant_isolation_p0.py
+++ b/backend/tests/test_tenant_isolation_p0.py
@@ -43,7 +43,7 @@ async def _create_account_and_user(db: AsyncSession, prefix: str):
 async def _login(client: AsyncClient, email: str, password: str) -> dict:
    """Log in and return Authorization headers."""
    resp = await client.post(
-        "/api/v1/auth/login",
+        "/api/v1/auth/login/json",
        json={"email": email, "password": password},
    )
    assert resp.status_code == 200, f"Login failed: {resp.text}"
@@ -101,11 +101,11 @@ async def test_category_tree_count_scoped_to_account(
    acct_a, user_a, pass_a = await _create_account_and_user(test_db, "cat-a")
    acct_b, user_b, pass_b = await _create_account_and_user(test_db, "cat-b")

-    # Shared category (account_id=None means global)
+    # Categories are tenant-scoped; the endpoint must only count account A's trees.
    category = TreeCategory(
        name="Shared Category",
        slug=f"shared-cat-{uuid.uuid4().hex[:6]}",
-        account_id=None,
+        account_id=acct_a.id,
        is_active=True,
    )
    test_db.add(category)
@@ -270,6 +270,7 @@ async def test_get_session_returns_404_not_403_for_other_user(
    session_b = Session(
        tree_id=tree_b.id,
        user_id=user_b.id,
+        account_id=acct_b.id,
        tree_snapshot={"id": "root", "type": "start", "children": []},
        path_taken=[],
        decisions=[],
@@ -384,6 +385,7 @@ async def test_share_revoke_returns_404_not_403_for_other_user(
    session_b = Session(
        tree_id=tree_b.id,
        user_id=user_b.id,
+        account_id=acct_b.id,
        tree_snapshot={"id": "root", "type": "start", "children": []},
        path_taken=[],
        decisions=[],
@@ -534,6 +536,7 @@ async def test_maintenance_schedule_returns_404_for_other_team(
    # Create a schedule for that tree
    schedule_b = MaintenanceSchedule(
        tree_id=tree_b.id,
+        account_id=acct_b.id,
        created_by=user_b.id,
        cron_expression="0 2 * * 0",
        timezone="UTC",
--- a/backend/tests/test_tree_sharing.py
+++ b/backend/tests/test_tree_sharing.py
@@ -4,6 +4,7 @@ from datetime import datetime, timezone, timedelta
 from httpx import AsyncClient
 from uuid import uuid4

+from app.models.account import Account
 from app.models.tree import Tree
 from app.models.tree_share import TreeShare
 from app.models.user import User
@@ -287,13 +288,17 @@ class TestTreeSharing:
@pytest.mark.asyncio
 async def test_migration_defaults_visibility_to_team(test_db):
    """Test that existing trees default to 'team' visibility after migration."""
+    account = Account(name="Migration Default Test", display_code=uuid4().hex[:8])
+    test_db.add(account)
+    await test_db.flush()
+
    # Create a tree without specifying visibility
    tree = Tree(
        name="Old Tree",
        description="Created before migration",
        tree_structure={"id": "root", "type": "decision", "question": "Test?", "children": []},
        author_id=None,
-        account_id=None
+        account_id=account.id
    )
    test_db.add(tree)
    await test_db.commit()
--- a/backend/tests/test_uploads.py
+++ b/backend/tests/test_uploads.py
@@ -359,7 +359,7 @@ async def test_delete_upload_forbidden_for_non_owner(client, auth_headers, test_
            f"/api/v1/uploads/{upload.id}", headers=other_headers
        )

-    assert response.status_code == 403
+    assert response.status_code == 404


 # ---------------------------------------------------------------------------
--- a/frontend/e2e/command-palette.spec.ts
+++ b/frontend/e2e/command-palette.spec.ts
@@ -88,6 +88,8 @@ test.describe('command palette smoke tests', () => {

    await flowpilotOption.click()

-    await expect(page).toHaveURL(/\/assistant/)
+    // Phase 1 of the FlowPilot migration renamed /assistant to /pilot.
+    // /assistant still 301-redirects to /pilot, so accept either landing URL.
+    await expect(page).toHaveURL(/\/(pilot|assistant)/)
  })
 })
--- a/frontend/e2e/history.spec.ts
+++ b/frontend/e2e/history.spec.ts
@@ -24,13 +24,21 @@ test.describe('session history smoke tests', () => {
      await page.goto('/sessions')

      await expect(
-        page.getByRole('heading', { name: 'Sessions', exact: true }),
+        page.getByRole('heading', { name: 'Session History', exact: true }),
      ).toBeVisible()

+      // Default tab on /sessions is "AI Sessions"; flow sessions live behind
+      // the "Flow Sessions" tab and only that tab exposes ticket/client filters.
+      await page.getByRole('button', { name: 'Flow Sessions' }).click()
+
      await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
      await page.getByPlaceholder('Search by client name...').fill(clientName)

-      const sessionCard = page.locator('.bg-card').filter({ hasText: ticketNumber }).filter({ hasText: clientName }).first()
+      const sessionCard = page
+        .getByTestId('flow-session-card')
+        .filter({ hasText: ticketNumber })
+        .filter({ hasText: clientName })
+        .first()
      await expect(sessionCard).toBeVisible()
      await expect(sessionCard.getByText(tree.name)).toBeVisible()

--- a/frontend/e2e/library-start.spec.ts
+++ b/frontend/e2e/library-start.spec.ts
@@ -24,7 +24,7 @@ test.describe('flow library start-session smoke tests', () => {
      await page.getByPlaceholder('Search flows...').fill(tree.name)
      await page.getByRole('button', { name: 'Search', exact: true }).click()

-      const treeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first()
+      const treeCard = page.getByTestId('tree-card').filter({ hasText: tree.name }).first()
      await expect(treeCard).toBeVisible()
      await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()

--- a/frontend/e2e/library.spec.ts
+++ b/frontend/e2e/library.spec.ts
@@ -20,7 +20,7 @@ test.describe('flow library smoke tests', () => {
      await page.getByPlaceholder('Search flows...').fill(tree.name)
      await page.getByRole('button', { name: 'Search', exact: true }).click()

-      await expect(page.getByText(tree.name)).toBeVisible()
+      await expect(page.getByTestId('tree-card').filter({ hasText: tree.name }).first()).toBeVisible()
    } finally {
      await disposeApiContext(api)
    }
--- a/frontend/e2e/navigation.spec.ts
+++ b/frontend/e2e/navigation.spec.ts
@@ -14,7 +14,7 @@ test.describe('authenticated navigation smoke tests', () => {
    await page.goto('/sessions')

    await expect(
-      page.getByRole('heading', { name: 'Sessions', exact: true }),
+      page.getByRole('heading', { name: 'Session History', exact: true }),
    ).toBeVisible()
  })

@@ -30,7 +30,7 @@ test.describe('authenticated navigation smoke tests', () => {
    await page.goto('/account')

    await expect(
-      page.getByRole('heading', { name: 'Account Settings' }),
+      page.getByRole('heading', { name: 'Account Management' }),
    ).toBeVisible()
  })
 })
--- a/frontend/e2e/resume.spec.ts
+++ b/frontend/e2e/resume.spec.ts
@@ -18,9 +18,17 @@ test.describe('session resume smoke tests', () => {
    })

    try {
-      await page.goto('/trees')
+      // Resume flow moved off /trees onto the Flow Sessions tab of /sessions
+      // during the FlowPilot migration. The destination (/trees/:id/navigate)
+      // is unchanged — only the entry point shifted.
+      await page.goto('/sessions')
+      await expect(
+        page.getByRole('heading', { name: 'Session History', exact: true }),
+      ).toBeVisible()
+      await page.getByRole('button', { name: 'Flow Sessions' }).click()
+      // Active sub-tab is the default and surfaces in-progress sessions.

-      const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).filter({ hasText: 'Resume' }).first()
+      const resumeCard = page.getByTestId('flow-session-card').filter({ hasText: tree.name }).first()
      await expect(resumeCard).toBeVisible()
      await resumeCard.getByRole('button', { name: 'Resume' }).first().click()

--- a/frontend/e2e/shares.spec.ts
+++ b/frontend/e2e/shares.spec.ts
@@ -31,7 +31,7 @@ test.describe('shared session management smoke tests', () => {
      ).toBeVisible()
      await expect(page.getByText(share.share_name || '')).toBeVisible()

-      const shareCard = page.locator('.bg-card').filter({ hasText: share.share_name || '' }).first()
+      const shareCard = page.getByTestId('share-card').filter({ hasText: share.share_name || '' }).first()
      await shareCard.getByRole('button', { name: 'Revoke' }).click()

      const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })
--- a/frontend/src/components/library/TreeGridView.tsx
+++ b/frontend/src/components/library/TreeGridView.tsx
@@ -34,6 +34,8 @@ export function TreeGridView({
      {trees.map((tree) => (
        <div
          key={tree.id}
+          data-testid="tree-card"
+          data-tree-id={tree.id}
          className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
        >
          <div className="mb-2 flex items-start justify-between gap-2">
--- a/frontend/src/components/library/TreeListView.tsx
+++ b/frontend/src/components/library/TreeListView.tsx
@@ -33,6 +33,8 @@ export function TreeListView({
      {trees.map((tree) => (
        <div
          key={tree.id}
+          data-testid="tree-card"
+          data-tree-id={tree.id}
          className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
        >
          {/* Left: Name and Description */}
--- a/frontend/src/pages/MySharesPage.tsx
+++ b/frontend/src/pages/MySharesPage.tsx
@@ -161,7 +161,12 @@ export default function MySharesPage() {
            const isCopied = copiedId === share.id

            return (
-              <div key={share.id} className="bg-card border border-border rounded-xl p-5">
+              <div
+                key={share.id}
+                data-testid="share-card"
+                data-share-id={share.id}
+                className="bg-card border border-border rounded-xl p-5"
+              >
                {/* Top row: badge + name */}
                <div className="flex items-center gap-3 mb-3">
                  <span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">
--- a/frontend/src/pages/SessionHistoryPage.tsx
+++ b/frontend/src/pages/SessionHistoryPage.tsx
@@ -533,7 +533,11 @@ export default function SessionHistoryPage() {
                      )}
                      style={{ '--stagger-index': i } as React.CSSProperties}
                    >
-                      <div className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]">
+                      <div
+                        data-testid="flow-session-card"
+                        data-session-id={session.id}
+                        className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]"
+                      >
                        <div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
                          <div className="flex-1">
                            <div className="flex flex-wrap items-center gap-2">