Compare commits
13 Commits
fix/ci-pyt
...
1559feb759
| Author | SHA1 | Date | |
|---|---|---|---|
| 1559feb759 | |||
| b56da2facd | |||
| 87bb20b8f0 | |||
| 1e3a6cfa01 | |||
| ede6eebf9a | |||
| 261814ae65 | |||
| 6656ebdead | |||
| 69f2a37591 | |||
| 7f714363dd | |||
| 1bd43abb8f | |||
| c203b70ef9 | |||
| f27e3b44b0 | |||
| fe632c9194 |
@@ -1,22 +1,21 @@
|
|||||||
# CURRENT_TASK.md
|
# CURRENT_TASK.md
|
||||||
|
|
||||||
**Task:** Restore a fully green CI gate on `main` and lock it via branch protection so future merges can't introduce silent rot.
|
**Task:** Land consolidated CI-recovery PR #150 and lock reliable CI gates on `main`.
|
||||||
|
|
||||||
**Status:** in-progress
|
**Status:** in-progress
|
||||||
|
|
||||||
**Definition of Done:**
|
**Definition of Done:**
|
||||||
- [ ] PR #150 (`fix/ci-workflow-config`) merged. Both `CI / backend (pull_request)` and `CI / frontend (pull_request)` show success on the merge commit.
|
- [ ] PR #150 (`fix/ci-workflow-config`) merged. `CI / backend (pull_request)`, `CI / frontend (pull_request)`, and `CI / e2e (pull_request)` show success before merge.
|
||||||
- [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required).
|
- [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required).
|
||||||
- [ ] The 54 real backend test failures (left after #149's infra cleanup) categorized and fixed in a follow-up PR. Target: 0 failures, 0 errors on a `pytest` run inside `resolutionflow_backend`.
|
- [ ] Optional: `CI / e2e (pull_request)` confirmed clean across at least one PR run and added to required checks.
|
||||||
- [ ] `npm run lint` stays at 0 errors after the cleanup PR (already at 0 on main).
|
|
||||||
- [ ] Append a SESSION_LOG.md entry summarizing what shipped.
|
|
||||||
|
|
||||||
**Assumptions:**
|
**Assumptions:**
|
||||||
- The 54 failures fall into a small number of root-cause categories (likely 3–5: fixture-scoping leaks, DB cleanup ordering, account_id propagation in test seed paths). Verify before assuming.
|
- The 8-core homelab Gitea Actions runner can support `-n auto` (8 xdist workers). If memory pressure shows up in CI, drop to `-n 4`.
|
||||||
- The pytest-asyncio 0.24 + pytest 8.4 toolchain bumped in #149 is the right baseline; do not revert.
|
- pytest-cov's xdist support continues to handle the coverage merge and `--cov-fail-under=50` check correctly.
|
||||||
- `DATABASE_TEST_URL` is the only DB URL conftest will honor; do not weaken the safety guard added in `dab740d`.
|
- The per-worker DB creation in `conftest.py` is idempotent and racing workers on first import won't all try to CREATE DATABASE simultaneously — postgres serializes that, but if it surfaces issues, wrap with an advisory lock.
|
||||||
|
|
||||||
**Out of scope:**
|
**Out of scope:**
|
||||||
- New feature work on FlowPilot (Phase 10+) or PSA — keep this branch focused on CI debt.
|
- Frontend lint warnings (23 remain after #149).
|
||||||
- Frontend lint warnings (23 remain after #149; they're missing-deps in useEffect, opt-in cleanup later).
|
- The 23 react-hooks/exhaustive-deps warnings.
|
||||||
- RLS test suite (`test_rls_isolation.py`) — gated behind `RUN_RLS_TESTS=1` and not in the default CI run.
|
- RLS test suite (gated behind `RUN_RLS_TESTS=1`; not in default CI).
|
||||||
|
- Per-test transactional rollback (would shave another 30-40% off backend time but is a much bigger refactor — capture in TODO if interested).
|
||||||
|
|||||||
@@ -2,62 +2,63 @@
|
|||||||
|
|
||||||
# HANDOFF.md
|
# HANDOFF.md
|
||||||
|
|
||||||
**Last updated:** 2026-04-25 06:12 EDT
|
**Last updated:** 2026-04-25 16:41 EDT
|
||||||
|
|
||||||
**Active task:** Restore green CI gate on `main` and lock it via branch protection. See [CURRENT_TASK.md](CURRENT_TASK.md).
|
**Active task:** Land PR #150 (the consolidated CI-recovery PR), then enable backend and eventually e2e gates on `main`. See [CURRENT_TASK.md](CURRENT_TASK.md).
|
||||||
|
|
||||||
**Branch:** `fix/ci-workflow-config`
|
**Branch:** `fix/ci-workflow-config` -> PR #150. PRs #151 and #152 were closed and consolidated into this branch.
|
||||||
|
|
||||||
## Current state
|
## Current resume point
|
||||||
|
|
||||||
Previous session fixed the 54 real backend failures left after #149. The default backend suite is now green locally:
|
Latest PR #150 CI had backend and frontend green, but `CI / e2e (pull_request)` failed on the resume smoke test.
|
||||||
|
|
||||||
```bash
|
The failure was not product behavior. Playwright was using:
|
||||||
docker exec resolutionflow_backend bash -lc 'pytest --override-ini="addopts=" -q > /tmp/full-backend.log 2>&1; code=$?; tail -n 160 /tmp/full-backend.log; exit $code'
|
|
||||||
# 1076 passed, 35 deselected in 1347.41s (0:22:27)
|
```ts
|
||||||
|
page.locator('.bg-card').filter({ hasText: tree.name }).first()
|
||||||
```
|
```
|
||||||
|
|
||||||
Targeted validation also passed:
|
On the session history page this matched the tree filter `<select>` first because the select options contain the same flow name, then the test waited forever for a `Resume` button inside the select.
|
||||||
|
|
||||||
- `tests/test_session_resolutions_api.py tests/test_session_sharing.py tests/test_session_suggested_fixes_api.py tests/test_survey.py tests/test_tenant_isolation_p0.py tests/test_tree_sharing.py tests/test_trees.py::TestTrees::test_delete_tree_cleans_up_folder_and_tag_assignments tests/test_uploads.py::test_delete_upload_forbidden_for_non_owner` → `73 passed`
|
This session fixed that properly by adding stable test IDs to repeated cards and moving e2e tests off `.bg-card` selectors:
|
||||||
- PDF export tests → `3 passed`
|
|
||||||
- Prompt/PSA/resolution/script-builder subset → `14 passed`
|
|
||||||
- Admin/AI/branch subsets → `11 passed`
|
|
||||||
|
|
||||||
## What changed
|
- `flow-session-card` in `SessionHistoryPage.tsx`
|
||||||
|
- `tree-card` in `TreeGridView.tsx` and `TreeListView.tsx`
|
||||||
|
- `share-card` in `MySharesPage.tsx`
|
||||||
|
|
||||||
Production fixes:
|
The workflow was also hardened:
|
||||||
|
|
||||||
- CI/backend dev image now installs WeasyPrint system libraries.
|
- Postgres service healthchecks now run `pg_isready -U postgres` instead of checking as `root`.
|
||||||
- Public share-token and survey routes are mounted outside tenant auth; protected share management remains tenant-protected.
|
- The e2e frontend build now bakes `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"`, matching the Playwright backend origin.
|
||||||
- Folder creation now persists `UserFolder.account_id`.
|
|
||||||
- Script Builder save-to-library now persists `ScriptTemplate.account_id`.
|
|
||||||
- Resolution output generation eager-loads `AISession.steps` to avoid async lazy-load `MissingGreenlet`.
|
|
||||||
- AI session model now declares the generated `search_vector` column already present in Alembic, so `create_all` test schemas match runtime migrations.
|
|
||||||
- Direct account-role update now rejects `"owner"`; ownership changes must use the transfer path.
|
|
||||||
- Assistant prompt marker examples no longer include a literal executable `create_spin_off_ticket` payload.
|
|
||||||
|
|
||||||
Test/harness fixes:
|
## Verification completed
|
||||||
|
|
||||||
- Test seeds updated for tenant-scoped `account_id` columns on sessions, branches, resolution outputs, script templates, PSA connections, folders, schedules, and categories.
|
- `git diff --check`
|
||||||
- Tests aligned with 404-not-403 resource-hiding policy.
|
- Confirmed no remaining `.bg-card` selectors in `frontend/e2e/*.ts`.
|
||||||
- Disabled-AI tests now restore both Anthropic and Google key settings.
|
- `docker exec -w /app resolutionflow_frontend npm run build`
|
||||||
- Pytest harness closes pytest-asyncio's leftover clean loop and ignores known unclosed asyncio/asyncpg teardown ResourceWarnings that otherwise appear at arbitrary later setup points under `filterwarnings = error`.
|
- Ran migrations and test-user seed in the dev backend container.
|
||||||
|
- Focused Playwright verification in an Actions-like Ubuntu container:
|
||||||
|
- First `e2e/resume.spec.ts` passed.
|
||||||
|
- Then `e2e/history.spec.ts e2e/library.spec.ts e2e/library-start.spec.ts e2e/resume.spec.ts e2e/shares.spec.ts --project=chromium --workers=1` passed: `6 passed (1.3m)`.
|
||||||
|
|
||||||
## Immediate next steps
|
## Immediate next steps
|
||||||
|
|
||||||
1. Commit current working tree if not already committed with trailer:
|
1. Push the WIP commit from this session to PR #150.
|
||||||
`Co-Authored-By: Codex <noreply@openai.com>`.
|
2. Watch PR #150 CI on the new SHA. Expected result: backend, frontend, and e2e all green.
|
||||||
2. Check PR #150 status on Gitea. If both `CI / backend (pull_request)` and `CI / frontend (pull_request)` are green, merge it.
|
3. Merge PR #150 when green.
|
||||||
3. After #150 merges, add `CI / backend (pull_request)` to required status checks on main:
|
4. Enable `CI / backend (pull_request)` as a required status check on `main`.
|
||||||
```bash
|
5. After at least one reliable green PR run, consider adding `CI / e2e (pull_request)` as required too.
|
||||||
PATCH /repos/chihlasm/resolutionflow/branch_protections/main
|
|
||||||
{ "status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"] }
|
|
||||||
```
|
|
||||||
`$GITEA_TOKEN` is in `.claude/settings.local.json`.
|
|
||||||
4. Run/confirm frontend lint if needed for the final DoD item (`npm run lint` was already green after #149, but this session did not rerun it).
|
|
||||||
|
|
||||||
## Open questions
|
## Branch protection on main (current)
|
||||||
|
|
||||||
- PR #150 was not rechecked or merged in this session.
|
- PR-only merges
|
||||||
- Branch protection was not updated in this session.
|
- `CI / frontend (pull_request)` required
|
||||||
|
- Force-push blocked
|
||||||
|
- No review required (solo)
|
||||||
|
|
||||||
|
## Useful breadcrumbs
|
||||||
|
|
||||||
|
- `.gitea/workflows/ci.yml` contains the parallel backend/frontend/e2e workflow.
|
||||||
|
- `backend/scripts/seed_phase9_qa_fixtures.py` pre-bakes Phase 9 QA fixtures.
|
||||||
|
- `.gstack/qa-reports/phase9-20260424-232700/REPORT.md` has the FlowPilot QA report.
|
||||||
|
- Per-worker test DBs accumulate on the Postgres service. Cheap to leave around; cleanup if needed.
|
||||||
|
|||||||
@@ -12,6 +12,25 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 2026-04-25 16:41 EDT — Codex — Stabilize PR #150 e2e selectors
|
||||||
|
|
||||||
|
- Investigated the remaining PR #150 failure after backend and frontend CI were green. The e2e resume smoke test was not failing because of product behavior; it used `.bg-card` plus text filtering and matched the tree filter `<select>` before the intended session card.
|
||||||
|
- Added stable test IDs to flow session, tree, and share cards, then updated affected e2e tests to target those cards instead of Tailwind class names.
|
||||||
|
- Hardened the CI workflow by making Postgres healthchecks authenticate as `postgres` and baking `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"` into the e2e frontend build.
|
||||||
|
- Verified with `git diff --check`, frontend build in Docker, no remaining `.bg-card` e2e selectors, and focused Playwright runs in an Actions-like Ubuntu container: resume spec passed, then history/library/library-start/resume/shares passed (`6 passed`).
|
||||||
|
- Left for next session: push this WIP commit to PR #150, watch CI, merge when all three jobs are green, then enable backend branch protection and consider the e2e gate after a reliable green run.
|
||||||
|
- Files touched: `.gitea/workflows/ci.yml`, `frontend/e2e/history.spec.ts`, `frontend/e2e/library-start.spec.ts`, `frontend/e2e/library.spec.ts`, `frontend/e2e/resume.spec.ts`, `frontend/e2e/shares.spec.ts`, `frontend/src/components/library/TreeGridView.tsx`, `frontend/src/components/library/TreeListView.tsx`, `frontend/src/pages/MySharesPage.tsx`, `frontend/src/pages/SessionHistoryPage.tsx`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/SESSION_LOG.md`.
|
||||||
|
|
||||||
|
## 2026-04-25 12:00 America/New_York — Claude Code — Mock final AI-provider test, cache CI deps, parallelize backend with pytest-xdist
|
||||||
|
|
||||||
|
- Diagnosed why CI was still red despite Codex's local 1076 passed: a single test (`test_record_decision_persists_and_bumps_state_version`) needed `ANTHROPIC_API_KEY` because the `decision: draft_template` path calls `TemplateExtractionService` → AI provider. Patched `_extract_template_parameters` with an `AsyncMock` so the test no longer depends on AI availability. Verified.
|
||||||
|
- Pushed Codex's WIP commit `49f8856` to PR #150 (had been local-only per handoff protocol).
|
||||||
|
- PR #150 (`fix/ci-workflow-config`) extended with cheap CI wins: `actions/cache@v3` for pip + npm in all three jobs; dropped `--cov-report=term-missing` (the custom display step parses JSON); added `--maxfail=10` so structural breakage exits fast.
|
||||||
|
- PR #151 (`fix/ci-pytest-xdist`) opened, stacked on #150: pytest-xdist with per-worker DB isolation. `conftest.py` reads `PYTEST_XDIST_WORKER`, computes a per-worker DB URL like `…_gw0`, and synchronously CREATEs the DB on first import. The per-test `DROP SCHEMA public CASCADE` then operates on the worker's isolated DB. Verified locally: backend suite went from 22m 27s serial → 4m 28s parallel (8 workers), 1076 passed in both cases. ~5× speedup.
|
||||||
|
- Decided NOT to do per-test transactional rollback (bigger refactor); captured for future TODO consideration.
|
||||||
|
- Left for next session: watch CI on both PRs, merge in order (#150 first, #151 second), then enable `CI / backend (pull_request)` as a required status check on main.
|
||||||
|
- Files touched: `backend/tests/test_session_suggested_fixes_api.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `.gitea/workflows/ci.yml`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/TODO.md`.
|
||||||
|
|
||||||
## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
|
## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
|
||||||
|
|
||||||
- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.
|
- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.
|
||||||
|
|||||||
@@ -5,9 +5,13 @@
|
|||||||
|
|
||||||
## Up next
|
## Up next
|
||||||
|
|
||||||
- [ ] **Parallelize backend pytest with pytest-xdist.** Currently the backend suite takes ~22 min wall-clock for `1076 passed, 35 deselected` (verified locally 2026-04-25). With `-n auto` on the homelab Gitea Actions runner, this should land in the 3–6 min range depending on core count. Blocker: `test_db` fixture in `backend/tests/conftest.py` does `DROP SCHEMA public CASCADE` per test, which two workers would race on. Standard fix: one database per worker, derived from `PYTEST_XDIST_WORKER` env var inside conftest. The runner has spare CPU, so prioritize once main is green and the 54-failure cleanup has landed.
|
- [ ] **Parallelize backend pytest with pytest-xdist.** ✅ landing as PR #151. Verified locally: backend suite 22 min → 4m 28s with `-n auto` on the 8-core homelab runner. Per-worker DB isolation via `PYTEST_XDIST_WORKER` in conftest.py.
|
||||||
|
|
||||||
## Backlog
|
## Backlog
|
||||||
|
|
||||||
- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
|
- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
|
||||||
- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
|
- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
|
||||||
|
- [ ] **Add `data-testid` attributes to e2e-critical interactive elements.** PR #152 fixed five Playwright tests by chasing UI-text changes (`Sessions` → `Session History`, `Account Settings` → `Account Management`, `/assistant` → `/pilot`, "Flow Sessions" tab, Resume button on session cards). Each was a one-line selector update, but every UI churn re-breaks them. Adding stable `data-testid` attributes on the targeted elements (page heading wrappers, tab nav, primary action buttons) and switching tests to `getByTestId` would make these immune to copy/route renames. Scope it small — start with `SessionHistoryPage` heading, the AI/Flow Sessions tab buttons, the per-session `Resume` button, and the command-palette FlowPilot option.
|
||||||
|
- [ ] **Per-test transactional rollback in `test_db` fixture.** Bigger engineering than xdist (which we already shipped). Instead of `DROP SCHEMA public CASCADE` per test, wrap each test in a savepoint and rollback at teardown. ~30-40% additional speedup on top of xdist for test-DB-heavy tests. Real refactor; only worth it if the suite gets significantly larger or runs more frequently.
|
||||||
|
- [ ] **Consider `pytest-testmon` for PR-time test selection.** Tracks which tests touched which source files and only re-runs affected ones. Best for small PRs touching ~few files. Adds cache-invalidation complexity; only worth it if the suite stays painfully long even after xdist.
|
||||||
|
- [ ] **AssistantChatPage `currentChatRef` guard is a silent return** — `handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, and `refreshPreview` all bail with `if (currentChatRef.current !== sentForChatId) return` when stale. This is by design for chat switching, but it also silently masked the prefill-ref bug fixed in PR #153 — the user just saw "no AI response" with no log, no toast, no Sentry event. Either (a) log a `console.warn`/Sentry breadcrumb on the mismatch path so future drift is visible, or (b) split "expected stale" (chat switch) from "unexpected stale" (ref never updated) so only the latter alerts. Pair with an audit of every `currentChatRef.current = ...` assignment vs every `setActiveChatId(...)` call to make sure they're paired everywhere.
|
||||||
|
|||||||
@@ -17,10 +17,13 @@ jobs:
|
|||||||
POSTGRES_USER: postgres
|
POSTGRES_USER: postgres
|
||||||
POSTGRES_PASSWORD: postgres
|
POSTGRES_PASSWORD: postgres
|
||||||
POSTGRES_DB: resolutionflow_test
|
POSTGRES_DB: resolutionflow_test
|
||||||
ports:
|
# No host port mapping. Tests connect to `postgres:5432` (the service
|
||||||
- 5432:5432
|
# container's docker-network DNS name), not `localhost:5432`. With
|
||||||
|
# multiple Gitea runners on the same homelab box, host-port mapping
|
||||||
|
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
||||||
|
# second fails with "port is already allocated".
|
||||||
options: >-
|
options: >-
|
||||||
--health-cmd pg_isready
|
--health-cmd "pg_isready -U postgres"
|
||||||
--health-interval 10s
|
--health-interval 10s
|
||||||
--health-timeout 5s
|
--health-timeout 5s
|
||||||
--health-retries 5
|
--health-retries 5
|
||||||
@@ -122,15 +125,14 @@ jobs:
|
|||||||
- name: Build
|
- name: Build
|
||||||
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
|
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
|
||||||
|
|
||||||
- name: Upload build artifact
|
# Build artifact intentionally NOT uploaded. The e2e job below builds
|
||||||
uses: actions/upload-artifact@v3
|
# its own frontend rather than downloading one from this job, so there
|
||||||
with:
|
# is no need for the cross-job artifact handoff (which previously broke
|
||||||
name: frontend-dist
|
# on actions/upload-artifact@v4 GHES support and forced a v3 pin).
|
||||||
path: frontend/dist
|
# Decoupling also lets e2e start immediately rather than waiting for
|
||||||
retention-days: 1
|
# this job to finish — important on a multi-runner setup.
|
||||||
|
|
||||||
e2e:
|
e2e:
|
||||||
needs: [frontend]
|
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
|
|
||||||
services:
|
services:
|
||||||
@@ -140,10 +142,13 @@ jobs:
|
|||||||
POSTGRES_USER: postgres
|
POSTGRES_USER: postgres
|
||||||
POSTGRES_PASSWORD: postgres
|
POSTGRES_PASSWORD: postgres
|
||||||
POSTGRES_DB: resolutionflow_test
|
POSTGRES_DB: resolutionflow_test
|
||||||
ports:
|
# No host port mapping. Tests connect to `postgres:5432` (the service
|
||||||
- 5432:5432
|
# container's docker-network DNS name), not `localhost:5432`. With
|
||||||
|
# multiple Gitea runners on the same homelab box, host-port mapping
|
||||||
|
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
||||||
|
# second fails with "port is already allocated".
|
||||||
options: >-
|
options: >-
|
||||||
--health-cmd pg_isready
|
--health-cmd "pg_isready -U postgres"
|
||||||
--health-interval 10s
|
--health-interval 10s
|
||||||
--health-timeout 5s
|
--health-timeout 5s
|
||||||
--health-retries 5
|
--health-retries 5
|
||||||
@@ -182,11 +187,13 @@ jobs:
|
|||||||
- name: Install frontend dependencies
|
- name: Install frontend dependencies
|
||||||
run: cd frontend && npm ci
|
run: cd frontend && npm ci
|
||||||
|
|
||||||
- name: Download frontend build
|
- name: Build frontend
|
||||||
uses: actions/download-artifact@v3
|
# Building inline (instead of downloading an artifact from the
|
||||||
with:
|
# frontend job) drops the cross-job dependency, so e2e can start
|
||||||
name: frontend-dist
|
# immediately on a free runner. Adds ~1-2 min of build time, but
|
||||||
path: frontend/dist
|
# eliminates the artifact-upload mechanism entirely (no more
|
||||||
|
# v3/v4 GHES headaches) and saves ~5 min of waiting.
|
||||||
|
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}" npm run build
|
||||||
|
|
||||||
- name: Install Playwright browser
|
- name: Install Playwright browser
|
||||||
run: cd frontend && npx playwright install --with-deps chromium
|
run: cd frontend && npx playwright install --with-deps chromium
|
||||||
|
|||||||
111
frontend/e2e/assistant-chat-prefill.spec.ts
Normal file
111
frontend/e2e/assistant-chat-prefill.spec.ts
Normal file
@@ -0,0 +1,111 @@
|
|||||||
|
import { expect, test } from '@playwright/test'
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Regression test for the prefill-handoff `currentChatRef` bug.
|
||||||
|
*
|
||||||
|
* Symptom: a chat session created via the dashboard prefill flow
|
||||||
|
* looked fine on the first AI turn, but submitting partial answers
|
||||||
|
* from the task lane silently dropped the AI's follow-up response.
|
||||||
|
* The user saw their answers in the chat, no assistant reply, no
|
||||||
|
* toast.
|
||||||
|
*
|
||||||
|
* Root cause: the prefill effect in `AssistantChatPage` set
|
||||||
|
* `activeChatId` without also updating `currentChatRef.current`, so
|
||||||
|
* the `currentChatRef.current !== sentForChatId` guard in
|
||||||
|
* `handleTaskSubmit` (and `handleSend`) tripped on every subsequent
|
||||||
|
* request and discarded the AI response.
|
||||||
|
*
|
||||||
|
* Strategy: drive the real prefill flow against the real backend, but
|
||||||
|
* intercept the `/chat` endpoint with `page.route` so we get
|
||||||
|
* deterministic question payloads on turn 1 and a deterministic
|
||||||
|
* follow-up on turn 2. The fix is what makes turn 2 visible.
|
||||||
|
*/
|
||||||
|
test.describe('AssistantChatPage — prefill handoff regression', () => {
|
||||||
|
test('AI follow-up renders after submitting partial task lane answers', async ({ page }) => {
|
||||||
|
let chatCallCount = 0
|
||||||
|
|
||||||
|
// Clear any persisted active-chat-id so the page does not auto-resume a
|
||||||
|
// stale session left behind by a sibling spec.
|
||||||
|
await page.addInitScript(() => {
|
||||||
|
try {
|
||||||
|
sessionStorage.removeItem('rf-active-chat-id')
|
||||||
|
sessionStorage.removeItem('rf-tasklane-meta')
|
||||||
|
} catch { /* ignore */ }
|
||||||
|
})
|
||||||
|
|
||||||
|
// Intercept only the chat endpoint. Session creation, listSessions,
|
||||||
|
// facts, suggested-fixes, etc. all hit the real backend so the page
|
||||||
|
// renders normally — only the LLM call is deterministic. The pattern
|
||||||
|
// matches `/ai-sessions/<uuid>/chat` and nothing nested beneath it.
|
||||||
|
await page.route(/\/api\/v1\/ai-sessions\/[^/]+\/chat$/, async (route) => {
|
||||||
|
if (route.request().method() !== 'POST') {
|
||||||
|
await route.fallback()
|
||||||
|
return
|
||||||
|
}
|
||||||
|
chatCallCount += 1
|
||||||
|
if (chatCallCount === 1) {
|
||||||
|
await route.fulfill({
|
||||||
|
status: 200,
|
||||||
|
contentType: 'application/json',
|
||||||
|
body: JSON.stringify({
|
||||||
|
content: 'Initial diagnostic plan. Please answer the questions in the task lane.',
|
||||||
|
suggested_flows: [],
|
||||||
|
fork: null,
|
||||||
|
actions: [],
|
||||||
|
questions: [
|
||||||
|
{ text: 'Has the user recently changed their password?' },
|
||||||
|
{ text: 'Is the lockout happening at a consistent time of day?' },
|
||||||
|
],
|
||||||
|
}),
|
||||||
|
})
|
||||||
|
return
|
||||||
|
}
|
||||||
|
await route.fulfill({
|
||||||
|
status: 200,
|
||||||
|
contentType: 'application/json',
|
||||||
|
body: JSON.stringify({
|
||||||
|
content: 'Got it — based on your answer, here is what to check next.',
|
||||||
|
suggested_flows: [],
|
||||||
|
fork: null,
|
||||||
|
actions: [],
|
||||||
|
questions: [],
|
||||||
|
}),
|
||||||
|
})
|
||||||
|
})
|
||||||
|
|
||||||
|
// Drive the prefill flow exactly the way the dashboard does. The textarea
|
||||||
|
// is keyed by its placeholder copy on QuickStartPage.
|
||||||
|
await page.goto('/')
|
||||||
|
const prefillBox = page.getByPlaceholder(/Describe the issue/i)
|
||||||
|
await expect(prefillBox).toBeVisible({ timeout: 10_000 })
|
||||||
|
await prefillBox.fill('User locked out of AD weekly')
|
||||||
|
await prefillBox.press('Enter')
|
||||||
|
|
||||||
|
// After the prefill submits we land on /pilot and the first stubbed AI
|
||||||
|
// turn surfaces the task-lane question text.
|
||||||
|
await expect(page).toHaveURL(/\/pilot/)
|
||||||
|
await expect(
|
||||||
|
page.getByText('Has the user recently changed their password?'),
|
||||||
|
).toBeVisible({ timeout: 15_000 })
|
||||||
|
|
||||||
|
// Answer the first question. UI flow: click "Answer" to open the
|
||||||
|
// textarea, type, click the inline "Answer" button to mark done.
|
||||||
|
await page.getByRole('button', { name: /^Answer$/ }).first().click()
|
||||||
|
await page.getByPlaceholder('Type your answer...').fill('No, password is months old')
|
||||||
|
await page.getByRole('button', { name: /^Answer$/ }).first().click()
|
||||||
|
|
||||||
|
// Submit the partial response. Pre-fix: the response was silently dropped
|
||||||
|
// here because `currentChatRef.current` still held the mount-time value.
|
||||||
|
await page.getByRole('button', { name: /Send 1 of 2 Responses/ }).click()
|
||||||
|
|
||||||
|
// Bug repro: the assistant message must render. Pre-fix this assertion
|
||||||
|
// fails because `handleTaskSubmit` early-returns at the
|
||||||
|
// `currentChatRef.current !== sentForChatId` guard.
|
||||||
|
await expect(
|
||||||
|
page.getByText('Got it — based on your answer, here is what to check next.'),
|
||||||
|
).toBeVisible({ timeout: 15_000 })
|
||||||
|
|
||||||
|
// Both chat calls must have actually happened.
|
||||||
|
expect(chatCallCount).toBe(2)
|
||||||
|
})
|
||||||
|
})
|
||||||
@@ -88,6 +88,8 @@ test.describe('command palette smoke tests', () => {
|
|||||||
|
|
||||||
await flowpilotOption.click()
|
await flowpilotOption.click()
|
||||||
|
|
||||||
await expect(page).toHaveURL(/\/assistant/)
|
// Phase 1 of the FlowPilot migration renamed /assistant to /pilot.
|
||||||
|
// /assistant still 301-redirects to /pilot, so accept either landing URL.
|
||||||
|
await expect(page).toHaveURL(/\/(pilot|assistant)/)
|
||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -24,13 +24,21 @@ test.describe('session history smoke tests', () => {
|
|||||||
await page.goto('/sessions')
|
await page.goto('/sessions')
|
||||||
|
|
||||||
await expect(
|
await expect(
|
||||||
page.getByRole('heading', { name: 'Sessions', exact: true }),
|
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||||
).toBeVisible()
|
).toBeVisible()
|
||||||
|
|
||||||
|
// Default tab on /sessions is "AI Sessions"; flow sessions live behind
|
||||||
|
// the "Flow Sessions" tab and only that tab exposes ticket/client filters.
|
||||||
|
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
||||||
|
|
||||||
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
|
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
|
||||||
await page.getByPlaceholder('Search by client name...').fill(clientName)
|
await page.getByPlaceholder('Search by client name...').fill(clientName)
|
||||||
|
|
||||||
const sessionCard = page.locator('.bg-card').filter({ hasText: ticketNumber }).filter({ hasText: clientName }).first()
|
const sessionCard = page
|
||||||
|
.getByTestId('flow-session-card')
|
||||||
|
.filter({ hasText: ticketNumber })
|
||||||
|
.filter({ hasText: clientName })
|
||||||
|
.first()
|
||||||
await expect(sessionCard).toBeVisible()
|
await expect(sessionCard).toBeVisible()
|
||||||
await expect(sessionCard.getByText(tree.name)).toBeVisible()
|
await expect(sessionCard.getByText(tree.name)).toBeVisible()
|
||||||
|
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ test.describe('flow library start-session smoke tests', () => {
|
|||||||
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
||||||
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
||||||
|
|
||||||
const treeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first()
|
const treeCard = page.getByTestId('tree-card').filter({ hasText: tree.name }).first()
|
||||||
await expect(treeCard).toBeVisible()
|
await expect(treeCard).toBeVisible()
|
||||||
await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()
|
await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ test.describe('flow library smoke tests', () => {
|
|||||||
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
||||||
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
||||||
|
|
||||||
await expect(page.getByText(tree.name)).toBeVisible()
|
await expect(page.getByTestId('tree-card').filter({ hasText: tree.name }).first()).toBeVisible()
|
||||||
} finally {
|
} finally {
|
||||||
await disposeApiContext(api)
|
await disposeApiContext(api)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -14,7 +14,7 @@ test.describe('authenticated navigation smoke tests', () => {
|
|||||||
await page.goto('/sessions')
|
await page.goto('/sessions')
|
||||||
|
|
||||||
await expect(
|
await expect(
|
||||||
page.getByRole('heading', { name: 'Sessions', exact: true }),
|
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||||
).toBeVisible()
|
).toBeVisible()
|
||||||
})
|
})
|
||||||
|
|
||||||
@@ -30,7 +30,7 @@ test.describe('authenticated navigation smoke tests', () => {
|
|||||||
await page.goto('/account')
|
await page.goto('/account')
|
||||||
|
|
||||||
await expect(
|
await expect(
|
||||||
page.getByRole('heading', { name: 'Account Settings' }),
|
page.getByRole('heading', { name: 'Account Management' }),
|
||||||
).toBeVisible()
|
).toBeVisible()
|
||||||
})
|
})
|
||||||
})
|
})
|
||||||
|
|||||||
@@ -18,9 +18,17 @@ test.describe('session resume smoke tests', () => {
|
|||||||
})
|
})
|
||||||
|
|
||||||
try {
|
try {
|
||||||
await page.goto('/trees')
|
// Resume flow moved off /trees onto the Flow Sessions tab of /sessions
|
||||||
|
// during the FlowPilot migration. The destination (/trees/:id/navigate)
|
||||||
|
// is unchanged — only the entry point shifted.
|
||||||
|
await page.goto('/sessions')
|
||||||
|
await expect(
|
||||||
|
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||||
|
).toBeVisible()
|
||||||
|
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
||||||
|
// Active sub-tab is the default and surfaces in-progress sessions.
|
||||||
|
|
||||||
const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).filter({ hasText: 'Resume' }).first()
|
const resumeCard = page.getByTestId('flow-session-card').filter({ hasText: tree.name }).first()
|
||||||
await expect(resumeCard).toBeVisible()
|
await expect(resumeCard).toBeVisible()
|
||||||
await resumeCard.getByRole('button', { name: 'Resume' }).first().click()
|
await resumeCard.getByRole('button', { name: 'Resume' }).first().click()
|
||||||
|
|
||||||
|
|||||||
@@ -31,7 +31,7 @@ test.describe('shared session management smoke tests', () => {
|
|||||||
).toBeVisible()
|
).toBeVisible()
|
||||||
await expect(page.getByText(share.share_name || '')).toBeVisible()
|
await expect(page.getByText(share.share_name || '')).toBeVisible()
|
||||||
|
|
||||||
const shareCard = page.locator('.bg-card').filter({ hasText: share.share_name || '' }).first()
|
const shareCard = page.getByTestId('share-card').filter({ hasText: share.share_name || '' }).first()
|
||||||
await shareCard.getByRole('button', { name: 'Revoke' }).click()
|
await shareCard.getByRole('button', { name: 'Revoke' }).click()
|
||||||
|
|
||||||
const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })
|
const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })
|
||||||
|
|||||||
@@ -34,6 +34,8 @@ export function TreeGridView({
|
|||||||
{trees.map((tree) => (
|
{trees.map((tree) => (
|
||||||
<div
|
<div
|
||||||
key={tree.id}
|
key={tree.id}
|
||||||
|
data-testid="tree-card"
|
||||||
|
data-tree-id={tree.id}
|
||||||
className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
|
className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
|
||||||
>
|
>
|
||||||
<div className="mb-2 flex items-start justify-between gap-2">
|
<div className="mb-2 flex items-start justify-between gap-2">
|
||||||
|
|||||||
@@ -33,6 +33,8 @@ export function TreeListView({
|
|||||||
{trees.map((tree) => (
|
{trees.map((tree) => (
|
||||||
<div
|
<div
|
||||||
key={tree.id}
|
key={tree.id}
|
||||||
|
data-testid="tree-card"
|
||||||
|
data-tree-id={tree.id}
|
||||||
className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
|
className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
|
||||||
>
|
>
|
||||||
{/* Left: Name and Description */}
|
{/* Left: Name and Description */}
|
||||||
|
|||||||
@@ -255,6 +255,12 @@ export default function AssistantChatPage() {
|
|||||||
}
|
}
|
||||||
setChats(prev => [chatItem, ...prev])
|
setChats(prev => [chatItem, ...prev])
|
||||||
setActiveChatId(session.session_id)
|
setActiveChatId(session.session_id)
|
||||||
|
// Keep the in-flight guard ref in sync. Without this, currentChatRef
|
||||||
|
// stays at its mount-time value (often a stale id from sessionStorage
|
||||||
|
// or null), so subsequent handleSend / handleTaskSubmit calls bail at
|
||||||
|
// their `currentChatRef.current !== sentForChatId` check and the AI
|
||||||
|
// response is silently dropped.
|
||||||
|
currentChatRef.current = session.session_id
|
||||||
setMessages([{ role: 'user', content: prefill }])
|
setMessages([{ role: 'user', content: prefill }])
|
||||||
setLoading(true)
|
setLoading(true)
|
||||||
|
|
||||||
|
|||||||
@@ -161,7 +161,12 @@ export default function MySharesPage() {
|
|||||||
const isCopied = copiedId === share.id
|
const isCopied = copiedId === share.id
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<div key={share.id} className="bg-card border border-border rounded-xl p-5">
|
<div
|
||||||
|
key={share.id}
|
||||||
|
data-testid="share-card"
|
||||||
|
data-share-id={share.id}
|
||||||
|
className="bg-card border border-border rounded-xl p-5"
|
||||||
|
>
|
||||||
{/* Top row: badge + name */}
|
{/* Top row: badge + name */}
|
||||||
<div className="flex items-center gap-3 mb-3">
|
<div className="flex items-center gap-3 mb-3">
|
||||||
<span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">
|
<span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">
|
||||||
|
|||||||
@@ -533,7 +533,11 @@ export default function SessionHistoryPage() {
|
|||||||
)}
|
)}
|
||||||
style={{ '--stagger-index': i } as React.CSSProperties}
|
style={{ '--stagger-index': i } as React.CSSProperties}
|
||||||
>
|
>
|
||||||
<div className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]">
|
<div
|
||||||
|
data-testid="flow-session-card"
|
||||||
|
data-session-id={session.id}
|
||||||
|
className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]"
|
||||||
|
>
|
||||||
<div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
|
<div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
|
||||||
<div className="flex-1">
|
<div className="flex-1">
|
||||||
<div className="flex flex-wrap items-center gap-2">
|
<div className="flex flex-wrap items-center gap-2">
|
||||||
|
|||||||
Reference in New Issue
Block a user