16 Commits

Author SHA1 Message Date
68fcdc6122 Merge PR #153: fix(chat): sync currentChatRef when prefill creates a new chat session
All checks were successful
CI / frontend (push) Successful in 5m57s
Mirror to GitHub / mirror (push) Successful in 13s
CI / backend (push) Successful in 10m28s
CI / e2e (push) Successful in 12m0s
Fixes a silent-drop bug where the dashboard prefill flow created a new chat session but didn't update the in-flight guard ref, so subsequent task-lane submissions had their AI follow-up responses discarded.

Includes a Playwright regression test that drives the prefill flow and stubs /ai-sessions/*/chat to verify the second AI turn renders. Also adds a stub ANTHROPIC_API_KEY to the e2e CI job so AI-gated endpoints clear their _require_ai_enabled() check (the chat call itself is intercepted in the browser, so no real Anthropic traffic).
2026-04-26 05:05:54 +00:00
11fe32f4c6 fix(ci): set stub ANTHROPIC_API_KEY for e2e job so AI-gated endpoints respond
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
CI / frontend (pull_request) Successful in 5m39s
CI / backend (pull_request) Successful in 10m24s
CI / e2e (pull_request) Successful in 12m14s
POST /api/v1/ai-sessions and friends call _require_ai_enabled(), which
returns 503 when no provider key is set. The new prefill-handoff
regression test (e2e/assistant-chat-prefill.spec.ts) drives the
dashboard prefill flow, which has to create a chat session before its
page.route stub on /chat can fire — so without a key, session
creation 503s and the test never sees the task lane.

The Playwright stub intercepts /chat in the browser, so the backend
never actually contacts Anthropic — but the AI-enabled gate still
needs to pass. A stub value is enough.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 00:51:39 -04:00
43eed720d9 docs(ai): close out PR #150, set PR #153 as active task
Some checks failed
Mirror to GitHub / mirror (push) Successful in 13s
CI / frontend (pull_request) Successful in 5m50s
CI / e2e (pull_request) Failing after 6m50s
CI / backend (pull_request) Successful in 10m40s
- CURRENT_TASK.md rolled forward — the CI-recovery task is complete
  (PR #150 merged as 87bb20b; backend gate is in required checks).
  Active task is now landing PR #153.
- HANDOFF.md rewritten — new resume point is watching CI on the
  rebased SHA 1559feb and merging when all three checks are green.
- SESSION_LOG.md gains a 2026-04-26 entry covering the prefill bug
  diagnosis, fix, regression test, and the rebase off post-#150 main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 00:30:50 -04:00
1559feb759 docs(ai): track currentChatRef silent-swallow follow-up in TODO
Some checks failed
Mirror to GitHub / mirror (push) Successful in 11s
CI / frontend (pull_request) Successful in 5m43s
CI / e2e (pull_request) Failing after 6m40s
CI / backend (pull_request) Has been cancelled
The guard pattern that masked the prefill-ref bug fixed in PR #153 is
applied across handleSend, handleTaskSubmit, selectChat, refreshFacts,
refreshActiveFix, and refreshPreview. Worth either logging the
mismatch path or distinguishing expected-stale from unexpected-stale
so the next instance of this class of bug surfaces instead of hiding.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 00:24:25 -04:00
b56da2facd fix(chat): sync currentChatRef when prefill creates a new chat session
The dashboard prefill flow in AssistantChatPage set activeChatId after
creating a new session but never updated currentChatRef.current. Every
later handleSend / handleTaskSubmit then tripped the
`currentChatRef.current !== sentForChatId` guard that was supposed to
discard responses for stale chats — and silently dropped the AI's
follow-up. The user saw their submitted message but no assistant
reply, no toast, no task-lane update.

Mirrors what handleNewChat and handleResumeNew already do. Adds an
e2e regression test that drives the dashboard prefill, submits a
partial task-lane response, and asserts the second AI turn renders.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 00:24:02 -04:00
87bb20b8f0 Merge PR #150: fix(ci): consolidated CI recovery — backend green, xdist parallelization, e2e selector + decoupling
All checks were successful
CI / frontend (push) Successful in 5m42s
Mirror to GitHub / mirror (push) Successful in 13s
CI / backend (push) Successful in 10m21s
CI / e2e (push) Successful in 11m5s
2026-04-25 21:57:26 +00:00
1e3a6cfa01 fix(e2e): harden card selectors for session resume
All checks were successful
Mirror to GitHub / mirror (push) Successful in 12s
CI / frontend (pull_request) Successful in 5m43s
CI / backend (pull_request) Successful in 10m21s
CI / e2e (pull_request) Successful in 11m23s
Co-Authored-By: Codex <noreply@openai.com>
2026-04-25 16:42:33 -04:00
ede6eebf9a docs(ai): note e2e decoupling commit (261814a) in HANDOFF
Some checks failed
Mirror to GitHub / mirror (push) Successful in 11s
CI / frontend (pull_request) Successful in 5m43s
CI / e2e (pull_request) Failing after 9m30s
CI / backend (pull_request) Successful in 10m18s
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 16:12:19 -04:00
261814ae65 perf(ci): decouple e2e from frontend — build frontend inline in e2e job
Some checks failed
Mirror to GitHub / mirror (push) Successful in 14s
CI / frontend (pull_request) Successful in 5m44s
CI / e2e (pull_request) Failing after 7m42s
CI / backend (pull_request) Successful in 10m28s
Before: e2e \`needs: [frontend]\` waited for the frontend job to upload
a build artifact, then downloaded it. With multiple runners this means
the third runner sat idle for ~6 min while frontend ran, then started
e2e — total wall-clock max(backend, frontend+e2e) ≈ 11 min.

After: e2e builds its own frontend (npm ci + npm run build are already
in the job; just dropped the artifact download step and added the
build). e2e starts immediately on a free runner. Adds ~1-2 min to the
e2e job duration but removes ~5 min of waiting and eliminates the
cross-job artifact mechanism entirely.

Side benefit: no more \`actions/upload-artifact\` v3/v4 GHES headaches
on the cross-job handoff. The \`if: always()\` upload of the
playwright-report at the end of e2e is kept (failure report retrieval
is still useful), but it's a leaf-output, not a dependency.

Net wall-clock: max(backend=9m, frontend=6m, e2e=7m) ≈ 9 min on the
3-runner setup, down from ~11 min.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:59:00 -04:00
6656ebdead docs(ai): reflect PR consolidation — #151/#152 merged into #150
Some checks failed
Mirror to GitHub / mirror (push) Successful in 12s
CI / e2e (pull_request) Has been cancelled
CI / backend (pull_request) Has been cancelled
CI / frontend (pull_request) Has been cancelled
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:55:08 -04:00
69f2a37591 fix(e2e): update 5 selectors that drifted with FlowPilot/PSA UI changes
Some checks failed
Mirror to GitHub / mirror (push) Successful in 11s
CI / frontend (pull_request) Successful in 5m52s
CI / e2e (pull_request) Has been cancelled
CI / backend (pull_request) Has been cancelled
Mechanical drift between the e2e selectors and the current UI surfaced
on the first CI run after PR #149 unblocked the artifact upload step.
Five tests, three categories of drift:

1. **Page heading renames** (navigation.spec.ts)
   - `Sessions` → `Session History` on /sessions
   - `Account Settings` → `Account Management` on /account

2. **Route rename** (command-palette.spec.ts:74)
   - The "Troubleshoot with FlowPilot" command palette option now lands
     on /pilot (Phase 1 of the FlowPilot migration renamed /assistant).
     /assistant still 301-redirects, so the assertion accepts either.

3. **Feature moved to /sessions** (history.spec.ts, resume.spec.ts)
   - Default tab on /sessions is "AI Sessions"; flow-session filtering
     and the Resume button moved behind the "Flow Sessions" tab. Both
     tests now click that tab before asserting.
   - resume.spec.ts no longer starts at /trees (Resume buttons aren't
     rendered there anymore — the flow lives on /sessions). Destination
     URL (/trees/:id/navigate) is unchanged.

No product-code changes — these are pure test updates against the
shipped UI. Run the suite locally with
`cd frontend && npm run test:e2e` once a fresh build is available.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:53:57 -04:00
7f714363dd perf(ci): pytest-xdist with per-worker DBs — 22m → ~4m
Backend suite is the slow gate (1076 passed locally in 22m27s on
fix/ci-workflow-config). Adding pytest-xdist with per-worker DB
isolation drops it to ~4m20s on the 8-core homelab runner. Verified
locally: `pytest -n auto --no-cov` finished in 4m28s real time
(15m19s user — confirms ~5× parallelism).

How it works:
- conftest.py reads `PYTEST_XDIST_WORKER` (set per worker by xdist —
  'gw0', 'gw1', …). When set, derives a per-worker DB URL like
  `…/resolutionflow_test_gw0`. The base DB stays for serial / master
  runs.
- `_ensure_worker_db_exists` runs synchronously at conftest import,
  connects to the postgres maintenance DB, and `CREATE DATABASE`s the
  worker-suffixed DB if it doesn't exist. Idempotent across runs.
- The "test" safety guard still applies — every worker DB name
  contains "test" so the assertion holds.
- The per-test `DROP SCHEMA public CASCADE` now operates on the
  worker's isolated DB, no cross-worker race.

CI workflow: backend job switches to `pytest -n auto`. Coverage still
collected (pytest-cov has built-in xdist support).

Adds `pytest-xdist==3.6.1` to requirements-dev.txt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:53:47 -04:00
1bd43abb8f fix(ci): drop postgres host port mapping (multi-runner port collision)
Some checks failed
Mirror to GitHub / mirror (push) Successful in 12s
CI / frontend (pull_request) Successful in 6m44s
CI / e2e (pull_request) Failing after 8m43s
CI / backend (pull_request) Has been cancelled
With 3 Gitea Actions runners on the same homelab box, two simultaneous
backend (or backend + e2e) jobs both try to bind 0.0.0.0:5432 for their
postgres service containers. The second fails with:

  failed to set up container networking: ... Bind for 0.0.0.0:5432
  failed: port is already allocated

The host-port mapping isn't actually needed — the workflow uses
\`DATABASE_URL: postgresql+asyncpg://...@postgres:5432/...\` (hostname
\`postgres\` is the service container's docker-network DNS name).
The tests run inside the act container which is on the same docker
network, so they reach postgres without going through the host.

Removing \`ports: 5432:5432\` from both backend and e2e job service
definitions lets multiple postgres services run in parallel on
different docker networks without colliding on the host.

Surfaced when PR #150 ran in parallel with another job after the
multi-runner setup. Backend instant-failed in 2s on the docker run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:28:17 -04:00
c203b70ef9 docs(ai): queue data-testid hardening + reflect PR #152 + 3-runner setup
Some checks failed
CI / backend (pull_request) Failing after 2s
Mirror to GitHub / mirror (push) Successful in 15s
CI / e2e (pull_request) Has been cancelled
CI / frontend (pull_request) Has been cancelled
TODO.md: Promote pytest-xdist to  (PR #151 carries it). Adds three new
backlog items:
- data-testid hardening for e2e-critical interactive elements (sparked
  by PR #152's selector drift work)
- per-test transactional rollback (next big speedup if needed)
- pytest-testmon for PR-time test selection

HANDOFF.md: Three open PRs now (#150, #151, #152), all independent.
Three Gitea runner agents now registered, so jobs run in parallel.
Combined with #151's xdist, the prior 1h 14m wall-clock should drop
to ~6-10 min. Updated merge order: #152 first (smallest), #150 next,
#151 last. After all three land, enable CI / backend then CI / e2e
as required status checks.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 15:26:21 -04:00
f27e3b44b0 docs(ai): SESSION_LOG entry for the parallelization session
Some checks failed
Mirror to GitHub / mirror (push) Successful in 11s
CI / backend (pull_request) Successful in 32m33s
CI / frontend (pull_request) Successful in 5m42s
CI / e2e (pull_request) Failing after 4m58s
(Was meant to land in fe632c9; the multi-line edit failed silently
because Codex's earlier entry shifted the surrounding context.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 12:15:41 -04:00
fe632c9194 docs(ai): handoff after CI parallelization + final test fix
Some checks failed
Mirror to GitHub / mirror (push) Has been cancelled
CI / backend (pull_request) Successful in 30m26s
CI / frontend (pull_request) Successful in 5m46s
CI / e2e (pull_request) Failing after 5m3s
Updates HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md to reflect:
- PR #150 now contains the AI-provider test mock + caching + maxfail.
  Backend CI should be fully green for the first time in months.
- PR #151 stacked on #150: pytest-xdist with per-worker DBs. Local
  verification: 22m 27s → 4m 28s (5× speedup), 1076 passed both runs.
- DoD is now: merge #150, then #151, then add CI / backend
  (pull_request) to required status checks on main.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 12:15:07 -04:00
20 changed files with 341 additions and 89 deletions

View File

@@ -1,22 +1,28 @@
# CURRENT_TASK.md # CURRENT_TASK.md
**Task:** Restore a fully green CI gate on `main` and lock it via branch protection so future merges can't introduce silent rot. **Task:** Land PR #153 — fix the AssistantChatPage prefill `currentChatRef` bug that silently dropped AI follow-up responses in the task lane.
**Status:** in-progress **Status:** in-progress (CI running on rebased branch)
**Definition of Done:** **Definition of Done:**
- [ ] PR #150 (`fix/ci-workflow-config`) merged. Both `CI / backend (pull_request)` and `CI / frontend (pull_request)` show success on the merge commit. - [ ] PR #153 (`fix/tasklane-prefill-ref`) CI green on the rebased SHA `1559feb`. Backend, frontend, and e2e all pass.
- [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required). - [ ] PR #153 merged into `main`.
- [ ] The 54 real backend test failures (left after #149's infra cleanup) categorized and fixed in a follow-up PR. Target: 0 failures, 0 errors on a `pytest` run inside `resolutionflow_backend`. - [ ] User-visible verification: from the dashboard, type a prefill, answer a subset of task-lane questions, click *Send N of M Responses* — AI follow-up renders.
- [ ] `npm run lint` stays at 0 errors after the cleanup PR (already at 0 on main).
- [ ] Append a SESSION_LOG.md entry summarizing what shipped.
**Assumptions:** **Assumptions:**
- The 54 failures fall into a small number of root-cause categories (likely 35: fixture-scoping leaks, DB cleanup ordering, account_id propagation in test seed paths). Verify before assuming. - Rebasing onto post-#150 main pulls in the workflow fixes (no host-port mapping for postgres, no upload-artifact step), so the prior CI failures on this PR are resolved structurally.
- The pytest-asyncio 0.24 + pytest 8.4 toolchain bumped in #149 is the right baseline; do not revert. - The new e2e regression spec (`frontend/e2e/assistant-chat-prefill.spec.ts`) doesn't depend on Anthropic — it uses `page.route` to stub `/ai-sessions/*/chat` deterministically, so it should be CI-stable.
- `DATABASE_TEST_URL` is the only DB URL conftest will honor; do not weaken the safety guard added in `dab740d`.
**Out of scope:** **Out of scope (deferred to TODO):**
- New feature work on FlowPilot (Phase 10+) or PSA — keep this branch focused on CI debt. - Hardening the `currentChatRef.current !== sentForChatId` silent-return pattern across `handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, `refreshPreview`. PR #153 fixes the specific symptom; the broader audit is its own task in `TODO.md`.
- Frontend lint warnings (23 remain after #149; they're missing-deps in useEffect, opt-in cleanup later). - Promoting `CI / e2e (pull_request)` to required on `main`. Holding off until two consecutive green PR runs (PR #150 was one; PR #153's run will be the second if it passes).
- RLS test suite (`test_rls_isolation.py`) — gated behind `RUN_RLS_TESTS=1` and not in the default CI run.
## Previous task — closed out
**Task:** Land consolidated CI-recovery PR #150 and lock reliable CI gates on `main`.
**Status:** complete (2026-04-26).
- PR #150 merged as commit `87bb20b` on `main`. Backend, frontend, and e2e all green on the merge SHA.
- `CI / backend (pull_request)` added to required status checks on `main` (alongside the pre-existing `CI / frontend (pull_request)` requirement).
- `CI / e2e (pull_request)` left as not-required pending one more green PR run.

View File

@@ -2,62 +2,54 @@
# HANDOFF.md # HANDOFF.md
**Last updated:** 2026-04-25 06:12 EDT **Last updated:** 2026-04-26 03:50 EDT
**Active task:** Restore green CI gate on `main` and lock it via branch protection. See [CURRENT_TASK.md](CURRENT_TASK.md). **Active task:** Ship PR #153 — the AssistantChatPage prefill `currentChatRef` bug fix. See [CURRENT_TASK.md](CURRENT_TASK.md).
**Branch:** `fix/ci-workflow-config` **Branch:** `fix/tasklane-prefill-ref` → PR #153.
## Current state ## Current resume point
Previous session fixed the 54 real backend failures left after #149. The default backend suite is now green locally: PR #150 is merged. Branch `fix/tasklane-prefill-ref` has been rebased onto the new `main` (HEAD `1559feb`) and force-pushed; that pulled in the workflow fixes from #150, so the prior backend port-conflict and frontend upload-artifact failures are gone structurally.
```bash CI was kicked off automatically on the rebased SHA. Watch:
docker exec resolutionflow_backend bash -lc 'pytest --override-ini="addopts=" -q > /tmp/full-backend.log 2>&1; code=$?; tail -n 160 /tmp/full-backend.log; exit $code'
# 1076 passed, 35 deselected in 1347.41s (0:22:27)
```
Targeted validation also passed: - `CI / backend (pull_request)` — should pass now that postgres uses the docker-network DNS name `postgres:5432` (no host port race).
- `CI / frontend (pull_request)` — should pass now that the `actions/upload-artifact@v4` step is removed.
- `CI / e2e (pull_request)` — runs on its own postgres + builds frontend inline. Includes the new `e2e/assistant-chat-prefill.spec.ts` regression test, which stubs `/ai-sessions/*/chat` with `page.route` and is independent of Anthropic.
- `tests/test_session_resolutions_api.py tests/test_session_sharing.py tests/test_session_suggested_fixes_api.py tests/test_survey.py tests/test_tenant_isolation_p0.py tests/test_tree_sharing.py tests/test_trees.py::TestTrees::test_delete_tree_cleans_up_folder_and_tag_assignments tests/test_uploads.py::test_delete_upload_forbidden_for_non_owner``73 passed` If all three go green, merge #153 into `main`.
- PDF export tests → `3 passed`
- Prompt/PSA/resolution/script-builder subset → `14 passed`
- Admin/AI/branch subsets → `11 passed`
## What changed ## What this PR fixes
Production fixes: Reported symptom: in a troubleshooting (chat) session, after answering a subset of the task-lane questions and clicking *Send N of M Responses*, no AI response appeared.
- CI/backend dev image now installs WeasyPrint system libraries. Root cause: the dashboard prefill effect in `AssistantChatPage` set `activeChatId` after creating a new session but never updated `currentChatRef.current`. The `currentChatRef.current !== sentForChatId` guard inside `handleSend` and `handleTaskSubmit` then bailed silently on every later request and discarded the AI's reply.
- Public share-token and survey routes are mounted outside tenant auth; protected share management remains tenant-protected.
- Folder creation now persists `UserFolder.account_id`.
- Script Builder save-to-library now persists `ScriptTemplate.account_id`.
- Resolution output generation eager-loads `AISession.steps` to avoid async lazy-load `MissingGreenlet`.
- AI session model now declares the generated `search_vector` column already present in Alembic, so `create_all` test schemas match runtime migrations.
- Direct account-role update now rejects `"owner"`; ownership changes must use the transfer path.
- Assistant prompt marker examples no longer include a literal executable `create_spin_off_ticket` payload.
Test/harness fixes: Fix is a single line: assign `currentChatRef.current = session.session_id` immediately after `setActiveChatId(session.session_id)` in the prefill effect, mirroring `handleNewChat` and `handleResumeNew`.
- Test seeds updated for tenant-scoped `account_id` columns on sessions, branches, resolution outputs, script templates, PSA connections, folders, schedules, and categories. ## Verification completed
- Tests aligned with 404-not-403 resource-hiding policy.
- Disabled-AI tests now restore both Anthropic and Google key settings. - New regression test `frontend/e2e/assistant-chat-prefill.spec.ts` drives the real dashboard prefill flow against the real backend, stubs `/ai-sessions/*/chat` for deterministic turn-1/turn-2 responses, asserts the second AI message renders. Confirmed it fails on unfixed code at the exact assertion (`Got it — based on your answer…` never appears) and passes once the fix is restored.
- Pytest harness closes pytest-asyncio's leftover clean loop and ignores known unclosed asyncio/asyncpg teardown ResourceWarnings that otherwise appear at arbitrary later setup points under `filterwarnings = error`. - `tsc -b` clean. No new lint errors. Adjacent specs (`flowpilot-chat`) still pass.
## Branch protection on main (current)
- PR-only merges
- `CI / frontend (pull_request)` required
- `CI / backend (pull_request)` required (added during PR #150 close-out)
- Force-push blocked
- No review required (solo)
## Immediate next steps ## Immediate next steps
1. Commit current working tree if not already committed with trailer: 1. Watch PR #153 CI on `1559feb`.
`Co-Authored-By: Codex <noreply@openai.com>`. 2. Merge PR #153 when all three checks are green.
2. Check PR #150 status on Gitea. If both `CI / backend (pull_request)` and `CI / frontend (pull_request)` are green, merge it. 3. After merge, decide whether to promote `CI / e2e (pull_request)` to required (would make this two consecutive green PR e2e runs in a row, the threshold from the prior CURRENT_TASK).
3. After #150 merges, add `CI / backend (pull_request)` to required status checks on main: 4. Pick next item from `.ai/TODO.md` — top "Up next" is the `data-testid` audit; the `currentChatRef` silent-return follow-up is in Backlog and is a natural pairing with this fix.
```bash
PATCH /repos/chihlasm/resolutionflow/branch_protections/main
{ "status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"] }
```
`$GITEA_TOKEN` is in `.claude/settings.local.json`.
4. Run/confirm frontend lint if needed for the final DoD item (`npm run lint` was already green after #149, but this session did not rerun it).
## Open questions ## Useful breadcrumbs
- PR #150 was not rechecked or merged in this session. - Bug fix: [`frontend/src/pages/AssistantChatPage.tsx`](../frontend/src/pages/AssistantChatPage.tsx) around line 258 — the `currentChatRef.current = session.session_id` line in the prefill `sendPrefill` effect.
- Branch protection was not updated in this session. - Regression test: [`frontend/e2e/assistant-chat-prefill.spec.ts`](../frontend/e2e/assistant-chat-prefill.spec.ts).
- TODO entry tracking the broader silent-return audit: [`.ai/TODO.md`](TODO.md).

View File

@@ -12,6 +12,37 @@
--- ---
## 2026-04-26 03:50 EDT — Claude Code — Ship AssistantChatPage prefill `currentChatRef` fix; close out PR #150
- User reported a troubleshooting-session bug: after answering a subset of task-lane questions and clicking *Send N of M Responses*, no AI response appeared. Traced to `AssistantChatPage`: the dashboard prefill effect set `activeChatId` after creating a new chat session but never updated `currentChatRef.current`. The `currentChatRef.current !== sentForChatId` guard in `handleSend` and `handleTaskSubmit` then bailed silently on every later request and discarded the AI's reply. The user message was already pushed to the chat before the await, so the user saw their answers but nothing else.
- Fix: one-line addition mirroring `handleNewChat` and `handleResumeNew` — assign `currentChatRef.current = session.session_id` immediately after `setActiveChatId(session.session_id)` in the prefill effect. Branched off `origin/main` as `fix/tasklane-prefill-ref`; PR #153 opened on Gitea.
- Authored a Playwright regression test `frontend/e2e/assistant-chat-prefill.spec.ts` that drives the real dashboard prefill flow against the real backend, stubs `/ai-sessions/*/chat` with `page.route` for deterministic turn-1/turn-2 responses, and asserts the second AI message renders. Confirmed the test fails on unfixed code at the exact assertion (`Got it — based on your answer…` never appears) and passes once the fix is restored.
- Verified locally inside `mcr.microsoft.com/playwright:v1.58.2-noble` against the running dev stack: new spec passes, adjacent `flowpilot-chat` spec still passes, `tsc -b` clean. `resume.spec` and `history.spec` failures observed are pre-existing real-backend fixture collisions, unrelated to this change.
- First CI run on PR #153 failed on infrastructure issues already addressed by PR #150: backend hit `Bind for 0.0.0.0:5432 failed: port is already allocated`, frontend hit `actions/upload-artifact@v4 not supported on GHES`. PR #150 was already merged (commit `87bb20b` on `main`). Rebased `fix/tasklane-prefill-ref` onto new `main` (force-push `1a8cb06``1559feb`), resolved a `.ai/TODO.md` conflict by keeping both backlog item sets, kicked off CI on the rebased SHA.
- Confirmed `CI / backend (pull_request)` is now in branch protection's required-status-checks list (added during PR #150 close-out). `CI / e2e (pull_request)` left as not-required pending one more clean PR run as the threshold.
- Recorded the broader silent-return concern in TODO backlog: the `currentChatRef.current !== sentForChatId` guard is applied across `handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, and `refreshPreview`. PR #153 fixes one symptom but the same pattern can mask other drift. Either log a Sentry breadcrumb on the mismatch path or distinguish "expected stale" (chat switch) from "unexpected stale" (ref never updated) so the latter alerts.
- Left for next session: watch PR #153 CI on `1559feb`, merge when green, then decide whether to promote `CI / e2e (pull_request)` to required and pick the next item from `TODO.md` — natural pairing is the `currentChatRef` silent-return audit added to backlog this session.
- Files touched: `frontend/src/pages/AssistantChatPage.tsx` (the one-line fix + comment), `frontend/e2e/assistant-chat-prefill.spec.ts` (new regression test), `.ai/TODO.md` (silent-return follow-up entry, plus conflict resolution preserving PR #150's backlog additions), `.ai/CURRENT_TASK.md` (rolled forward to PR #153 active task; previous task closed out), `.ai/HANDOFF.md` (new resume point), `.ai/SESSION_LOG.md` (this entry).
## 2026-04-25 16:41 EDT — Codex — Stabilize PR #150 e2e selectors
- Investigated the remaining PR #150 failure after backend and frontend CI were green. The e2e resume smoke test was not failing because of product behavior; it used `.bg-card` plus text filtering and matched the tree filter `<select>` before the intended session card.
- Added stable test IDs to flow session, tree, and share cards, then updated affected e2e tests to target those cards instead of Tailwind class names.
- Hardened the CI workflow by making Postgres healthchecks authenticate as `postgres` and baking `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"` into the e2e frontend build.
- Verified with `git diff --check`, frontend build in Docker, no remaining `.bg-card` e2e selectors, and focused Playwright runs in an Actions-like Ubuntu container: resume spec passed, then history/library/library-start/resume/shares passed (`6 passed`).
- Left for next session: push this WIP commit to PR #150, watch CI, merge when all three jobs are green, then enable backend branch protection and consider the e2e gate after a reliable green run.
- Files touched: `.gitea/workflows/ci.yml`, `frontend/e2e/history.spec.ts`, `frontend/e2e/library-start.spec.ts`, `frontend/e2e/library.spec.ts`, `frontend/e2e/resume.spec.ts`, `frontend/e2e/shares.spec.ts`, `frontend/src/components/library/TreeGridView.tsx`, `frontend/src/components/library/TreeListView.tsx`, `frontend/src/pages/MySharesPage.tsx`, `frontend/src/pages/SessionHistoryPage.tsx`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/SESSION_LOG.md`.
## 2026-04-25 12:00 America/New_York — Claude Code — Mock final AI-provider test, cache CI deps, parallelize backend with pytest-xdist
- Diagnosed why CI was still red despite Codex's local 1076 passed: a single test (`test_record_decision_persists_and_bumps_state_version`) needed `ANTHROPIC_API_KEY` because the `decision: draft_template` path calls `TemplateExtractionService` → AI provider. Patched `_extract_template_parameters` with an `AsyncMock` so the test no longer depends on AI availability. Verified.
- Pushed Codex's WIP commit `49f8856` to PR #150 (had been local-only per handoff protocol).
- PR #150 (`fix/ci-workflow-config`) extended with cheap CI wins: `actions/cache@v3` for pip + npm in all three jobs; dropped `--cov-report=term-missing` (the custom display step parses JSON); added `--maxfail=10` so structural breakage exits fast.
- PR #151 (`fix/ci-pytest-xdist`) opened, stacked on #150: pytest-xdist with per-worker DB isolation. `conftest.py` reads `PYTEST_XDIST_WORKER`, computes a per-worker DB URL like `…_gw0`, and synchronously CREATEs the DB on first import. The per-test `DROP SCHEMA public CASCADE` then operates on the worker's isolated DB. Verified locally: backend suite went from 22m 27s serial → 4m 28s parallel (8 workers), 1076 passed in both cases. ~5× speedup.
- Decided NOT to do per-test transactional rollback (bigger refactor); captured for future TODO consideration.
- Left for next session: watch CI on both PRs, merge in order (#150 first, #151 second), then enable `CI / backend (pull_request)` as a required status check on main.
- Files touched: `backend/tests/test_session_suggested_fixes_api.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `.gitea/workflows/ci.yml`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/TODO.md`.
## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green ## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails. - Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.

View File

@@ -5,9 +5,13 @@
## Up next ## Up next
- [ ] **Parallelize backend pytest with pytest-xdist.** Currently the backend suite takes ~22 min wall-clock for `1076 passed, 35 deselected` (verified locally 2026-04-25). With `-n auto` on the homelab Gitea Actions runner, this should land in the 36 min range depending on core count. Blocker: `test_db` fixture in `backend/tests/conftest.py` does `DROP SCHEMA public CASCADE` per test, which two workers would race on. Standard fix: one database per worker, derived from `PYTEST_XDIST_WORKER` env var inside conftest. The runner has spare CPU, so prioritize once main is green and the 54-failure cleanup has landed. - [ ] **Parallelize backend pytest with pytest-xdist.** ✅ landing as PR #151. Verified locally: backend suite 22 min → 4m 28s with `-n auto` on the 8-core homelab runner. Per-worker DB isolation via `PYTEST_XDIST_WORKER` in conftest.py.
## Backlog ## Backlog
- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today. - [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced. - [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
- [ ] **Add `data-testid` attributes to e2e-critical interactive elements.** PR #152 fixed five Playwright tests by chasing UI-text changes (`Sessions``Session History`, `Account Settings``Account Management`, `/assistant``/pilot`, "Flow Sessions" tab, Resume button on session cards). Each was a one-line selector update, but every UI churn re-breaks them. Adding stable `data-testid` attributes on the targeted elements (page heading wrappers, tab nav, primary action buttons) and switching tests to `getByTestId` would make these immune to copy/route renames. Scope it small — start with `SessionHistoryPage` heading, the AI/Flow Sessions tab buttons, the per-session `Resume` button, and the command-palette FlowPilot option.
- [ ] **Per-test transactional rollback in `test_db` fixture.** Bigger engineering than xdist (which we already shipped). Instead of `DROP SCHEMA public CASCADE` per test, wrap each test in a savepoint and rollback at teardown. ~30-40% additional speedup on top of xdist for test-DB-heavy tests. Real refactor; only worth it if the suite gets significantly larger or runs more frequently.
- [ ] **Consider `pytest-testmon` for PR-time test selection.** Tracks which tests touched which source files and only re-runs affected ones. Best for small PRs touching ~few files. Adds cache-invalidation complexity; only worth it if the suite stays painfully long even after xdist.
- [ ] **AssistantChatPage `currentChatRef` guard is a silent return**`handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, and `refreshPreview` all bail with `if (currentChatRef.current !== sentForChatId) return` when stale. This is by design for chat switching, but it also silently masked the prefill-ref bug fixed in PR #153 — the user just saw "no AI response" with no log, no toast, no Sentry event. Either (a) log a `console.warn`/Sentry breadcrumb on the mismatch path so future drift is visible, or (b) split "expected stale" (chat switch) from "unexpected stale" (ref never updated) so only the latter alerts. Pair with an audit of every `currentChatRef.current = ...` assignment vs every `setActiveChatId(...)` call to make sure they're paired everywhere.

View File

@@ -17,10 +17,13 @@ jobs:
POSTGRES_USER: postgres POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres POSTGRES_PASSWORD: postgres
POSTGRES_DB: resolutionflow_test POSTGRES_DB: resolutionflow_test
ports: # No host port mapping. Tests connect to `postgres:5432` (the service
- 5432:5432 # container's docker-network DNS name), not `localhost:5432`. With
# multiple Gitea runners on the same homelab box, host-port mapping
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
# second fails with "port is already allocated".
options: >- options: >-
--health-cmd pg_isready --health-cmd "pg_isready -U postgres"
--health-interval 10s --health-interval 10s
--health-timeout 5s --health-timeout 5s
--health-retries 5 --health-retries 5
@@ -66,11 +69,15 @@ jobs:
run: cd backend && python scripts/check_tenant_filters.py run: cd backend && python scripts/check_tenant_filters.py
- name: Run tests with coverage - name: Run tests with coverage
# `-n auto` parallelizes across all runner cores via pytest-xdist.
# conftest.py creates a per-worker DB (resolutionflow_test_gw0,
# resolutionflow_test_gw1, …) so the per-test DROP SCHEMA doesn't
# race across workers. Master/serial runs keep the base DB.
# term-missing dropped — the custom "Display coverage summary" step # term-missing dropped — the custom "Display coverage summary" step
# below parses coverage.json and prints the same info more concisely. # below parses coverage.json and prints the same info more concisely.
# --maxfail=10 short-circuits on structural breakage so we don't burn # --maxfail=10 short-circuits on structural breakage so we don't burn
# 25 minutes when a fixture explodes. # 25 minutes when a fixture explodes.
run: cd backend && python -m pytest --override-ini="addopts=" --maxfail=10 --cov=app --cov-report=json:coverage.json --cov-fail-under=50 run: cd backend && python -m pytest --override-ini="addopts=" -n auto --maxfail=10 --cov=app --cov-report=json:coverage.json --cov-fail-under=50
- name: Display coverage summary - name: Display coverage summary
if: always() if: always()
@@ -118,15 +125,14 @@ jobs:
- name: Build - name: Build
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
- name: Upload build artifact # Build artifact intentionally NOT uploaded. The e2e job below builds
uses: actions/upload-artifact@v3 # its own frontend rather than downloading one from this job, so there
with: # is no need for the cross-job artifact handoff (which previously broke
name: frontend-dist # on actions/upload-artifact@v4 GHES support and forced a v3 pin).
path: frontend/dist # Decoupling also lets e2e start immediately rather than waiting for
retention-days: 1 # this job to finish — important on a multi-runner setup.
e2e: e2e:
needs: [frontend]
runs-on: ubuntu-latest runs-on: ubuntu-latest
services: services:
@@ -136,10 +142,13 @@ jobs:
POSTGRES_USER: postgres POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres POSTGRES_PASSWORD: postgres
POSTGRES_DB: resolutionflow_test POSTGRES_DB: resolutionflow_test
ports: # No host port mapping. Tests connect to `postgres:5432` (the service
- 5432:5432 # container's docker-network DNS name), not `localhost:5432`. With
# multiple Gitea runners on the same homelab box, host-port mapping
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
# second fails with "port is already allocated".
options: >- options: >-
--health-cmd pg_isready --health-cmd "pg_isready -U postgres"
--health-interval 10s --health-interval 10s
--health-timeout 5s --health-timeout 5s
--health-retries 5 --health-retries 5
@@ -152,6 +161,12 @@ jobs:
PLAYWRIGHT_SECRET_KEY: ci-playwright-secret-key PLAYWRIGHT_SECRET_KEY: ci-playwright-secret-key
PLAYWRIGHT_TEST_EMAIL: teamadmin@resolutionflow.example.com PLAYWRIGHT_TEST_EMAIL: teamadmin@resolutionflow.example.com
PLAYWRIGHT_TEST_PASSWORD: TestPass123! PLAYWRIGHT_TEST_PASSWORD: TestPass123!
# AI-touching endpoints (POST /ai-sessions, /chat, /respond, etc.) are
# gated by `_require_ai_enabled()`, which returns 503 when no provider
# key is set. Tests that exercise those flows stub the AI calls in the
# browser via `page.route`, so the backend never actually contacts
# Anthropic — but the gate still has to pass. A stub value is enough.
ANTHROPIC_API_KEY: ci-stub-key-not-used-by-tests
steps: steps:
- uses: actions/checkout@v4 - uses: actions/checkout@v4
@@ -178,11 +193,13 @@ jobs:
- name: Install frontend dependencies - name: Install frontend dependencies
run: cd frontend && npm ci run: cd frontend && npm ci
- name: Download frontend build - name: Build frontend
uses: actions/download-artifact@v3 # Building inline (instead of downloading an artifact from the
with: # frontend job) drops the cross-job dependency, so e2e can start
name: frontend-dist # immediately on a free runner. Adds ~1-2 min of build time, but
path: frontend/dist # eliminates the artifact-upload mechanism entirely (no more
# v3/v4 GHES headaches) and saves ~5 min of waiting.
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}" npm run build
- name: Install Playwright browser - name: Install Playwright browser
run: cd frontend && npx playwright install --with-deps chromium run: cd frontend && npx playwright install --with-deps chromium

View File

@@ -4,6 +4,7 @@
# Testing — pytest-asyncio 0.24+ requires pytest>=8.2 # Testing — pytest-asyncio 0.24+ requires pytest>=8.2
pytest==8.4.2 pytest==8.4.2
pytest-asyncio==0.24.0 pytest-asyncio==0.24.0
pytest-xdist==3.6.1
httpx>=0.27.0 httpx>=0.27.0
pytest-cov==5.0.0 pytest-cov==5.0.0

View File

@@ -35,11 +35,64 @@ settings.REQUIRE_INVITE_CODE = False
# would silently nuke the dev database. Only DATABASE_TEST_URL is honored, # would silently nuke the dev database. Only DATABASE_TEST_URL is honored,
# and the safety assertion below refuses to run against a DB whose name # and the safety assertion below refuses to run against a DB whose name
# doesn't contain "test". # doesn't contain "test".
TEST_DATABASE_URL = os.environ.get( _BASE_TEST_DATABASE_URL = os.environ.get(
"DATABASE_TEST_URL", "DATABASE_TEST_URL",
"postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test", "postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
) )
def _worker_db_url(base_url: str) -> str:
"""Per-worker DB URL for pytest-xdist parallelization.
pytest-xdist sets PYTEST_XDIST_WORKER to 'gw0', 'gw1', ... per worker
process. Each worker needs its own database so the per-test
`DROP SCHEMA public CASCADE` doesn't race across workers. Master/serial
runs (no xdist) keep the base DB. The base DB is created by the postgres
service container; per-worker DBs are CREATE DATABASE-d on first import
by `_ensure_worker_db_exists` below.
"""
worker = os.environ.get("PYTEST_XDIST_WORKER")
if not worker or worker == "master":
return base_url
head, tail = base_url.rsplit("/", 1)
db_name, _, query = tail.partition("?")
suffix = f"?{query}" if query else ""
return f"{head}/{db_name}_{worker}{suffix}"
def _ensure_worker_db_exists(worker_url: str, base_url: str) -> None:
"""Create the per-worker DB if it doesn't exist. Runs synchronously at
conftest import time (before any async test machinery), using psycopg2
against the postgres maintenance DB. No-op when not running under xdist.
"""
if worker_url == base_url:
return
head, tail = worker_url.rsplit("/", 1)
worker_db = tail.partition("?")[0]
# Strip the +asyncpg dialect for sync psycopg2 + connect to 'postgres'.
sync_head = head.replace("+asyncpg", "")
admin_url = f"{sync_head}/postgres"
# Lazy import — psycopg2 is a transitive backend dep; not imported at
# module top to keep the conftest light when xdist isn't in use.
from sqlalchemy import create_engine
engine = create_engine(admin_url, isolation_level="AUTOCOMMIT")
try:
with engine.begin() as conn:
exists = conn.execute(
sa.text("SELECT 1 FROM pg_database WHERE datname = :n"),
{"n": worker_db},
).scalar()
if not exists:
# Identifier interpolation is safe — worker_db is built from
# the trusted base URL + 'gw\d+' worker suffix.
conn.execute(sa.text(f'CREATE DATABASE "{worker_db}"'))
finally:
engine.dispose()
TEST_DATABASE_URL = _worker_db_url(_BASE_TEST_DATABASE_URL)
_ensure_worker_db_exists(TEST_DATABASE_URL, _BASE_TEST_DATABASE_URL)
# Belt-and-suspenders: refuse to run tests against a DB whose name doesn't # Belt-and-suspenders: refuse to run tests against a DB whose name doesn't
# contain "test". Parses the last path segment of the URL (everything after # contain "test". Parses the last path segment of the URL (everything after
# the final '/', with query string stripped) so credentials / hosts that # the final '/', with query string stripped) so credentials / hosts that

View File

@@ -0,0 +1,111 @@
import { expect, test } from '@playwright/test'
/**
* Regression test for the prefill-handoff `currentChatRef` bug.
*
* Symptom: a chat session created via the dashboard prefill flow
* looked fine on the first AI turn, but submitting partial answers
* from the task lane silently dropped the AI's follow-up response.
* The user saw their answers in the chat, no assistant reply, no
* toast.
*
* Root cause: the prefill effect in `AssistantChatPage` set
* `activeChatId` without also updating `currentChatRef.current`, so
* the `currentChatRef.current !== sentForChatId` guard in
* `handleTaskSubmit` (and `handleSend`) tripped on every subsequent
* request and discarded the AI response.
*
* Strategy: drive the real prefill flow against the real backend, but
* intercept the `/chat` endpoint with `page.route` so we get
* deterministic question payloads on turn 1 and a deterministic
* follow-up on turn 2. The fix is what makes turn 2 visible.
*/
test.describe('AssistantChatPage — prefill handoff regression', () => {
test('AI follow-up renders after submitting partial task lane answers', async ({ page }) => {
let chatCallCount = 0
// Clear any persisted active-chat-id so the page does not auto-resume a
// stale session left behind by a sibling spec.
await page.addInitScript(() => {
try {
sessionStorage.removeItem('rf-active-chat-id')
sessionStorage.removeItem('rf-tasklane-meta')
} catch { /* ignore */ }
})
// Intercept only the chat endpoint. Session creation, listSessions,
// facts, suggested-fixes, etc. all hit the real backend so the page
// renders normally — only the LLM call is deterministic. The pattern
// matches `/ai-sessions/<uuid>/chat` and nothing nested beneath it.
await page.route(/\/api\/v1\/ai-sessions\/[^/]+\/chat$/, async (route) => {
if (route.request().method() !== 'POST') {
await route.fallback()
return
}
chatCallCount += 1
if (chatCallCount === 1) {
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
content: 'Initial diagnostic plan. Please answer the questions in the task lane.',
suggested_flows: [],
fork: null,
actions: [],
questions: [
{ text: 'Has the user recently changed their password?' },
{ text: 'Is the lockout happening at a consistent time of day?' },
],
}),
})
return
}
await route.fulfill({
status: 200,
contentType: 'application/json',
body: JSON.stringify({
content: 'Got it — based on your answer, here is what to check next.',
suggested_flows: [],
fork: null,
actions: [],
questions: [],
}),
})
})
// Drive the prefill flow exactly the way the dashboard does. The textarea
// is keyed by its placeholder copy on QuickStartPage.
await page.goto('/')
const prefillBox = page.getByPlaceholder(/Describe the issue/i)
await expect(prefillBox).toBeVisible({ timeout: 10_000 })
await prefillBox.fill('User locked out of AD weekly')
await prefillBox.press('Enter')
// After the prefill submits we land on /pilot and the first stubbed AI
// turn surfaces the task-lane question text.
await expect(page).toHaveURL(/\/pilot/)
await expect(
page.getByText('Has the user recently changed their password?'),
).toBeVisible({ timeout: 15_000 })
// Answer the first question. UI flow: click "Answer" to open the
// textarea, type, click the inline "Answer" button to mark done.
await page.getByRole('button', { name: /^Answer$/ }).first().click()
await page.getByPlaceholder('Type your answer...').fill('No, password is months old')
await page.getByRole('button', { name: /^Answer$/ }).first().click()
// Submit the partial response. Pre-fix: the response was silently dropped
// here because `currentChatRef.current` still held the mount-time value.
await page.getByRole('button', { name: /Send 1 of 2 Responses/ }).click()
// Bug repro: the assistant message must render. Pre-fix this assertion
// fails because `handleTaskSubmit` early-returns at the
// `currentChatRef.current !== sentForChatId` guard.
await expect(
page.getByText('Got it — based on your answer, here is what to check next.'),
).toBeVisible({ timeout: 15_000 })
// Both chat calls must have actually happened.
expect(chatCallCount).toBe(2)
})
})

View File

@@ -88,6 +88,8 @@ test.describe('command palette smoke tests', () => {
await flowpilotOption.click() await flowpilotOption.click()
await expect(page).toHaveURL(/\/assistant/) // Phase 1 of the FlowPilot migration renamed /assistant to /pilot.
// /assistant still 301-redirects to /pilot, so accept either landing URL.
await expect(page).toHaveURL(/\/(pilot|assistant)/)
}) })
}) })

View File

@@ -24,13 +24,21 @@ test.describe('session history smoke tests', () => {
await page.goto('/sessions') await page.goto('/sessions')
await expect( await expect(
page.getByRole('heading', { name: 'Sessions', exact: true }), page.getByRole('heading', { name: 'Session History', exact: true }),
).toBeVisible() ).toBeVisible()
// Default tab on /sessions is "AI Sessions"; flow sessions live behind
// the "Flow Sessions" tab and only that tab exposes ticket/client filters.
await page.getByRole('button', { name: 'Flow Sessions' }).click()
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber) await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
await page.getByPlaceholder('Search by client name...').fill(clientName) await page.getByPlaceholder('Search by client name...').fill(clientName)
const sessionCard = page.locator('.bg-card').filter({ hasText: ticketNumber }).filter({ hasText: clientName }).first() const sessionCard = page
.getByTestId('flow-session-card')
.filter({ hasText: ticketNumber })
.filter({ hasText: clientName })
.first()
await expect(sessionCard).toBeVisible() await expect(sessionCard).toBeVisible()
await expect(sessionCard.getByText(tree.name)).toBeVisible() await expect(sessionCard.getByText(tree.name)).toBeVisible()

View File

@@ -24,7 +24,7 @@ test.describe('flow library start-session smoke tests', () => {
await page.getByPlaceholder('Search flows...').fill(tree.name) await page.getByPlaceholder('Search flows...').fill(tree.name)
await page.getByRole('button', { name: 'Search', exact: true }).click() await page.getByRole('button', { name: 'Search', exact: true }).click()
const treeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first() const treeCard = page.getByTestId('tree-card').filter({ hasText: tree.name }).first()
await expect(treeCard).toBeVisible() await expect(treeCard).toBeVisible()
await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click() await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()

View File

@@ -20,7 +20,7 @@ test.describe('flow library smoke tests', () => {
await page.getByPlaceholder('Search flows...').fill(tree.name) await page.getByPlaceholder('Search flows...').fill(tree.name)
await page.getByRole('button', { name: 'Search', exact: true }).click() await page.getByRole('button', { name: 'Search', exact: true }).click()
await expect(page.getByText(tree.name)).toBeVisible() await expect(page.getByTestId('tree-card').filter({ hasText: tree.name }).first()).toBeVisible()
} finally { } finally {
await disposeApiContext(api) await disposeApiContext(api)
} }

View File

@@ -14,7 +14,7 @@ test.describe('authenticated navigation smoke tests', () => {
await page.goto('/sessions') await page.goto('/sessions')
await expect( await expect(
page.getByRole('heading', { name: 'Sessions', exact: true }), page.getByRole('heading', { name: 'Session History', exact: true }),
).toBeVisible() ).toBeVisible()
}) })
@@ -30,7 +30,7 @@ test.describe('authenticated navigation smoke tests', () => {
await page.goto('/account') await page.goto('/account')
await expect( await expect(
page.getByRole('heading', { name: 'Account Settings' }), page.getByRole('heading', { name: 'Account Management' }),
).toBeVisible() ).toBeVisible()
}) })
}) })

View File

@@ -18,9 +18,17 @@ test.describe('session resume smoke tests', () => {
}) })
try { try {
await page.goto('/trees') // Resume flow moved off /trees onto the Flow Sessions tab of /sessions
// during the FlowPilot migration. The destination (/trees/:id/navigate)
// is unchanged — only the entry point shifted.
await page.goto('/sessions')
await expect(
page.getByRole('heading', { name: 'Session History', exact: true }),
).toBeVisible()
await page.getByRole('button', { name: 'Flow Sessions' }).click()
// Active sub-tab is the default and surfaces in-progress sessions.
const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).filter({ hasText: 'Resume' }).first() const resumeCard = page.getByTestId('flow-session-card').filter({ hasText: tree.name }).first()
await expect(resumeCard).toBeVisible() await expect(resumeCard).toBeVisible()
await resumeCard.getByRole('button', { name: 'Resume' }).first().click() await resumeCard.getByRole('button', { name: 'Resume' }).first().click()

View File

@@ -31,7 +31,7 @@ test.describe('shared session management smoke tests', () => {
).toBeVisible() ).toBeVisible()
await expect(page.getByText(share.share_name || '')).toBeVisible() await expect(page.getByText(share.share_name || '')).toBeVisible()
const shareCard = page.locator('.bg-card').filter({ hasText: share.share_name || '' }).first() const shareCard = page.getByTestId('share-card').filter({ hasText: share.share_name || '' }).first()
await shareCard.getByRole('button', { name: 'Revoke' }).click() await shareCard.getByRole('button', { name: 'Revoke' }).click()
const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' }) const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })

View File

@@ -34,6 +34,8 @@ export function TreeGridView({
{trees.map((tree) => ( {trees.map((tree) => (
<div <div
key={tree.id} key={tree.id}
data-testid="tree-card"
data-tree-id={tree.id}
className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6" className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
> >
<div className="mb-2 flex items-start justify-between gap-2"> <div className="mb-2 flex items-start justify-between gap-2">

View File

@@ -33,6 +33,8 @@ export function TreeListView({
{trees.map((tree) => ( {trees.map((tree) => (
<div <div
key={tree.id} key={tree.id}
data-testid="tree-card"
data-tree-id={tree.id}
className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs" className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
> >
{/* Left: Name and Description */} {/* Left: Name and Description */}

View File

@@ -255,6 +255,12 @@ export default function AssistantChatPage() {
} }
setChats(prev => [chatItem, ...prev]) setChats(prev => [chatItem, ...prev])
setActiveChatId(session.session_id) setActiveChatId(session.session_id)
// Keep the in-flight guard ref in sync. Without this, currentChatRef
// stays at its mount-time value (often a stale id from sessionStorage
// or null), so subsequent handleSend / handleTaskSubmit calls bail at
// their `currentChatRef.current !== sentForChatId` check and the AI
// response is silently dropped.
currentChatRef.current = session.session_id
setMessages([{ role: 'user', content: prefill }]) setMessages([{ role: 'user', content: prefill }])
setLoading(true) setLoading(true)

View File

@@ -161,7 +161,12 @@ export default function MySharesPage() {
const isCopied = copiedId === share.id const isCopied = copiedId === share.id
return ( return (
<div key={share.id} className="bg-card border border-border rounded-xl p-5"> <div
key={share.id}
data-testid="share-card"
data-share-id={share.id}
className="bg-card border border-border rounded-xl p-5"
>
{/* Top row: badge + name */} {/* Top row: badge + name */}
<div className="flex items-center gap-3 mb-3"> <div className="flex items-center gap-3 mb-3">
<span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground"> <span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">

View File

@@ -533,7 +533,11 @@ export default function SessionHistoryPage() {
)} )}
style={{ '--stagger-index': i } as React.CSSProperties} style={{ '--stagger-index': i } as React.CSSProperties}
> >
<div className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]"> <div
data-testid="flow-session-card"
data-session-id={session.id}
className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]"
>
<div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between"> <div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
<div className="flex-1"> <div className="flex-1">
<div className="flex flex-wrap items-center gap-2"> <div className="flex flex-wrap items-center gap-2">