Compare commits
1 Commits
69f2a37591
...
fix/ci-pyt
| Author | SHA1 | Date | |
|---|---|---|---|
| ca45bc9bb3 |
@@ -1,22 +1,22 @@
|
||||
# CURRENT_TASK.md
|
||||
|
||||
**Task:** Land two stacked CI PRs and lock the backend gate on `main`.
|
||||
**Task:** Restore a fully green CI gate on `main` and lock it via branch protection so future merges can't introduce silent rot.
|
||||
|
||||
**Status:** in-progress
|
||||
|
||||
**Definition of Done:**
|
||||
- [ ] PR #150 (`fix/ci-workflow-config`) merged. Both `CI / backend (pull_request)` and `CI / frontend (pull_request)` show success on the merge commit.
|
||||
- [ ] PR #151 (`fix/ci-pytest-xdist`) merged. Backend CI on the merge commit completes in <6 min (was ~22 min serial).
|
||||
- [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required).
|
||||
- [ ] Optional: `CI / e2e (pull_request)` confirmed clean and added to required checks.
|
||||
- [ ] The 54 real backend test failures (left after #149's infra cleanup) categorized and fixed in a follow-up PR. Target: 0 failures, 0 errors on a `pytest` run inside `resolutionflow_backend`.
|
||||
- [ ] `npm run lint` stays at 0 errors after the cleanup PR (already at 0 on main).
|
||||
- [ ] Append a SESSION_LOG.md entry summarizing what shipped.
|
||||
|
||||
**Assumptions:**
|
||||
- The 8-core homelab Gitea Actions runner can support `-n auto` (8 xdist workers). If memory pressure shows up in CI, drop to `-n 4`.
|
||||
- pytest-cov's xdist support continues to handle the coverage merge and `--cov-fail-under=50` check correctly.
|
||||
- The per-worker DB creation in `conftest.py` is idempotent and racing workers on first import won't all try to CREATE DATABASE simultaneously — postgres serializes that, but if it surfaces issues, wrap with an advisory lock.
|
||||
- The 54 failures fall into a small number of root-cause categories (likely 3–5: fixture-scoping leaks, DB cleanup ordering, account_id propagation in test seed paths). Verify before assuming.
|
||||
- The pytest-asyncio 0.24 + pytest 8.4 toolchain bumped in #149 is the right baseline; do not revert.
|
||||
- `DATABASE_TEST_URL` is the only DB URL conftest will honor; do not weaken the safety guard added in `dab740d`.
|
||||
|
||||
**Out of scope:**
|
||||
- Frontend lint warnings (23 remain after #149).
|
||||
- The 23 react-hooks/exhaustive-deps warnings.
|
||||
- RLS test suite (gated behind `RUN_RLS_TESTS=1`; not in default CI).
|
||||
- Per-test transactional rollback (would shave another 30-40% off backend time but is a much bigger refactor — capture in TODO if interested).
|
||||
- New feature work on FlowPilot (Phase 10+) or PSA — keep this branch focused on CI debt.
|
||||
- Frontend lint warnings (23 remain after #149; they're missing-deps in useEffect, opt-in cleanup later).
|
||||
- RLS test suite (`test_rls_isolation.py`) — gated behind `RUN_RLS_TESTS=1` and not in the default CI run.
|
||||
|
||||
109
.ai/HANDOFF.md
109
.ai/HANDOFF.md
@@ -2,93 +2,62 @@
|
||||
|
||||
# HANDOFF.md
|
||||
|
||||
**Last updated:** 2026-04-25 (America/New_York)
|
||||
**Last updated:** 2026-04-25 06:12 EDT
|
||||
|
||||
**Active task:** Land three open CI PRs (#150 + #151 + #152), then enable backend + e2e gates on `main`. See [CURRENT_TASK.md](CURRENT_TASK.md).
|
||||
**Active task:** Restore green CI gate on `main` and lock it via branch protection. See [CURRENT_TASK.md](CURRENT_TASK.md).
|
||||
|
||||
**Branches:** Three open PRs, all independent of each other for correctness:
|
||||
- `fix/ci-workflow-config` → PR #150
|
||||
- `fix/ci-pytest-xdist` → PR #151 (stacked on #150 for context but mergeable on its own)
|
||||
- `fix/e2e-test-selectors` → PR #152
|
||||
**Branch:** `fix/ci-workflow-config`
|
||||
|
||||
**Runner setup:** Three Gitea Actions agents are now registered on the homelab box, so `backend` / `frontend` / `e2e` jobs run truly in parallel instead of serializing on a single agent. Combined with PR #151's xdist parallelization, the previous 1h 14m wall-clock should drop to ~6–10 min.
|
||||
## Current state
|
||||
|
||||
## Three open PRs
|
||||
Previous session fixed the 54 real backend failures left after #149. The default backend suite is now green locally:
|
||||
|
||||
### PR #150 — `fix/ci-workflow-config` → main
|
||||
```bash
|
||||
docker exec resolutionflow_backend bash -lc 'pytest --override-ini="addopts=" -q > /tmp/full-backend.log 2>&1; code=$?; tail -n 160 /tmp/full-backend.log; exit $code'
|
||||
# 1076 passed, 35 deselected in 1347.41s (0:22:27)
|
||||
```
|
||||
|
||||
Carries:
|
||||
- The Codex commit (`49f8856 wip(handoff): restore backend suite to green`) — fixes 54 backend test failures.
|
||||
- Workflow fixes: `DATABASE_TEST_URL` env, `actions/upload-artifact` v3 pin.
|
||||
- Most-recent commit (`e976fb4`):
|
||||
- Mocks `_extract_template_parameters` in `test_record_decision_persists_and_bumps_state_version` (last test failing on CI; needed an AI provider key the runner doesn't have). Verified locally — passes.
|
||||
- pip + npm caches in all three jobs.
|
||||
- Drops `--cov-report=term-missing` (the custom "Display coverage summary" step prints the same info from JSON).
|
||||
- Adds `--maxfail=10` so structural breakage fails fast.
|
||||
Targeted validation also passed:
|
||||
|
||||
**Expected CI on this PR:** all three jobs green for the first time in months.
|
||||
- `tests/test_session_resolutions_api.py tests/test_session_sharing.py tests/test_session_suggested_fixes_api.py tests/test_survey.py tests/test_tenant_isolation_p0.py tests/test_tree_sharing.py tests/test_trees.py::TestTrees::test_delete_tree_cleans_up_folder_and_tag_assignments tests/test_uploads.py::test_delete_upload_forbidden_for_non_owner` → `73 passed`
|
||||
- PDF export tests → `3 passed`
|
||||
- Prompt/PSA/resolution/script-builder subset → `14 passed`
|
||||
- Admin/AI/branch subsets → `11 passed`
|
||||
|
||||
### PR #151 — `fix/ci-pytest-xdist` → main (stacked on #150)
|
||||
## What changed
|
||||
|
||||
Carries (on top of #150):
|
||||
- `pytest-xdist==3.6.1` in `requirements-dev.txt`.
|
||||
- `conftest.py` adds `_worker_db_url` + `_ensure_worker_db_exists`. Each xdist worker gets its own DB (`resolutionflow_test_gw0`, `gw1`, …) so the per-test `DROP SCHEMA public CASCADE` doesn't race across workers.
|
||||
- Workflow's pytest invocation gains `-n auto`.
|
||||
Production fixes:
|
||||
|
||||
**Measured locally:** backend suite goes from `22m 27s` (serial, 1076 passed) → `4m 28s` (8 workers, 1076 passed). Same exit code, same test count.
|
||||
- CI/backend dev image now installs WeasyPrint system libraries.
|
||||
- Public share-token and survey routes are mounted outside tenant auth; protected share management remains tenant-protected.
|
||||
- Folder creation now persists `UserFolder.account_id`.
|
||||
- Script Builder save-to-library now persists `ScriptTemplate.account_id`.
|
||||
- Resolution output generation eager-loads `AISession.steps` to avoid async lazy-load `MissingGreenlet`.
|
||||
- AI session model now declares the generated `search_vector` column already present in Alembic, so `create_all` test schemas match runtime migrations.
|
||||
- Direct account-role update now rejects `"owner"`; ownership changes must use the transfer path.
|
||||
- Assistant prompt marker examples no longer include a literal executable `create_spin_off_ticket` payload.
|
||||
|
||||
### PR #152 — `fix/e2e-test-selectors` → main
|
||||
Test/harness fixes:
|
||||
|
||||
Carries: five Playwright e2e selector updates against the current UI. The drift was inherited from the FlowPilot/PSA migration:
|
||||
|
||||
- `Sessions` → `Session History` (page heading)
|
||||
- `Account Settings` → `Account Management` (page heading)
|
||||
- `/assistant` → `/pilot` (Phase 1 route rename; redirect still works)
|
||||
- Flow-session filtering and the Resume button moved behind the "Flow Sessions" tab on `/sessions` (default tab is "AI Sessions")
|
||||
- `resume.spec.ts` no longer starts at `/trees` — Resume button rendering moved to the session card on `/sessions`
|
||||
|
||||
No product-code changes. Pure test updates.
|
||||
- Test seeds updated for tenant-scoped `account_id` columns on sessions, branches, resolution outputs, script templates, PSA connections, folders, schedules, and categories.
|
||||
- Tests aligned with 404-not-403 resource-hiding policy.
|
||||
- Disabled-AI tests now restore both Anthropic and Google key settings.
|
||||
- Pytest harness closes pytest-asyncio's leftover clean loop and ignores known unclosed asyncio/asyncpg teardown ResourceWarnings that otherwise appear at arbitrary later setup points under `filterwarnings = error`.
|
||||
|
||||
## Immediate next steps
|
||||
|
||||
1. **Merge PR #152 first.** Smallest, lowest risk, no shared file with the other two PRs.
|
||||
2. **Merge PR #150 next.** Backend test suite should be fully green (1076 passed, 0 failed, 0 errors).
|
||||
3. **Merge PR #151 last.** Backend job time drops to ~4–6 min on the runner.
|
||||
4. **Enable backend gate** on `main` branch protection — append `"CI / backend (pull_request)"` to `status_check_contexts`:
|
||||
1. Commit current working tree if not already committed with trailer:
|
||||
`Co-Authored-By: Codex <noreply@openai.com>`.
|
||||
2. Check PR #150 status on Gitea. If both `CI / backend (pull_request)` and `CI / frontend (pull_request)` are green, merge it.
|
||||
3. After #150 merges, add `CI / backend (pull_request)` to required status checks on main:
|
||||
```bash
|
||||
curl -X PATCH -H "Authorization: token $GITEA_TOKEN" \
|
||||
"https://gitea.resolutionflow.com/api/v1/repos/chihlasm/resolutionflow/branch_protections/main" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"]}'
|
||||
PATCH /repos/chihlasm/resolutionflow/branch_protections/main
|
||||
{ "status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"] }
|
||||
```
|
||||
5. **Then enable `CI / e2e (pull_request)`** — same PATCH, append to the list. Verify e2e is reliably green for at least one PR run before locking it in.
|
||||
|
||||
## Uncommitted state
|
||||
|
||||
Working tree clean (after this handoff commit).
|
||||
|
||||
## Branch protection on main (current)
|
||||
|
||||
- PR-only merges
|
||||
- `CI / frontend (pull_request)` required
|
||||
- Force-push blocked
|
||||
- No review required (solo)
|
||||
|
||||
## Recently merged on main
|
||||
|
||||
- `f27f671` — PR #149: fix(ci): frontend lint to zero errors + test-DB schema fix + dev-deps installable
|
||||
- `06593a4` — PR #148: fix(tests): repair two pre-existing bugs blocking backend CI
|
||||
- `32fae2c` — PR #147: feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness
|
||||
- `16060d2` — PR #141: feat: PSA ticket management
|
||||
`$GITEA_TOKEN` is in `.claude/settings.local.json`.
|
||||
4. Run/confirm frontend lint if needed for the final DoD item (`npm run lint` was already green after #149, but this session did not rerun it).
|
||||
|
||||
## Open questions
|
||||
|
||||
- One known concern with `--maxfail=10`: if a single bad commit produces 11+ legitimate failures, CI bails before reporting them all. Acceptable trade-off — the alternative is burning 25 min on a structural break.
|
||||
- pytest-xdist load distribution is the default file-scoped balance. If one worker consistently gets the slow tests, switch to `--dist worksteal` (xdist 3.x). Not worth tuning preemptively.
|
||||
|
||||
## Useful breadcrumbs
|
||||
|
||||
- `backend/scripts/seed_phase9_qa_fixtures.py` pre-bakes Phase 9 QA fixtures.
|
||||
- `.gstack/qa-reports/phase9-20260424-232700/REPORT.md` — full QA report from the FlowPilot session.
|
||||
- gstack is in team mode for this repo. `/browse` Chromium needs `CONTAINER=1` env (see `~/.claude/skills/gstack/browse/src/browser-manager.ts:188`).
|
||||
- Per-worker test DBs accumulate on the postgres service. Cheap to leave around; cleanup if it ever bothers anyone.
|
||||
- PR #150 was not rechecked or merged in this session.
|
||||
- Branch protection was not updated in this session.
|
||||
|
||||
@@ -12,16 +12,6 @@
|
||||
|
||||
---
|
||||
|
||||
## 2026-04-25 12:00 America/New_York — Claude Code — Mock final AI-provider test, cache CI deps, parallelize backend with pytest-xdist
|
||||
|
||||
- Diagnosed why CI was still red despite Codex's local 1076 passed: a single test (`test_record_decision_persists_and_bumps_state_version`) needed `ANTHROPIC_API_KEY` because the `decision: draft_template` path calls `TemplateExtractionService` → AI provider. Patched `_extract_template_parameters` with an `AsyncMock` so the test no longer depends on AI availability. Verified.
|
||||
- Pushed Codex's WIP commit `49f8856` to PR #150 (had been local-only per handoff protocol).
|
||||
- PR #150 (`fix/ci-workflow-config`) extended with cheap CI wins: `actions/cache@v3` for pip + npm in all three jobs; dropped `--cov-report=term-missing` (the custom display step parses JSON); added `--maxfail=10` so structural breakage exits fast.
|
||||
- PR #151 (`fix/ci-pytest-xdist`) opened, stacked on #150: pytest-xdist with per-worker DB isolation. `conftest.py` reads `PYTEST_XDIST_WORKER`, computes a per-worker DB URL like `…_gw0`, and synchronously CREATEs the DB on first import. The per-test `DROP SCHEMA public CASCADE` then operates on the worker's isolated DB. Verified locally: backend suite went from 22m 27s serial → 4m 28s parallel (8 workers), 1076 passed in both cases. ~5× speedup.
|
||||
- Decided NOT to do per-test transactional rollback (bigger refactor); captured for future TODO consideration.
|
||||
- Left for next session: watch CI on both PRs, merge in order (#150 first, #151 second), then enable `CI / backend (pull_request)` as a required status check on main.
|
||||
- Files touched: `backend/tests/test_session_suggested_fixes_api.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `.gitea/workflows/ci.yml`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/TODO.md`.
|
||||
|
||||
## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
|
||||
|
||||
- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.
|
||||
|
||||
@@ -5,12 +5,9 @@
|
||||
|
||||
## Up next
|
||||
|
||||
- [ ] **Parallelize backend pytest with pytest-xdist.** ✅ landing as PR #151. Verified locally: backend suite 22 min → 4m 28s with `-n auto` on the 8-core homelab runner. Per-worker DB isolation via `PYTEST_XDIST_WORKER` in conftest.py.
|
||||
- [ ] **Parallelize backend pytest with pytest-xdist.** Currently the backend suite takes ~22 min wall-clock for `1076 passed, 35 deselected` (verified locally 2026-04-25). With `-n auto` on the homelab Gitea Actions runner, this should land in the 3–6 min range depending on core count. Blocker: `test_db` fixture in `backend/tests/conftest.py` does `DROP SCHEMA public CASCADE` per test, which two workers would race on. Standard fix: one database per worker, derived from `PYTEST_XDIST_WORKER` env var inside conftest. The runner has spare CPU, so prioritize once main is green and the 54-failure cleanup has landed.
|
||||
|
||||
## Backlog
|
||||
|
||||
- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
|
||||
- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
|
||||
- [ ] **Add `data-testid` attributes to e2e-critical interactive elements.** PR #152 fixed five Playwright tests by chasing UI-text changes (`Sessions` → `Session History`, `Account Settings` → `Account Management`, `/assistant` → `/pilot`, "Flow Sessions" tab, Resume button on session cards). Each was a one-line selector update, but every UI churn re-breaks them. Adding stable `data-testid` attributes on the targeted elements (page heading wrappers, tab nav, primary action buttons) and switching tests to `getByTestId` would make these immune to copy/route renames. Scope it small — start with `SessionHistoryPage` heading, the AI/Flow Sessions tab buttons, the per-session `Resume` button, and the command-palette FlowPilot option.
|
||||
- [ ] **Per-test transactional rollback in `test_db` fixture.** Bigger engineering than xdist (which we already shipped). Instead of `DROP SCHEMA public CASCADE` per test, wrap each test in a savepoint and rollback at teardown. ~30-40% additional speedup on top of xdist for test-DB-heavy tests. Real refactor; only worth it if the suite gets significantly larger or runs more frequently.
|
||||
- [ ] **Consider `pytest-testmon` for PR-time test selection.** Tracks which tests touched which source files and only re-runs affected ones. Best for small PRs touching ~few files. Adds cache-invalidation complexity; only worth it if the suite stays painfully long even after xdist.
|
||||
|
||||
@@ -17,11 +17,8 @@ jobs:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: resolutionflow_test
|
||||
# No host port mapping. Tests connect to `postgres:5432` (the service
|
||||
# container's docker-network DNS name), not `localhost:5432`. With
|
||||
# multiple Gitea runners on the same homelab box, host-port mapping
|
||||
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
||||
# second fails with "port is already allocated".
|
||||
ports:
|
||||
- 5432:5432
|
||||
options: >-
|
||||
--health-cmd pg_isready
|
||||
--health-interval 10s
|
||||
@@ -143,11 +140,8 @@ jobs:
|
||||
POSTGRES_USER: postgres
|
||||
POSTGRES_PASSWORD: postgres
|
||||
POSTGRES_DB: resolutionflow_test
|
||||
# No host port mapping. Tests connect to `postgres:5432` (the service
|
||||
# container's docker-network DNS name), not `localhost:5432`. With
|
||||
# multiple Gitea runners on the same homelab box, host-port mapping
|
||||
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
||||
# second fails with "port is already allocated".
|
||||
ports:
|
||||
- 5432:5432
|
||||
options: >-
|
||||
--health-cmd pg_isready
|
||||
--health-interval 10s
|
||||
|
||||
@@ -88,8 +88,6 @@ test.describe('command palette smoke tests', () => {
|
||||
|
||||
await flowpilotOption.click()
|
||||
|
||||
// Phase 1 of the FlowPilot migration renamed /assistant to /pilot.
|
||||
// /assistant still 301-redirects to /pilot, so accept either landing URL.
|
||||
await expect(page).toHaveURL(/\/(pilot|assistant)/)
|
||||
await expect(page).toHaveURL(/\/assistant/)
|
||||
})
|
||||
})
|
||||
|
||||
@@ -24,13 +24,9 @@ test.describe('session history smoke tests', () => {
|
||||
await page.goto('/sessions')
|
||||
|
||||
await expect(
|
||||
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||
page.getByRole('heading', { name: 'Sessions', exact: true }),
|
||||
).toBeVisible()
|
||||
|
||||
// Default tab on /sessions is "AI Sessions"; flow sessions live behind
|
||||
// the "Flow Sessions" tab and only that tab exposes ticket/client filters.
|
||||
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
||||
|
||||
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
|
||||
await page.getByPlaceholder('Search by client name...').fill(clientName)
|
||||
|
||||
|
||||
@@ -14,7 +14,7 @@ test.describe('authenticated navigation smoke tests', () => {
|
||||
await page.goto('/sessions')
|
||||
|
||||
await expect(
|
||||
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||
page.getByRole('heading', { name: 'Sessions', exact: true }),
|
||||
).toBeVisible()
|
||||
})
|
||||
|
||||
@@ -30,7 +30,7 @@ test.describe('authenticated navigation smoke tests', () => {
|
||||
await page.goto('/account')
|
||||
|
||||
await expect(
|
||||
page.getByRole('heading', { name: 'Account Management' }),
|
||||
page.getByRole('heading', { name: 'Account Settings' }),
|
||||
).toBeVisible()
|
||||
})
|
||||
})
|
||||
|
||||
@@ -18,17 +18,9 @@ test.describe('session resume smoke tests', () => {
|
||||
})
|
||||
|
||||
try {
|
||||
// Resume flow moved off /trees onto the Flow Sessions tab of /sessions
|
||||
// during the FlowPilot migration. The destination (/trees/:id/navigate)
|
||||
// is unchanged — only the entry point shifted.
|
||||
await page.goto('/sessions')
|
||||
await expect(
|
||||
page.getByRole('heading', { name: 'Session History', exact: true }),
|
||||
).toBeVisible()
|
||||
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
||||
// Active sub-tab is the default and surfaces in-progress sessions.
|
||||
await page.goto('/trees')
|
||||
|
||||
const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first()
|
||||
const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).filter({ hasText: 'Resume' }).first()
|
||||
await expect(resumeCard).toBeVisible()
|
||||
await resumeCard.getByRole('button', { name: 'Resume' }).first().click()
|
||||
|
||||
|
||||
Reference in New Issue
Block a user