diff --git a/.ai/CURRENT_TASK.md b/.ai/CURRENT_TASK.md index 383fff86..0109ea2d 100644 --- a/.ai/CURRENT_TASK.md +++ b/.ai/CURRENT_TASK.md @@ -1,22 +1,22 @@ # CURRENT_TASK.md -**Task:** Restore a fully green CI gate on `main` and lock it via branch protection so future merges can't introduce silent rot. +**Task:** Land two stacked CI PRs and lock the backend gate on `main`. **Status:** in-progress **Definition of Done:** - [ ] PR #150 (`fix/ci-workflow-config`) merged. Both `CI / backend (pull_request)` and `CI / frontend (pull_request)` show success on the merge commit. +- [ ] PR #151 (`fix/ci-pytest-xdist`) merged. Backend CI on the merge commit completes in <6 min (was ~22 min serial). - [ ] `CI / backend (pull_request)` added to required status checks on `main` in Gitea branch protection (frontend is already required). -- [ ] The 54 real backend test failures (left after #149's infra cleanup) categorized and fixed in a follow-up PR. Target: 0 failures, 0 errors on a `pytest` run inside `resolutionflow_backend`. -- [ ] `npm run lint` stays at 0 errors after the cleanup PR (already at 0 on main). -- [ ] Append a SESSION_LOG.md entry summarizing what shipped. +- [ ] Optional: `CI / e2e (pull_request)` confirmed clean and added to required checks. **Assumptions:** -- The 54 failures fall into a small number of root-cause categories (likely 3–5: fixture-scoping leaks, DB cleanup ordering, account_id propagation in test seed paths). Verify before assuming. -- The pytest-asyncio 0.24 + pytest 8.4 toolchain bumped in #149 is the right baseline; do not revert. -- `DATABASE_TEST_URL` is the only DB URL conftest will honor; do not weaken the safety guard added in `dab740d`. +- The 8-core homelab Gitea Actions runner can support `-n auto` (8 xdist workers). If memory pressure shows up in CI, drop to `-n 4`. +- pytest-cov's xdist support continues to handle the coverage merge and `--cov-fail-under=50` check correctly. +- The per-worker DB creation in `conftest.py` is idempotent and racing workers on first import won't all try to CREATE DATABASE simultaneously — postgres serializes that, but if it surfaces issues, wrap with an advisory lock. **Out of scope:** -- New feature work on FlowPilot (Phase 10+) or PSA — keep this branch focused on CI debt. -- Frontend lint warnings (23 remain after #149; they're missing-deps in useEffect, opt-in cleanup later). -- RLS test suite (`test_rls_isolation.py`) — gated behind `RUN_RLS_TESTS=1` and not in the default CI run. +- Frontend lint warnings (23 remain after #149). +- The 23 react-hooks/exhaustive-deps warnings. +- RLS test suite (gated behind `RUN_RLS_TESTS=1`; not in default CI). +- Per-test transactional rollback (would shave another 30-40% off backend time but is a much bigger refactor — capture in TODO if interested). diff --git a/.ai/HANDOFF.md b/.ai/HANDOFF.md index 0b896b7e..771c40b2 100644 --- a/.ai/HANDOFF.md +++ b/.ai/HANDOFF.md @@ -2,62 +2,75 @@ # HANDOFF.md -**Last updated:** 2026-04-25 06:12 EDT +**Last updated:** 2026-04-25 (America/New_York) -**Active task:** Restore green CI gate on `main` and lock it via branch protection. See [CURRENT_TASK.md](CURRENT_TASK.md). +**Active task:** Land two stacked CI PRs (#150 + #151), then enable backend gate on `main`. See [CURRENT_TASK.md](CURRENT_TASK.md). -**Branch:** `fix/ci-workflow-config` +**Branch:** Currently on `fix/ci-workflow-config` (PR #150). The xdist work lives on `fix/ci-pytest-xdist` (PR #151), branched from #150. -## Current state +## Two open PRs to land in order -Previous session fixed the 54 real backend failures left after #149. The default backend suite is now green locally: +### PR #150 — `fix/ci-workflow-config` → main -```bash -docker exec resolutionflow_backend bash -lc 'pytest --override-ini="addopts=" -q > /tmp/full-backend.log 2>&1; code=$?; tail -n 160 /tmp/full-backend.log; exit $code' -# 1076 passed, 35 deselected in 1347.41s (0:22:27) -``` +Carries: +- The Codex commit (`49f8856 wip(handoff): restore backend suite to green`) — fixes 54 backend test failures. +- Workflow fixes: `DATABASE_TEST_URL` env, `actions/upload-artifact` v3 pin. +- Most-recent commit (`e976fb4`): + - Mocks `_extract_template_parameters` in `test_record_decision_persists_and_bumps_state_version` (last test failing on CI; needed an AI provider key the runner doesn't have). Verified locally — passes. + - pip + npm caches in all three jobs. + - Drops `--cov-report=term-missing` (the custom "Display coverage summary" step prints the same info from JSON). + - Adds `--maxfail=10` so structural breakage fails fast. -Targeted validation also passed: +**Expected CI on this PR:** all three jobs green for the first time in months. -- `tests/test_session_resolutions_api.py tests/test_session_sharing.py tests/test_session_suggested_fixes_api.py tests/test_survey.py tests/test_tenant_isolation_p0.py tests/test_tree_sharing.py tests/test_trees.py::TestTrees::test_delete_tree_cleans_up_folder_and_tag_assignments tests/test_uploads.py::test_delete_upload_forbidden_for_non_owner` → `73 passed` -- PDF export tests → `3 passed` -- Prompt/PSA/resolution/script-builder subset → `14 passed` -- Admin/AI/branch subsets → `11 passed` +### PR #151 — `fix/ci-pytest-xdist` → main (stacked on #150) -## What changed +Carries (on top of #150): +- `pytest-xdist==3.6.1` in `requirements-dev.txt`. +- `conftest.py` adds `_worker_db_url` + `_ensure_worker_db_exists`. Each xdist worker gets its own DB (`resolutionflow_test_gw0`, `gw1`, …) so the per-test `DROP SCHEMA public CASCADE` doesn't race across workers. +- Workflow's pytest invocation gains `-n auto`. -Production fixes: - -- CI/backend dev image now installs WeasyPrint system libraries. -- Public share-token and survey routes are mounted outside tenant auth; protected share management remains tenant-protected. -- Folder creation now persists `UserFolder.account_id`. -- Script Builder save-to-library now persists `ScriptTemplate.account_id`. -- Resolution output generation eager-loads `AISession.steps` to avoid async lazy-load `MissingGreenlet`. -- AI session model now declares the generated `search_vector` column already present in Alembic, so `create_all` test schemas match runtime migrations. -- Direct account-role update now rejects `"owner"`; ownership changes must use the transfer path. -- Assistant prompt marker examples no longer include a literal executable `create_spin_off_ticket` payload. - -Test/harness fixes: - -- Test seeds updated for tenant-scoped `account_id` columns on sessions, branches, resolution outputs, script templates, PSA connections, folders, schedules, and categories. -- Tests aligned with 404-not-403 resource-hiding policy. -- Disabled-AI tests now restore both Anthropic and Google key settings. -- Pytest harness closes pytest-asyncio's leftover clean loop and ignores known unclosed asyncio/asyncpg teardown ResourceWarnings that otherwise appear at arbitrary later setup points under `filterwarnings = error`. +**Measured locally:** backend suite goes from `22m 27s` (serial, 1076 passed) → `4m 28s` (8 workers, 1076 passed). Same exit code, same test count. ## Immediate next steps -1. Commit current working tree if not already committed with trailer: - `Co-Authored-By: Codex `. -2. Check PR #150 status on Gitea. If both `CI / backend (pull_request)` and `CI / frontend (pull_request)` are green, merge it. -3. After #150 merges, add `CI / backend (pull_request)` to required status checks on main: +1. **Watch PR #150 CI** on its latest sha (`e976fb4`). Both `CI / backend (pull_request)` and `CI / frontend (pull_request)` should be green. Merge if so. +2. **Watch PR #151 CI** after #150 merges. Once #151 is rebased / merged automatically, backend job time on subsequent runs should drop to the 4–6 min range. +3. **Enable backend gate** on `main` branch protection — append `"CI / backend (pull_request)"` to `status_check_contexts`: ```bash - PATCH /repos/chihlasm/resolutionflow/branch_protections/main - { "status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"] } + curl -X PATCH -H "Authorization: token $GITEA_TOKEN" \ + "https://gitea.resolutionflow.com/api/v1/repos/chihlasm/resolutionflow/branch_protections/main" \ + -H "Content-Type: application/json" \ + -d '{"status_check_contexts": ["CI / frontend (pull_request)", "CI / backend (pull_request)"]}' ``` - `$GITEA_TOKEN` is in `.claude/settings.local.json`. -4. Run/confirm frontend lint if needed for the final DoD item (`npm run lint` was already green after #149, but this session did not rerun it). +4. **Optional: also gate `CI / e2e (pull_request)`** once that job has run cleanly a few times. The artifact-v3 fix means it can finally run; we haven't verified its actual outcome yet. + +## Uncommitted state + +Working tree clean (after this handoff commit). + +## Branch protection on main (current) + +- PR-only merges +- `CI / frontend (pull_request)` required +- Force-push blocked +- No review required (solo) + +## Recently merged on main + +- `f27f671` — PR #149: fix(ci): frontend lint to zero errors + test-DB schema fix + dev-deps installable +- `06593a4` — PR #148: fix(tests): repair two pre-existing bugs blocking backend CI +- `32fae2c` — PR #147: feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness +- `16060d2` — PR #141: feat: PSA ticket management ## Open questions -- PR #150 was not rechecked or merged in this session. -- Branch protection was not updated in this session. +- One known concern with `--maxfail=10`: if a single bad commit produces 11+ legitimate failures, CI bails before reporting them all. Acceptable trade-off — the alternative is burning 25 min on a structural break. +- pytest-xdist load distribution is the default file-scoped balance. If one worker consistently gets the slow tests, switch to `--dist worksteal` (xdist 3.x). Not worth tuning preemptively. + +## Useful breadcrumbs + +- `backend/scripts/seed_phase9_qa_fixtures.py` pre-bakes Phase 9 QA fixtures. +- `.gstack/qa-reports/phase9-20260424-232700/REPORT.md` — full QA report from the FlowPilot session. +- gstack is in team mode for this repo. `/browse` Chromium needs `CONTAINER=1` env (see `~/.claude/skills/gstack/browse/src/browser-manager.ts:188`). +- Per-worker test DBs accumulate on the postgres service. Cheap to leave around; cleanup if it ever bothers anyone.