Full-suite pytest collision: RLS tests incompatible with function-scoped test_db fixture #144

New Issue

chihlasm · 2026-04-24T14:30:18Z

chihlasm commented

2026-04-24 14:30:18 +00:00

Summary

Running pytest tests/ end-to-end produces ~75 failures + ~96 errors, almost entirely caused by a fixture-scope collision between test_rls_isolation.py and every other test file's function-scoped test_db fixture. Phase 9 did not introduce this — it was present before and was flagged during the Phase 9 Task 14 full-suite run as a blocker for using pytest tests/ as a single regression gate.

Repro

docker exec resolutionflow_backend sh -c "cd /app && pytest tests/ --override-ini='addopts=' -q"

Observed on feat/flowpilot-migration at d386d11: 935 passed, 75 failed, 96 errors. Running individual files (pytest tests/test_foo.py) passes cleanly.

Root cause

Two incompatible test-setup strategies live in the same suite:

backend/tests/conftest.py test_db fixture (function scope, used by most tests): DROP SCHEMA public CASCADE; CREATE SCHEMA public; Base.metadata.create_all(...). Creates tables from SQLAlchemy metadata only — no RLS policies, no DB roles, no migration-managed grants.
backend/tests/test_rls_isolation.py _ensure_rls_schema fixture (module scope): runs alembic upgrade head once to set up the full migration-managed schema including RLS policies and per-tenant DB roles. Then uses an _ADMIN_DSN asyncpg connection plus per-tenant role DSNs for ownership checks.

Collisions that trigger the failures:

_ADMIN_DSN / role DSNs not configured in the test env → every test in test_rls_isolation.py errors at connect time (matches the 8+ errors clustered in that file).
Schema clobbering between modules → when a function-scoped test_db fixture runs before an RLS test picks back up, the Alembic-managed schema has been replaced with the bare-metadata version. Roles that were created by migrations may still exist at the DB level (roles are per-cluster, not per-schema), but policies attached to now-dropped tables are gone, so subsequent queries in RLS tests see inconsistent state.
Cascade into unrelated modules (test_trees.py, test_uploads.py, etc.) → the pre-existing test order can leave the DB in a half-state where a later non-RLS test sees lingering role grants or missing seed data, surfacing as errors rather than assertion failures.

Why this didn't show up earlier

CI likely runs test files in isolated pytest invocations (or in parallel with pytest-xdist worker-per-file), sidestepping cross-file schema state. The collision only appears when a single pytest tests/ invocation walks all files in one process.
Most day-to-day dev runs target a single file or module, never the full tree.

Options for fixing

In rough order of scope:

Cheap: exclude test_rls_isolation.py from the default collection when run alongside others — e.g., pytest tests/ --ignore=tests/test_rls_isolation.py as the default, with RLS tests run in a separate CI step. Unblocks the full-suite regression gate immediately.
Medium: switch conftest.py's test_db fixture to use alembic upgrade head instead of Base.metadata.create_all, so every test gets the migration-managed schema (RLS policies + roles + seeds). ~100 tests depend on this fixture; they should continue to work if the migration set covers everything create_all does (which it should). Risk: slower fixture setup; may surface other latent bugs.
Deep: tier the test suite — unit tests that don't need DB use an in-memory or sqlite fixture; integration tests use a single migration-managed schema shared across the process (session or module scope) with per-test transactions rolled back. This is the textbook approach but a multi-day refactor.

Option 1 is the right default until someone has bandwidth for Option 2.

Impact

pytest tests/ is not currently a valid full-suite gate. Contributors who try to verify "nothing broke" by running the full suite will see a noisy failure report and may incorrectly attribute those failures to their own change.
Phase 9's Task 14 had to triage this to confirm the 75+96 were all pre-existing, adding friction to the handoff.

Not a code bug, but tooling debt worth tracking.

## Summary Running `pytest tests/` end-to-end produces ~75 failures + ~96 errors, almost entirely caused by a fixture-scope collision between `test_rls_isolation.py` and every other test file's function-scoped `test_db` fixture. Phase 9 did not introduce this — it was present before and was flagged during the Phase 9 Task 14 full-suite run as a blocker for using `pytest tests/` as a single regression gate. ## Repro ```bash docker exec resolutionflow_backend sh -c "cd /app && pytest tests/ --override-ini='addopts=' -q" ``` Observed on `feat/flowpilot-migration` at `d386d11`: `935 passed, 75 failed, 96 errors`. Running individual files (`pytest tests/test_foo.py`) passes cleanly. ## Root cause Two incompatible test-setup strategies live in the same suite: 1. **`backend/tests/conftest.py` `test_db` fixture** (function scope, used by most tests): `DROP SCHEMA public CASCADE; CREATE SCHEMA public; Base.metadata.create_all(...)`. Creates tables from SQLAlchemy metadata only — **no RLS policies, no DB roles, no migration-managed grants**. 2. **`backend/tests/test_rls_isolation.py` `_ensure_rls_schema` fixture** (module scope): runs `alembic upgrade head` once to set up the full migration-managed schema including RLS policies and per-tenant DB roles. Then uses an `_ADMIN_DSN` asyncpg connection plus per-tenant role DSNs for ownership checks. **Collisions that trigger the failures:** - **`_ADMIN_DSN` / role DSNs not configured in the test env** → every test in `test_rls_isolation.py` errors at connect time (matches the 8+ errors clustered in that file). - **Schema clobbering between modules** → when a function-scoped `test_db` fixture runs before an RLS test picks back up, the Alembic-managed schema has been replaced with the bare-metadata version. Roles that were created by migrations may still exist at the DB level (roles are per-cluster, not per-schema), but policies attached to now-dropped tables are gone, so subsequent queries in RLS tests see inconsistent state. - **Cascade into unrelated modules** (`test_trees.py`, `test_uploads.py`, etc.) → the pre-existing test order can leave the DB in a half-state where a later non-RLS test sees lingering role grants or missing seed data, surfacing as errors rather than assertion failures. ## Why this didn't show up earlier - CI likely runs test files in isolated pytest invocations (or in parallel with `pytest-xdist` worker-per-file), sidestepping cross-file schema state. The collision only appears when a single `pytest tests/` invocation walks all files in one process. - Most day-to-day dev runs target a single file or module, never the full tree. ## Options for fixing In rough order of scope: 1. **Cheap: exclude `test_rls_isolation.py` from the default collection** when run alongside others — e.g., `pytest tests/ --ignore=tests/test_rls_isolation.py` as the default, with RLS tests run in a separate CI step. Unblocks the full-suite regression gate immediately. 2. **Medium: switch `conftest.py`'s `test_db` fixture to use `alembic upgrade head` instead of `Base.metadata.create_all`**, so every test gets the migration-managed schema (RLS policies + roles + seeds). ~100 tests depend on this fixture; they should continue to work if the migration set covers everything `create_all` does (which it should). Risk: slower fixture setup; may surface other latent bugs. 3. **Deep: tier the test suite** — unit tests that don't need DB use an in-memory or sqlite fixture; integration tests use a single migration-managed schema shared across the process (session or module scope) with per-test transactions rolled back. This is the textbook approach but a multi-day refactor. Option 1 is the right default until someone has bandwidth for Option 2. ## Impact - `pytest tests/` is not currently a valid full-suite gate. Contributors who try to verify "nothing broke" by running the full suite will see a noisy failure report and may incorrectly attribute those failures to their own change. - Phase 9's Task 14 had to triage this to confirm the 75+96 were all pre-existing, adding friction to the handoff. Not a code bug, but tooling debt worth tracking.

chihlasm commented

2026-05-01 04:57:14 +00:00

Closing as duplicate of #145 (which has the post-mortem and resolution detail). Fixed in dab740d + b14a16a.

Closing as duplicate of #145 (which has the post-mortem and resolution detail). Fixed in `dab740d` + `b14a16a`.

chihlasm closed this issue

2026-05-01 04:57:18 +00:00

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: chihlasm/resolutionflow#144