Full-suite pytest collision: RLS tests incompatible with function-scoped test_db fixture #144

Closed
opened 2026-04-24 14:30:18 +00:00 by chihlasm · 1 comment
Owner

Summary

Running pytest tests/ end-to-end produces ~75 failures + ~96 errors, almost entirely caused by a fixture-scope collision between test_rls_isolation.py and every other test file's function-scoped test_db fixture. Phase 9 did not introduce this — it was present before and was flagged during the Phase 9 Task 14 full-suite run as a blocker for using pytest tests/ as a single regression gate.

Repro

docker exec resolutionflow_backend sh -c "cd /app && pytest tests/ --override-ini='addopts=' -q"

Observed on feat/flowpilot-migration at d386d11: 935 passed, 75 failed, 96 errors. Running individual files (pytest tests/test_foo.py) passes cleanly.

Root cause

Two incompatible test-setup strategies live in the same suite:

  1. backend/tests/conftest.py test_db fixture (function scope, used by most tests): DROP SCHEMA public CASCADE; CREATE SCHEMA public; Base.metadata.create_all(...). Creates tables from SQLAlchemy metadata only — no RLS policies, no DB roles, no migration-managed grants.
  2. backend/tests/test_rls_isolation.py _ensure_rls_schema fixture (module scope): runs alembic upgrade head once to set up the full migration-managed schema including RLS policies and per-tenant DB roles. Then uses an _ADMIN_DSN asyncpg connection plus per-tenant role DSNs for ownership checks.

Collisions that trigger the failures:

  • _ADMIN_DSN / role DSNs not configured in the test env → every test in test_rls_isolation.py errors at connect time (matches the 8+ errors clustered in that file).
  • Schema clobbering between modules → when a function-scoped test_db fixture runs before an RLS test picks back up, the Alembic-managed schema has been replaced with the bare-metadata version. Roles that were created by migrations may still exist at the DB level (roles are per-cluster, not per-schema), but policies attached to now-dropped tables are gone, so subsequent queries in RLS tests see inconsistent state.
  • Cascade into unrelated modules (test_trees.py, test_uploads.py, etc.) → the pre-existing test order can leave the DB in a half-state where a later non-RLS test sees lingering role grants or missing seed data, surfacing as errors rather than assertion failures.

Why this didn't show up earlier

  • CI likely runs test files in isolated pytest invocations (or in parallel with pytest-xdist worker-per-file), sidestepping cross-file schema state. The collision only appears when a single pytest tests/ invocation walks all files in one process.
  • Most day-to-day dev runs target a single file or module, never the full tree.

Options for fixing

In rough order of scope:

  1. Cheap: exclude test_rls_isolation.py from the default collection when run alongside others — e.g., pytest tests/ --ignore=tests/test_rls_isolation.py as the default, with RLS tests run in a separate CI step. Unblocks the full-suite regression gate immediately.
  2. Medium: switch conftest.py's test_db fixture to use alembic upgrade head instead of Base.metadata.create_all, so every test gets the migration-managed schema (RLS policies + roles + seeds). ~100 tests depend on this fixture; they should continue to work if the migration set covers everything create_all does (which it should). Risk: slower fixture setup; may surface other latent bugs.
  3. Deep: tier the test suite — unit tests that don't need DB use an in-memory or sqlite fixture; integration tests use a single migration-managed schema shared across the process (session or module scope) with per-test transactions rolled back. This is the textbook approach but a multi-day refactor.

Option 1 is the right default until someone has bandwidth for Option 2.

Impact

  • pytest tests/ is not currently a valid full-suite gate. Contributors who try to verify "nothing broke" by running the full suite will see a noisy failure report and may incorrectly attribute those failures to their own change.
  • Phase 9's Task 14 had to triage this to confirm the 75+96 were all pre-existing, adding friction to the handoff.

Not a code bug, but tooling debt worth tracking.

## Summary Running `pytest tests/` end-to-end produces ~75 failures + ~96 errors, almost entirely caused by a fixture-scope collision between `test_rls_isolation.py` and every other test file's function-scoped `test_db` fixture. Phase 9 did not introduce this — it was present before and was flagged during the Phase 9 Task 14 full-suite run as a blocker for using `pytest tests/` as a single regression gate. ## Repro ```bash docker exec resolutionflow_backend sh -c "cd /app && pytest tests/ --override-ini='addopts=' -q" ``` Observed on `feat/flowpilot-migration` at `d386d11`: `935 passed, 75 failed, 96 errors`. Running individual files (`pytest tests/test_foo.py`) passes cleanly. ## Root cause Two incompatible test-setup strategies live in the same suite: 1. **`backend/tests/conftest.py` `test_db` fixture** (function scope, used by most tests): `DROP SCHEMA public CASCADE; CREATE SCHEMA public; Base.metadata.create_all(...)`. Creates tables from SQLAlchemy metadata only — **no RLS policies, no DB roles, no migration-managed grants**. 2. **`backend/tests/test_rls_isolation.py` `_ensure_rls_schema` fixture** (module scope): runs `alembic upgrade head` once to set up the full migration-managed schema including RLS policies and per-tenant DB roles. Then uses an `_ADMIN_DSN` asyncpg connection plus per-tenant role DSNs for ownership checks. **Collisions that trigger the failures:** - **`_ADMIN_DSN` / role DSNs not configured in the test env** → every test in `test_rls_isolation.py` errors at connect time (matches the 8+ errors clustered in that file). - **Schema clobbering between modules** → when a function-scoped `test_db` fixture runs before an RLS test picks back up, the Alembic-managed schema has been replaced with the bare-metadata version. Roles that were created by migrations may still exist at the DB level (roles are per-cluster, not per-schema), but policies attached to now-dropped tables are gone, so subsequent queries in RLS tests see inconsistent state. - **Cascade into unrelated modules** (`test_trees.py`, `test_uploads.py`, etc.) → the pre-existing test order can leave the DB in a half-state where a later non-RLS test sees lingering role grants or missing seed data, surfacing as errors rather than assertion failures. ## Why this didn't show up earlier - CI likely runs test files in isolated pytest invocations (or in parallel with `pytest-xdist` worker-per-file), sidestepping cross-file schema state. The collision only appears when a single `pytest tests/` invocation walks all files in one process. - Most day-to-day dev runs target a single file or module, never the full tree. ## Options for fixing In rough order of scope: 1. **Cheap: exclude `test_rls_isolation.py` from the default collection** when run alongside others — e.g., `pytest tests/ --ignore=tests/test_rls_isolation.py` as the default, with RLS tests run in a separate CI step. Unblocks the full-suite regression gate immediately. 2. **Medium: switch `conftest.py`'s `test_db` fixture to use `alembic upgrade head` instead of `Base.metadata.create_all`**, so every test gets the migration-managed schema (RLS policies + roles + seeds). ~100 tests depend on this fixture; they should continue to work if the migration set covers everything `create_all` does (which it should). Risk: slower fixture setup; may surface other latent bugs. 3. **Deep: tier the test suite** — unit tests that don't need DB use an in-memory or sqlite fixture; integration tests use a single migration-managed schema shared across the process (session or module scope) with per-test transactions rolled back. This is the textbook approach but a multi-day refactor. Option 1 is the right default until someone has bandwidth for Option 2. ## Impact - `pytest tests/` is not currently a valid full-suite gate. Contributors who try to verify "nothing broke" by running the full suite will see a noisy failure report and may incorrectly attribute those failures to their own change. - Phase 9's Task 14 had to triage this to confirm the 75+96 were all pre-existing, adding friction to the handoff. Not a code bug, but tooling debt worth tracking.
Author
Owner

Closing as duplicate of #145 (which has the post-mortem and resolution detail). Fixed in dab740d + b14a16a.

Closing as duplicate of #145 (which has the post-mortem and resolution detail). Fixed in `dab740d` + `b14a16a`.
Sign in to join this conversation.