chore: Task 7+8 — TargetList audit, CI tenant-filter grep check

Task 7: TargetList dead code audit
- Found active code references in 12+ files across backend and frontend
  (full CRUD API + frontend page + MaintenanceScheduleSection + BatchLaunchModal)
- Decision: migrate to account_id in Phase 1 (cannot drop)
- DB row count not available from code-server — must verify from VPS SSH
  before Phase 1 migration
- Teams orphan check query documented; must run from VPS SSH before Phase 1
- Results documented in spec Section 9

Task 8: CI tenant-filter enforcement check (warn mode)
- Create backend/scripts/check_tenant_filters.py
  Scans endpoint and service files for select() on tenant tables without
  tenant_filter/account_id/user_id in surrounding context. Currently
  reports 109 warnings (Phase 1 backlog). Exits 0 (warn mode).
- Add Check tenant filter enforcement step to backend CI job
  Add --fail flag after Phase 1 backlog clears to make it blocking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
chihlasm
2026-04-09 04:21:58 +00:00
parent 61402f62a2
commit 0a9b16a28f
3 changed files with 98 additions and 1 deletions

View File

@@ -47,6 +47,11 @@ jobs:
- name: Install dependencies
run: pip install -r backend/requirements.txt -r backend/requirements-dev.txt
- name: Check tenant filter enforcement
run: cd backend && python scripts/check_tenant_filters.py
# Warn mode only (exits 0). Switch to --fail after Phase 1 backlog clears.
# See: docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md Section 3f
- name: Run tests with coverage
run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50

View File

@@ -0,0 +1,91 @@
"""
Tenant filter enforcement check.
Scans endpoint and service files for SQLAlchemy select() calls on known
tenant tables and warns when account_id or tenant_filter is not present
in the surrounding 15 lines (the typical extent of a single query).
Usage:
python scripts/check_tenant_filters.py # warn mode (exits 0)
python scripts/check_tenant_filters.py --fail # block mode (exits 1 on findings)
"""
import re
import sys
from pathlib import Path
# Tables that must always be filtered by account_id or tenant_filter.
# Extend this list as new tenant tables are added.
TENANT_MODELS = [
"Tree", "AISession", "Session", "StepLibrary", "FlowProposal",
"CopilotConversation", "AssistantChat", "FileUpload", "KBImport",
"PsaConnection", "PsaPostLog", "PsaMemberMapping", "AIChatSession",
"AIConversation", "AIUsage", "Subscription", "AccountInvite",
"Notification", "NotificationConfig", "SessionShare", "UserFolder",
"UserPinnedTree", "SessionBranch", "SessionHandoff",
"SessionResolutionOutput", "ForkPoint", "AISessionStep",
"AISuggestion", "StepCategory", "TreeCategory", "TreeTag",
"Attachment", "SessionSupportingData", "MaintenanceSchedule",
"AuditLog", "ScriptBuilderSession", "ScriptTemplate",
"StepRating", "StepUsageLog", "TargetList",
]
# Directories to scan
SCAN_DIRS = [
Path("app/api/endpoints"),
Path("app/services"),
]
# Patterns that indicate the query is correctly scoped.
# NOTE: user_id scoping is accepted for user-owned resources (sessions, folders, notifications).
# For account-shared resources (trees, steps, etc.) use tenant_filter or account_id.
SAFE_PATTERNS = [
r"tenant_filter",
r"account_id",
r"user_id", # User-scoped resources (sessions, folders, notifications, etc.)
r"is_super_admin", # Super admin queries intentionally bypass tenant filter
r"# cross-tenant: approved", # Explicit approval comment
]
SKIP_FILES = {
"admin.py", # Super admin endpoints intentionally bypass tenant filter
"admin_gallery.py", # Gallery management — super admin only, no tenant scoping needed
"public_templates.py",# Public template browser — intentionally cross-tenant
"auth.py", # Auth/registration — no account context during login/register
"ratings.py", # Session ratings — user-scoped via session lookup chain
}
findings = []
for scan_dir in SCAN_DIRS:
if not scan_dir.exists():
continue
for path in sorted(scan_dir.glob("*.py")):
if path.name in SKIP_FILES:
continue
lines = path.read_text().splitlines()
for i, line in enumerate(lines):
for model in TENANT_MODELS:
if re.search(rf"\bselect\s*\(\s*{model}\b", line):
# Check surrounding 15 lines for a safe pattern
start = max(0, i - 2)
end = min(len(lines), i + 15)
context = "\n".join(lines[start:end])
if not any(re.search(p, context) for p in SAFE_PATTERNS):
findings.append(
f"{path}:{i + 1}: select({model}) — no tenant_filter or account_id found in context"
)
if findings:
print(f"\n⚠ Tenant filter check — {len(findings)} warning(s):\n")
for f in findings:
print(f" {f}")
print()
if "--fail" in sys.argv:
print("Run with --fail: exiting 1")
sys.exit(1)
else:
print("Run in warn mode — not blocking. Pass --fail to block.")
sys.exit(0)
else:
print("✓ Tenant filter check passed — no unscoped tenant table queries found.")
sys.exit(0)

View File

@@ -589,7 +589,8 @@ The following cross-tenant operations are explicitly approved. All others requir
|---|---|---|
| Is tree sharing intra-tenant only, or can trees be shared across accounts? | Determines `tree_shares` schema, backfill strategy, and RLS policy. Tree_shares table deferred until resolved. | Product |
| What is the exact PostgreSQL exception raised when `current_setting('app.current_account_id', false)::uuid` is evaluated with no value set? | Determines the fail-closed test assertion. Must be tested in Phase 2. | Engineering |
| TargetList: zero references + zero rows? Or does data exist? | Determines whether table is dropped or migrated. | Phase 0 audit |
| TargetList audit complete: active code references found in 12+ files across backend and frontend (full CRUD API, frontend page, used in MaintenanceScheduleSection and BatchLaunchModal). Cannot be dropped. Decision: migrate to account_id in Phase 1 via backfill from team_id → accounts. DB row count not available from code-server (Docker only via VPS SSH) — run `SELECT COUNT(*) FROM target_lists;` before Phase 1 migration to confirm data volume. | ✓ Resolved — migrate in Phase 1. Row count verification required before migration. | Phase 0 audit |
| Teams orphan check: DB unavailable from code-server (Docker requires VPS SSH access). Must run before Phase 1 backfill: `SELECT COUNT(*) AS orphaned_teams FROM teams t LEFT JOIN users u ON u.team_id = t.id AND u.account_id IS NOT NULL WHERE u.id IS NULL;` | Count = 0 → proceed. Count > 0 → resolve orphaned teams before any Phase 1 backfill using team_id → account_id chain. | Run from VPS SSH before Phase 1 |
---