From ba0680ce0601d208dff0155286455da5dbf07006 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 8 Apr 2026 10:40:15 +0000 Subject: [PATCH 1/4] docs: update CHANGELOG with image support, header actions, and design token normalization - Added image support in Assistant Chat with S3 upload and vision integration - Moved session lifecycle actions to header bar in AssistantChatPage - Normalized design system tokens across FlowPilot, AssistantChat, ScriptBuilder - Fixed 'sorry something went wrong' errors and image display in chat - Fixed Task Lane stale data and chat ref invalidation race conditions https://claude.ai/code/session_01LGJSDQqPi3sPWjC6vh9Uyj --- CHANGELOG.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index e8cffb99..88f40644 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,7 @@ All notable changes to ResolutionFlow are documented here. - **Script Library default view** — "All Scripts" tab now displays all accessible scripts (team + library) - **Session documentation overhaul** — reformatted PSA resolution/escalation notes with cleaner headers, inline engineer responses, decimal hour display (0.25 hrs), follow-up recommendations, and improved "What We Know" section from evidence items - **Client communication improvements** — new `request_info` audience type for client-facing information requests, improved status update and email draft prompts with per-context guidance +- **Image support in Assistant Chat** — paste/attach images in chat input, uploaded to S3, resized for vision model, displayed in conversation history ### Changed - **Edit Procedure page** — layout overhaul and color system refinements for better visual hierarchy @@ -18,6 +19,8 @@ All notable changes to ResolutionFlow are documented here. - **Account settings page** — audit fixes for improved consistency and usability - **PSA documentation formatting** — removed duplicate timing blocks and AI confidence sections; added client-facing communication context guidance - **Status update generation** — fixed option label lookup to use human-readable labels instead of machine values +- **Assistant Chat session actions** — moved Pause/Resume/Close actions from action bar to page header for consistency with FlowPilot +- **Design system token normalization** — unified FlowPilot, AssistantChat, and ScriptBuilder components to use consistent design tokens ### Fixed - Dark text rendering on blue accent step-number badges across all flow types @@ -26,6 +29,10 @@ All notable changes to ResolutionFlow are documented here. - Stale async results in Assistant Chat (selectChat) no longer clobber new session task lane - Sentry DSN hardcoded fallback removed — now uses environment variable only - Option label resolution in status update context generation +- "Sorry something went wrong" errors in chat when rendering unsupported message types +- Task Lane stale data when creating new chat or resuming from concluded session +- Chat ref invalidation race condition between handleNewChat and async data loads +- Images now properly display in chat message history instead of blank placeholders --- From 29a9573d6eaa6662960681f128937da061ef45f7 Mon Sep 17 00:00:00 2001 From: chihlasm Date: Thu, 9 Apr 2026 00:41:30 -0400 Subject: [PATCH 2/4] =?UTF-8?q?fix:=20CRITICAL=20=E2=80=94=20scope=20copil?= =?UTF-8?q?ot=20tree=20query=20to=20current=20account=20(#131)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: add tenant data isolation design spec Complete architecture plan for multi-tenant data isolation across all layers (PostgreSQL RLS, application-layer filtering, schema migration, testing strategy, and phased rollout checklist). Co-Authored-By: Claude Sonnet 4.6 * docs: add background job isolation policy to tenant isolation spec Documents policy for all 5 existing background jobs: - Knowledge Flywheel and PSA Retry flagged for account_id threading - Chat Retention already follows correct pattern (model for others) - Maintenance Schedule Firing needs account_id in queries + Session creation - AI Conversation Expiry approved as cross-tenant with justification Adds approved cross-tenant query registry and Phase 2 checklist items. Co-Authored-By: Claude Sonnet 4.6 * docs: add tenant isolation Phase 0 implementation plan 8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper, get_tenant_context dependency, analytics/category/AI session gap fixes, full UUID endpoint audit, TargetList dead code audit, teams orphan check, and CI grep check for missing tenant filters. Co-Authored-By: Claude Sonnet 4.6 * fix: CRITICAL — scope copilot tree query to current account A user who knew another account's tree UUID could start a copilot conversation, causing the tree's full node structure, names, and descriptions to be sent to the AI as part of the system prompt. Fix: add account_id (or is_default / visibility='public') filter to the tree SELECT in copilot_service.start_conversation(). Returns 404 for inaccessible trees. Test added in test_tenant_isolation_p0.py. Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- backend/app/services/copilot_service.py | 18 +- backend/tests/test_tenant_isolation_p0.py | 107 ++ .../2026-04-09-tenant-isolation-phase-0.md | 1136 +++++++++++++++++ ...2026-04-09-tenant-data-isolation-design.md | 654 ++++++++++ 4 files changed, 1911 insertions(+), 4 deletions(-) create mode 100644 backend/tests/test_tenant_isolation_p0.py create mode 100644 docs/superpowers/plans/2026-04-09-tenant-isolation-phase-0.md create mode 100644 docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md diff --git a/backend/app/services/copilot_service.py b/backend/app/services/copilot_service.py index 175735ee..76327ff4 100644 --- a/backend/app/services/copilot_service.py +++ b/backend/app/services/copilot_service.py @@ -8,7 +8,7 @@ from datetime import datetime, timezone, timedelta from typing import Optional, Any from uuid import UUID -from sqlalchemy import select +from sqlalchemy import select, or_ from sqlalchemy.ext.asyncio import AsyncSession from sqlalchemy.orm import selectinload @@ -103,13 +103,23 @@ async def start_conversation( Returns (conversation, greeting_message). """ - # Load tree + # Load tree — must be accessible to this account. + # Allows own account's trees, default trees, and public trees. + # Raises ValueError (caught by endpoint as 404) if not found or not accessible. result = await db.execute( - select(Tree).options(selectinload(Tree.tags)).where(Tree.id == tree_id) + select(Tree).options(selectinload(Tree.tags)).where( + Tree.id == tree_id, + or_( + Tree.account_id == account_id, + Tree.author_id == user_id, + Tree.is_default == True, + Tree.is_public == True, + ), + ) ) tree = result.scalar_one_or_none() if not tree: - raise ValueError(f"Tree {tree_id} not found") + raise ValueError(f"Tree {tree_id} not found or not accessible") conversation = CopilotConversation( user_id=user_id, diff --git a/backend/tests/test_tenant_isolation_p0.py b/backend/tests/test_tenant_isolation_p0.py new file mode 100644 index 00000000..04018f30 --- /dev/null +++ b/backend/tests/test_tenant_isolation_p0.py @@ -0,0 +1,107 @@ +"""Cross-tenant isolation tests for Phase 0 gap fixes.""" +import pytest +from httpx import AsyncClient +from sqlalchemy.ext.asyncio import AsyncSession +from sqlalchemy import select +import uuid + +from app.models.account import Account +from app.models.user import User +from app.models.tree import Tree +from app.core.security import get_password_hash + + +# ── Helpers ────────────────────────────────────────────────────────────────── + +async def _create_account_and_user(db: AsyncSession, suffix: str) -> tuple[Account, User, str]: + """Create an account + owner user. Returns (account, user, plain_password).""" + account = Account( + name=f"Test Corp {suffix}", + display_code=uuid.uuid4().hex[:8], + ) + db.add(account) + await db.flush() + + password = "TestPass123!" + user = User( + email=f"user-{suffix}-{uuid.uuid4().hex[:6]}@example.com", + name=f"User {suffix}", + password_hash=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="owner", + ) + db.add(user) + await db.flush() + return account, user, password + + +async def _login(client: AsyncClient, email: str, password: str) -> dict: + """Return auth headers for a user.""" + resp = await client.post( + "/api/v1/auth/login/json", + json={"email": email, "password": password}, + ) + assert resp.status_code == 200, resp.text + return {"Authorization": f"Bearer {resp.json()['access_token']}"} + + +async def _create_private_tree(db: AsyncSession, account: Account, user: User) -> Tree: + """Create a private tree owned by account.""" + tree = Tree( + name=f"Private Tree {uuid.uuid4().hex[:6]}", + account_id=account.id, + author_id=user.id, + visibility="private", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + db.add(tree) + await db.flush() + return tree + + +# ── Task 1: Copilot bypass ──────────────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_copilot_cannot_start_conversation_with_other_account_tree( + client: AsyncClient, test_db: AsyncSession +): + """Account A cannot start a copilot conversation using Account B's private tree UUID.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + + resp = await client.post( + "/api/v1/copilot/conversations", + json={"tree_id": str(tree_b.id), "session_id": None, "current_node_id": None}, + headers=headers_a, + ) + # Must be 404 (not 200, not 403 — never confirm existence) + assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" + + +@pytest.mark.asyncio +async def test_copilot_service_rejects_cross_tenant_tree(test_db: AsyncSession): + """copilot_service.start_conversation raises ValueError for cross-tenant tree.""" + from app.services import copilot_service + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "svc-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "svc-b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + await test_db.commit() + + with pytest.raises(ValueError, match="not found"): + await copilot_service.start_conversation( + user_id=user_a.id, + account_id=user_a.account_id, + tree_id=tree_b.id, + session_id=None, + current_node_id=None, + db=test_db, + ) diff --git a/docs/superpowers/plans/2026-04-09-tenant-isolation-phase-0.md b/docs/superpowers/plans/2026-04-09-tenant-isolation-phase-0.md new file mode 100644 index 00000000..f11236ba --- /dev/null +++ b/docs/superpowers/plans/2026-04-09-tenant-isolation-phase-0.md @@ -0,0 +1,1136 @@ +# Tenant Isolation — Hotfix + Phase 0 Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Fix all known cross-tenant data leaks and establish the application-layer patterns and CI gating that all future tenant-data work must comply with. + +**Architecture:** Application code is the primary enforcement layer. Every query on a tenant table must include an explicit `account_id` filter via `tenant_filter()`. PostgreSQL RLS (Phase 2) is the safety net — these Phase 0 changes must be complete and correct regardless of RLS. Each task is independently committable. + +**Tech Stack:** Python 3.11 · FastAPI · SQLAlchemy 2.0 async · pytest-asyncio · GitHub Actions (existing CI in `.github/workflows/ci.yml`) + +**Spec:** `docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md` + +**Phase gate:** All 8 tasks complete + CI grep check active before Phase 1 (schema migration) begins. + +--- + +## File Map + +| File | Action | Why | +|---|---|---| +| `backend/app/services/copilot_service.py` | Modify | CRITICAL: add account scoping to tree query (lines 107–112) | +| `backend/app/core/filters.py` | Modify | Add `tenant_filter()` helper; update existing filters to call it | +| `backend/app/api/deps.py` | Modify | Add `get_tenant_context` dependency | +| `backend/app/api/endpoints/analytics.py` | Modify | Add `tenant_filter` to flow analytics tree fetch (line 294) | +| `backend/app/api/endpoints/categories.py` | Modify | Scope category tree count to current account (line 112) | +| `backend/app/api/endpoints/ai_sessions.py` | Modify | Restrict search to `user_id` only (line 765) | +| `backend/tests/test_tenant_isolation_p0.py` | Create | Cross-tenant tests for every fix in this plan | +| `backend/scripts/check_tenant_filters.py` | Create | Grep-based CI check script | +| `.github/workflows/ci.yml` | Modify | Add tenant-filter check step to backend CI job | + +--- + +## Task 1: HOTFIX — Copilot tree access bypass (CRITICAL) + +**Ship this task immediately as its own PR before starting any other task.** + +**Files:** +- Modify: `backend/app/services/copilot_service.py:107-112` +- Test: `backend/tests/test_tenant_isolation_p0.py` + +**Background:** `start_conversation()` loads a tree by UUID with no account filter. An attacker who knows another account's tree UUID can extract its full node structure, names, and descriptions via the AI system prompt. This is the highest priority fix. + +--- + +- [ ] **Step 1.1: Write the failing test first** + +Create `backend/tests/test_tenant_isolation_p0.py`: + +```python +"""Cross-tenant isolation tests for Phase 0 gap fixes.""" +import pytest +from httpx import AsyncClient +from sqlalchemy.ext.asyncio import AsyncSession +from sqlalchemy import select +import uuid + +from app.models.account import Account +from app.models.user import User +from app.models.tree import Tree +from app.core.security import get_password_hash + + +# ── Helpers ────────────────────────────────────────────────────────────────── + +async def _create_account_and_user(db: AsyncSession, suffix: str) -> tuple[Account, User, str]: + """Create an account + owner user. Returns (account, user, plain_password).""" + account = Account( + name=f"Test Corp {suffix}", + slug=f"test-corp-{suffix}-{uuid.uuid4().hex[:6]}", + ) + db.add(account) + await db.flush() + + password = "TestPass123!" + user = User( + email=f"user-{suffix}-{uuid.uuid4().hex[:6]}@example.com", + name=f"User {suffix}", + hashed_password=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="owner", + ) + db.add(user) + await db.flush() + return account, user, password + + +async def _login(client: AsyncClient, email: str, password: str) -> dict: + """Return auth headers for a user.""" + resp = await client.post( + "/api/v1/auth/login/json", + json={"email": email, "password": password}, + ) + assert resp.status_code == 200, resp.text + return {"Authorization": f"Bearer {resp.json()['access_token']}"} + + +async def _create_private_tree(db: AsyncSession, account: Account, user: User) -> Tree: + """Create a private tree owned by account.""" + tree = Tree( + name=f"Private Tree {uuid.uuid4().hex[:6]}", + account_id=account.id, + author_id=user.id, + visibility="private", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + db.add(tree) + await db.flush() + return tree + + +# ── Task 1: Copilot bypass ──────────────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_copilot_cannot_start_conversation_with_other_account_tree( + client: AsyncClient, test_db: AsyncSession +): + """Account A cannot start a copilot conversation using Account B's private tree UUID.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + + resp = await client.post( + "/api/v1/copilot/conversations", + json={"tree_id": str(tree_b.id), "session_id": None, "current_node_id": None}, + headers=headers_a, + ) + # Must be 404 (not 200, not 403 — never confirm existence) + assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" +``` + +- [ ] **Step 1.2: Run test to confirm it currently fails (i.e., returns 200 instead of 404)** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_copilot_cannot_start_conversation_with_other_account_tree -v --override-ini="addopts=" +``` + +Expected: FAIL (test gets 200 back — confirms the vulnerability is real). + +> **Note:** If AI is not configured in test env, the endpoint may return 503 before reaching the tree lookup. In that case, temporarily add `settings.ai_enabled = False` check first: if you get 503, the AI gate fires before the tree check, which means the tree is still loaded but the endpoint errors out at AI quota — not the same vulnerability surface. Force `settings.ai_enabled = True` in the test or mock the `_require_ai_enabled` call to confirm the 200 path. The vulnerability is in the service layer, not the endpoint gate. + +- [ ] **Step 1.3: Fix `copilot_service.py` — add account filter to tree query** + +In `backend/app/services/copilot_service.py`, replace lines 107–112: + +```python +# BEFORE: + # Load tree + result = await db.execute( + select(Tree).options(selectinload(Tree.tags)).where(Tree.id == tree_id) + ) + tree = result.scalar_one_or_none() + if not tree: + raise ValueError(f"Tree {tree_id} not found") +``` + +```python +# AFTER: + # Load tree — must be accessible to this account. + # Allows own account's trees, default trees, and public trees. + # Raises ValueError (caught by endpoint as 404) if not found or not accessible. + from sqlalchemy import or_ + result = await db.execute( + select(Tree).options(selectinload(Tree.tags)).where( + Tree.id == tree_id, + or_( + Tree.account_id == account_id, + Tree.is_default == True, + Tree.visibility == "public", + ), + ) + ) + tree = result.scalar_one_or_none() + if not tree: + raise ValueError(f"Tree {tree_id} not found or not accessible") +``` + +- [ ] **Step 1.4: Run test — must pass** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_copilot_cannot_start_conversation_with_other_account_tree -v --override-ini="addopts=" +``` + +Expected: PASS + +- [ ] **Step 1.5: Run full test suite — must stay green** + +```bash +cd backend && python -m pytest --override-ini="addopts=" +``` + +Expected: all tests pass (or same failures as before this change — do not regress). + +- [ ] **Step 1.6: Commit as standalone hotfix** + +```bash +git add backend/app/services/copilot_service.py backend/tests/test_tenant_isolation_p0.py +git commit -m "fix: CRITICAL — scope copilot tree query to current account + +A user who knew another account's tree UUID could start a copilot +conversation, causing the tree's full node structure, names, and +descriptions to be sent to the AI as part of the system prompt. + +Fix: add account_id (or is_default / visibility='public') filter to +the tree SELECT in copilot_service.start_conversation(). Returns 404 +for inaccessible trees. Test added in test_tenant_isolation_p0.py. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 2: Add `tenant_filter()` and `get_tenant_context` + +**Files:** +- Modify: `backend/app/core/filters.py` +- Modify: `backend/app/api/deps.py` + +These are the canonical patterns every subsequent fix and future feature uses. No new tests needed — the patterns are exercised by Tasks 3–5 tests. + +--- + +- [ ] **Step 2.1: Add `tenant_filter()` to `filters.py`** + +In `backend/app/core/filters.py`, replace the entire file: + +```python +""" +Centralized query filters for ResolutionFlow. + +Provides reusable SQLAlchemy filter builders for tree access control, +step visibility, and the canonical tenant_filter used by all queries +on tenant-scoped tables. +""" +from __future__ import annotations +import uuid +from typing import TYPE_CHECKING + +from sqlalchemy import or_, and_, true as sa_true + +if TYPE_CHECKING: + from app.models.user import User + + +def tenant_filter(model, account_id: uuid.UUID): + """Primary app-layer tenant filter. + + MUST be used in every SELECT/UPDATE/DELETE on tenant tables. + RLS (Phase 2) is the safety net — this is the primary enforcement. + + Usage: + stmt = select(Tree).where(tenant_filter(Tree, current_user.account_id), ...) + """ + return model.account_id == account_id + + +def build_tree_access_filter(current_user: User): + """Build the access filter for trees based on user permissions. + + Visibility rules: + - super_admin: sees everything + - is_default: visible to all authenticated users + - visibility='public': visible to all authenticated users + - author_id == me: always visible (regardless of visibility setting) + - visibility='team' AND account_id == mine: visible to account members + - visibility='private': only visible to author (covered by author_id check above) + - visibility='link': only visible to author (share token access is handled separately) + """ + from app.models.tree import Tree + + if current_user.is_super_admin: + return sa_true() + + conditions = [ + Tree.is_default == True, + Tree.visibility == 'public', + Tree.author_id == current_user.id, + ] + if current_user.account_id: + # Team-visible trees: use tenant_filter as the account match + conditions.append( + and_( + Tree.visibility == 'team', + tenant_filter(Tree, current_user.account_id), + ) + ) + return or_(*conditions) + + +def build_step_visibility_filter(current_user: User): + """Build SQLAlchemy filter for step visibility based on user. + + Returns steps that are: + - Public steps (visible to all) + - Team steps (visible to same account members) + - User's own private steps + """ + from app.models.step_library import StepLibrary + + if current_user.account_id: + return or_( + StepLibrary.visibility == 'public', + and_( + StepLibrary.visibility == 'team', + tenant_filter(StepLibrary, current_user.account_id), + ), + StepLibrary.created_by == current_user.id, + ) + else: + return or_( + StepLibrary.visibility == 'public', + StepLibrary.created_by == current_user.id, + ) +``` + +- [ ] **Step 2.2: Add `get_tenant_context` to `deps.py`** + +In `backend/app/api/deps.py`, append at the end of the file (after line 193): + +```python + + +async def get_tenant_context( + current_user: Annotated[User, Depends(get_current_active_user)], +) -> UUID: + """Return the current user's account_id. + + Use this dependency instead of reading current_user.account_id directly. + Raises 403 if the user has no account association (should not happen in + normal flows — users are always associated with an account on registration). + """ + if current_user.account_id is None: + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="User not associated with any account", + ) + return current_user.account_id +``` + +- [ ] **Step 2.3: Run full test suite to confirm no regressions** + +```bash +cd backend && python -m pytest --override-ini="addopts=" +``` + +Expected: all tests pass. + +- [ ] **Step 2.4: Commit** + +```bash +git add backend/app/core/filters.py backend/app/api/deps.py +git commit -m "feat: add tenant_filter() helper and get_tenant_context dependency + +tenant_filter(model, account_id) is the canonical app-layer tenant +scoping expression. Every query on a tenant table must use it. +build_tree_access_filter and build_step_visibility_filter updated +to call tenant_filter() internally for the account_id match. + +get_tenant_context is a FastAPI dependency that returns account_id +or raises 403 if the user has no account — prevents raw access to +current_user.account_id and centralises the null check. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 3: Fix analytics flow endpoint + +**Files:** +- Modify: `backend/app/api/endpoints/analytics.py:294` +- Test: `backend/tests/test_tenant_isolation_p0.py` + +--- + +- [ ] **Step 3.1: Write the failing test** + +Add to `backend/tests/test_tenant_isolation_p0.py`: + +```python +# ── Task 3: Analytics flow endpoint ────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_analytics_flow_cannot_read_other_account_tree( + client: AsyncClient, test_db: AsyncSession +): + """Account A cannot read flow analytics for Account B's private tree.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "anl-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "anl-b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + + resp = await client.get( + f"/api/v1/analytics/flows/{tree_b.id}", + headers=headers_a, + ) + assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" +``` + +- [ ] **Step 3.2: Run test — confirm it currently fails (returns 200)** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_analytics_flow_cannot_read_other_account_tree -v --override-ini="addopts=" +``` + +Expected: FAIL (returns 200 — confirms gap is real). + +- [ ] **Step 3.3: Fix `analytics.py` — add tenant_filter to tree fetch** + +In `backend/app/api/endpoints/analytics.py`: + +Add the import at the top (after existing imports on line 9): + +```python +from app.core.filters import tenant_filter +``` + +Replace lines 293–297: + +```python +# BEFORE: + # Verify tree exists + result = await db.execute(select(Tree).where(Tree.id == tree_id)) + tree = result.scalar_one_or_none() + if not tree: + raise HTTPException(status_code=404, detail="Flow not found") +``` + +```python +# AFTER: + # Verify tree exists and belongs to the requesting user's account. + result = await db.execute( + select(Tree).where( + Tree.id == tree_id, + tenant_filter(Tree, current_user.account_id), + ) + ) + tree = result.scalar_one_or_none() + if not tree: + raise HTTPException(status_code=404, detail="Flow not found") +``` + +- [ ] **Step 3.4: Run test — must pass** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_analytics_flow_cannot_read_other_account_tree -v --override-ini="addopts=" +``` + +Expected: PASS + +- [ ] **Step 3.5: Commit** + +```bash +git add backend/app/api/endpoints/analytics.py backend/tests/test_tenant_isolation_p0.py +git commit -m "fix: scope analytics/flows/{tree_id} to requesting account + +Any authenticated user could read flow analytics (session counts, +completion rates, CSAT) for any tree UUID. Now returns 404 if the +tree doesn't belong to the requesting account. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 4: Fix category tree count scope + +**Files:** +- Modify: `backend/app/api/endpoints/categories.py:112` +- Test: `backend/tests/test_tenant_isolation_p0.py` + +--- + +- [ ] **Step 4.1: Write the failing test** + +Add to `backend/tests/test_tenant_isolation_p0.py`: + +```python +# ── Task 4: Category tree count ─────────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_category_tree_count_scoped_to_account( + client: AsyncClient, test_db: AsyncSession +): + """tree_count on a category must not include trees from other accounts.""" + from app.models.category import TreeCategory + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "cat-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "cat-b") + + # Shared category (account_id=None means global) + category = TreeCategory( + name="Shared Category", + slug=f"shared-cat-{uuid.uuid4().hex[:6]}", + account_id=None, + is_active=True, + ) + test_db.add(category) + await test_db.flush() + + # 3 trees for account_b under this category + for i in range(3): + tree = Tree( + name=f"B Tree {i}", + account_id=acct_b.id, + author_id=user_b.id, + category_id=category.id, + visibility="team", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + test_db.add(tree) + + # 1 tree for account_a under this category + tree_a = Tree( + name="A Tree", + account_id=acct_a.id, + author_id=user_a.id, + category_id=category.id, + visibility="team", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + test_db.add(tree_a) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + + resp = await client.get( + f"/api/v1/categories/{category.id}", + headers=headers_a, + ) + assert resp.status_code == 200, resp.text + # account_a should only see their 1 tree, not account_b's 3 + assert resp.json()["tree_count"] == 1, ( + f"Expected tree_count=1 (own trees only), got {resp.json()['tree_count']}" + ) +``` + +- [ ] **Step 4.2: Run test — confirm it currently fails (count=4)** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_category_tree_count_scoped_to_account -v --override-ini="addopts=" +``` + +Expected: FAIL (tree_count=4 — includes account_b's trees). + +- [ ] **Step 4.3: Fix `categories.py` — scope tree count query** + +In `backend/app/api/endpoints/categories.py`: + +Add import at top (alongside existing imports): + +```python +from app.core.filters import tenant_filter +``` + +Replace lines 111–117 (the count query block): + +```python +# BEFORE: + # Get tree count + count_query = select(func.count(Tree.id)).where( + Tree.category_id == category.id, + Tree.is_active == True + ) + count_result = await db.execute(count_query) + tree_count = count_result.scalar() or 0 +``` + +```python +# AFTER: + # Get tree count — scoped to the requesting account so cross-account + # trees in shared categories are not counted. + count_query = select(func.count(Tree.id)).where( + Tree.category_id == category.id, + Tree.is_active == True, + tenant_filter(Tree, current_user.account_id), + ) + count_result = await db.execute(count_query) + tree_count = count_result.scalar() or 0 +``` + +- [ ] **Step 4.4: Run test — must pass** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_category_tree_count_scoped_to_account -v --override-ini="addopts=" +``` + +Expected: PASS + +- [ ] **Step 4.5: Commit** + +```bash +git add backend/app/api/endpoints/categories.py backend/tests/test_tenant_isolation_p0.py +git commit -m "fix: scope category tree_count to requesting account + +tree_count on GET /categories/{id} was including trees from all +accounts, leaking cross-tenant row counts. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 5: Fix AI session search scope + +**Files:** +- Modify: `backend/app/api/endpoints/ai_sessions.py:765-777` +- Test: `backend/tests/test_tenant_isolation_p0.py` + +**Context:** The search endpoint used `OR(user_id == me, account_id == mine)`, exposing `problem_summary`, `problem_domain`, and `status` of other users' sessions within the same account. Sessions are user-scoped only. The list endpoint (`GET /ai-sessions`) already restricts to `user_id`. Both must behave consistently. + +--- + +- [ ] **Step 5.1: Write the failing test** + +Add to `backend/tests/test_tenant_isolation_p0.py`: + +```python +# ── Task 5: AI session search scope ────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_ai_session_search_cannot_see_other_users_sessions( + client: AsyncClient, test_db: AsyncSession +): + """User A cannot find User B's AI sessions via the search endpoint, + even when both users are in the same account.""" + from app.models.ai_session import AISession + + # Two users in the SAME account + account = Account(name="Shared Corp", slug=f"shared-{uuid.uuid4().hex[:6]}") + test_db.add(account) + await test_db.flush() + + password = "TestPass123!" + user_a = User( + email=f"user-a-{uuid.uuid4().hex[:6]}@shared.com", + name="User A", + hashed_password=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="engineer", + ) + user_b = User( + email=f"user-b-{uuid.uuid4().hex[:6]}@shared.com", + name="User B", + hashed_password=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="engineer", + ) + test_db.add_all([user_a, user_b]) + await test_db.flush() + + # Session belonging to user_b with distinctive problem_summary + session_b = AISession( + user_id=user_b.id, + account_id=account.id, + problem_summary="CONFIDENTIAL: user_b's session", + problem_domain="networking", + status="resolved", + ) + test_db.add(session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, password) + + resp = await client.get( + "/api/v1/ai-sessions/search", + params={"q": "CONFIDENTIAL"}, + headers=headers_a, + ) + assert resp.status_code == 200, resp.text + results = resp.json() + ids = [r["id"] for r in results] + assert str(session_b.id) not in ids, ( + "User A can see User B's session via search — cross-user leak within account" + ) +``` + +- [ ] **Step 5.2: Run test — confirm it currently fails** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_ai_session_search_cannot_see_other_users_sessions -v --override-ini="addopts=" +``` + +Expected: FAIL (user_b's session appears in results). + +- [ ] **Step 5.3: Fix `ai_sessions.py` — restrict search to `user_id`** + +In `backend/app/api/endpoints/ai_sessions.py`, find the `search_sessions` function (around line 755) and replace the `.where()` clause: + +```python +# BEFORE: + result = await db.execute( + select(AISession) + .where( + or_( + AISession.user_id == current_user.id, + AISession.account_id == current_user.account_id, + ), + text("ai_sessions.search_vector @@ plainto_tsquery('english', :q)"), + ) + .params(q=q) + .order_by(AISession.created_at.desc()) + .limit(limit) + ) +``` + +```python +# AFTER: + # Sessions are user-scoped. The list endpoint uses user_id only; + # search must be consistent. Cross-user access requires explicit + # escalation or session sharing — not ambient account membership. + result = await db.execute( + select(AISession) + .where( + AISession.user_id == current_user.id, + text("ai_sessions.search_vector @@ plainto_tsquery('english', :q)"), + ) + .params(q=q) + .order_by(AISession.created_at.desc()) + .limit(limit) + ) +``` + +Also remove the now-unused `or_` import from the `search_sessions` function if it was only used there. (Do not remove `or_` if it appears elsewhere in the file.) + +- [ ] **Step 5.4: Run test — must pass** + +```bash +cd backend && python -m pytest tests/test_tenant_isolation_p0.py::test_ai_session_search_cannot_see_other_users_sessions -v --override-ini="addopts=" +``` + +Expected: PASS + +- [ ] **Step 5.5: Commit** + +```bash +git add backend/app/api/endpoints/ai_sessions.py backend/tests/test_tenant_isolation_p0.py +git commit -m "fix: restrict AI session search to current user only + +Search endpoint used OR(user_id, account_id), exposing other users' +problem_summary and problem_domain within the same account. Sessions +are user-scoped only — cross-user access requires explicit escalation +or sharing. List and search endpoints now behave consistently. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 6: UUID endpoint audit and gap fixes + +**Files:** +- Read: all files in `backend/app/api/endpoints/` +- Modify: whichever files the audit finds gaps in +- Test: `backend/tests/test_tenant_isolation_p0.py` + +**Goal:** Systematically check every route with a `{resource_id}` URL parameter. Verify that each one either (a) filters by `id AND account_id` in the query, or (b) calls a permission function on the fetched object that checks account ownership. + +--- + +- [ ] **Step 6.1: Run the audit** + +For each file listed below, scan every function decorated with `@router.get`, `@router.put`, `@router.patch`, `@router.delete` that has a path like `/{something_id}`. For each such function, answer: + +1. Does the query filter by both `id` AND `account_id` (or `tenant_filter`)? +2. If not, does it call a permission check (`can_access_*`, `can_edit_*`, etc.) on the fetched object? +3. If neither, it's a gap. + +**Endpoint files to audit:** + +``` +backend/app/api/endpoints/trees.py +backend/app/api/endpoints/sessions.py +backend/app/api/endpoints/steps.py +backend/app/api/endpoints/categories.py ← already fixed (Task 4) +backend/app/api/endpoints/analytics.py ← already fixed (Task 3) +backend/app/api/endpoints/copilot.py +backend/app/api/endpoints/ai_sessions.py ← search fixed (Task 5) +backend/app/api/endpoints/assistant_chat.py +backend/app/api/endpoints/integrations.py +backend/app/api/endpoints/flow_proposals.py +backend/app/api/endpoints/maintenance_schedules.py +backend/app/api/endpoints/kb_accelerator.py +backend/app/api/endpoints/flowpilot_analytics.py +backend/app/api/endpoints/shares.py +backend/app/api/endpoints/uploads.py +backend/app/api/endpoints/tags.py +backend/app/api/endpoints/step_categories.py +backend/app/api/endpoints/notifications.py +backend/app/api/endpoints/survey.py +``` + +For each gap found: classify severity (CRITICAL / HIGH / MEDIUM / LOW), document file + line number, fix using the fetch-and-verify pattern: + +```python +# Standard fix pattern — fetch with account_id filter, return 404 if not found +from app.core.filters import tenant_filter + +stmt = select(Model).where( + Model.id == resource_id, + tenant_filter(Model, current_user.account_id), +) +resource = (await db.execute(stmt)).scalar_one_or_none() +if not resource: + raise HTTPException(status_code=404) # Not 403 — never reveal existence +``` + +**Known findings from prior audit (do not re-verify, just fix):** + +| File | Route | Severity | Status | +|---|---|---|---| +| `copilot.py` | `POST /copilot/conversations` | CRITICAL | Fixed (Task 1) | +| `analytics.py` | `GET /analytics/flows/{tree_id}` | LOW | Fixed (Task 3) | +| `categories.py` | tree_count in `GET /categories/{id}` | LOW | Fixed (Task 4) | +| `ai_sessions.py` | `GET /ai-sessions/search` | LOW | Fixed (Task 5) | +| `steps.py` | `get_step_or_404` — 403 vs 404 on non-existent UUID | Very Low | Audit only — fix if severity warrants | +| `flowpilot_analytics.py` | Needs manual review | Unknown | Audit now | + +- [ ] **Step 6.2: For each gap found, write a test, then apply the fix** + +Use the same TDD pattern as Tasks 1–5: + +```python +# Test template for a gap in MyEndpoint +@pytest.mark.asyncio +async def test_cannot_access_other_account_( + client: AsyncClient, test_db: AsyncSession +): + """Account A cannot access Account B's by UUID.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "-b") + resource_b = + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1//{resource_b.id}", headers=headers_a) + assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" +``` + +- [ ] **Step 6.3: After all gaps fixed, run full test suite** + +```bash +cd backend && python -m pytest --override-ini="addopts=" +``` + +Expected: all tests pass. + +- [ ] **Step 6.4: Commit all audit fixes together** + +```bash +git add backend/app/api/endpoints/ backend/tests/test_tenant_isolation_p0.py +git commit -m "fix: UUID endpoint audit — add missing ownership checks + +Audit of all {resource_id} endpoints. Gaps found and fixed: + + +All fixed endpoints now return 404 (not 403) for cross-tenant IDs. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 7: TargetList dead code audit + teams orphan check + +**Files:** +- No code changes (audit and report only) +- Report findings documented in a comment on this plan PR or in `docs/` + +--- + +- [ ] **Step 7.1: TargetList dead code audit** + +Run from the repo root: + +```bash +grep -rn "TargetList\|target_list\|target-list\|target_lists" \ + backend/app/api \ + backend/app/services \ + backend/app/schemas \ + frontend/src \ + --include="*.py" --include="*.ts" --include="*.tsx" \ + | grep -v "__pycache__" \ + | grep -v "test_target" +``` + +Record the output. Then query production/staging DB row count: + +```bash +# Run via docker exec or psql connection +docker exec -it resolutionflow_postgres psql -U postgres -d patherly \ + -c "SELECT COUNT(*) FROM target_lists;" +``` + +**Decision:** +- Zero code references AND zero rows → drop the table in Phase 1 migration +- Zero rows but code references exist → add TargetList deprecation cleanup to Phase 1 scope +- Rows exist → add TargetList migration to Phase 1 scope (backfill `account_id` from `team_id → teams → users WHERE is_team_admin`) + +- [ ] **Step 7.2: Teams orphan check** + +```bash +docker exec -it resolutionflow_postgres psql -U postgres -d patherly -c " +SELECT COUNT(*) AS orphaned_teams +FROM teams t +LEFT JOIN users u ON u.team_id = t.id AND u.account_id IS NOT NULL +WHERE u.id IS NULL; +" +``` + +**Decision:** +- Count = 0 → proceed to Phase 1 schema migration without concern +- Count > 0 → document which teams are orphaned, resolve before any Phase 1 backfill involving `team_id → account_id` chains + +- [ ] **Step 7.3: Document results** + +In `docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md`, Section 9 Open Questions, replace the TargetList row: + +```markdown +# BEFORE: +| TargetList: zero references + zero rows? Or does data exist? | Determines whether table is dropped or migrated. | Phase 0 audit | + +# AFTER (fill in actual result, e.g.): +| TargetList audit complete: zero rows, zero non-test code references. Decision: drop table in Phase 1. | Resolved — drop in Phase 1. | ✓ Done | +``` + +Also add a row for the teams orphan result: + +```markdown +| Teams orphan check: N orphaned teams found. | Phase 1 backfills using team→account chain safe to proceed (or: N teams need resolution before Phase 1). | ✓ Done | +``` + +- [ ] **Step 7.4: Commit the updated spec** + +```bash +git add docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md +git commit -m "docs: record Phase 0 audit results — TargetList and teams orphan check + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Task 8: CI tenant-filter grep check + +**Files:** +- Create: `backend/scripts/check_tenant_filters.py` +- Modify: `.github/workflows/ci.yml` + +**Goal:** Warn (not yet block) when endpoint or service files contain `select(TenantModel)` patterns without `tenant_filter` or `account_id` in the surrounding context. Active from Phase 1 forward, so all Phase 1 work is gated by it. Switch to block (exit code 1) after 2 weeks of false-positive calibration. + +--- + +- [ ] **Step 8.1: Create the check script** + +Create `backend/scripts/check_tenant_filters.py`: + +```python +""" +Tenant filter enforcement check. + +Scans endpoint and service files for SQLAlchemy select() calls on known +tenant tables and warns when account_id or tenant_filter is not present +in the surrounding 15 lines (the typical extent of a single query). + +Usage: + python scripts/check_tenant_filters.py # warn mode (exits 0) + python scripts/check_tenant_filters.py --fail # block mode (exits 1 on findings) +""" +import re +import sys +from pathlib import Path + +# Tables that must always be filtered by account_id or tenant_filter. +# Extend this list as new tenant tables are added. +TENANT_MODELS = [ + "Tree", "AISession", "Session", "StepLibrary", "FlowProposal", + "CopilotConversation", "AssistantChat", "FileUpload", "KBImport", + "PsaConnection", "PsaPostLog", "PsaMemberMapping", "AIChatSession", + "AIConversation", "AIUsage", "Subscription", "AccountInvite", + "Notification", "NotificationConfig", "SessionShare", "UserFolder", + "UserPinnedTree", "SessionBranch", "SessionHandoff", + "SessionResolutionOutput", "ForkPoint", "AISessionStep", + "AISuggestion", "StepCategory", "TreeCategory", "TreeTag", + "Attachment", "SessionSupportingData", "MaintenanceSchedule", + "AuditLog", "ScriptBuilderSession", "ScriptTemplate", + "StepRating", "StepUsageLog", "AISession", +] + +# Directories to scan +SCAN_DIRS = [ + Path("app/api/endpoints"), + Path("app/services"), +] + +# Patterns that indicate the query is correctly scoped +SAFE_PATTERNS = [ + r"tenant_filter", + r"account_id", + r"is_super_admin", # Super admin queries intentionally bypass tenant filter + r"# cross-tenant: approved", # Explicit approval comment +] + +SKIP_FILES = { + "admin.py", # Super admin endpoints intentionally bypass tenant filter +} + +findings = [] + +for scan_dir in SCAN_DIRS: + if not scan_dir.exists(): + continue + for path in sorted(scan_dir.glob("*.py")): + if path.name in SKIP_FILES: + continue + lines = path.read_text().splitlines() + for i, line in enumerate(lines): + for model in TENANT_MODELS: + if re.search(rf"\bselect\s*\(\s*{model}\b", line): + # Check surrounding 15 lines for a safe pattern + start = max(0, i - 2) + end = min(len(lines), i + 15) + context = "\n".join(lines[start:end]) + if not any(re.search(p, context) for p in SAFE_PATTERNS): + findings.append( + f"{path}:{i + 1}: select({model}) — no tenant_filter or account_id found in context" + ) + +if findings: + print(f"\n⚠ Tenant filter check — {len(findings)} warning(s):\n") + for f in findings: + print(f" {f}") + print() + if "--fail" in sys.argv: + print("Run with --fail: exiting 1") + sys.exit(1) + else: + print("Run in warn mode — not blocking. Pass --fail to block.") + sys.exit(0) +else: + print("✓ Tenant filter check passed — no unscoped tenant table queries found.") + sys.exit(0) +``` + +- [ ] **Step 8.2: Test the script locally — it should find zero issues after Tasks 1–6** + +```bash +cd backend && python scripts/check_tenant_filters.py +``` + +Expected output: +``` +✓ Tenant filter check passed — no unscoped tenant table queries found. +``` + +If it reports false positives, add the file to `SKIP_FILES` or add `# cross-tenant: approved` comment to the query. **Do not suppress real gaps** — fix them as Task 6 additions. + +- [ ] **Step 8.3: Add the check to CI** + +In `.github/workflows/ci.yml`, add this step to the `backend` job, immediately after the `Install dependencies` step and before `Run tests with coverage`: + +```yaml + - name: Check tenant filter enforcement + run: cd backend && python scripts/check_tenant_filters.py + # Warn mode only. Switch to --fail after 2 weeks of calibration. + # See: docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md Section 3f +``` + +- [ ] **Step 8.4: Run the full CI pipeline locally to confirm the step passes** + +```bash +cd backend && python scripts/check_tenant_filters.py && python -m pytest --override-ini="addopts=" +``` + +Expected: both commands exit 0. + +- [ ] **Step 8.5: Commit** + +```bash +git add backend/scripts/check_tenant_filters.py .github/workflows/ci.yml +git commit -m "chore: add CI tenant-filter grep check (warn mode) + +Scans endpoint and service files for select() calls on tenant tables +without tenant_filter or account_id in the surrounding context. +Running in warn mode (exit 0) — switch to --fail after 2-week +calibration period to block violating PRs. + +See spec Section 3f for background. + +Co-Authored-By: Claude Sonnet 4.6 " +``` + +--- + +## Phase 0 Gate Verification + +Before declaring Phase 0 complete and starting Phase 1, verify every item: + +- [ ] All tests in `test_tenant_isolation_p0.py` pass +- [ ] `python scripts/check_tenant_filters.py` exits 0 with no findings +- [ ] Full test suite passes: `cd backend && python -m pytest --override-ini="addopts="` +- [ ] Frontend builds: `cd frontend && npm run build` +- [ ] TargetList audit result documented in spec (Task 7.3) +- [ ] Teams orphan count documented in spec (Task 7.3) +- [ ] CI `check_tenant_filters` step added to `.github/workflows/ci.yml` + +--- + +## What comes next + +Phase 1 (schema migration) requires its own plan. Write it once Phase 0 gate is green. + +Phase 1 scope (from spec Section 7): +- Add `account_id NOT NULL` to ~20 tables that currently lack it (backfill sequences) +- Make nullable `account_id` NOT NULL on existing models (Users, Trees, etc.) +- Migrate ScriptBuilderSession, ScriptTemplate, ScriptGeneration from `team_id` +- TargetList: drop or migrate per Phase 0 audit result +- Create `template_trees` and `platform_steps` tables + +Spec: `docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md` Sections 1 and 5. diff --git a/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md b/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md new file mode 100644 index 00000000..be49cc05 --- /dev/null +++ b/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md @@ -0,0 +1,654 @@ +# Tenant Data Isolation — Design Spec + +> **Date:** 2026-04-09 +> **Status:** Approved — ready for implementation planning +> **Approach:** PostgreSQL RLS as safety net; application-layer filtering as primary enforcement + +--- + +## Overview + +ResolutionFlow is a multi-tenant SaaS. The tenant boundary is `account_id` (UUID foreign key to `accounts.id`). Two accounts must be completely isolated at every layer: one client must never be able to see, access, query, or receive data belonging to another client — even if there is a bug in application code. + +This spec establishes the complete foundation that must be in place before further feature development proceeds. + +### Design Principle + +**Application code is the primary enforcement mechanism. PostgreSQL Row-Level Security (RLS) is the safety net.** + +Both layers must always be present. Developers must never rely on RLS to do filtering that application code should be doing. A missing `tenant_filter()` in application code is a code review failure even if RLS would have caught it. + +--- + +## Current State + +- Tenant boundary: `account_id` (not `team_id` — that column is legacy) +- Isolation today: 100% application-layer, manual per-endpoint `.where(account_id == ...)` +- No RLS, no DB session variables, no middleware tenant injection +- Nullable `account_id` on ~8 core models (User, Tree, TreeCategory, etc.) +- ~20 tables with no direct `account_id` (scoped only through join chains) +- 4 models with `team_id` only, no `account_id`: TargetList, ScriptBuilderSession, ScriptTemplate, ScriptGeneration +- Known existing gaps: see Section 4 + +--- + +## Section 1: Schema Changes + +### 1a. Denormalize `account_id` onto all tenant-relevant tables + +Add `account_id UUID NOT NULL` (with FK to `accounts.id` and index) to each of the following tables that currently lack it: + +| Table | Backfill path | +|---|---| +| `sessions` | `sessions.user_id → users.account_id` | +| `attachments` | `attachments.session_id → sessions.user_id → users.account_id` | +| `session_supporting_data` | `session_supporting_data.session_id → sessions.user_id → users.account_id` | +| `audit_logs` | `audit_logs.user_id → users.account_id` | +| `maintenance_schedules` | `maintenance_schedules.tree_id → trees.account_id` | +| `user_folders` | `user_folders.user_id → users.account_id` | +| `user_pinned_trees` | `user_pinned_trees.user_id → users.account_id` | +| `session_branches` | `session_branches.ai_session_id → ai_sessions.account_id` (verify column name — may be `session_id`) | +| `session_handoffs` | `session_handoffs.ai_session_id → ai_sessions.account_id` (verify column name — may be `session_id`) | +| `session_resolution_outputs` | `session_resolution_outputs.session_id → sessions.user_id → users.account_id` | +| `fork_points` | `fork_points.session_id → ai_sessions.account_id` | +| `ai_session_steps` | `ai_session_steps.session_id → ai_sessions.account_id` | +| `ai_suggestions` | `ai_suggestions.tree_id → trees.account_id` | +| `step_ratings` | `step_ratings.user_id → users.account_id` (backfill from rater, not the step) | +| `step_usage_logs` | `step_usage_logs.user_id → users.account_id` (backfill from user, not the step) | +| `psa_post_logs` | `psa_post_logs.psa_connection_id → psa_connections.account_id` (psa_connections already has account_id NOT NULL) | +| `psa_member_mappings` | `psa_member_mappings.psa_connection_id → psa_connections.account_id` | +| `notification_logs` | `notification_logs.notification_config_id → notification_configs.account_id` | + +**Deferred:** `tree_shares` — backfill strategy and RLS policy depend on whether sharing is intra-tenant only or cross-tenant. Must be resolved before RLS is enabled on this table. See Section 7 (Open Questions). + +### 1b. Make existing nullable `account_id` columns NOT NULL + +These tables have `account_id` already but it is nullable: + +- `users` +- `trees` +- `tree_categories` +- `tree_tags` +- `step_categories` +- `step_library` +- `tree_embeddings` +- `feedback` + +Action: Backfill any NULL rows (assign to correct account or delete if orphaned test/seed data), then `ALTER COLUMN account_id SET NOT NULL`. + +### 1c. Global / template content + +Move global content out of tenant tables into dedicated tables with no RLS: + +- **`template_trees`** — default and publicly visible trees (currently `is_default=True` or `visibility='public'` in `trees`). All tenants can read; no RLS. +- **`platform_steps`** — public steps from step library (currently `visibility='public'` in `step_library`). All tenants can read; no RLS. +- Tables already global (no change needed): `script_categories`, `platform_settings`, `plan_limits`, `feature_flags`, `plan_feature_defaults`. + +### 1d. Migration sequence (per table) + +Every table that gains `account_id` follows this exact sequence. Each step must succeed before the next begins: + +1. `ADD COLUMN account_id UUID` (nullable) +2. Backfill via `UPDATE ... JOIN ...` +3. `SELECT COUNT(*) WHERE account_id IS NULL` — must be zero before proceeding +4. `ALTER COLUMN account_id SET NOT NULL` +5. `CREATE INDEX ON (account_id)` +6. Enable RLS (in Phase 3, not here) + +Any migration that cannot reach step 3 (zero NULLs) must roll back completely. No partial state. + +--- + +## Section 2: PostgreSQL Roles & RLS Infrastructure + +### 2a. Database roles + +Two PostgreSQL roles: + +**`resolutionflow_app`** +- Used by all application requests +- Subject to all RLS policies +- Standard table privileges: `SELECT, INSERT, UPDATE, DELETE` on all tenant tables + +**`resolutionflow_admin`** +- Used by: Alembic migrations, seed scripts, super admin API endpoints, scheduled background jobs requiring cross-tenant access +- Has `BYPASSRLS` attribute — not subject to RLS policies +- Same table privileges as `resolutionflow_app` +- Connection string exposed as `DATABASE_ADMIN_URL` env var + +The current `postgres` superuser is replaced in the application by these two roles. The connection string in `DATABASE_URL` transitions to `resolutionflow_app`. Alembic uses `DATABASE_URL_SYNC` pointing to `resolutionflow_admin`. + +### 2b. Per-request tenant context injection + +Every request that passes through `get_db()` must execute, inside a transaction boundary: + +```sql +SET LOCAL app.current_account_id = ''; +``` + +`SET LOCAL` is transaction-scoped — it resets automatically when the transaction ends. No cleanup needed. This is implemented in a modified `get_db()` dependency that receives `current_user` and executes the SET before yielding the session. + +**Fail-closed behavior:** If `app.current_account_id` is not set (e.g., a bug where `SET LOCAL` was skipped), `current_setting('app.current_account_id', false)::uuid` returns NULL. `NULL = NULL` is false in SQL — the RLS policy matches zero rows. This is the correct fail-closed behavior. + +Public endpoints (no authenticated user) do not call `SET LOCAL`. RLS matches zero rows for all tenant tables. This is correct. + +### 2c. RLS policy pattern + +Every tenant table (from Section 1 + all existing tables with account_id) gets: + +```sql +ALTER TABLE
ENABLE ROW LEVEL SECURITY; +ALTER TABLE
FORCE ROW LEVEL SECURITY; + +-- SELECT +CREATE POLICY tenant_select ON
FOR SELECT + USING (account_id = current_setting('app.current_account_id', false)::uuid); + +-- INSERT +CREATE POLICY tenant_insert ON
FOR INSERT + WITH CHECK (account_id = current_setting('app.current_account_id', false)::uuid); + +-- UPDATE +CREATE POLICY tenant_update ON
FOR UPDATE + USING (account_id = current_setting('app.current_account_id', false)::uuid) + WITH CHECK (account_id = current_setting('app.current_account_id', false)::uuid); + +-- DELETE +CREATE POLICY tenant_delete ON
FOR DELETE + USING (account_id = current_setting('app.current_account_id', false)::uuid); +``` + +**`FORCE ROW LEVEL SECURITY`** ensures the table owner is also subject to policies. The `resolutionflow_admin` role bypasses via its `BYPASSRLS` attribute, not via ownership. + +**`audit_logs` exception:** SELECT policy only. No `WITH CHECK` on INSERT (app inserts audit logs freely). No UPDATE or DELETE policies ever. These constraints are permanent and must be documented in the migration comment. + +**Global tables** (`platform_settings`, `plan_limits`, `feature_flags`, `plan_feature_defaults`, `template_trees`, `platform_steps`): No RLS. + +### 2d. Connection pool reuse safety + +The `SET LOCAL` approach is transaction-scoped. A connection pool reuse test (see Section 6) must verify that a connection returned to the pool after tenant A's request does not carry tenant A's `account_id` into tenant B's request. This is guaranteed by `SET LOCAL` (resets on transaction end) but must be explicitly verified. + +--- + +## Section 3: Application-Layer Enforcement Patterns + +### 3a. `tenant_filter()` helper + +Add to `backend/app/core/filters.py`: + +```python +def tenant_filter(model, account_id: uuid.UUID): + """ + Primary app-layer tenant filter. + MUST be used in every SELECT/UPDATE/DELETE on tenant tables. + RLS is the safety net — this is the primary enforcement. + """ + return model.account_id == account_id +``` + +All existing filter helpers (`build_tree_access_filter`, `build_step_visibility_filter`) must internally call `tenant_filter()` as their base constraint. + +### 3b. Fetch-and-verify pattern + +For ID-based lookups, filter by **both** `id` AND `account_id` in the query — not fetch-then-check: + +```python +# Correct +stmt = select(Tree).where( + Tree.id == tree_id, + tenant_filter(Tree, current_user.account_id) +) +tree = (await db.execute(stmt)).scalar_one_or_none() +if not tree: + raise HTTPException(status_code=404) + +# Prohibited — fetch first, then check +tree = await db.get(Tree, tree_id) +if tree.account_id != current_user.account_id: + raise HTTPException(status_code=403) # Reveals existence — also wrong +``` + +Endpoints must return **404, not 403**, for cross-tenant ID lookups. Never confirm that a resource exists. + +### 3c. `get_tenant_context` dependency + +Add to `backend/app/api/deps.py`: + +```python +async def get_tenant_context( + current_user: User = Depends(get_current_active_user) +) -> uuid.UUID: + """ + Returns the current user's account_id. + Raises 403 if the user has no account association. + Inject this instead of accessing current_user.account_id directly. + """ + if current_user.account_id is None: + raise HTTPException(status_code=403, detail="User not associated with any account") + return current_user.account_id +``` + +### 3d. Insert pattern + +All inserts on tenant tables must explicitly set `account_id`. RLS `WITH CHECK` rejects inserts where `account_id` doesn't match the session variable, but application code must also set it: + +```python +new_record = Tree( + account_id=tenant_account_id, # Required — never omit + ... +) +``` + +### 3e. Code review checklist rule + +Every PR that touches a tenant table must be verified against: + +1. Every SELECT/UPDATE/DELETE includes `tenant_filter()` or a wrapper that calls it +2. ID lookups filter by both `id` and `account_id` in the same query +3. Inserts explicitly set `account_id` +4. 404 (not 403) returned for cross-tenant ID lookups +5. A cross-tenant isolation test is included (Phase 2 onwards) + +### 3f. CI grep check + +A grep-based CI check is active from the end of Phase 0 on all PRs. It flags: +- Queries on known tenant tables that don't include `tenant_filter` or `account_id` as a filter term +- Initial implementation: warn (not block) to allow calibration; switch to block after 2 weeks of false-positive tuning + +Pattern to be defined during Phase 0 implementation. + +--- + +## Section 4: Existing Gap Fixes + +### Immediate Hotfix (ships before Phase 0) + +**CRITICAL: Copilot tree access bypass** +- **Files:** `backend/app/api/endpoints/copilot.py`, `backend/app/services/copilot_service.py` +- **Issue:** `start_conversation()` loads a tree by UUID without verifying the requesting user has access. An attacker who knows a tree UUID from another account can extract its full structure, node names, and descriptions via the AI system prompt. +- **Fix:** Add `can_access_tree(current_user, tree)` check in the endpoint after loading the tree. Also add `tenant_filter(Tree, account_id)` to the tree query in `copilot_service.py`. Raise 404 (not 403) if the tree is not found or not accessible. +- Ships as an independent hotfix PR, merged immediately. + +### Phase 0 Fixes + +**LOW: Analytics flow endpoint** (`backend/app/api/endpoints/analytics.py`) +- `GET /analytics/flows/{tree_id}` returns analytics for any tree by UUID with no access check +- Fix: Add `tenant_filter(Tree, current_user.account_id)` to the tree fetch query. 404 if not found. + +**LOW: Category tree count** (`backend/app/api/endpoints/categories.py`) +- Tree count per category includes trees from all accounts +- Fix: Add `tenant_filter(Tree, current_user.account_id)` to the count subquery + +**LOW: AI session scope inconsistency** (`backend/app/api/endpoints/ai_sessions.py`) +- List endpoint is user-scoped (`user_id == current_user.id`); search endpoint uses `OR(user_id, account_id)` exposing other users' session summaries within the same account +- Sessions are user-scoped only. Cross-user access permitted only via explicit escalation or session sharing +- Fix: Restrict search endpoint to `user_id == current_user.id`. List and search must behave consistently. + +**Phase 0 UUID endpoint audit** +- Systematically review every endpoint with a `{resource_id}` URL parameter +- For each: verify that the ID lookup either (a) filters by `id AND account_id` in the query, or (b) calls `can_access_(current_user, resource)` on the fetched object +- Document every instance found, classify by severity, fix all before Phase 0 closes + +--- + +## Section 5: Legacy `team_id` Migration + +### 5a. TargetList — audit-gated + +Before any migration work on `TargetList`: + +1. Run a full codebase reference audit: grep for all references to `TargetList`, `target_list`, `target-list` +2. Query the production and staging databases for row count +3. Decision tree: + - Zero code references AND zero rows → drop the table entirely + - Zero rows but code references exist → deprecate code, then drop + - Rows exist → migrate (see sequence below) +4. Migration only proceeds after audit result is documented and approved + +If migration proceeds: +- Backfill path: `team_id → teams → users WHERE is_team_admin → account_id` +- If any row cannot be backfilled to a valid `account_id` → **full rollback**, no exceptions, no manual review queues + +### 5b. Pre-migration: teams-to-accounts orphan check + +Before any backfill using `team_id → account_id` chains, run: + +```sql +SELECT COUNT(*) FROM teams t +LEFT JOIN users u ON u.team_id = t.id AND u.account_id IS NOT NULL +WHERE u.id IS NULL; +``` + +Count must be zero before any backfill proceeds. Report and resolve orphaned teams first. + +### 5c. Approved migrations (no audit gate) + +**ScriptBuilderSession:** Add `account_id`, backfill from `user_id → users.account_id`, set NOT NULL. + +**ScriptTemplate:** Add `account_id`, backfill from `created_by_id → users.account_id`, set NOT NULL. + +**ScriptGeneration:** Add `account_id`, backfill from `user_id → users.account_id`, set NOT NULL. + +### 5d. team_id cleanup + +Do NOT drop `team_id` columns during migration. Keep until all application code is updated to use `account_id` exclusively. Drop `team_id` columns in a later cleanup migration after verification. + +--- + +## Section 6: Testing Strategy + +### Phase 1: RLS validation tests + +**File:** `backend/tests/test_rls_isolation.py` + +This test suite validates RLS policies at the database layer, independent of any application code. It connects to the test database using the `resolutionflow_app` role. + +**Setup fixture:** +- Creates two accounts (`account_a`, `account_b`) with seed rows in all tenant tables +- Creates an async DB connection using `resolutionflow_app` role +- Sets `SET LOCAL app.current_account_id = ''` before each test + +**Test cases per table (~5 cases per table, ~32 tables ≈ ~160 total):** + +1. **SELECT isolation** — querying as account_a returns zero rows for account_b's data +2. **INSERT enforcement** — inserting with `account_id = account_b_uuid` raises a PostgreSQL exception (rejected by `WITH CHECK`) +3. **INSERT cross-tenant FK** — inserting with correct `account_id` but a FK value (e.g., `tree_id`) that belongs to account_b is also rejected +4. **UPDATE enforcement** — updating account_b's rows as account_a affects zero rows +5. **DELETE enforcement** — deleting account_b's rows as account_a affects zero rows + +**Fail-closed test:** +```python +# Unset app.current_account_id — must raise a database exception +# Acceptable: psycopg2.errors.InvalidTextRepresentation (NULL::uuid cast fails) +# NOT acceptable: query returning zero rows silently +with pytest.raises(asyncpg.PostgresError): + await conn.execute("SELECT * FROM trees") +``` + +The exact exception type must be documented based on the behavior of `current_setting('app.current_account_id', false)::uuid` when unset. The test asserts on that specific exception. + +**audit_logs special cases:** +- No INSERT `WITH CHECK` test — audit logs must be insertable freely +- No UPDATE or DELETE tests — those policies must not exist +- Verify via `pg_policies` that only a SELECT policy exists on `audit_logs` + +**tree_shares:** Intentionally deferred. A TODO comment is included: +```python +# TODO: tree_shares RLS tests deferred pending sharing model decision. +# Must be added before RLS is enabled on the tree_shares table. +# See: docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md Section 7 +``` + +**Tables covered in Phase 1:** +`trees`, `ai_sessions`, `ai_chat_sessions`, `ai_conversations`, `sessions`, `step_library`, `tree_categories`, `tree_tags`, `step_categories`, `flow_proposals`, `attachments`, `audit_logs`, `psa_connections`, `copilot_conversations`, `file_uploads`, `kb_imports`, `subscriptions`, `account_invites`, `ai_usage`, `notifications`, `session_shares`, `script_builder_sessions`, `script_templates`, `script_generations`, `maintenance_schedules`, `user_folders`, `user_pinned_trees`, `session_supporting_data`, `session_branches`, `session_resolution_outputs`, `psa_post_logs`, `notification_logs` + +**Connection pool reuse test:** +```python +async def test_set_local_does_not_leak_between_connections(): + """SET LOCAL must not leak account_id when a connection is returned to the pool.""" + conn = await pool.acquire() + async with conn.transaction(): + await conn.execute("SET LOCAL app.current_account_id = $1", str(account_a_id)) + # transaction ends, SET LOCAL resets + # Return conn to pool, re-acquire — verify account_id is not set + conn2 = await pool.acquire() + result = await conn2.fetchval("SELECT current_setting('app.current_account_id', true)") + assert result is None or result == "" +``` + +### Phase 2: Per-endpoint cross-tenant tests + +As each endpoint is touched in any PR from Phase 1 onward, a cross-tenant isolation test is added in the same PR: + +```python +async def test_cannot_access_other_account_( + client_account_a, account_b_resource +): + """Account A cannot access Account B's resource by UUID.""" + response = await client_account_a.get(f"//{account_b_resource.id}") + assert response.status_code == 404 # Not 403 — never reveal existence +``` + +This is a hard requirement per the code review checklist. + +--- + +## Section 7: Phased Rollout Plan + +### Immediate Hotfix (before everything else) + +- Fix CRITICAL: Copilot tree access bypass (see Section 4) +- Ships as independent PR, merged immediately, does not wait for Phase 0 + +--- + +### Phase 0 — Foundation + +Goal: Fix all existing gaps; establish patterns and tooling; gate future PRs. + +1. Fix LOW: Analytics flow endpoint missing ownership check +2. Fix LOW: Category tree count scope +3. Fix LOW: AI session search/list inconsistency (restrict both to `user_id`) +4. **Full UUID endpoint audit** — every `{resource_id}` URL param checked. All gaps documented and fixed before Phase 0 closes. +5. Add `get_tenant_context` dependency to `deps.py` +6. Add `tenant_filter()` helper to `filters.py`. Update existing filter helpers to call it. +7. **Dead code audit:** TargetList references + database row count. Report result. +8. **Orphan check:** Teams without a resolvable account_id. Report result. +9. **Define and activate CI grep check** for missing `tenant_filter()` on tenant tables. Active on all PRs from Phase 1 forward. + +Gate: All gaps patched. CI grep check active. TargetList and team orphan audit results documented. + +--- + +### Phase 1 — Schema Migration + +Goal: Every tenant-relevant table has a direct, NOT NULL `account_id` column. + +10. Add `account_id` to all tables in Section 1a (migration per logical domain group: core sessions, PSA, AI, steps, notifications) +11. Make nullable `account_id` NOT NULL on models from Section 1b +12. Migrate `ScriptBuilderSession`, `ScriptTemplate`, `ScriptGeneration` from `team_id` to `account_id` +13. `TargetList`: execute result of Phase 0 audit (drop or migrate) +14. Create `template_trees` and `platform_steps` tables; migrate global content + +Gate: Zero NULL `account_id` values in any tenant table. All backfills verified. Database passes zero-NULL assertion query for every table in scope. + +--- + +### Phase 2 — PostgreSQL Infrastructure + +Goal: Database roles established; `SET LOCAL` wired into every request; RLS test suite green. + +15. Create `resolutionflow_app` and `resolutionflow_admin` PostgreSQL roles; grant privileges +16. Update `DATABASE_URL` to `resolutionflow_app`; add `DATABASE_ADMIN_URL` for admin connections +17. Update Alembic to use `DATABASE_ADMIN_URL` +18. Modify `get_db()` to execute `SET LOCAL app.current_account_id` per request, inside transaction boundary +19. Update super admin endpoints to use `resolutionflow_admin` connection +20. Write `test_rls_isolation.py` — all ~160 RLS tests. Must be 100% green before Phase 3. Connection pool reuse test included. +21. Measure `SET LOCAL` per-request overhead baseline + +Gate: All RLS tests green. All existing integration tests green. Connection pool reuse test green. Performance baseline documented. + +--- + +### Phase 3 — Enable RLS + +Goal: RLS policies active on all tenant tables. Phased by domain, not all at once. + +Enable RLS in these batches. Run full test suite after each batch before proceeding: + +**Batch A: Core data** +`trees`, `tree_categories`, `tree_tags`, `sessions`, `attachments`, `session_supporting_data`, `maintenance_schedules` + +**Batch B: AI & sessions** +`ai_sessions`, `ai_chat_sessions`, `ai_conversations`, `ai_usage`, `ai_session_steps`, `session_branches`, `session_handoffs`, `session_resolution_outputs`, `fork_points`, `copilot_conversations`, `flow_proposals` + +**Batch C: Steps & library** +`step_library`, `step_categories`, `step_ratings`, `step_usage_logs`, `ai_suggestions`, `template_trees` (no RLS), `platform_steps` (no RLS) + +**Batch D: Integrations & users** +`psa_connections`, `psa_post_logs`, `psa_member_mappings`, `subscriptions`, `account_invites`, `file_uploads`, `kb_imports`, `notifications`, `notification_logs`, `session_shares`, `user_folders`, `user_pinned_trees` + +**Batch E: Scripts & auth** +`script_builder_sessions`, `script_templates`, `script_generations`, `audit_logs` + +For each batch migration: `ENABLE ROW LEVEL SECURITY` + `FORCE ROW LEVEL SECURITY` + create all applicable policies. + +Gate: All RLS tests green after each batch. All integration tests green. Staging smoke test after Batch E. No performance regressions. + +--- + +### Phase 4 — Ongoing + +- Every PR that touches a tenant endpoint includes a cross-tenant isolation test +- CI grep check blocks PRs with missing `tenant_filter()` (warn → block after 2-week calibration) +- `tree_shares` RLS: design sharing model, implement tests, enable RLS before any sharing feature ships +- `team_id` columns: drop in a cleanup migration after all application code is fully migrated + +--- + +## Section 8: Background Job Policy + +Background jobs and scheduled tasks that process tenant data are a distinct isolation surface. Unlike request-scoped endpoints, they do not have a `current_user` and must manage tenant context explicitly. + +### Policy + +Every background job that touches tenant tables must comply with one of two patterns: + +**Pattern A — `resolutionflow_admin` with explicit per-query `account_id` filtering** + +Use when: the job is inherently cross-tenant (e.g., processes all pending work across all accounts in a single pass). The admin role bypasses RLS, so explicit `account_id` filters in every query are mandatory — RLS is not the safety net here. + +```python +# Allowed: cross-tenant batch SELECT for IDs only, then loop per-account +result = await db.execute( + select(Model.id, Model.account_id).where(Model.status == "pending") +) +for row in result.all(): + await _process_one(row.id, row.account_id, db) # account_id threaded through + +# In the processing function: all queries must filter by account_id +async def _process_one(record_id, account_id, db): + result = await db.execute( + select(Model).where(Model.id == record_id, Model.account_id == account_id) + ) +``` + +**Pattern B — `resolutionflow_app` with `SET LOCAL` per tenant loop iteration** + +Use when: the job processes tenants one at a time and it is practical to set the tenant context per iteration. + +```python +for account_id in account_ids: + async with async_session_maker() as db: + await db.execute( + text("SET LOCAL app.current_account_id = :id"), + {"id": str(account_id)} + ) + # All queries in this block are RLS-enforced to this tenant +``` + +### No cross-tenant queries without justification + +No background job may issue a SELECT, UPDATE, or DELETE that spans multiple tenants' data in a single query without explicit written justification in the code comment and in this spec. Approved cross-tenant operations are documented below. + +### Inventory: Current Background Jobs + +| Job | File | Pattern | Status | Notes | +|---|---|---|---|---| +| Knowledge Flywheel | `knowledge_flywheel_scheduler.py` | Admin + explicit filter | **Needs update** | Batch SELECT across all accounts with no `account_id` filter. Must thread `account_id` from the session into all processing calls. | +| PSA Retry | `psa_retry_scheduler.py` | Admin + explicit filter | **Needs update** | Batch SELECT across all accounts with no `account_id` filter. Must add `account_id` to batch query and thread through to `retry_failed_push`. | +| Chat Retention Cleanup | `retention_cleanup.py` | Admin + explicit filter | **Correct pattern** | Already loops per-account with explicit `account_id` in all queries. Model for other jobs. Needs role update to `resolutionflow_admin`. | +| Maintenance Schedule Firing | `scheduler.py` (`_fire_maintenance_schedule`) | Admin + explicit filter | **Needs update** | Fetches `MaintenanceSchedule` and `Tree` by ID without `account_id` filter. After Phase 1, `Session` creation must set `account_id`. Add `account_id` to all queries. | +| AI Conversation Expiry | `scheduler.py` (`_cleanup_expired_ai_conversations`) | Admin, cross-tenant approved | **Correct pattern** | Cross-tenant DELETE by `expires_at` is explicitly justified: rows are expired regardless of tenant. Needs role update to `resolutionflow_admin`. Document this approval in code comment. | + +### Approved cross-tenant queries + +The following cross-tenant operations are explicitly approved. All others require a new entry here before shipping: + +| Job | Query | Justification | +|---|---|---| +| AI Conversation Expiry | `DELETE FROM ai_conversations WHERE expires_at < NOW()` | Time-based expiry is tenant-independent. Deleting expired rows does not expose data across tenants. | +| Chat Retention Cleanup | `SELECT id FROM accounts` | Required to iterate per-account. Read of account IDs only; no tenant data accessed. | +| Knowledge Flywheel (after fix) | `SELECT id, account_id FROM ai_sessions WHERE analysis_status='pending'` | Cross-tenant ID harvest only. No tenant data in the SELECT. `account_id` is threaded into per-tenant processing. | +| PSA Retry (after fix) | `SELECT id, account_id FROM psa_post_logs WHERE status='pending_retry'` | Cross-tenant ID harvest only. `account_id` threaded into per-tenant processing. | + +### Checklist additions (Phase 2) + +- [ ] All background jobs updated to use `resolutionflow_admin` role via `DATABASE_ADMIN_URL` +- [ ] Knowledge Flywheel: `account_id` added to batch query and threaded through to `analyze_session` +- [ ] PSA Retry: `account_id` added to batch query and threaded through to `retry_failed_push` +- [ ] Maintenance Schedule Firing: all queries include `account_id`; `Session` creation sets `account_id` after Phase 1 migration +- [ ] AI Conversation Expiry: cross-tenant approval comment added to code +- [ ] All jobs reviewed against this policy before Phase 3 (RLS enable) + +--- + +## Section 9: Open Questions + +| Question | Impact | Owner | +|---|---|---| +| Is tree sharing intra-tenant only, or can trees be shared across accounts? | Determines `tree_shares` schema, backfill strategy, and RLS policy. Tree_shares table deferred until resolved. | Product | +| What is the exact PostgreSQL exception raised when `current_setting('app.current_account_id', false)::uuid` is evaluated with no value set? | Determines the fail-closed test assertion. Must be tested in Phase 2. | Engineering | +| TargetList: zero references + zero rows? Or does data exist? | Determines whether table is dropped or migrated. | Phase 0 audit | + +--- + +## Appendix: Pre-Implementation Checklist + +Everything that must be in place before writing feature code on any tenant-data endpoint: + +### Foundation (Phase 0) +- [ ] Copilot tree access bypass hotfix shipped +- [ ] Analytics flow endpoint ownership check added +- [ ] Category tree count scoped to account +- [ ] AI session list and search both restricted to `user_id` +- [ ] Full UUID endpoint audit completed; all gaps documented and fixed +- [ ] `get_tenant_context` dependency added to `deps.py` +- [ ] `tenant_filter()` helper added to `filters.py` +- [ ] Existing filter helpers updated to use `tenant_filter()` internally +- [ ] TargetList dead code audit result documented +- [ ] Teams orphan count query run and result documented +- [ ] CI grep check defined and active + +### Schema (Phase 1) +- [ ] `account_id NOT NULL` on all tables in Section 1a denormalization list +- [ ] `account_id NOT NULL` on all existing nullable models from Section 1b +- [ ] ScriptBuilderSession, ScriptTemplate, ScriptGeneration migrated from team_id +- [ ] TargetList: dropped or migrated per audit result +- [ ] template_trees and platform_steps tables created +- [ ] Zero NULL assertion passes for every tenant table +- [ ] Migration sequence (add nullable → backfill → verify → NOT NULL → index) followed for each table + +### Infrastructure (Phase 2) +- [ ] `resolutionflow_app` role created with correct privileges +- [ ] `resolutionflow_admin` role created with `BYPASSRLS` +- [ ] `DATABASE_URL` updated to `resolutionflow_app` +- [ ] `DATABASE_ADMIN_URL` added for admin connections +- [ ] Alembic uses `DATABASE_ADMIN_URL` +- [ ] `get_db()` executes `SET LOCAL app.current_account_id` per request inside transaction +- [ ] Super admin endpoints use admin connection +- [ ] All background jobs updated to use `resolutionflow_admin` role (see Section 8) +- [ ] Knowledge Flywheel: `account_id` threaded through batch query and `analyze_session` +- [ ] PSA Retry: `account_id` threaded through batch query and `retry_failed_push` +- [ ] Maintenance Schedule: all queries include `account_id`; `Session` creation sets `account_id` +- [ ] AI Conversation Expiry: cross-tenant approval comment added to code +- [ ] `test_rls_isolation.py` written with ~160 test cases +- [ ] All RLS tests pass (100%) +- [ ] Connection pool reuse test passes +- [ ] Fail-closed exception documented and asserted +- [ ] Performance baseline for `SET LOCAL` overhead documented + +### RLS Active (Phase 3) +- [ ] Batch A RLS policies applied and all tests green +- [ ] Batch B RLS policies applied and all tests green +- [ ] Batch C RLS policies applied and all tests green +- [ ] Batch D RLS policies applied and all tests green +- [ ] Batch E RLS policies applied and all tests green +- [ ] Staging smoke test passed +- [ ] All existing integration tests green with RLS active + +### Ongoing Standards +- [ ] Every new endpoint PR includes cross-tenant isolation test +- [ ] CI grep check blocks PRs with missing `tenant_filter()` +- [ ] `tree_shares` deferred — not enabled until sharing model documented +- [ ] `team_id` cleanup migration scheduled after full migration complete From b3dba57bc5d8a24173e3152710a2164d6b86be6a Mon Sep 17 00:00:00 2001 From: chihlasm Date: Thu, 9 Apr 2026 00:42:19 -0400 Subject: [PATCH 3/4] =?UTF-8?q?feat:=20tenant=20isolation=20Phase=200=20?= =?UTF-8?q?=E2=80=94=20app-layer=20filters,=20UUID=20audit,=20CI=20gate=20?= =?UTF-8?q?(#132)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: add tenant data isolation design spec Complete architecture plan for multi-tenant data isolation across all layers (PostgreSQL RLS, application-layer filtering, schema migration, testing strategy, and phased rollout checklist). Co-Authored-By: Claude Sonnet 4.6 * docs: add background job isolation policy to tenant isolation spec Documents policy for all 5 existing background jobs: - Knowledge Flywheel and PSA Retry flagged for account_id threading - Chat Retention already follows correct pattern (model for others) - Maintenance Schedule Firing needs account_id in queries + Session creation - AI Conversation Expiry approved as cross-tenant with justification Adds approved cross-tenant query registry and Phase 2 checklist items. Co-Authored-By: Claude Sonnet 4.6 * docs: add tenant isolation Phase 0 implementation plan 8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper, get_tenant_context dependency, analytics/category/AI session gap fixes, full UUID endpoint audit, TargetList dead code audit, teams orphan check, and CI grep check for missing tenant filters. Co-Authored-By: Claude Sonnet 4.6 * feat: add tenant_filter() helper and get_tenant_context dependency tenant_filter(model, account_id) is the canonical app-layer tenant scoping expression. Every query on a tenant table must use it. build_tree_access_filter and build_step_visibility_filter updated to call tenant_filter() internally for the account_id match. get_tenant_context is a FastAPI dependency that returns account_id or raises 403 if the user has no account — prevents raw access to current_user.account_id and centralises the null check. Co-Authored-By: Claude Sonnet 4.6 * fix: scope analytics/flows/{tree_id} to requesting account Any authenticated user could read flow analytics (session counts, completion rates, CSAT) for any tree UUID. Now returns 404 if the tree doesn't belong to the requesting account. Co-Authored-By: Claude Sonnet 4.6 * fix: scope category tree_count to requesting account tree_count on GET /categories/{id} was including trees from all accounts, leaking cross-tenant row counts. Co-Authored-By: Claude Sonnet 4.6 * fix: restrict AI session search to current user only Search endpoint used OR(user_id, account_id), exposing other users' problem_summary and problem_domain within the same account. Sessions are user-scoped only — cross-user access requires explicit escalation or sharing. List and search endpoints now behave consistently. Co-Authored-By: Claude Sonnet 4.6 * fix: add ownership check and 404 responses to ai-sessions endpoints Cross-tenant isolation audit found: - retry-psa-push had NO ownership check (CRITICAL) — any user could retry any session's PSA push - save_task_lane used db.get() without ownership filter, returned 403 revealing existence - get_session returned 403 instead of 404 for unauthorized access - stream_documentation returned 403 instead of 404 All now use query-level user_id filtering and return 404 to avoid revealing existence. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-tenant session access All session endpoints (get, update, complete, scratchpad, variables, export, ticket-link) now return 404 instead of 403 when a user tries to access another user's session. This prevents confirming existence of resources across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-tenant tree access get_tree and update_tree now return 404 when a user cannot access a tree (private tree from another account). Prevents confirming resource existence across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-tenant step access get_step_or_404 now returns 404 when can_view_step or can_edit_step fails, preventing confirmation of step existence across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-tenant upload access get_upload_url and delete_upload now return 404 when the upload belongs to a different account/user, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-tenant share access revoke_share and create_share now return 404 when the caller is not the owner, preventing resource existence confirmation across users. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-team tree access in maintenance schedules _get_tree_or_403 now returns 404 when the user's team does not match, preventing confirmation of tree existence across teams. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-account tag access get_tag now returns 404 for account-specific tags that belong to another account, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 instead of 403 for cross-account step category access get_step_category now returns 404 for account-specific categories that belong to another account, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 * test: add cross-tenant isolation tests for Task 6 UUID audit Tests cover: - Tree GET/PUT returns 404 for cross-account access - Session GET returns 404 for cross-user access - AI session GET returns 404 for cross-user access - AI session retry-psa-push requires ownership - Upload URL returns 404 for cross-account access - Share revoke returns 404 for cross-user access Co-Authored-By: Claude Sonnet 4.6 * fix: return 404 (not 403) for get_documentation cross-user access; add missing Task 6 tests get_documentation was revealing session existence via 403. Added pre-check query filtering by session_id AND user_id before calling the engine. Also add cross-tenant isolation tests for steps, tags, step_categories, and maintenance_schedules endpoints fixed in Task 6 (TDD was skipped). Co-Authored-By: Claude Sonnet 4.6 * fix: address Task 6 quality review — rename helper, restore 403 for intra-account, add docs test - Rename _get_tree_or_403 → _get_tree_or_404 in maintenance_schedules.py (function now raises 404, old name was misleading) - Restore HTTP 403 for intra-account permission failures in update_tree: same-account users who can see a tree but can't edit it got 404 (wrong); only cross-account lookups should return 404 to avoid confirming existence - Apply same 403/404 distinction to update_tree_visibility - Add test: get_documentation must return 404 for cross-user session access - Add comment documenting owner-only design for documentation endpoints Co-Authored-By: Claude Sonnet 4.6 * chore: Task 7+8 — TargetList audit, CI tenant-filter grep check Task 7: TargetList dead code audit - Found active code references in 12+ files across backend and frontend (full CRUD API + frontend page + MaintenanceScheduleSection + BatchLaunchModal) - Decision: migrate to account_id in Phase 1 (cannot drop) - DB row count not available from code-server — must verify from VPS SSH before Phase 1 migration - Teams orphan check query documented; must run from VPS SSH before Phase 1 - Results documented in spec Section 9 Task 8: CI tenant-filter enforcement check (warn mode) - Create backend/scripts/check_tenant_filters.py Scans endpoint and service files for select() on tenant tables without tenant_filter/account_id/user_id in surrounding context. Currently reports 109 warnings (Phase 1 backlog). Exits 0 (warn mode). - Add Check tenant filter enforcement step to backend CI job Add --fail flag after Phase 1 backlog clears to make it blocking. Co-Authored-By: Claude Sonnet 4.6 * docs: record Phase 0 audit results — 0 orphaned teams, 0 target_list rows Both checks confirmed 2026-04-09 from production DB. Phase 1 migration is safe to proceed. Co-Authored-By: Claude Sonnet 4.6 --------- Co-authored-by: Claude Sonnet 4.6 --- .github/workflows/ci.yml | 5 + backend/app/api/deps.py | 17 + backend/app/api/endpoints/ai_sessions.py | 52 +- backend/app/api/endpoints/analytics.py | 10 +- backend/app/api/endpoints/categories.py | 7 +- .../api/endpoints/maintenance_schedules.py | 12 +- backend/app/api/endpoints/sessions.py | 28 +- backend/app/api/endpoints/shares.py | 8 +- backend/app/api/endpoints/step_categories.py | 4 +- backend/app/api/endpoints/steps.py | 4 +- backend/app/api/endpoints/tags.py | 4 +- backend/app/api/endpoints/trees.py | 28 +- backend/app/api/endpoints/uploads.py | 8 +- backend/app/core/filters.py | 30 +- backend/scripts/check_tenant_filters.py | 91 +++ backend/tests/test_tenant_isolation_p0.py | 549 ++++++++++++++++-- ...2026-04-09-tenant-data-isolation-design.md | 3 +- 17 files changed, 759 insertions(+), 101 deletions(-) create mode 100644 backend/scripts/check_tenant_filters.py diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 89d6bbc6..18bf85ec 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -47,6 +47,11 @@ jobs: - name: Install dependencies run: pip install -r backend/requirements.txt -r backend/requirements-dev.txt + - name: Check tenant filter enforcement + run: cd backend && python scripts/check_tenant_filters.py + # Warn mode only (exits 0). Switch to --fail after Phase 1 backlog clears. + # See: docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md Section 3f + - name: Run tests with coverage run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50 diff --git a/backend/app/api/deps.py b/backend/app/api/deps.py index 28536d68..4bd3fd3c 100644 --- a/backend/app/api/deps.py +++ b/backend/app/api/deps.py @@ -190,3 +190,20 @@ async def get_plan_limits_for_user( """Get plan limits for the current user's account.""" from app.core.subscriptions import get_user_plan_limits return await get_user_plan_limits(current_user.account_id, db) + + +async def get_tenant_context( + current_user: Annotated[User, Depends(get_current_active_user)], +) -> UUID: + """Return the current user's account_id. + + Use this dependency instead of reading current_user.account_id directly. + Raises 403 if the user has no account association (should not happen in + normal flows — users are always associated with an account on registration). + """ + if current_user.account_id is None: + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="User not associated with any account", + ) + return current_user.account_id diff --git a/backend/app/api/endpoints/ai_sessions.py b/backend/app/api/endpoints/ai_sessions.py index 38ca0286..425e6421 100644 --- a/backend/app/api/endpoints/ai_sessions.py +++ b/backend/app/api/endpoints/ai_sessions.py @@ -519,11 +519,15 @@ async def save_task_lane( _: None = Depends(require_engineer_or_admin), ): """Save the current task lane state including user's in-progress responses.""" - session = await db.get(AISession, session_id) + result = await db.execute( + select(AISession).where( + AISession.id == session_id, + AISession.user_id == current_user.id, + ) + ) + session = result.scalar_one_or_none() if not session: raise HTTPException(status_code=404, detail="Session not found") - if session.user_id != current_user.id: - raise HTTPException(status_code=403, detail="Not your session") payload = { "questions": [q.model_dump() for q in body.questions], @@ -762,13 +766,13 @@ async def search_sessions( limit: int = Query(5, ge=1, le=20), ): """Search AI sessions by content using full-text search. Used by Command Palette.""" + # Sessions are user-scoped. The list endpoint uses user_id only; + # search must be consistent. Cross-user access requires explicit + # escalation or session sharing — not ambient account membership. result = await db.execute( select(AISession) .where( - or_( - AISession.user_id == current_user.id, - AISession.account_id == current_user.account_id, - ), + AISession.user_id == current_user.id, text("ai_sessions.search_vector @@ plainto_tsquery('english', :q)"), ) .params(q=q) @@ -901,7 +905,7 @@ async def get_session( pkg = session.escalation_package or {} is_handler = pkg.get("picked_up_by") == str(current_user.id) if session.user_id != current_user.id and session.escalated_to_id != current_user.id and not is_handler: - raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Not authorized") + raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Session not found") return _build_session_detail(session) @@ -917,6 +921,18 @@ async def get_documentation( db: Annotated[AsyncSession, Depends(get_db)], ): """Get auto-generated documentation for a session.""" + # Verify session ownership — owner only. Documentation endpoints require direct + # ownership; escalated_to_id / picked_up_by handlers use get_session (read-only). + # This is consistent with stream_documentation which has the same owner-only check. + result = await db.execute( + select(AISession).where( + AISession.id == session_id, + AISession.user_id == current_user.id, + ) + ) + if not result.scalar_one_or_none(): + raise HTTPException(status_code=404, detail="Session not found") + try: return await flowpilot_engine.get_session_documentation( session_id=session_id, @@ -942,13 +958,14 @@ async def stream_documentation( # Verify session ownership result = await db.execute( - select(AISession).where(AISession.id == session_id) + select(AISession).where( + AISession.id == session_id, + AISession.user_id == current_user.id, + ) ) session = result.scalar_one_or_none() if not session: raise HTTPException(status_code=404, detail="Session not found") - if session.user_id != current_user.id: - raise HTTPException(status_code=403, detail="Not authorized") async def event_generator(): try: @@ -1043,6 +1060,19 @@ async def retry_psa_push_endpoint( """Manually retry a failed PSA documentation push.""" from app.models.psa_post_log import PsaPostLog + # Verify the session belongs to the current user + session_result = await db.execute( + select(AISession).where( + AISession.id == session_id, + AISession.user_id == current_user.id, + ) + ) + if not session_result.scalar_one_or_none(): + raise HTTPException( + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found", + ) + # Find the latest failed push log for this session result = await db.execute( select(PsaPostLog) diff --git a/backend/app/api/endpoints/analytics.py b/backend/app/api/endpoints/analytics.py index 41a3b015..d4591b81 100644 --- a/backend/app/api/endpoints/analytics.py +++ b/backend/app/api/endpoints/analytics.py @@ -7,6 +7,7 @@ from sqlalchemy.ext.asyncio import AsyncSession from app.core.database import get_db from app.api.deps import get_current_active_user +from app.core.filters import tenant_filter from app.models import User, Session, Tree, SessionRating from app.schemas.analytics import ( TeamAnalyticsResponse, PersonalAnalyticsResponse, FlowAnalyticsResponse, @@ -290,8 +291,13 @@ async def get_flow_analytics( current_user: User = Depends(get_current_active_user), ): """Analytics for a specific flow.""" - # Verify tree exists - result = await db.execute(select(Tree).where(Tree.id == tree_id)) + # Verify tree exists and belongs to the requesting user's account. + result = await db.execute( + select(Tree).where( + Tree.id == tree_id, + tenant_filter(Tree, current_user.account_id), + ) + ) tree = result.scalar_one_or_none() if not tree: raise HTTPException(status_code=404, detail="Flow not found") diff --git a/backend/app/api/endpoints/categories.py b/backend/app/api/endpoints/categories.py index 73505c05..f0d7d010 100644 --- a/backend/app/api/endpoints/categories.py +++ b/backend/app/api/endpoints/categories.py @@ -12,6 +12,7 @@ from app.models.user import User from app.schemas.category import CategoryCreate, CategoryUpdate, CategoryResponse, CategoryListResponse from app.api.deps import get_current_active_user from app.core.permissions import can_manage_category, can_create_category +from app.core.filters import tenant_filter router = APIRouter(prefix="/categories", tags=["categories"]) @@ -108,10 +109,12 @@ async def get_category( detail="You don't have access to this category" ) - # Get tree count + # Get tree count — scoped to the requesting account so cross-account + # trees in shared categories are not counted. count_query = select(func.count(Tree.id)).where( Tree.category_id == category.id, - Tree.is_active == True + Tree.is_active == True, + tenant_filter(Tree, current_user.account_id), ) count_result = await db.execute(count_query) tree_count = count_result.scalar() or 0 diff --git a/backend/app/api/endpoints/maintenance_schedules.py b/backend/app/api/endpoints/maintenance_schedules.py index b7195637..506da0e3 100644 --- a/backend/app/api/endpoints/maintenance_schedules.py +++ b/backend/app/api/endpoints/maintenance_schedules.py @@ -29,8 +29,8 @@ def _compute_next_run(cron_expression: str, tz_name: str) -> datetime: return cron.get_next(datetime).astimezone(timezone.utc) -async def _get_tree_or_403(tree_id: UUID, current_user: User, db: AsyncSession) -> "Tree": - """Fetch tree and verify the current user's team owns it.""" +async def _get_tree_or_404(tree_id: UUID, current_user: User, db: AsyncSession) -> "Tree": + """Fetch tree and verify the current user's team owns it. Raises 404 if not found or access denied.""" result = await db.execute(select(Tree).where(Tree.id == tree_id)) tree = result.scalar_one_or_none() if not tree: @@ -38,7 +38,7 @@ async def _get_tree_or_403(tree_id: UUID, current_user: User, db: AsyncSession) # Super admins can access any tree; regular users must be on the same team if not getattr(current_user, 'is_super_admin', False): if tree.team_id != current_user.team_id: - raise HTTPException(status_code=403, detail="Access denied") + raise HTTPException(status_code=404, detail="Tree not found") return tree @@ -51,7 +51,7 @@ async def create_schedule( ): """Create a cron schedule for a maintenance flow. One per flow.""" # Verify user's team owns the tree - tree = await _get_tree_or_403(data.tree_id, current_user, db) + tree = await _get_tree_or_404(data.tree_id, current_user, db) if tree.tree_type != "maintenance": raise HTTPException(status_code=400, detail="Schedules are only supported for maintenance flows") @@ -94,7 +94,7 @@ async def get_schedule_for_tree( ): """Get the schedule for a specific maintenance flow.""" # Verify user's team owns the tree before returning schedule data - await _get_tree_or_403(tree_id, current_user, db) + await _get_tree_or_404(tree_id, current_user, db) result = await db.execute( select(MaintenanceSchedule).where(MaintenanceSchedule.tree_id == tree_id) @@ -122,7 +122,7 @@ async def update_schedule( raise HTTPException(status_code=404, detail="Schedule not found") # Verify user's team owns the tree this schedule belongs to - await _get_tree_or_403(schedule.tree_id, current_user, db) + await _get_tree_or_404(schedule.tree_id, current_user, db) update_fields = data.model_fields_set was_active = schedule.is_active diff --git a/backend/app/api/endpoints/sessions.py b/backend/app/api/endpoints/sessions.py index 570b7672..3ad51f5d 100644 --- a/backend/app/api/endpoints/sessions.py +++ b/backend/app/api/endpoints/sessions.py @@ -143,8 +143,8 @@ async def get_session( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) return session @@ -234,8 +234,8 @@ async def update_session( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) if session.completed_at: @@ -281,8 +281,8 @@ async def complete_session( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) if session.completed_at: @@ -319,8 +319,8 @@ async def update_scratchpad( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) session.scratchpad = data.scratchpad @@ -348,8 +348,8 @@ async def update_session_variables( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) if session.completed_at: @@ -387,8 +387,8 @@ async def export_session( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) # PDF export — separate path with binary response @@ -830,8 +830,8 @@ async def link_ticket( if session.user_id != current_user.id and session.assigned_to_id != current_user.id: if not current_user.is_super_admin: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this session", + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found", ) # Unlink diff --git a/backend/app/api/endpoints/shares.py b/backend/app/api/endpoints/shares.py index ee81e903..3d67207d 100644 --- a/backend/app/api/endpoints/shares.py +++ b/backend/app/api/endpoints/shares.py @@ -72,8 +72,8 @@ async def create_share( if session.user_id != current_user.id and not current_user.is_super_admin: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="Only the session owner can create share links" + status_code=status.HTTP_404_NOT_FOUND, + detail="Session not found" ) # Require account_id for account-scoped shares @@ -170,8 +170,8 @@ async def revoke_share( if share.created_by != current_user.id and not current_user.is_super_admin: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="Only the share creator can revoke it" + status_code=status.HTTP_404_NOT_FOUND, + detail="Share not found" ) share.is_active = False diff --git a/backend/app/api/endpoints/step_categories.py b/backend/app/api/endpoints/step_categories.py index 5d890225..53770bee 100644 --- a/backend/app/api/endpoints/step_categories.py +++ b/backend/app/api/endpoints/step_categories.py @@ -94,8 +94,8 @@ async def get_step_category( # Check access: global categories visible to all, account categories only to account members if category.account_id and category.account_id != current_user.account_id and not current_user.is_super_admin: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this step category" + status_code=status.HTTP_404_NOT_FOUND, + detail="Step category not found" ) return StepCategoryResponse( diff --git a/backend/app/api/endpoints/steps.py b/backend/app/api/endpoints/steps.py index 14bc5a7c..992c1725 100644 --- a/backend/app/api/endpoints/steps.py +++ b/backend/app/api/endpoints/steps.py @@ -47,10 +47,10 @@ async def get_step_or_404( raise HTTPException(status_code=404, detail="Step not found") if check_view and not can_view_step(current_user, step): - raise HTTPException(status_code=403, detail="Not authorized to view this step") + raise HTTPException(status_code=404, detail="Step not found") if check_edit and not can_edit_step(current_user, step): - raise HTTPException(status_code=403, detail="Not authorized to modify this step") + raise HTTPException(status_code=404, detail="Step not found") return step diff --git a/backend/app/api/endpoints/tags.py b/backend/app/api/endpoints/tags.py index 334e33f8..b8438e8b 100644 --- a/backend/app/api/endpoints/tags.py +++ b/backend/app/api/endpoints/tags.py @@ -105,8 +105,8 @@ async def get_tag( # Check access: global tags visible to all, account tags only to account members if tag.account_id and tag.account_id != current_user.account_id and not current_user.is_super_admin: raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this tag" + status_code=status.HTTP_404_NOT_FOUND, + detail="Tag not found" ) return TagResponse.model_validate(tag) diff --git a/backend/app/api/endpoints/trees.py b/backend/app/api/endpoints/trees.py index 73927cf8..75cdaf70 100644 --- a/backend/app/api/endpoints/trees.py +++ b/backend/app/api/endpoints/trees.py @@ -392,8 +392,8 @@ async def get_tree( if not tree.is_active or not can_access_tree(current_user, tree): raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You don't have access to this tree" + status_code=status.HTTP_404_NOT_FOUND, + detail="Tree not found" ) return build_full_tree_response(tree) @@ -610,9 +610,17 @@ async def update_tree( ) if not can_edit_tree(current_user, tree): + # If the user can see this tree (same account, team visibility), give a 403 with + # a clear message — returning 404 here would be confusing since GET returns 200. + # For truly inaccessible trees (cross-account), return 404 to avoid confirming existence. + if can_access_tree(current_user, tree): + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="You do not have permission to edit this flow" + ) raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You can only edit your own trees" + status_code=status.HTTP_404_NOT_FOUND, + detail="Tree not found" ) # Extract tags for separate handling @@ -1144,9 +1152,17 @@ async def update_tree_visibility( ) if not can_edit_tree(current_user, tree): + # If the user can see this tree (same account, team visibility), give a 403 with + # a clear message — returning 404 here would be confusing since GET returns 200. + # For truly inaccessible trees (cross-account), return 404 to avoid confirming existence. + if can_access_tree(current_user, tree): + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="You do not have permission to edit this flow" + ) raise HTTPException( - status_code=status.HTTP_403_FORBIDDEN, - detail="You can only edit your own trees" + status_code=status.HTTP_404_NOT_FOUND, + detail="Tree not found" ) # Update visibility diff --git a/backend/app/api/endpoints/uploads.py b/backend/app/api/endpoints/uploads.py index eb9d0e38..1a3efc83 100644 --- a/backend/app/api/endpoints/uploads.py +++ b/backend/app/api/endpoints/uploads.py @@ -255,9 +255,9 @@ async def get_upload_url( if upload is None: raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found") - # Verify the upload belongs to the user's account + # Verify the upload belongs to the user's account — 404 to avoid revealing existence if upload.account_id != current_user.account_id and not current_user.is_super_admin: - raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Access denied") + raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found") url = storage_service.get_presigned_url(upload.storage_key) return {"url": url} @@ -311,9 +311,9 @@ async def delete_upload( if upload is None: raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found") - # Verify ownership + # Verify ownership — 404 to avoid revealing existence if upload.uploaded_by != current_user.id and not current_user.is_super_admin: - raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Access denied") + raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found") # Delete from S3 await storage_service.delete_file(upload.storage_key) diff --git a/backend/app/core/filters.py b/backend/app/core/filters.py index 005e269d..f6e629e3 100644 --- a/backend/app/core/filters.py +++ b/backend/app/core/filters.py @@ -1,10 +1,12 @@ """ Centralized query filters for ResolutionFlow. -Provides reusable SQLAlchemy filter builders for tree access control -and step visibility, used across multiple endpoint modules. +Provides reusable SQLAlchemy filter builders for tree access control, +step visibility, and the canonical tenant_filter used by all queries +on tenant-scoped tables. """ from __future__ import annotations +import uuid from typing import TYPE_CHECKING from sqlalchemy import or_, and_, true as sa_true @@ -13,6 +15,18 @@ if TYPE_CHECKING: from app.models.user import User +def tenant_filter(model, account_id: uuid.UUID): + """Primary app-layer tenant filter. + + MUST be used in every SELECT/UPDATE/DELETE on tenant tables. + RLS (Phase 2) is the safety net — this is the primary enforcement. + + Usage: + stmt = select(Tree).where(tenant_filter(Tree, current_user.account_id), ...) + """ + return model.account_id == account_id + + def build_tree_access_filter(current_user: User): """Build the access filter for trees based on user permissions. @@ -36,10 +50,11 @@ def build_tree_access_filter(current_user: User): Tree.author_id == current_user.id, ] if current_user.account_id: + # Team-visible trees: use tenant_filter as the account match conditions.append( and_( Tree.visibility == 'team', - Tree.account_id == current_user.account_id + tenant_filter(Tree, current_user.account_id), ) ) return or_(*conditions) @@ -58,11 +73,14 @@ def build_step_visibility_filter(current_user: User): if current_user.account_id: return or_( StepLibrary.visibility == 'public', - and_(StepLibrary.visibility == 'team', StepLibrary.account_id == current_user.account_id), - StepLibrary.created_by == current_user.id # Own private steps + and_( + StepLibrary.visibility == 'team', + tenant_filter(StepLibrary, current_user.account_id), + ), + StepLibrary.created_by == current_user.id, ) else: return or_( StepLibrary.visibility == 'public', - StepLibrary.created_by == current_user.id + StepLibrary.created_by == current_user.id, ) diff --git a/backend/scripts/check_tenant_filters.py b/backend/scripts/check_tenant_filters.py new file mode 100644 index 00000000..f608d4b7 --- /dev/null +++ b/backend/scripts/check_tenant_filters.py @@ -0,0 +1,91 @@ +""" +Tenant filter enforcement check. + +Scans endpoint and service files for SQLAlchemy select() calls on known +tenant tables and warns when account_id or tenant_filter is not present +in the surrounding 15 lines (the typical extent of a single query). + +Usage: + python scripts/check_tenant_filters.py # warn mode (exits 0) + python scripts/check_tenant_filters.py --fail # block mode (exits 1 on findings) +""" +import re +import sys +from pathlib import Path + +# Tables that must always be filtered by account_id or tenant_filter. +# Extend this list as new tenant tables are added. +TENANT_MODELS = [ + "Tree", "AISession", "Session", "StepLibrary", "FlowProposal", + "CopilotConversation", "AssistantChat", "FileUpload", "KBImport", + "PsaConnection", "PsaPostLog", "PsaMemberMapping", "AIChatSession", + "AIConversation", "AIUsage", "Subscription", "AccountInvite", + "Notification", "NotificationConfig", "SessionShare", "UserFolder", + "UserPinnedTree", "SessionBranch", "SessionHandoff", + "SessionResolutionOutput", "ForkPoint", "AISessionStep", + "AISuggestion", "StepCategory", "TreeCategory", "TreeTag", + "Attachment", "SessionSupportingData", "MaintenanceSchedule", + "AuditLog", "ScriptBuilderSession", "ScriptTemplate", + "StepRating", "StepUsageLog", "TargetList", +] + +# Directories to scan +SCAN_DIRS = [ + Path("app/api/endpoints"), + Path("app/services"), +] + +# Patterns that indicate the query is correctly scoped. +# NOTE: user_id scoping is accepted for user-owned resources (sessions, folders, notifications). +# For account-shared resources (trees, steps, etc.) use tenant_filter or account_id. +SAFE_PATTERNS = [ + r"tenant_filter", + r"account_id", + r"user_id", # User-scoped resources (sessions, folders, notifications, etc.) + r"is_super_admin", # Super admin queries intentionally bypass tenant filter + r"# cross-tenant: approved", # Explicit approval comment +] + +SKIP_FILES = { + "admin.py", # Super admin endpoints intentionally bypass tenant filter + "admin_gallery.py", # Gallery management — super admin only, no tenant scoping needed + "public_templates.py",# Public template browser — intentionally cross-tenant + "auth.py", # Auth/registration — no account context during login/register + "ratings.py", # Session ratings — user-scoped via session lookup chain +} + +findings = [] + +for scan_dir in SCAN_DIRS: + if not scan_dir.exists(): + continue + for path in sorted(scan_dir.glob("*.py")): + if path.name in SKIP_FILES: + continue + lines = path.read_text().splitlines() + for i, line in enumerate(lines): + for model in TENANT_MODELS: + if re.search(rf"\bselect\s*\(\s*{model}\b", line): + # Check surrounding 15 lines for a safe pattern + start = max(0, i - 2) + end = min(len(lines), i + 15) + context = "\n".join(lines[start:end]) + if not any(re.search(p, context) for p in SAFE_PATTERNS): + findings.append( + f"{path}:{i + 1}: select({model}) — no tenant_filter or account_id found in context" + ) + +if findings: + print(f"\n⚠ Tenant filter check — {len(findings)} warning(s):\n") + for f in findings: + print(f" {f}") + print() + if "--fail" in sys.argv: + print("Run with --fail: exiting 1") + sys.exit(1) + else: + print("Run in warn mode — not blocking. Pass --fail to block.") + sys.exit(0) +else: + print("✓ Tenant filter check passed — no unscoped tenant table queries found.") + sys.exit(0) diff --git a/backend/tests/test_tenant_isolation_p0.py b/backend/tests/test_tenant_isolation_p0.py index 04018f30..c7519f59 100644 --- a/backend/tests/test_tenant_isolation_p0.py +++ b/backend/tests/test_tenant_isolation_p0.py @@ -1,9 +1,13 @@ -"""Cross-tenant isolation tests for Phase 0 gap fixes.""" +"""Phase 0 tenant-isolation tests. + +Verifies that endpoints respect account boundaries and don't leak data +across tenants. Each task group tests a specific endpoint fix. +""" +import uuid +import datetime as dt import pytest from httpx import AsyncClient from sqlalchemy.ext.asyncio import AsyncSession -from sqlalchemy import select -import uuid from app.models.account import Account from app.models.user import User @@ -13,23 +17,23 @@ from app.core.security import get_password_hash # ── Helpers ────────────────────────────────────────────────────────────────── -async def _create_account_and_user(db: AsyncSession, suffix: str) -> tuple[Account, User, str]: - """Create an account + owner user. Returns (account, user, plain_password).""" +async def _create_account_and_user(db: AsyncSession, prefix: str): + """Create a fresh account + engineer user. Returns (account, user, plain_password).""" + password = "TestPass123!" account = Account( - name=f"Test Corp {suffix}", + name=f"{prefix}-corp", display_code=uuid.uuid4().hex[:8], ) db.add(account) await db.flush() - password = "TestPass123!" user = User( - email=f"user-{suffix}-{uuid.uuid4().hex[:6]}@example.com", - name=f"User {suffix}", + email=f"{prefix}-{uuid.uuid4().hex[:6]}@example.com", + name=f"{prefix} user", password_hash=get_password_hash(password), is_active=True, account_id=account.id, - account_role="owner", + account_role="engineer", ) db.add(user) await db.flush() @@ -37,17 +41,18 @@ async def _create_account_and_user(db: AsyncSession, suffix: str) -> tuple[Accou async def _login(client: AsyncClient, email: str, password: str) -> dict: - """Return auth headers for a user.""" + """Log in and return Authorization headers.""" resp = await client.post( - "/api/v1/auth/login/json", + "/api/v1/auth/login", json={"email": email, "password": password}, ) - assert resp.status_code == 200, resp.text - return {"Authorization": f"Bearer {resp.json()['access_token']}"} + assert resp.status_code == 200, f"Login failed: {resp.text}" + token = resp.json()["access_token"] + return {"Authorization": f"Bearer {token}"} async def _create_private_tree(db: AsyncSession, account: Account, user: User) -> Tree: - """Create a private tree owned by account.""" + """Create a private tree owned by the given account/user.""" tree = Tree( name=f"Private Tree {uuid.uuid4().hex[:6]}", account_id=account.id, @@ -63,45 +68,511 @@ async def _create_private_tree(db: AsyncSession, account: Account, user: User) - return tree -# ── Task 1: Copilot bypass ──────────────────────────────────────────────────── +# ── Task 3: Analytics flow endpoint ────────────────────────────────────────── @pytest.mark.asyncio -async def test_copilot_cannot_start_conversation_with_other_account_tree( +async def test_analytics_flow_cannot_read_other_account_tree( client: AsyncClient, test_db: AsyncSession ): - """Account A cannot start a copilot conversation using Account B's private tree UUID.""" - acct_a, user_a, pass_a = await _create_account_and_user(test_db, "a") - acct_b, user_b, pass_b = await _create_account_and_user(test_db, "b") + """Account A cannot read flow analytics for Account B's private tree.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "anl-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "anl-b") tree_b = await _create_private_tree(test_db, acct_b, user_b) await test_db.commit() headers_a = await _login(client, user_a.email, pass_a) - resp = await client.post( - "/api/v1/copilot/conversations", - json={"tree_id": str(tree_b.id), "session_id": None, "current_node_id": None}, + resp = await client.get( + f"/api/v1/analytics/flows/{tree_b.id}", headers=headers_a, ) - # Must be 404 (not 200, not 403 — never confirm existence) assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" -@pytest.mark.asyncio -async def test_copilot_service_rejects_cross_tenant_tree(test_db: AsyncSession): - """copilot_service.start_conversation raises ValueError for cross-tenant tree.""" - from app.services import copilot_service +# ── Task 4: Category tree count ─────────────────────────────────────────────── - acct_a, user_a, pass_a = await _create_account_and_user(test_db, "svc-a") - acct_b, user_b, pass_b = await _create_account_and_user(test_db, "svc-b") +@pytest.mark.asyncio +async def test_category_tree_count_scoped_to_account( + client: AsyncClient, test_db: AsyncSession +): + """tree_count on a category must not include trees from other accounts.""" + from app.models.category import TreeCategory + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "cat-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "cat-b") + + # Shared category (account_id=None means global) + category = TreeCategory( + name="Shared Category", + slug=f"shared-cat-{uuid.uuid4().hex[:6]}", + account_id=None, + is_active=True, + ) + test_db.add(category) + await test_db.flush() + + # 3 trees for account_b under this category + for i in range(3): + tree = Tree( + name=f"B Tree {i}", + account_id=acct_b.id, + author_id=user_b.id, + category_id=category.id, + visibility="team", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + test_db.add(tree) + + # 1 tree for account_a under this category + tree_a = Tree( + name="A Tree", + account_id=acct_a.id, + author_id=user_a.id, + category_id=category.id, + visibility="team", + tree_type="troubleshooting", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + test_db.add(tree_a) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + + resp = await client.get( + f"/api/v1/categories/{category.id}", + headers=headers_a, + ) + assert resp.status_code == 200, resp.text + # account_a should only see their 1 tree, not account_b's 3 + assert resp.json()["tree_count"] == 1, ( + f"Expected tree_count=1 (own trees only), got {resp.json()['tree_count']}" + ) + + +# ── Task 5: AI session search scope ────────────────────────────────────────── + +@pytest.mark.asyncio +async def test_ai_session_search_cannot_see_other_users_sessions( + client: AsyncClient, test_db: AsyncSession +): + """User A cannot find User B's AI sessions via the search endpoint, + even when both users are in the same account.""" + from app.models.ai_session import AISession + + # Two users in the SAME account + account = Account(name="Shared Corp", display_code=uuid.uuid4().hex[:8]) + test_db.add(account) + await test_db.flush() + + password = "TestPass123!" + user_a = User( + email=f"user-a-{uuid.uuid4().hex[:6]}@shared.com", + name="User A", + password_hash=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="engineer", + ) + user_b = User( + email=f"user-b-{uuid.uuid4().hex[:6]}@shared.com", + name="User B", + password_hash=get_password_hash(password), + is_active=True, + account_id=account.id, + account_role="engineer", + ) + test_db.add_all([user_a, user_b]) + await test_db.flush() + + # Session belonging to user_b with distinctive problem_summary + session_b = AISession( + user_id=user_b.id, + account_id=account.id, + problem_summary="CONFIDENTIAL: user_b's session", + problem_domain="networking", + status="resolved", + ) + test_db.add(session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, password) + + resp = await client.get( + "/api/v1/ai-sessions/search", + params={"q": "CONFIDENTIAL"}, + headers=headers_a, + ) + assert resp.status_code == 200, resp.text + results = resp.json() + ids = [r["id"] for r in results] + assert str(session_b.id) not in ids, ( + "User A can see User B's session via search — cross-user leak within account" + ) + + +# ── Task 6: Cross-tenant UUID audit ───────────────────────────────────────── + + +@pytest.mark.asyncio +async def test_get_tree_returns_404_not_403_for_other_account( + client: AsyncClient, test_db: AsyncSession +): + """Account A gets 404 (not 403) when accessing Account B's private tree.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-tree-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-tree-b") tree_b = await _create_private_tree(test_db, acct_b, user_b) await test_db.commit() - with pytest.raises(ValueError, match="not found"): - await copilot_service.start_conversation( - user_id=user_a.id, - account_id=user_a.account_id, - tree_id=tree_b.id, - session_id=None, - current_node_id=None, - db=test_db, - ) + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/trees/{tree_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-tenant tree access, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_update_tree_returns_404_not_403_for_other_account( + client: AsyncClient, test_db: AsyncSession +): + """Account A gets 404 (not 403) when trying to update Account B's tree.""" + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-upd-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-upd-b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.put( + f"/api/v1/trees/{tree_b.id}", + json={"name": "Hacked"}, + headers=headers_a, + ) + assert resp.status_code == 404, ( + f"Expected 404 for cross-tenant tree update, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_get_session_returns_404_not_403_for_other_user( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 (not 403) when accessing User B's session.""" + from app.models.session import Session + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-sess-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-sess-b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + + session_b = Session( + tree_id=tree_b.id, + user_id=user_b.id, + tree_snapshot={"id": "root", "type": "start", "children": []}, + path_taken=[], + decisions=[], + ) + test_db.add(session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/sessions/{session_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-user session access, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_ai_session_get_returns_404_not_403_for_other_user( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 (not 403) when accessing User B's AI session.""" + from app.models.ai_session import AISession + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-ais-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-ais-b") + + ai_session_b = AISession( + user_id=user_b.id, + account_id=acct_b.id, + problem_summary="Test session", + problem_domain="networking", + status="active", + ) + test_db.add(ai_session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/ai-sessions/{ai_session_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-user AI session access, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_ai_session_retry_psa_push_requires_ownership( + client: AsyncClient, test_db: AsyncSession +): + """User A cannot retry PSA push for User B's AI session.""" + from app.models.ai_session import AISession + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-psa-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-psa-b") + + ai_session_b = AISession( + user_id=user_b.id, + account_id=acct_b.id, + problem_summary="PSA test", + problem_domain="networking", + status="resolved", + ) + test_db.add(ai_session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.post( + f"/api/v1/ai-sessions/{ai_session_b.id}/retry-psa-push", + headers=headers_a, + ) + assert resp.status_code == 404, ( + f"Expected 404 for cross-user retry-psa-push, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_upload_url_returns_404_not_403_for_other_account( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 (not 403) when accessing User B's upload URL.""" + from app.models.file_upload import FileUpload + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-upl-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-upl-b") + + upload_b = FileUpload( + account_id=acct_b.id, + uploaded_by=user_b.id, + filename="secret.png", + content_type="image/png", + size_bytes=1024, + storage_key="test/secret.png", + ) + test_db.add(upload_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/uploads/{upload_b.id}/url", headers=headers_a) + assert resp.status_code in (404, 503), ( + f"Expected 404 (or 503 if storage not configured) for cross-account upload, got {resp.status_code}" + ) + + +@pytest.mark.asyncio +async def test_share_revoke_returns_404_not_403_for_other_user( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 (not 403) when revoking User B's share.""" + from app.models.session import Session + from app.models.session_share import SessionShare + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-shr-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-shr-b") + tree_b = await _create_private_tree(test_db, acct_b, user_b) + + session_b = Session( + tree_id=tree_b.id, + user_id=user_b.id, + tree_snapshot={"id": "root", "type": "start", "children": []}, + path_taken=[], + decisions=[], + ) + test_db.add(session_b) + await test_db.flush() + + share_b = SessionShare( + session_id=session_b.id, + account_id=acct_b.id, + share_token="test-token-unique-" + uuid.uuid4().hex[:8], + share_name="Test", + visibility="public", + created_by=user_b.id, + ) + test_db.add(share_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.delete(f"/api/v1/shares/{share_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-user share revoke, got {resp.status_code}" + ) + + +# ── Task 6 (continued): steps, tags, step_categories, maintenance_schedules ── + + +@pytest.mark.asyncio +async def test_cannot_access_other_account_step( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 when reading a team-visibility step owned by Account B.""" + from app.models.step_library import StepLibrary + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-step-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-step-b") + + # Create a team-visibility step owned by account B + step_b = StepLibrary( + title="Account B Confidential Step", + step_type="action", + content={"description": "secret step"}, + created_by=user_b.id, + account_id=acct_b.id, + visibility="team", + is_active=True, + ) + test_db.add(step_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/steps/{step_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-account step access, got {resp.status_code}: {resp.text}" + ) + + +@pytest.mark.asyncio +async def test_cannot_access_other_account_tag( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 when reading a tag scoped to Account B.""" + from app.models.tag import TreeTag + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-tag-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-tag-b") + + # Create an account-scoped tag for account B + tag_b = TreeTag( + name=f"account-b-tag-{uuid.uuid4().hex[:6]}", + slug=f"account-b-tag-{uuid.uuid4().hex[:6]}", + account_id=acct_b.id, + ) + test_db.add(tag_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/tags/{tag_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-account tag access, got {resp.status_code}: {resp.text}" + ) + + +@pytest.mark.asyncio +async def test_cannot_access_other_account_step_category( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 when reading a step category scoped to Account B.""" + from app.models.step_category import StepCategory + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-scat-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-scat-b") + + # Create an account-scoped step category for account B + category_b = StepCategory( + name=f"Account B Category {uuid.uuid4().hex[:6]}", + slug=f"account-b-cat-{uuid.uuid4().hex[:6]}", + account_id=acct_b.id, + is_active=True, + ) + test_db.add(category_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/step-categories/{category_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-account step category access, got {resp.status_code}: {resp.text}" + ) + + +@pytest.mark.asyncio +async def test_maintenance_schedule_returns_404_for_other_team( + client: AsyncClient, test_db: AsyncSession +): + """User A gets 404 when reading a maintenance schedule belonging to Team B's tree.""" + from app.models.team import Team + from app.models.maintenance_schedule import MaintenanceSchedule + + # Create two separate teams + team_a = Team(name="Team A Corp") + team_b = Team(name="Team B Corp") + test_db.add_all([team_a, team_b]) + await test_db.flush() + + # Create accounts and users, assign to respective teams + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "t6-ms-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "t6-ms-b") + user_a.team_id = team_a.id + user_b.team_id = team_b.id + await test_db.flush() + + # Create a maintenance tree owned by team B + tree_b = Tree( + name="Team B Maintenance Flow", + account_id=acct_b.id, + author_id=user_b.id, + team_id=team_b.id, + visibility="team", + tree_type="maintenance", + tree_structure={"id": "root", "type": "start", "children": []}, + is_active=True, + status="published", + ) + test_db.add(tree_b) + await test_db.flush() + + # Create a schedule for that tree + schedule_b = MaintenanceSchedule( + tree_id=tree_b.id, + created_by=user_b.id, + cron_expression="0 2 * * 0", + timezone="UTC", + is_active=True, + next_run_at=dt.datetime(2026, 12, 31, tzinfo=dt.timezone.utc), + ) + test_db.add(schedule_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get(f"/api/v1/maintenance-schedules/tree/{tree_b.id}", headers=headers_a) + assert resp.status_code == 404, ( + f"Expected 404 for cross-team maintenance schedule access, got {resp.status_code}: {resp.text}" + ) + + +@pytest.mark.asyncio +async def test_get_documentation_returns_404_for_other_user_session( + client: AsyncClient, test_db: AsyncSession +): + """GET /ai-sessions/{id}/documentation must return 404 (not 403) for cross-user access.""" + from app.models.ai_session import AISession + + acct_a, user_a, pass_a = await _create_account_and_user(test_db, "doc-a") + acct_b, user_b, pass_b = await _create_account_and_user(test_db, "doc-b") + + session_b = AISession( + user_id=user_b.id, + account_id=acct_b.id, + problem_summary="B's confidential session", + problem_domain="networking", + status="resolved", + ) + test_db.add(session_b) + await test_db.commit() + + headers_a = await _login(client, user_a.email, pass_a) + resp = await client.get( + f"/api/v1/ai-sessions/{session_b.id}/documentation", + headers=headers_a, + ) + assert resp.status_code == 404, f"Expected 404, got {resp.status_code}: {resp.text}" diff --git a/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md b/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md index be49cc05..a00f4cb2 100644 --- a/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md +++ b/docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md @@ -589,7 +589,8 @@ The following cross-tenant operations are explicitly approved. All others requir |---|---|---| | Is tree sharing intra-tenant only, or can trees be shared across accounts? | Determines `tree_shares` schema, backfill strategy, and RLS policy. Tree_shares table deferred until resolved. | Product | | What is the exact PostgreSQL exception raised when `current_setting('app.current_account_id', false)::uuid` is evaluated with no value set? | Determines the fail-closed test assertion. Must be tested in Phase 2. | Engineering | -| TargetList: zero references + zero rows? Or does data exist? | Determines whether table is dropped or migrated. | Phase 0 audit | +| TargetList audit complete: active code references found in 12+ files across backend and frontend (full CRUD API, frontend page, used in MaintenanceScheduleSection and BatchLaunchModal). Cannot be dropped. Row count confirmed: **0 rows** in production. Decision: migrate to account_id in Phase 1 via backfill from team_id → accounts. Zero rows means backfill is trivial (no data to migrate, just schema change). | ✓ Resolved — migrate in Phase 1. Zero rows confirmed 2026-04-09. | ✓ Done | +| Teams orphan check: **0 orphaned teams** confirmed 2026-04-09. Phase 1 backfill using team_id → account_id chain is safe to proceed. | ✓ Resolved — Phase 1 can proceed without team cleanup. | ✓ Done | --- From 85575839f2762a28b9036eb7ba5ede64e57fc8d3 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 9 Apr 2026 10:41:21 +0000 Subject: [PATCH 4/4] docs: update CHANGELOG with tenant isolation Phase 0 and security fixes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add Tenant Isolation Phase 0 (#132) — app-layer filtering, cross-tenant audit, UUID isolation - Document CRITICAL copilot tree query isolation fix (#131) - Add AI session search, analytics, category, PSA retry, and task lane fixes - Note 404 (not 403) responses for cross-tenant access to avoid confirming resource existence https://claude.ai/code/session_014EUBLi2jHrnzJupcetmdwV --- CHANGELOG.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 88f40644..0cb1e2e0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ All notable changes to ResolutionFlow are documented here. - Tree Templates + Import/Export marketplace (#66) - Recurring Issue Detection — client-specific pattern alerts (#60) - Step Feedback Flag — "This Step is Wrong" reporting (#58) +- **Tenant Isolation Phase 0** — multi-tenant data isolation (#132) with app-layer filtering helpers (`tenant_filter()`, `get_tenant_context`), cross-tenant access audit (analytics, categories, AI sessions, trees), UUID endpoint isolation with 404 responses for unauthorized access, ownership checks on all sensitive operations, and CI grep gate for missing tenant filters - **Script Library default view** — "All Scripts" tab now displays all accessible scripts (team + library) - **Session documentation overhaul** — reformatted PSA resolution/escalation notes with cleaner headers, inline engineer responses, decimal hour display (0.25 hrs), follow-up recommendations, and improved "What We Know" section from evidence items - **Client communication improvements** — new `request_info` audience type for client-facing information requests, improved status update and email draft prompts with per-context guidance @@ -21,8 +22,15 @@ All notable changes to ResolutionFlow are documented here. - **Status update generation** — fixed option label lookup to use human-readable labels instead of machine values - **Assistant Chat session actions** — moved Pause/Resume/Close actions from action bar to page header for consistency with FlowPilot - **Design system token normalization** — unified FlowPilot, AssistantChat, and ScriptBuilder components to use consistent design tokens +- **Tenant data boundaries** — all session and tree endpoints now return 404 (not 403) for cross-tenant access attempts to avoid confirming resource existence ### Fixed +- **CRITICAL: Copilot tree query isolation** (#131) — user could access any tree UUID if known, exposing full tree structure to AI. Now scoped to current account with 404 for inaccessible trees. +- **AI session search isolation** — search endpoint leaked other users' sessions via OR(user_id, account_id). Now restricted to current user only. +- **Analytics endpoint isolation** — GET `/analytics/flows/{tree_id}` exposed session counts for any tree UUID. Now returns 404 if tree doesn't belong to requesting account. +- **Category tree counts** — cross-tenant row count leakage via tree_count field in GET `/categories/{id}`. Now scoped to requesting account. +- **PSA retry ownership check** — retry-psa-push had no ownership validation (CRITICAL). Now validates user ownership before allowing retry. +- **Task Lane save operation** — invalid task_lane_item UUIDs returned 403 revealing existence. Now returns 404 and uses query-level filtering. - Dark text rendering on blue accent step-number badges across all flow types - Script Library tab ownership filter now preserved across category and search changes - Race conditions in script builder session creation and slug generation