4 Commits

Author SHA1 Message Date
036431aef8 chore(ai): update HANDOFF.md and SESSION_LOG.md for session end
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Reflect current state: dual-agent migration + Codex review round +
branch cleanup (RLS test gating, Phase 9 docs, .remember/ gitignore,
landing-handoff deletion). Working tree clean, no active task, 3
cleanup commits queued to push.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 16:16:55 -04:00
b3be1e0749 chore: ignore .remember/ skill runtime state
Runtime hook logs and PIDs from the remember skill — local-only, not
repo content.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 16:09:23 -04:00
b3506b5e73 docs(pilot): phase 9 review issues
Review findings companion to docs/FlowAssist_Migration/Issues/phase-8-review-issues.md.
Documents the issues addressed by commit 24972e8 (partial-outcome notes
+ per-fix script-builder remount).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 16:09:23 -04:00
b14a16a1ab chore(tests): gate RLS tests behind RUN_RLS_TESTS flag
Continues the test-isolation work from dab740d. RLS migration tests run
against a policy-installed database and fail in the default create_all
suite, so they need to be opt-in:

- pytest.ini: register `rls` marker.
- conftest.py: auto-deselect test_rls_isolation.py unless
  RUN_RLS_TESTS=1. Drops the deprecated session-scoped event_loop
  fixture (not needed since pytest-asyncio 0.23+).
- test_rls_isolation.py: tag module with `rls` marker. Replace
  hardcoded `patherly_test` DB reference with parsed DATABASE_TEST_URL
  (matches conftest.py default `resolutionflow_test`). Updated docstring
  command to show RUN_RLS_TESTS=1.
- requirements-dev.txt: bump pytest-asyncio 0.23.0 → 0.24.0 (loop-scope
  marker behavior required by the RLS module fixture).

Run the RLS suite with:
  RUN_RLS_TESTS=1 DB_APP_ROLE_PASSWORD=... pytest tests/test_rls_isolation.py

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 16:09:13 -04:00
8 changed files with 211 additions and 51 deletions

View File

@@ -2,23 +2,34 @@
# HANDOFF.md
**Last updated:** 2026-04-24 (America/New_York) — initial setup
**Last updated:** 2026-04-24 (America/New_York)
**Active task:** See [CURRENT_TASK.md](CURRENT_TASK.md). (No real task yet — handoff system just initialized.)
**Active task:** None — see [CURRENT_TASK.md](CURRENT_TASK.md). Replace it when picking up the next real task.
**Branch:** `feat/flowpilot-migration`
**Branch:** `feat/flowpilot-migration` — a long-running FlowPilot Phase 9 feature branch. The recent AI-handoff migration commits ride on this branch (not on their own branch); they'll merge to `main` whenever Phase 9 does.
**Branch state:** 3 commits ahead of `origin/feat/flowpilot-migration`:
- `b3be1e0 chore: ignore .remember/ skill runtime state`
- `b3506b5 docs(pilot): phase 9 review issues`
- `b14a16a chore(tests): gate RLS tests behind RUN_RLS_TESTS flag`
Earlier in this session (already pushed to origin):
- `9c8ba29 fix(ai): correct stale role-hierarchy and file-listing claims`
- `bee8690 chore(ai): migrate to dual-agent handoff system`
- `e110fed chore: snapshot CLAUDE.md before ai-handoff migration` (tag: `pre-ai-handoff`)
**Where I left off:**
- File: n/a
- Next intended action: first agent to pick up real work should replace `CURRENT_TASK.md` example and start fresh here.
- File: n/a — nothing mid-edit.
- Next intended action: push the 3 unpushed commits when ready (`git push`), then start the next real task (replace `CURRENT_TASK.md`, update this file).
**Uncommitted state:**
- `backend/pytest.ini`, `backend/requirements-dev.txt`, `backend/tests/conftest.py`, `backend/tests/test_rls_isolation.py` modified (pre-existing, unrelated to handoff-system migration).
- Safe to WIP-commit? Ask Michael — these are mid-flight test isolation fixes from `dab740d`.
- Working tree is clean.
**Immediate next steps:**
1. When picking up real work, replace this file's body with the actual resume point.
2. Keep this file under ~2K tokens. If it grows, move older context into `SESSION_LOG.md`.
1. `git push` to publish the 3 local commits (cleanup batch).
2. When starting the next real feature task: replace `CURRENT_TASK.md` with actual goal/DoD, rewrite this file's resume section.
**Open questions / blockers:**
- None.
- None. The dual-agent handoff system is live and has survived one Codex review round (see DECISIONS.md 2026-04-24 entry; corrections in `9c8ba29`).

View File

@@ -20,3 +20,4 @@
- Left for next session: first real feature task should replace the seed `CURRENT_TASK.md` and update `HANDOFF.md` with real resume state.
- Files touched: `.ai/*.md` (created), `CLAUDE.md` (rewritten), `AGENTS.md` (created), `SESSION-HANDOFF.md` (deleted).
- Follow-up (same day): Codex review pass flagged stale SaaS-role claim and incomplete file-listings carried over from the pre-migration CLAUDE.md. Verified against `backend/app/core/permissions.py`, `frontend/src/hooks/usePermissions.ts`, `backend/app/api/deps.py`, `backend/app/api/router.py`, and `backend/app/services/psa/`. Corrected PROJECT_CONTEXT.md role hierarchy (`super_admin > owner > engineer > viewer`, not `team_admin`), added `require_account_owner` / `require_team_admin` to deps list, replaced stale endpoint comment with a summary pointing at `api/router.py`, added `exceptions.py` + `ticket_context.py` to the PSA file list. Also replaced seed-example content in `CURRENT_TASK.md` and `TODO.md` with clearer empty-state sentinels.
- Branch cleanup (same day): committed pending test-isolation work as `b14a16a chore(tests): gate RLS tests behind RUN_RLS_TESTS flag`, new Phase 9 review doc as `b3506b5 docs(pilot): phase 9 review issues`, and `.remember/` gitignore entry as `b3be1e0 chore: ignore .remember/ skill runtime state`. Deleted `docs/landing-handoff/` (prepared for external design work, not meant to live in the repo). Working tree clean; 3 cleanup commits unpushed.

3
.gitignore vendored
View File

@@ -238,3 +238,6 @@ package-lock.json
# graphify knowledge graph outputs
graphify-out/
.graphify_python
# remember skill runtime state (hook logs, PIDs)
.remember/

View File

@@ -27,6 +27,7 @@ markers =
slow: marks tests as slow (deselect with '-m "not slow"')
integration: marks tests as integration tests
unit: marks tests as unit tests
rls: opt-in RLS migration and policy tests (run with RUN_RLS_TESTS=1)
# Ignore paths
testpaths = tests

View File

@@ -3,7 +3,7 @@
# Testing
pytest==7.4.3
pytest-asyncio==0.23.0
pytest-asyncio==0.24.0
httpx>=0.27.0
pytest-cov==4.1.0

View File

@@ -4,8 +4,8 @@ Pytest configuration and fixtures for integration tests.
Provides test database setup, client fixtures, and authentication helpers.
"""
import asyncio
from typing import AsyncGenerator, Generator
import os
from typing import AsyncGenerator
import pytest
import sqlalchemy as sa
from httpx import AsyncClient, ASGITransport
@@ -26,7 +26,6 @@ settings.REQUIRE_INVITE_CODE = False
# would silently nuke the dev database. Only DATABASE_TEST_URL is honored,
# and the safety assertion below refuses to run against a DB whose name
# doesn't contain "test".
import os
TEST_DATABASE_URL = os.environ.get(
"DATABASE_TEST_URL",
"postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
@@ -43,13 +42,27 @@ assert "test" in _test_db_name, (
f"test database (e.g. resolutionflow_test)."
)
_RUN_RLS_TESTS = os.environ.get("RUN_RLS_TESTS") == "1"
_RLS_ISOLATION_FILE = "test_rls_isolation.py"
@pytest.fixture(scope="session")
def event_loop() -> Generator:
"""Create an instance of the default event loop for each test case."""
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()
def pytest_collection_modifyitems(config, items):
"""Keep migration-managed RLS checks out of the default create_all suite."""
if _RUN_RLS_TESTS:
return
selected = []
deselected = []
for item in items:
item_path = getattr(item, "path", None) or getattr(item, "fspath", None)
if item_path and str(item_path).endswith(_RLS_ISOLATION_FILE):
deselected.append(item)
else:
selected.append(item)
if deselected:
config.hook.pytest_deselected(items=deselected)
items[:] = selected
@pytest.fixture

View File

@@ -11,30 +11,57 @@ Tests bypass FastAPI entirely — raw asyncpg connections only.
MUST FAIL before Task 10 (RLS migration) and PASS after it.
Run with:
DB_APP_ROLE_PASSWORD=app_secret_change_me pytest tests/test_rls_isolation.py -v
RUN_RLS_TESTS=1 DB_APP_ROLE_PASSWORD=app_secret_change_me pytest tests/test_rls_isolation.py -v
The test DB is patherly_test (matches conftest.py default).
The test DB comes from DATABASE_TEST_URL, matching conftest.py.
"""
import os
import subprocess
import sys
import uuid
from pathlib import Path
from urllib.parse import unquote, urlsplit
import asyncpg
import pytest
import pytest_asyncio
# All tests in this module use module-scoped async fixtures (admin_conn,
# seed_rls_test_data) which run on the module event loop. Without this marker,
# pytest-asyncio 0.23+ defaults tests to function-scoped loops, causing
# "Future attached to a different loop" errors on the asyncpg connections.
pytestmark = pytest.mark.asyncio(loop_scope="module")
pytestmark = [
pytest.mark.asyncio(loop_scope="module"),
pytest.mark.rls,
]
_DB_HOST = os.getenv("TEST_DB_HOST", "localhost")
_DB_PORT = int(os.getenv("TEST_DB_PORT", "5432"))
_DB_NAME = os.getenv("TEST_DB_NAME", "patherly_test") # matches conftest.py
_DATABASE_TEST_URL = os.getenv(
"DATABASE_TEST_URL",
"postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
)
_DATABASE_TEST_URL_ASYNCPG = _DATABASE_TEST_URL.replace(
"postgresql+asyncpg://",
"postgresql://",
1,
)
_DATABASE_TEST_URL_SYNC = _DATABASE_TEST_URL_ASYNCPG
_TEST_DB_PARTS = urlsplit(_DATABASE_TEST_URL_ASYNCPG)
_DB_HOST = os.getenv("TEST_DB_HOST", _TEST_DB_PARTS.hostname or "localhost")
_DB_PORT = int(os.getenv("TEST_DB_PORT", str(_TEST_DB_PARTS.port or 5432)))
_DB_NAME = os.getenv(
"TEST_DB_NAME",
unquote(_TEST_DB_PARTS.path.lstrip("/") or "resolutionflow_test"),
)
_ADMIN_USER = os.getenv(
"TEST_DB_ADMIN_USER",
unquote(_TEST_DB_PARTS.username or "postgres"),
)
_ADMIN_PASSWORD = os.getenv(
"TEST_DB_ADMIN_PASSWORD",
unquote(_TEST_DB_PARTS.password or "postgres"),
)
_APP_PASSWORD = os.getenv("DB_APP_ROLE_PASSWORD", "app_secret_change_me")
_ADMIN_DSN = f"postgresql://postgres:postgres@{_DB_HOST}:{_DB_PORT}/{_DB_NAME}"
PLATFORM_ACCOUNT_ID = "00000000-0000-0000-0000-000000000001"
ACCOUNT_A_ID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
@@ -55,23 +82,33 @@ def _ensure_rls_schema():
the full migration-managed schema (including RLS policies) is in place.
"""
backend_dir = Path(__file__).parent.parent
env = os.environ.copy()
env["DATABASE_URL"] = _DATABASE_TEST_URL
env["DATABASE_URL_SYNC"] = _DATABASE_TEST_URL_SYNC
subprocess.run(
[sys.executable, "-m", "alembic", "upgrade", "head"],
cwd=backend_dir,
env=env,
check=True,
capture_output=True,
)
@pytest.fixture(scope="module")
@pytest_asyncio.fixture(scope="module", loop_scope="module")
async def admin_conn(_ensure_rls_schema):
"""Superuser asyncpg connection for fixture setup and teardown."""
conn = await asyncpg.connect(_ADMIN_DSN)
conn = await asyncpg.connect(
host=_DB_HOST,
port=_DB_PORT,
database=_DB_NAME,
user=_ADMIN_USER,
password=_ADMIN_PASSWORD,
)
yield conn
await conn.close()
@pytest.fixture(scope="module", autouse=True)
@pytest_asyncio.fixture(scope="module", loop_scope="module", autouse=True)
async def seed_rls_test_data(admin_conn):
"""
Create two isolated test accounts, one user per account, and one private
@@ -154,7 +191,7 @@ async def seed_rls_test_data(admin_conn):
await admin_conn.execute("DELETE FROM tree_tags WHERE slug = 'rls-global-tag'")
@pytest.fixture
@pytest_asyncio.fixture(loop_scope="module")
async def conn_a():
"""App-role connection, tenant context = Account A."""
conn = await asyncpg.connect(
@@ -168,7 +205,7 @@ async def conn_a():
await conn.close()
@pytest.fixture
@pytest_asyncio.fixture(loop_scope="module")
async def conn_b():
"""App-role connection, tenant context = Account B."""
conn = await asyncpg.connect(
@@ -182,7 +219,7 @@ async def conn_b():
await conn.close()
@pytest.fixture
@pytest_asyncio.fixture(loop_scope="module")
async def conn_no_context():
"""App-role connection with NO tenant context set."""
conn = await asyncpg.connect(
@@ -288,7 +325,7 @@ async def test_flow_proposals_account_a_cannot_see_account_b(conn_a):
# Phase 2 fixtures
# ---------------------------------------------------------------------------
@pytest.fixture(scope="module")
@pytest_asyncio.fixture(scope="module", loop_scope="module")
async def session_row_ids(admin_conn):
"""
Insert one `sessions` row and one `ai_sessions` row for each of
@@ -644,13 +681,15 @@ async def test_psa_post_log_account_a_cannot_see_account_b(conn_a, session_row_i
async def test_step_library_account_a_cannot_see_account_b_private_steps(admin_conn, conn_a):
"""Private/non-public steps owned by Account B must not be visible to Account A."""
user_b_id = await _get_user_b_id(admin_conn)
private_step_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO step_library (
id, account_id, title, step_type, content,
id, account_id, created_by, title, step_type, content,
visibility, is_active, created_at, updated_at
) VALUES (
'{private_step_id}', '{ACCOUNT_B_ID}', 'RLS Private Step', 'action',
'{private_step_id}', '{ACCOUNT_B_ID}', '{user_b_id}',
'RLS Private Step', 'action',
'{{}}'::jsonb, 'private', TRUE, NOW(), NOW()
)
""")
@@ -668,13 +707,15 @@ async def test_step_library_account_a_cannot_see_account_b_private_steps(admin_c
async def test_step_library_account_a_can_see_account_b_public_steps(admin_conn, conn_a):
"""Public steps owned by Account B MUST be visible to Account A (cross-tenant visibility)."""
user_b_id = await _get_user_b_id(admin_conn)
public_step_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO step_library (
id, account_id, title, step_type, content,
id, account_id, created_by, title, step_type, content,
visibility, is_active, created_at, updated_at
) VALUES (
'{public_step_id}', '{ACCOUNT_B_ID}', 'RLS Public Step', 'action',
'{public_step_id}', '{ACCOUNT_B_ID}', '{user_b_id}',
'RLS Public Step', 'action',
'{{}}'::jsonb, 'public', TRUE, NOW(), NOW()
)
""")
@@ -728,10 +769,11 @@ async def test_step_ratings_account_a_cannot_see_account_b(admin_conn, conn_a):
step_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO step_library (
id, account_id, title, step_type, content,
id, account_id, created_by, title, step_type, content,
visibility, is_active, created_at, updated_at
) VALUES (
'{step_id}', '{ACCOUNT_B_ID}', 'Phase3 RLS Step', 'action',
'{step_id}', '{ACCOUNT_B_ID}', '{user_b_id}',
'Phase3 RLS Step', 'action',
'{{}}'::jsonb, 'private', TRUE, NOW(), NOW()
)
""")
@@ -768,10 +810,11 @@ async def test_step_usage_log_account_a_cannot_see_account_b(admin_conn, conn_a)
step_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO step_library (
id, account_id, title, step_type, content,
id, account_id, created_by, title, step_type, content,
visibility, is_active, created_at, updated_at
) VALUES (
'{step_id}', '{ACCOUNT_B_ID}', 'Phase3 Usage Step', 'action',
'{step_id}', '{ACCOUNT_B_ID}', '{user_b_id}',
'Phase3 Usage Step', 'action',
'{{}}'::jsonb, 'private', TRUE, NOW(), NOW()
)
""")
@@ -971,10 +1014,10 @@ async def test_script_builder_sessions_account_a_cannot_see_account_b(admin_conn
session_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO script_builder_sessions (
id, user_id, account_id, language, created_at, updated_at
id, user_id, account_id, language, origin, created_at, updated_at
) VALUES (
'{session_id}', '{user_b_id}', '{ACCOUNT_B_ID}',
'powershell', NOW(), NOW()
'powershell', 'standalone', NOW(), NOW()
)
""")
try:
@@ -1001,22 +1044,24 @@ async def test_ai_session_steps_account_a_cannot_see_account_b(admin_conn, conn_
ai_session_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO ai_sessions (
id, user_id, account_id, flow_type, status, confidence_tier,
id, user_id, account_id, session_type, intake_type,
intake_content, status, confidence_tier, confidence_score,
created_at, updated_at
) VALUES (
'{ai_session_id}', '{user_b_id}', '{ACCOUNT_B_ID}',
'troubleshooting', 'active', 'guided', NOW(), NOW()
'guided', 'free_text', '{{}}'::jsonb, 'active', 'guided', 0.0,
NOW(), NOW()
)
""")
step_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO ai_session_steps (
id, session_id, account_id, step_type, content,
id, session_id, account_id, step_order, step_type, content,
created_at
) VALUES (
'{step_id}', '{ai_session_id}', '{ACCOUNT_B_ID}',
'question', 'Phase4 RLS test step', NOW()
1, 'question', '{{"text": "Phase4 RLS test step"}}'::jsonb, NOW()
)
""")
try:
@@ -1040,11 +1085,11 @@ async def test_notifications_account_a_cannot_see_account_b(admin_conn, conn_a):
notif_id = str(uuid.uuid4())
await admin_conn.execute(f"""
INSERT INTO notifications (
id, user_id, account_id, type, title, message,
id, user_id, account_id, event, title, body,
is_read, created_at
) VALUES (
'{notif_id}', '{user_b_id}', '{ACCOUNT_B_ID}',
'info', 'Phase4 RLS Test', 'RLS isolation test notification',
'test_event', 'Phase4 RLS Test', 'RLS isolation test notification',
FALSE, NOW()
)
""")
@@ -1055,4 +1100,3 @@ async def test_notifications_account_a_cannot_see_account_b(admin_conn, conn_a):
assert len(rows) == 0, "Account A should not see Account B notifications"
finally:
await admin_conn.execute(f"DELETE FROM notifications WHERE id = '{notif_id}'")

View File

@@ -0,0 +1,87 @@
# Phase 9 Review Issues
Date: 2026-04-24
Scope reviewed:
- `backend/app/api/endpoints/script_builder.py`
- `backend/app/api/endpoints/session_suggested_fixes.py`
- `backend/app/services/script_builder_service.py`
- `frontend/src/pages/AssistantChatPage.tsx`
- `frontend/src/components/pilot/ScriptBuilderTab.tsx`
- `frontend/src/components/pilot/EscalateInterceptDialog.tsx`
## 1. "Applied partially" from the escalation intercept cannot persist
Severity: High
The escalation intercept offers an "applied partially" choice, but the frontend
sends `applied_partial` without notes. The backend requires notes for that
outcome and returns 400. The frontend catches the error silently and still opens
the conclude modal, so the user can believe the partial outcome was recorded
when it was not.
Relevant files:
- `frontend/src/pages/AssistantChatPage.tsx:659`
- `frontend/src/components/pilot/EscalateInterceptDialog.tsx:56`
- `backend/app/api/endpoints/session_suggested_fixes.py:316`
Why this matters:
- `handleInterceptChoice()` maps the partial button directly to
`patchOutcome(..., "applied_partial")`.
- The call does not provide `notes`.
- `PATCH /suggested-fixes/{fix_id}/outcome` rejects `applied_partial` without
notes.
- The catch block is silent and the UI continues into the conclude flow.
- The recorded fix status therefore remains unchanged while the user sees a
flow that implies the partial outcome was accepted.
Recommended fix:
- Prompt for partial notes before calling `patchOutcome()` with
`applied_partial`.
- Do not proceed to the conclude modal if the partial outcome write fails.
- Consider hiding or disabling the partial option when it is not applicable, or
pass the current fix status into `EscalateInterceptDialog` so it can render
valid choices only.
- Add a regression test covering the partial escalation-intercept path.
## 2. Script Builder can attach stale script state to a newer active fix
Severity: Medium/High
`ScriptBuilderTab` keeps local builder state across active-fix changes within
the same pilot chat. If a new active fix supersedes the previous one while the
tab remains mounted, old messages, `latestScript`, or editor text can remain in
memory while submission uses the new `fix.id`.
Relevant files:
- `frontend/src/components/pilot/ScriptBuilderTab.tsx:55`
- `frontend/src/components/pilot/ScriptBuilderTab.tsx:78`
- `frontend/src/components/pilot/ScriptBuilderTab.tsx:150`
- `frontend/src/pages/AssistantChatPage.tsx:399`
- `frontend/src/pages/AssistantChatPage.tsx:1630`
Why this matters:
- `ScriptBuilderTab` initializes `editorBuffer`, messages, and latest script
from props and builder-session data.
- The create/resume effect depends on `pilotSessionId`, not `fix.id`.
- `AssistantChatPage` detects active-fix changes but only closes the script
panel.
- The rendered `ScriptBuilderTab` is not keyed by active fix id.
- Submitting a stale builder draft calls the script patch endpoint with the
current `fix.id`, so an older script can be attached to a newer fix.
Recommended fix:
- Reset Script Builder local state when `activeFix.id` changes.
- Key the rendered `ScriptBuilderTab` by `activeFix.id` if the intended UX is a
fresh builder surface per fix.
- If inline builder conversations are intended to resume per fix, extend the
backend idempotency model to include the fix id instead of only
`(user_id, ai_session_id)`.
- Add a frontend regression test for an active fix changing while the Script
Builder tab is mounted.
## Review Context
This review was based on code inspection of the latest committed Phase 9
implementation. No tracked working-tree diffs were present at review time.