# PR #193 (Phase 2A — L1 AI Tree Builder) Review Findings **Date:** 2026-06-09 **Reviewed:** `feat/l1-ai-tree-builder-phase-2a` vs `main` (42 files, +2,326/−154) **Process:** 7 independent finder angles, every candidate independently verified against actual code (quoted lines confirmed, not speculation). **Verdict: DO NOT MERGE as-is.** The headline feature (AI-guided walkthrough) is non-functional end-to-end, two tasks recorded as complete in `.ai/HANDOFF.md` were never actually committed, and one DB constraint is a deletion time bomb. --- ## ✅ RESOLUTION (2026-06-09, same day) **All 10 findings resolved.** Two architectural decisions taken (see `.ai/DECISIONS.md`): the **root fix** for Findings 8/9 (real `category` / `problem_text` / `pending_node` columns on `l1_walk_sessions`; the `{"node_type":"meta"}` walked_path convention deleted entirely — migration `61dda4f615c6`), and **restoring the ad-hoc walk** (Finding 5 option a — `adhoc=True` intake + "Walk it ad-hoc" out_of_scope button). - **Finding 1** — `ai_tree_builder._assign_id` stamps `uuid4().hex[:8]` on every node (generated, depth-cap, generation-failed); `current_node_id` now real. Contract test added (`test_ai_build_first_node_carries_id_and_advance_grows_walk`). - **Finding 2a/3** — `L1EscalationsSection` mounted on `EscalationQueuePage`; `ProposalDetail` `/pilot` link gated on `source_session_id`, L1-source block added. - **Finding 2b** — renders `step.question ?? step.text`, `timeAgo`, shows `problem_text`. - **Finding 4** — intake honors explicit `flow_id` (matcher bypassed); suggest card passes `near_miss.flow_id`; the three intake handlers collapsed into one `runIntake`. - **Finding 5** — ad-hoc walk restored (option a). - **Finding 6** — `l1_session_id` FK → `ondelete=CASCADE` (model + migration); cascade-delete test. - **Finding 7** — owner+admin at all three layers (GET dep, route guard, `usePermissions`); `require_account_owner_or_admin` delegates to `User.can_manage_account`; `User.account_role` TS type gains `'admin'`. - **Finding 8** — `pending_node` column; `/next-node` replays the served node on re-mount (no duplicate paid generation); reads context off the session (no ticket re-fetch). - **Finding 9** — meta entry gone → empty walk is falsy (no junk proposal) and the depth cap counts only real steps. - **Finding 10** — `escalate` passes `target_ids or None` (default fallback), filters `deleted_at IS NULL`, warns when empty; two tests. - **Cleanups** — dead `ticket_ref` deleted, `IntakeResponse` per-outcome validator + `ticket_kind` Literal restored, unused `acknowledged` dropped, escalations partial index added, restored the deleted `no_kb_content` audit assertion. **Verification:** full Phase 2A backend set **110 passed / 0 failed**; frontend `tsc -b` + `eslint` + `vite build` clean; migration upgrade→downgrade→upgrade roundtrip clean (columns + FK `confdeltype` + partial index confirmed); anti-parrot guardrail green. How to use this file: work the findings in order. Findings 1–7 are merge blockers; 8–10 can be fast-follows. Each finding lists the verified evidence (file:line) and a suggested fix. Several findings share two root causes — fix those at the root rather than patching symptoms: - **Root cause A:** AI-generated nodes have no `id`, but the advance protocol keys on `node_id`. (Finding 1; touches 8.) - **Root cause B:** The intake category is smuggled into `walked_path` as a fake `{"node_type":"meta"}` entry that every consumer must know to skip — and most don't. (Findings 2b, 9; the deeper fix is a real `category` column on `l1_walk_sessions`, plus `problem_text` while you're there — see Finding 8's note.) --- ## MERGE BLOCKERS ### 1. AI walkthrough can never advance past the first question (showstopper) **Evidence:** - `backend/app/services/ai_tree_builder.py:37-41` — SYSTEM_PROMPT's JSON output shapes (`{"node_type":"question","text":...}` etc.) define **no `id` field**; `validate_node` (lines 71-79) returns the node unchanged; nothing anywhere assigns an id. - `backend/app/services/l1_session_service.py:156` — `advance_ai_build` only appends to `walked_path` `if node_id is not None`; docstring (line 139) says "On the first call (node_id is None) nothing is appended." - `frontend/src/components/l1/L1WalkTreeVariant.tsx:52-54` — sends `node_id: node.id`, which is `undefined` at runtime (server never sends an id; `JSON.stringify` drops undefined keys) → backend always receives `node_id=None`. - `l1_session_service.py:174` — `session.current_node_id = next_node.get("id")` is always `None`. **User impact:** Tech answers the first question → answer is discarded, the same (or a re-rolled) first question regenerates forever. `walked_path` never grows past the meta entry, the depth cap never fires, and resolve captures an empty tree. **Why tests missed it:** `test_l1_api_ai_build` and friends mock `advance_ai_build` / hand-craft nodes **with** `id` keys — a shape the real model is never instructed to produce. **Fix:** Assign a server-side id to every generated node before returning it (e.g., `uuid4().hex[:8]` in `generate_next_node` after `validate_node`), persist it as `session.current_node_id`, and add a test that runs the real (unmocked-shape) prompt contract: generate → assert node has id → advance with that id → assert walked_path grew. Do NOT ask the LLM to invent ids. ### 2. Escalations from AI sessions go nowhere (two linked defects) **2a — Component never mounted.** - `grep -rn "L1EscalationsSection" frontend/src` → exactly one hit: its own definition (`frontend/src/components/l1/L1EscalationsSection.tsx:10`). It is imported nowhere. - `frontend/src/pages/EscalationQueuePage.tsx:3` imports only `EscalationQueue, EscalationMetricCard` from `@/components/flowpilot`. - `backend/app/services/notification_service.py:449` — `"l1.session.escalated": "/escalations"` deep-link → `frontend/src/router.tsx:299` renders `EscalationQueuePage` → engineer sees only FlowPilot escalations; the L1 handoff is invisible. `GET /l1/escalations` (`backend/app/api/endpoints/l1.py:330`) has no UI surface. - **Note:** `.ai/HANDOFF.md:38` claims "L1EscalationsSection on EscalationQueuePage" — that claim is false (documentation drift; see Finding 3). **2b — Component renders wrong fields once mounted.** - `backend/app/services/l1_session_service.py:162-168` — ai_build entries are `{"node_type", "id", "text", "answer", "l1_note"}` (key is `text`); legacy `record_step` (lines 199-204) uses `question`/`node_id`. - `L1EscalationsSection.tsx:61` renders `{step.question}` → blank for every ai_build entry. - `L1EscalationsSection.tsx:46` — `{s.walked_path.length} steps walked` counts the hidden meta entry → "N+1 steps walked". - `backend/app/api/endpoints/l1.py:41` — `_to_response` returns `walked_path` raw (meta entry included). **Fix:** Mount `L1EscalationsSection` on `EscalationQueuePage` (or fold L1 rows into the existing queue). Render `step.question ?? step.text`. Filter `node_type === 'meta'` entries — ideally server-side in `_to_response` (or eliminate the meta entry entirely per Root cause B). Also use the shared `timeAgo` util (`frontend/src/lib/timeAgo.ts`) instead of `new Date(...).toLocaleString()` at line 50, to match every sibling queue. ### 3. Two tasks recorded as complete were never committed - Task 16 ("ProposalDetail L1-source block") and Task 17 (mounting the escalations section) appear in `.ai/HANDOFF.md` / `SESSION_LOG.md` as done, but **no hunk for `ProposalDetail.tsx` exists in the diff**, and Finding 2a proves the mount never happened. - Concrete user impact today: `frontend/src/components/flowpilot/ProposalDetail.tsx:91-101` renders the "Source Session" card unconditionally; line 95 is `` to={`/pilot/${proposal.source_session_id}`} `` with no null guard. L1-sourced proposals (created with `source_session_id=None`, `l1_session_id=`) reach the review queue as `pending` → engineers get a broken **`/pilot/null`** link. **Fix:** Implement the missing work: in `ProposalDetail.tsx`, gate the `/pilot/` link on `source_session_id != null` and render an L1-source block (problem statement, category, link to the L1 session / escalations view) when `l1_session_id` is set. The backend already serves `l1_session_id` via `backend/app/schemas/flow_proposal.py`. Then correct `.ai/HANDOFF.md`. ### 4. "Use this flow" button silently does nothing - `frontend/src/pages/l1/L1Dashboard.tsx:77-86` — `useSuggestedFlow` re-POSTs `/l1/intake` with the same text, no `flow_id`. The in-code comment ("it matches again and returns a `matched` outcome") is factually wrong: the same text scores in the same 0.60–0.75 suggest band (`backend/app/services/match_or_build.py:66-72`, `MATCH_THRESHOLD = 0.75` line 21) → `suggest` again, no `session_id` → handler falls to `resetPrompts()` and the card vanishes. The suggested flow can never be started. - `backend/app/api/endpoints/l1.py` — the rewritten intake **never reads `payload.flow_id`** (old branch deleted, diff confirms); `IntakeRequest.flow_id` (`backend/app/schemas/l1.py:13`) is now dead. **Fix:** Make intake honor an explicit `flow_id` (bypass the matcher, call `start_flow_session` directly — restores the deleted behavior), and have the suggest card pass `near_miss.flow_id`. This also kills the wasteful re-run of the embedding + pgvector + keyword pipeline just to rediscover a flow_id the client already holds. ### 5. Out-of-scope problems lost the ad-hoc walk fallback - Old intake had `else: start_adhoc_session(...)`; the rewrite (`backend/app/api/endpoints/l1.py:88-102`) dispatches only matched/build/suggest/out_of_scope. `start_adhoc_session` (`l1_session_service.py:82`) now has **zero callers** — ad-hoc sessions are unreachable product-wide (the only remaining `session_kind="adhoc"` creation is `escalate_without_walk`, an audit record, not walkable). - `L1Dashboard.tsx:269-292` — out_of_scope prompt offers only "Escalate to engineering" / "Cancel". - Stale copy: `frontend/src/pages/account/L1CategoriesPage.tsx:57-58` still promises "Disabled categories fall back to an ad-hoc walk or escalation." A diff test comment also claims adhoc "is offered from the out_of_scope prompt" — it is not. **Fix (decide deliberately, don't drift):** Either (a) add a "Walk it ad-hoc" option to the out_of_scope prompt that hits a path creating an adhoc session (restore the capability), or (b) if dropping ad-hoc is intentional, fix the L1CategoriesPage copy and the test comment, and note the decision in `.ai/DECISIONS.md`. Option (a) preserves pre-existing user capability; recommend (a). ### 6. DB constraint makes L1-session deletion always fail (time bomb) - `backend/app/models/flow_proposal.py:60-63` — `CheckConstraint("(source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL)")` (XOR). - Line 87-92 — `l1_session_id` FK is `ondelete="SET NULL"`; `source_session_id` (line 83) is `ondelete="CASCADE"`. - Migration `backend/alembic/versions/1fd88a68b145_flow_proposal_l1_source_linkage.py:32-45` ships the same DDL. - Postgres CHECK constraints are non-deferrable and ARE evaluated on the UPDATE produced by `ON DELETE SET NULL`. So deleting any `l1_walk_sessions` row referenced by a proposal (whose `source_session_id` is NULL by construction) → both columns NULL → CHECK violation → the DELETE fails. The `SET NULL` action can literally never fire successfully. - Reachable today via `backend/app/api/endpoints/admin.py:1336` (`hard_delete_user` → `db.delete(account)`, DB-side cascades with unspecified ordering), and via any future GDPR/retention purge. **Fix:** Change `l1_session_id` to `ondelete="CASCADE"` (matching `source_session_id`'s behavior — proposal dies with its source), in both the model and a new migration. Keep the XOR check. Alternative (`num_nonnulls(...) >= 1` style relaxation) is weaker; prefer CASCADE. ### 7. Account admins locked out of L1 category settings (3-layer inconsistency) - Frontend route: `frontend/src/router.tsx:368-372` — `requiredRole="owner"`; `frontend/src/hooks/usePermissions.ts:21-28` (`getEffectiveRole`) has **no admin branch** → `account_role='admin'` maps to `viewer` → bounced to /trees. - Backend GET: `backend/app/api/endpoints/accounts.py:175-178` uses `require_l1_or_above` (`deps.py:235-242`: `l1_tech/engineer/owner` only) → admin gets **403 on read**. - Backend PATCH: `accounts.py:193-197` uses `require_account_owner_or_admin` (`deps.py:279-289`) → admin **can write**. - `admin` is a real role: `backend/app/models/user.py:25` CHECK constraint; `user.py:132` treats admin as account-manager. **Fix:** Pick one rule — owner+admin manage L1 categories — and apply it at all three layers: GET should use `require_account_owner_or_admin` too (or a combined dep), and the route guard needs admins to pass (either add an admin branch to `getEffectiveRole` — check blast radius on other `requiredRole` uses first — or a dedicated `canManageAccount`-style guard for this route). Also note `require_account_owner_or_admin` duplicates `User.can_manage_account` (`user.py:130-132`); delegate to it. --- ## FAST-FOLLOWS (real bugs, lower urgency) ### 8. Every walk-view mount fires a fresh paid LLM call (and may swap the question) - The served-but-unanswered node is never persisted: `l1_session_service.py:156-174` — `node_id is None` path goes straight to `generate_next_node`; only `current_node_id` (always None today, see Finding 1) is stored. No replay branch. - `L1WalkTreeVariant.tsx:26-44` — mount effect unconditionally POSTs `/next-node {}`; `frontend/src/main.tsx:4,34` — StrictMode is on, so dev double-mounts double-generate. **Impact:** Refresh/back-forward = duplicate Sonnet spend, multi-second stall, and possibly a *different* question than the one the tech was answering. **Fix:** Persist the pending node (e.g., a `pending_node` JSONB column on `l1_walk_sessions`, or reuse `current_node_id` + stored payload) and replay it when `node_id is None` and a pending node exists. Note: if adding columns, this is the moment to also add `category` and `problem_text` columns and delete the meta-entry convention (Root cause B) — `/next-node` currently re-fetches the internal ticket and re-scans walked_path on every step (`l1.py:302-310`) just to recover these immutable values. ### 9. Hidden meta entry: junk proposals + depth cap off-by-one - Junk proposal: `l1_session_service.py:270` — `if helpful and session.session_kind == "ai_build" and session.walked_path:` — a meta-only walked_path (seeded at intake, `l1.py:132-134`) is truthy. `normalize_walked_path` (`ai_tree_builder.py:131-137`) strips meta → empty → returns the `"Empty walk — needs authoring."` stub, which "passes the proposal approval guard" per its own docstring → a `status="pending"`, `validated_by_outcome=True` junk proposal reaches the review queue when a tech resolves immediately after intake. - Depth cap: `l1_session_service.py:172-173` passes the **raw** walked_path; `ai_tree_builder.py:82-83,96-98` — `len(walked_path) >= MAX_DEPTH` (12) counts the meta entry → cap fires after 11 real steps. `_strip_meta` is applied only downstream. **Fix (symptom-level):** strip meta before both the truthiness guard and `escalate_if_depth_exceeded`. **Fix (root):** real `category` column, delete the meta convention (see Finding 8 note). Root fix preferred per project principle (correct architecture over minimal diff). ### 10. Escalation notification silently dropped when recipient query is empty - `notification_service.py:180` changed `if target_user_ids:` → `if target_user_ids is not None:` (intentional, documented at lines 176-178). - `l1_session_service.py:371-381` — `escalate()` passes its computed `target_ids` unconditionally; if all owners/admins/engineers are inactive, `[]` → zero in-app notifications, no log, no fallback. (Existing callers are safe — they all use `[x] if x else None` patterns.) - Bonus divergence: escalate's hand-rolled query filters only `is_active`, while `handoff_manager.py:323-333` also filters `deleted_at IS NULL` — soft-deleted engineers would be notified. **Fix:** In `escalate()`: `target_user_ids=target_ids or None` (falls back to default recipients) plus a warning log when empty; add the `deleted_at` filter. Longer-term: give `_resolve_recipients` a roles parameter so callers stop hand-rolling recipient queries. --- ## Cleanups (optional, do alongside adjacent fixes) - `L1Dashboard.tsx:47-110` — `handleStart` / `useSuggestedFlow` / `buildNew` are three near-identical intake calls; collapse to one `runIntake(opts)` switching on `response.outcome` (this also prevents Finding-4-class drift). - `backend/app/schemas/l1.py` — `IntakeRequest.flow_id` is dead unless Finding 4 revives it; `NextNodeRequest.acknowledged` is sent by the frontend but never read by the backend (advance infers ack from `answer is None`) — wire it or drop it. `IntakeResponse` lost its per-outcome guarantees (all fields Optional, `ticket_kind` no longer `Literal["psa","internal"]`); add a `model_validator(mode="after")` requiring `session_id`/`ticket_id` when outcome is matched/build, and add a `session_id` null-guard before `navigate()` in `handleStart` (`L1Dashboard.tsx:58-59` — currently navigates to `/l1/walk/undefined` on a regression). - `backend/app/services/match_or_build.py:55` — unused positional `ticket_ref` param (only caller passes `""`); delete it. Also note `classify()` is a second bespoke LLM intake classifier alongside `flowpilot_engine._classify_intake`; `l1.py` passes `problem_domain=None` to matching, losing the domain signal the existing classifier provides — consider unifying in Phase 2B. - `backend/app/services/l1_category_service.py:17` + `models/account.py:75-81` — DEFAULT_L1_CATEGORIES duplicated as a hand-escaped JSON `server_default`; derive one from the other (migration copy stays frozen). - `frontend/src/pages/account/L1CategoriesPage.tsx` — local `prettify()` duplicates `humanizeFeatureKey` (`UpgradePrompt.tsx:62`); page skips shared `PageHeader`/`Spinner` used by sibling settings pages. - Missing index for `GET /l1/escalations` (`l1.py:338`): consider `CREATE INDEX ... ON l1_walk_sessions (account_id, last_step_at DESC) WHERE status = 'escalated'`. - `backend/tests/test_l1_session_service.py` — the `escalation_reason_category == "no_kb_content"` assertion was deleted from `test_escalate_without_walk_writes_audit_log`, weakening audit coverage; restore it. - Per-step `walked_path` rewrite is O(n²) cumulative bytes (`session.walked_path = [*session.walked_path, entry]`); bounded by MAX_DEPTH=12 so fine today — note for Phase 2B if depth grows. --- ## Suggested execution order 1. Finding 1 (node ids) — unblocks everything; add the contract test. 2. Finding 6 (FK/constraint) — new migration; do early so it ships in the same release. 3. Findings 2 + 3 together (mount section, fix field names/meta filter, ProposalDetail L1 block + null-guard the /pilot link). 4. Finding 4 (intake honors flow_id; suggest card passes it). 5. Finding 5 (decide adhoc: restore option (a) recommended, or fix copy + DECISIONS.md). 6. Finding 7 (align all three permission layers). 7. Findings 8 + 9 via the root fix: add `category`/`problem_text` (+ optionally `pending_node`) columns, delete the meta-entry convention, strip-meta fixes become moot. 8. Finding 10 (one-line guard + deleted_at filter). 9. Cleanups opportunistically alongside the file they touch. After fixes: run the 11 Phase 2A backend test files together (authoritative gate per HANDOFF — do NOT trust a full local serial `pytest tests/`; use `--override-ini="addopts="`), frontend `tsc -b` + lint + build, and migration downgrade/upgrade roundtrip. Update `.ai/HANDOFF.md` to correct the Task 16/17 record.