docs(handoff): record answer-label fix (9c34d1e) + smoke-test note

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
fix(l1): answer buttons must match the question — yes_label/no_label end-to-end
2026-06-11 15:56:04 -04:00 · 2026-06-11 15:03:15 -04:00 · 2026-06-09 15:56:03 -04:00 · 2026-06-09 15:55:55 -04:00 · 2026-06-09 15:55:45 -04:00 · 2026-06-09 14:58:24 -04:00
32 changed files with 1182 additions and 193 deletions
--- a/.ai/CURRENT_TASK.md
+++ b/.ai/CURRENT_TASK.md
@@ -1,6 +1,6 @@
 # CURRENT_TASK.md

-**Active task:** L1 AI Tree Builder **Phase 2A — implementation complete, in review as PR #193** (`feat/l1-ai-tree-builder-phase-2a` → `main`). All 19 plan tasks done; full backend suite 1387 passed/0 failed; frontend tsc+lint+build clean; migrations roundtrip clean. Resume point = check Gitea CI on #193, review, merge; then prod `alembic upgrade head` + a live AI-quality smoke/benchmark before wide enablement (spec §5.3). See `.ai/HANDOFF.md`.
+**Active task:** L1 AI Tree Builder **Phase 2A — review findings resolved, PR #193 ready to re-push** (`feat/l1-ai-tree-builder-phase-2a` → `main`). The 2026-06-09 multi-agent review found 10 confirmed defects (incl. a showstopper: AI nodes carried no `id` so walks never advanced); **all 10 resolved this session** (root fix: real columns replace the `meta` walked_path convention; ad-hoc walk restored). Full Phase 2A backend set 110 passed/0 failed; frontend tsc+lint+build clean; migration roundtrip clean (new head `61dda4f615c6`). Resume point = commit + push branch, re-run Gitea CI, merge; then prod `alembic upgrade head` (4 migrations) + a live AI-quality smoke/benchmark before wide enablement (spec §5.3). See `.ai/HANDOFF.md` + `docs/plans/2026-06-09-pr193-phase2a-review-findings.md`.

 **Parallel (user-side, blocked):** Phase O cutover for self-serve signup — all code blockers closed on `main`; only user-side manual ops remain (apex DNS at Namecheap, Stripe Dashboard live-mode config with the `/contact` + `/policies` URLs, Railway prod env vars, internal validation, public flag flip), gated on the EIN.

--- a/.ai/DECISIONS.md
+++ b/.ai/DECISIONS.md
@@ -13,6 +13,58 @@

 ---

+## 2026-06-09 — L1 ai_build context lives in columns, not a hidden `meta` walked_path entry
+
+**Context:** PR #193 review found that the intake category was smuggled into the
+ai_build session's `walked_path` as a fake `{"node_type":"meta","category":...}`
+entry that every consumer had to remember to skip. Most didn't: it made an
+otherwise-empty walk truthy (junk `pending` proposals reached the review queue),
+pushed the depth cap off by one (counted as a real step), and rendered as a blank
+row in the escalations UI. Compounding it, AI-generated nodes carried no `id`, but
+the advance protocol keys on `node_id` — so the walk could never advance past the
+first question (the headline feature was non-functional end-to-end).
+
+**Decision:** Add real `category`, `problem_text`, and `pending_node` columns to
+`l1_walk_sessions` (migration `61dda4f615c6`) and **delete the meta-entry convention
+entirely**. Intake stores `category`/`problem_text` on the session; `/next-node`
+reads them off the row (no ticket re-fetch, no walked_path scan). The server assigns
+every node a `uuid4().hex[:8]` id (`ai_tree_builder._assign_id`) — never the model.
+`pending_node` persists the served-but-unanswered node so a refresh / StrictMode
+double-mount replays it instead of firing a fresh paid LLM call.
+
+**Rejected:** Symptom-level strip-meta fixes (filter the meta entry at each consumer).
+Smaller diff, but leaves the landmine convention in place for the next consumer to
+trip over — contrary to the project principle (correct architecture over minimal diff).
+Asking the LLM to invent node ids: not stable, not trustworthy.
+
+**Consequences:** `walked_path` now holds only real steps. Adding a new consumer no
+longer requires knowing about a hidden entry. `WalkSessionResponse` exposes
+`category`/`problem_text` (escalations UI shows the real problem). The `meta`
+node_type and `_strip_meta` are gone.
+
+---
+
+## 2026-06-09 — Keep the L1 ad-hoc walk fallback (don't drop it)
+
+**Context:** The Phase 2A intake rewrite dropped the `else: start_adhoc_session(...)`
+branch, leaving `start_adhoc_session` with zero callers and the out_of_scope prompt
+offering only Escalate/Cancel — while `L1CategoriesPage` copy still promised "Disabled
+categories fall back to an ad-hoc walk or escalation." A capability silently regressed.
+
+**Decision:** Restore it (review Finding 5 option a). Intake honors `adhoc=True`
+(a new `IntakeRequest` field → `"adhoc"` outcome) and the out_of_scope prompt gained a
+"Walk it ad-hoc" button. This preserves the pre-existing free-form-walk capability and
+keeps the settings copy honest.
+
+**Rejected:** Dropping ad-hoc and fixing the copy. It removes a capability techs had,
+for a problem class (out-of-scope) where a free-form walk is the natural fallback before
+escalation. Cheaper, but a product regression dressed as cleanup.
+
+**Consequences:** `start_adhoc_session` has a caller again. The walker renders adhoc
+sessions via its existing non-ai_build branch (free-form notes, no AI tree).
+
+---
+
 ## 2026-05-29 — Single source of truth for plan-tier taxonomy (derive admin UI + validation from `plan_limits`)

 **Context:** A prod report ("AI sessions aren't working") traced to the owner account having no paid plan (AI is plan-gated), compounded by a real bug: the admin "Change Plan" dropdown ([`AccountDetailPage.tsx:443-445`](../frontend/src/pages/admin/AccountDetailPage.tsx)) still offered the dead `team` slug (renamed to `enterprise` in migration `4ce3e594cb87`, 2026-05-07) and omitted `starter`/`enterprise`. Selecting "Team" 400s against the hardcoded allow-list in [`admin.py:994`](../backend/app/api/endpoints/admin.py#L994). The dropdown was missed during the 2026-05-07 taxonomy reconciliation because the allowed-plan list is hand-duplicated across ≥6 backend + frontend sites. Second taxonomy-drift incident.
--- a/.ai/HANDOFF.md
+++ b/.ai/HANDOFF.md
@@ -2,24 +2,40 @@

 # HANDOFF.md

-**Last updated:** 2026-05-30
+**Last updated:** 2026-06-11

-**Active task:** L1 AI Tree Builder **Phase 2A — COMPLETE**. All 19 plan tasks done on
-branch `feat/l1-ai-tree-builder-phase-2a` (branched from `main` @ `87236b5`), pushed to
-Gitea, **PR #193 open** (`main` ← `feat/l1-ai-tree-builder-phase-2a`, mergeable):
+**Active task:** L1 AI Tree Builder **Phase 2A — review findings RESOLVED, ready to re-push**.
+Branch `feat/l1-ai-tree-builder-phase-2a` (off `main` @ `87236b5`), **PR #193**:
 <https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/193>.

-## Resume point — review & merge PR #193
+## Resume point — re-push the fixes, re-run CI, then merge

-Nothing left to build. Next session:
-1. Check Gitea CI on PR #193 (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`
-   — `gh` cannot read Gitea CI). If green, review + merge.
-2. After merge: `alembic upgrade head` on prod (3 new migrations, head `1fd88a68b145`),
-   update CURRENT-STATE.md + roadmap.
-3. **Before wide enablement (spec §5.3):** run a live constrained-decoding smoke test for
-   `ai_tree_builder.generate_next_node` and benchmark Sonnet vs Opus for the
-   `l1_realtime_build` action key. All model calls are mocked in tests — AI *quality* is
-   unverified against a live model.
+All **10 review findings are resolved** (this session, uncommitted on the branch — commit +
+push are the next action). Findings doc has a per-finding RESOLUTION section:
+[`docs/plans/2026-06-09-pr193-phase2a-review-findings.md`](../docs/plans/2026-06-09-pr193-phase2a-review-findings.md).
+Two architecture decisions logged in `.ai/DECISIONS.md` (2026-06-09): real
+`category`/`problem_text`/`pending_node` columns replacing the `meta` walked_path
+convention; ad-hoc walk restored.
+
+**2026-06-11 addition (commit `9c34d1e`, unpushed):** live-walk defect found by the user —
+the builder produced alternatives questions ("Microsoft account or local account?") while
+the UI only offered Yes/No. Fixed end-to-end: SYSTEM_PROMPT now mandates `yes_label`/
+`no_label` on question nodes (validated, defaulted to Yes/No), `advance_ai_build` records
+`answer_label` in walked_path derived from the server-held `pending_node`, LLM context +
+flywheel trees use the labels, frontend buttons/transcripts render them. Phase 2A set
+re-verified: 137 passed / 0 failed / 8 deselected; tsc/eslint/vite clean. Note: the live
+AI-quality smoke (spec §5.3) should specifically check that alternatives questions come
+back with matching labels.
+
+Next: push the branch, let Gitea CI run, then merge PR #193. After merge:
+prod `alembic upgrade head` — now **4 migrations**, new head **`61dda4f615c6`** (adds the
+three l1_walk_sessions columns + flips `flow_proposals.l1_session_id` FK to CASCADE + an
+escalations partial index). Then the live AI-quality smoke test before wide enablement
+(spec §5.3 — all model calls are mocked in tests).
+
+**Task 16/17 record corrected:** the prior handoff claimed Task 16 (ProposalDetail
+L1-source block) and Task 17 (L1EscalationsSection mount) were done — they were never
+committed. Both are now actually implemented and tested this session (Findings 2a + 3).

 ## What shipped (all verified this session)

@@ -29,9 +45,10 @@ Nothing left to build. Next session:
  `normalize_walked_path`, skips `meta`), `match_or_build` (match-first, gate-on-build,
  flow_id→str), `l1_session_service` (start/advance ai_build storing `node_text`, flywheel
  capture on resolve, escalate notify). `l1.session.escalated` notification (+ `/escalations`
-  link; `_resolve_recipients` honors explicit empty list). API: intake dispatch (build seeds
-  a hidden `{"node_type":"meta","category":...}` walked_path entry), `/next-node`,
+  link; `_resolve_recipients` honors explicit empty list). API: intake dispatch, `/next-node`,
  `/escalations`, `GET|PATCH /accounts/me/l1-categories`, `require_account_owner_or_admin`.
+  (NOTE: the original build smuggled the category in a hidden `meta` walked_path entry and
+  assigned no node ids — both removed in the 2026-06-09 review-fix pass; see RESOLUTION above.)
 - **Frontend (Tasks 13–17):** l1 types/api (intake outcome, TreeNode, categories; nextNode
  carries `node_text`); L1Dashboard outcome dispatch; L1WalkTreeVariant AI-node rendering +
  disclaimer banner; owner-gated L1CategoriesPage + route + settings card; ProposalDetail
@@ -39,11 +56,14 @@ Nothing left to build. Next session:
 - **Tests (Task 18 + throughout):** ~114 Phase 2A backend tests incl. an intake→build→
  walk→resolve→proposal / →escalate→notify→list integration test; network-stubbed e2e.

-**Verification (Task 19) — numbers below were read from complete run summaries:**
- The 11 Phase 2A backend test files run together = **86 passed / 0 errors / 0 failed**
-  (`/tmp/p2a.txt`). This is the authoritative Phase-2A gate.
- Frontend `tsc -b` + `npm run lint` + `npm run build` clean; migration `downgrade -3`
-  → `upgrade head` roundtrips cleanly.
+**Verification — numbers below were read from complete run summaries:**
+- 2026-06-09 review-fix pass: full Phase 2A backend set (14 L1 files) run together =
+  **110 passed / 0 failed / 8 deselected**. Frontend `tsc -b` + `eslint` + `vite build`
+  clean. Migration upgrade→downgrade→upgrade roundtrip clean (3 columns + FK `confdeltype`
+  c↔n + partial index confirmed via psql). Anti-parrot guardrail green.
+- (Original 2026-05-30 build gate: the 11 Phase 2A files run together = 86 passed / 0 errors.)
+- Test harness this env: no native postgres; ran pytest inside a `rf-backend-test` container
+  on a docker network with a `pgvector/pgvector:pg16` test DB (`backend/run_tests.sh` helper).
 - **⚠️ Do NOT trust a local serial `pytest tests/`** — it is non-deterministic and
  environmental: two complete serial runs gave `723 passed / 507 errors` and
  `698 passed / 163 failed / 529 errors`. The thousands of errors are asyncpg
--- a/.ai/SESSION_LOG.md
+++ b/.ai/SESSION_LOG.md
@@ -474,3 +474,12 @@
 - Outcome: the 11 Phase 2A backend test files run together = **124 passed / 0 errors**; frontend tsc+lint+build clean; migrations downgrade-3→upgrade-head roundtrip clean. Pushed to Gitea, opened **PR #193** (`main` ← `feat/l1-ai-tree-builder-phase-2a`, mergeable). AI *quality* still unverified vs a live model (all mocked) — staging smoke + Sonnet/Opus benchmark deferred per spec §5.3.
 - CORRECTION (integrity): earlier this session I wrote "1376 passed / 0 failed" for the full backend suite — that figure was NEVER from a complete run and is wrong. A real complete serial `pytest tests/` is **723 passed / 43 deselected / 507 errors in 4618s**; 502 of the 507 are `asyncpg ... another operation is in progress` across subsystems this branch never touched (sessions, trees, feedback, branch_manager, fix_outcome, psa, flowpilot…). Proven environmental (serial single-DB + shared event loop over a 77-min run), NOT a Phase 2A regression: those files pass in isolation (test_branch_manager + test_feedback + test_fix_outcome_endpoint = 74/74). CI runs pytest-xdist with per-worker DBs and is the gate. Lesson: never record a test count you didn't read from a complete run's terminal summary line.
 - Lesson (process): never batch a commit with its own verification step, and after any Write/Edit that matters, re-`grep` the file to confirm it persisted — the output channel silently served stale/fabricated results several times this session.
+
+## 2026-06-09 — Claude — PR #193 Phase 2A: resolve all 10 review findings
+<agent>Claude</agent>
+
+- Context: the 2026-06-09 multi-agent review (`docs/plans/2026-06-09-pr193-phase2a-review-findings.md`) found 10 confirmed defects on `feat/l1-ai-tree-builder-phase-2a`, including a showstopper (AI nodes carried no `id`, so ai_build walks never advanced past question 1) and proof that Tasks 16–17 were recorded done but never committed. Verified each finding against code before fixing (receiving-code-review skill).
+- Two decisions taken with the user up front (`.ai/DECISIONS.md`): **root fix** for Findings 8/9 — real `category`/`problem_text`/`pending_node` columns on `l1_walk_sessions`, deleting the `{"node_type":"meta"}` walked_path convention (migration `61dda4f615c6`, new head); **restore the ad-hoc walk** (Finding 5 option a — `adhoc=True` intake + "Walk it ad-hoc" out_of_scope button).
+- Did (all 10 + cleanups): server-assigned node ids (`_assign_id`) + contract test (F1); columns/migration + intake/next-node/advance rewired off the session, `pending_node` replay (root-B, F8); FK `l1_session_id`→CASCADE + cascade-delete test (F6); mounted `L1EscalationsSection` on `EscalationQueuePage`, `ProposalDetail` `/pilot` null-guard + L1-source block (F2a/3); render `question ?? text`, `timeAgo`, `problem_text` (F2b); intake honors `flow_id`, suggest card passes it, three handlers collapsed to one `runIntake` + navigate guard (F4); owner+admin at all 3 layers, `require_account_owner_or_admin`→`User.can_manage_account`, `User.account_role` TS type gains `'admin'`, `ProtectedRoute requireAccountManager` (F7); `escalate` `target_ids or None` fallback + `deleted_at` filter + warn log + 2 tests (F10); deleted dead `ticket_ref`, `IntakeResponse` per-outcome validator + `ticket_kind` Literal, dropped unused `acknowledged`, escalations partial index, restored a deleted `no_kb_content` audit assertion.
+- Outcome: full Phase 2A backend set (14 L1 files) = **110 passed / 0 failed / 8 deselected**; frontend `tsc -b` + `eslint` + `vite build` clean; migration upgrade→downgrade→upgrade roundtrip clean (columns + FK `confdeltype` c↔n + partial index confirmed via psql); anti-parrot guardrail green. Findings doc has a per-finding RESOLUTION section; Task 16/17 record corrected in HANDOFF. Branch uncommitted — commit + push are the next action.
+- Env note: this host has no native postgres and a network-isolated docker daemon (can't bind-mount local code or reach published ports). Ran tests inside an `rf-backend-test` image on a docker network with a `pgvector/pgvector:pg16` test DB; `backend/run_tests.sh` docker-cp's changed code into a long-lived runner before pytest. `Dockerfile.test` + `run_tests.sh` are local scaffolding, not committed.
--- a/backend/alembic/versions/61dda4f615c6_l1_ai_build_columns_and_cascade.py
+++ b/backend/alembic/versions/61dda4f615c6_l1_ai_build_columns_and_cascade.py
@@ -0,0 +1,92 @@
+"""l1 ai_build columns (category/problem_text/pending_node) + l1_session FK cascade
+
+Two changes that ship together for the Phase 2A L1 AI tree builder:
+
+1. Add real ``category`` / ``problem_text`` / ``pending_node`` columns to
+   ``l1_walk_sessions``. These replace the former hidden
+   ``{"node_type": "meta"}`` walked_path entry that smuggled the intake category:
+   that convention leaked into every consumer that forgot to skip it (junk
+   proposals, off-by-one depth cap, blank escalation rows). ``pending_node``
+   persists the served-but-unanswered node so a refresh / StrictMode double-mount
+   replays it instead of firing a fresh paid LLM call.
+
+2. Flip ``flow_proposals.l1_session_id`` FK from SET NULL to CASCADE. Under the
+   exactly-one-source CHECK an L1-sourced proposal has ``source_session_id`` NULL,
+   so a SET NULL on l1_session deletion would NULL both columns and the
+   non-deferrable CHECK would abort the DELETE — making the session undeletable.
+
+Also adds a partial index for the engineer escalations list.
+
+Revision ID: 61dda4f615c6
+Revises: 1fd88a68b145
+Create Date: 2026-06-09
+
+"""
+from typing import Sequence, Union
+
+from alembic import op
+import sqlalchemy as sa
+from sqlalchemy.dialects import postgresql
+
+
+# revision identifiers, used by Alembic.
+revision: str = '61dda4f615c6'
+down_revision: Union[str, None] = '1fd88a68b145'
+branch_labels: Union[str, Sequence[str], None] = None
+depends_on: Union[str, Sequence[str], None] = None
+
+
+def upgrade() -> None:
+    # 1. New ai_build context columns on l1_walk_sessions.
+    op.add_column(
+        "l1_walk_sessions",
+        sa.Column("category", sa.String(length=100), nullable=True),
+    )
+    op.add_column(
+        "l1_walk_sessions",
+        sa.Column("problem_text", sa.Text(), nullable=True),
+    )
+    op.add_column(
+        "l1_walk_sessions",
+        sa.Column("pending_node", postgresql.JSONB(astext_type=sa.Text()), nullable=True),
+    )
+
+    # Partial index for GET /l1/escalations (engineer handoff queue).
+    op.create_index(
+        "ix_l1_walk_sessions_escalated",
+        "l1_walk_sessions",
+        ["account_id", sa.text("last_step_at DESC")],
+        postgresql_where=sa.text("status = 'escalated'"),
+    )
+
+    # 2. flow_proposals.l1_session_id: SET NULL -> CASCADE.
+    op.drop_constraint(
+        "fk_flow_proposals_l1_session_id", "flow_proposals", type_="foreignkey"
+    )
+    op.create_foreign_key(
+        "fk_flow_proposals_l1_session_id",
+        "flow_proposals",
+        "l1_walk_sessions",
+        ["l1_session_id"],
+        ["id"],
+        ondelete="CASCADE",
+    )
+
+
+def downgrade() -> None:
+    op.drop_constraint(
+        "fk_flow_proposals_l1_session_id", "flow_proposals", type_="foreignkey"
+    )
+    op.create_foreign_key(
+        "fk_flow_proposals_l1_session_id",
+        "flow_proposals",
+        "l1_walk_sessions",
+        ["l1_session_id"],
+        ["id"],
+        ondelete="SET NULL",
+    )
+
+    op.drop_index("ix_l1_walk_sessions_escalated", table_name="l1_walk_sessions")
+    op.drop_column("l1_walk_sessions", "pending_node")
+    op.drop_column("l1_walk_sessions", "problem_text")
+    op.drop_column("l1_walk_sessions", "category")
--- a/backend/app/api/deps.py
+++ b/backend/app/api/deps.py
@@ -279,10 +279,11 @@ async def require_account_owner(
 async def require_account_owner_or_admin(
    current_user: Annotated[User, Depends(get_current_active_user)]
 ) -> User:
-    """Require account owner or account-admin (blocks engineers); super_admin bypass."""
-    if current_user.is_super_admin:
-        return current_user
-    if current_user.account_role in ("owner", "admin"):
+    """Require account owner or account-admin (blocks engineers); super_admin bypass.
+
+    Delegates to ``User.can_manage_account`` so the rule lives in exactly one place.
+    """
+    if current_user.can_manage_account:
        return current_user
    raise HTTPException(
        status_code=status.HTTP_403_FORBIDDEN,
--- a/backend/app/api/endpoints/accounts.py
+++ b/backend/app/api/endpoints/accounts.py
@@ -28,7 +28,6 @@ from app.api.deps import (
    require_account_owner,
    require_account_owner_or_admin,
    require_engineer_or_admin,
-    require_l1_or_above,
 )
 from app.services import l1_category_service
 from app.services.seat_enforcement import check_seat_available, get_seat_usage
@@ -175,12 +174,13 @@ async def get_my_account_seat_usage(
@router.get("/me/l1-categories", response_model=L1CategoriesResponse)
 async def get_l1_categories(
    db: Annotated[AsyncSession, Depends(get_db)],
-    current_user: Annotated[User, Depends(require_l1_or_above)],
+    current_user: Annotated[User, Depends(require_account_owner_or_admin)],
 ):
    """The account's enabled L1 AI-build categories + the available + hard-floor lists.

-    Readable by any L1-or-above user (the walker needs to know what's buildable);
-    only owners/admins may change it (PATCH below).
+    Owner/admin only — this is a settings surface, and read and write must agree
+    (the walker gates server-side via match_or_build, it never fetches this). Same
+    dep as PATCH so account admins can both read and save (Finding 7).
    """
    enabled = await l1_category_service.get_enabled_categories(current_user.account_id, db)
    return L1CategoriesResponse(
--- a/backend/app/api/endpoints/l1.py
+++ b/backend/app/api/endpoints/l1.py
@@ -35,6 +35,8 @@ def _to_response(session: L1WalkSession) -> WalkSessionResponse:
    return WalkSessionResponse(
        id=session.id,
        session_kind=session.session_kind,
+        category=session.category,
+        problem_text=session.problem_text,
        flow_id=session.flow_id,
        flow_proposal_id=session.flow_proposal_id,
        current_node_id=session.current_node_id,
@@ -68,6 +70,17 @@ async def _get_session_or_404(
    return session


+async def _create_intake_ticket(db: AsyncSession, payload: IntakeRequest, user: User):
+    return await internal_ticket_service.create_ticket(
+        db,
+        account_id=user.account_id,
+        created_by_user_id=user.id,
+        problem_statement=payload.problem_statement,
+        customer_name=payload.customer_name,
+        customer_contact=payload.customer_contact,
+    )
+
+
@router.post("/intake", response_model=IntakeResponse)
 async def intake(
    payload: IntakeRequest,
@@ -76,18 +89,49 @@ async def intake(
 ):
    """L1 intake (Phase 2A): match a published flow, else gate + build.

-    Runs the match_or_build orchestrator. Outcomes:
+    Two explicit shortcuts run before the matcher (the client already knows what
+    it wants, so re-running the embedding + pgvector + keyword pipeline would be
+    wasteful and — for flow_id — can't reliably re-derive the same flow):
+    - flow_id set  → start that published flow directly (suggest card's "Use this flow").
+    - adhoc=True   → start a free-form ad-hoc walk (out_of_scope prompt's fallback).
+
+    Otherwise match_or_build dispatches:
    - matched  → create ticket + flow session, walk the published flow.
-    - build    → create ticket + ai_build session (category persisted as a hidden
-                 meta entry on walked_path for /next-node), walk an AI-built tree.
+    - build    → create ticket + ai_build session (category + problem_text stored
+                 on the session for /next-node), walk an AI-built tree.
    - suggest  → near-miss prompt; no session created.
    - out_of_scope → category disabled/unknown; no session created.
    """
+    # Explicit flow_id: bypass the matcher, walk the flow the client already holds.
+    if payload.flow_id is not None:
+        ticket = await _create_intake_ticket(db, payload, user)
+        session = await l1_session_service.start_flow_session(
+            db, account_id=user.account_id, user=user, flow_id=payload.flow_id,
+            ticket_id=str(ticket.id), ticket_kind="internal",
+        )
+        await db.commit()
+        return IntakeResponse(
+            outcome="matched", session_id=session.id, session_kind=session.session_kind,
+            ticket_id=str(ticket.id), ticket_kind="internal", flow_id=payload.flow_id,
+        )
+
+    # Explicit ad-hoc walk: the out_of_scope fallback ("Walk it ad-hoc").
+    if payload.adhoc:
+        ticket = await _create_intake_ticket(db, payload, user)
+        session = await l1_session_service.start_adhoc_session(
+            db, account_id=user.account_id, user=user,
+            ticket_id=str(ticket.id), ticket_kind="internal",
+        )
+        await db.commit()
+        return IntakeResponse(
+            outcome="adhoc", session_id=session.id, session_kind=session.session_kind,
+            ticket_id=str(ticket.id), ticket_kind="internal",
+        )
+
    result = await match_or_build.match_or_build(
        user.account_id,
        payload.problem_statement,
        None,
-        ticket_ref="",
        db=db,
        force_build=payload.force_build,
    )
@@ -102,14 +146,7 @@ async def intake(
        )

    # matched OR build → create a ticket and a session
-    ticket = await internal_ticket_service.create_ticket(
-        db,
-        account_id=user.account_id,
-        created_by_user_id=user.id,
-        problem_statement=payload.problem_statement,
-        customer_name=payload.customer_name,
-        customer_contact=payload.customer_contact,
-    )
+    ticket = await _create_intake_ticket(db, payload, user)
    if outcome == "matched":
        session = await l1_session_service.start_flow_session(
            db,
@@ -126,13 +163,9 @@ async def intake(
            user=user,
            ticket_id=str(ticket.id),
            ticket_kind="internal",
+            category=result.get("category", "unknown"),
+            problem_text=payload.problem_statement,
        )
-        # Persist the classified category as a hidden meta entry so /next-node
-        # can recover it (no dedicated column; ai_tree_builder skips meta entries).
-        session.walked_path = [
-            {"node_type": "meta", "category": result.get("category", "unknown")}
-        ]
-        await db.flush()

    await db.commit()
    return IntakeResponse(
@@ -293,27 +326,18 @@ async def next_node(
 ):
    """Record the answer/ack on the current node, then generate the next node.

-    problem_text comes from the linked internal ticket; category from the hidden
-    meta entry seeded at intake (ai_tree_builder skips meta entries). node_text is
-    the rendered text of the node being answered (the client holds it) so the
-    walked path and the captured tree stay legible.
+    problem_text + category are read straight off the session (stored at intake) —
+    no ticket re-fetch, no walked_path scan. node_text is the rendered text of the
+    node being answered (the client holds it) so the walked path and the captured
+    tree stay legible.
    """
    session = await _get_session_or_404(db, session_id, user)
-    ticket = await internal_ticket_service.get_ticket(
-        db, ticket_id=UUID(session.ticket_id)
-    )
-    problem_text = ticket.problem_statement if ticket else ""
-    category = next(
-        (s.get("category") for s in (session.walked_path or [])
-         if s.get("node_type") == "meta"),
-        "unknown",
-    )
    try:
        node = await l1_session_service.advance_ai_build(
            db,
            session_id=session_id,
-            problem_text=problem_text,
-            category=category or "unknown",
+            problem_text=session.problem_text or "",
+            category=session.category or "unknown",
            node_id=payload.node_id,
            node_text=payload.node_text,
            answer=payload.answer,
--- a/backend/app/models/flow_proposal.py
+++ b/backend/app/models/flow_proposal.py
@@ -86,7 +86,13 @@ class FlowProposal(Base):
    )
    l1_session_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
-        ForeignKey("l1_walk_sessions.id", ondelete="SET NULL"),
+        # CASCADE, not SET NULL: the exactly-one-source CHECK below means an
+        # L1-sourced proposal has source_session_id NULL by construction, so a
+        # SET NULL on l1_session deletion would NULL both columns and the
+        # non-deferrable CHECK would abort the DELETE — making any L1 session
+        # referenced by a proposal undeletable (hard_delete_user, GDPR purge).
+        # The proposal dies with its source, matching source_session_id's CASCADE.
+        ForeignKey("l1_walk_sessions.id", ondelete="CASCADE"),
        nullable=True,
        index=True,
    )
--- a/backend/app/models/l1_walk_session.py
+++ b/backend/app/models/l1_walk_session.py
@@ -8,8 +8,7 @@ import uuid
 from datetime import datetime, timezone
 from typing import Any, Optional, TYPE_CHECKING

-import sqlalchemy as sa
-from sqlalchemy import String, Text, DateTime, Boolean, ForeignKey, CheckConstraint
+from sqlalchemy import String, Text, DateTime, Boolean, ForeignKey, CheckConstraint, Index
 from sqlalchemy import text as sa_text
 from sqlalchemy.orm import Mapped, mapped_column, relationship
 from sqlalchemy.dialects.postgresql import UUID, JSONB
@@ -59,6 +58,12 @@ class L1WalkSession(Base):
            "OR (session_kind IN ('adhoc', 'ai_build') AND flow_id IS NULL AND flow_proposal_id IS NULL)",
            name="ck_l1_walk_sessions_target_consistency",
        ),
+        # Partial index backing GET /l1/escalations (the engineer handoff queue).
+        Index(
+            "ix_l1_walk_sessions_escalated",
+            "account_id", sa_text("last_step_at DESC"),
+            postgresql_where=sa_text("status = 'escalated'"),
+        ),
    )

    id: Mapped[uuid.UUID] = mapped_column(
@@ -86,6 +91,14 @@ class L1WalkSession(Base):

    # ── Session kind + target ──
    session_kind: Mapped[str] = mapped_column(String(20), nullable=False)
+    # AI-build context (ai_build sessions only). Persisted at intake so /next-node
+    # never has to re-fetch the ticket or scan walked_path to recover them — they
+    # are immutable for the life of the session. Replaces the former hidden
+    # ``{"node_type":"meta"}`` walked_path entry (deleted: it leaked into every
+    # consumer that forgot to skip it — junk proposals, off-by-one depth cap,
+    # blank escalation rows).
+    category: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
+    problem_text: Mapped[Optional[str]] = mapped_column(Text(), nullable=True)
    flow_id: Mapped[Optional[uuid.UUID]] = mapped_column(
        UUID(as_uuid=True),
        ForeignKey("trees.id", ondelete="SET NULL"),
@@ -99,6 +112,12 @@ class L1WalkSession(Base):

    # ── Navigation state ──
    current_node_id: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
+    # The node served to the tech but not yet answered (ai_build only). Replayed on
+    # the next /next-node call with node_id=None so a refresh / StrictMode double-mount
+    # doesn't fire a fresh paid LLM call (and possibly swap the question mid-answer).
+    pending_node: Mapped[Optional[dict[str, Any]]] = mapped_column(
+        JSONB(), nullable=True,
+    )
    walked_path: Mapped[list[dict[str, Any]]] = mapped_column(
        JSONB(), nullable=False, server_default=sa_text("'[]'::jsonb"),
    )
--- a/backend/app/schemas/l1.py
+++ b/backend/app/schemas/l1.py
@@ -3,33 +3,54 @@ from datetime import datetime
 from typing import Any, Literal, Optional
 from uuid import UUID

-from pydantic import BaseModel, Field
+from pydantic import BaseModel, Field, model_validator


 class IntakeRequest(BaseModel):
    problem_statement: str = Field(..., min_length=1)
    customer_name: Optional[str] = None
    customer_contact: Optional[str] = None
+    # When set, bypass the matcher and start this published flow directly (the
+    # suggest card's "Use this flow" — the client already holds the flow id).
    flow_id: Optional[UUID] = None
+    # When True, start an ad-hoc free-form walk (the out_of_scope prompt's
+    # "Walk it ad-hoc" fallback). Mutually informative with flow_id/force_build;
+    # flow_id takes precedence if both are somehow set.
+    adhoc: bool = False
    force_build: bool = False


+# Outcomes that start a session (and therefore must carry session_id + ticket).
+_SESSION_OUTCOMES = {"matched", "build", "adhoc"}
+
+
 class IntakeResponse(BaseModel):
-    outcome: Literal["matched", "suggest", "out_of_scope", "build"]
+    outcome: Literal["matched", "suggest", "out_of_scope", "build", "adhoc"]
    session_id: Optional[UUID] = None
    session_kind: Optional[Literal["flow", "proposal", "adhoc", "ai_build"]] = None
    ticket_id: Optional[str] = None
-    ticket_kind: Optional[str] = None
+    ticket_kind: Optional[Literal["psa", "internal"]] = None
    flow_id: Optional[UUID] = None   # for 'matched'
    near_miss: Optional[dict] = None  # for 'suggest'
    category: Optional[str] = None   # for 'out_of_scope'

+    @model_validator(mode="after")
+    def _check_outcome_invariants(self) -> "IntakeResponse":
+        """Restore the per-outcome contract the frontend depends on: a session
+        outcome MUST carry the session_id + ticket the walker navigates to, so a
+        backend regression surfaces here instead of as /l1/walk/undefined."""
+        if self.outcome in _SESSION_OUTCOMES:
+            if self.session_id is None or self.ticket_id is None:
+                raise ValueError(
+                    f"intake outcome '{self.outcome}' requires session_id + ticket_id"
+                )
+        return self
+

 class NextNodeRequest(BaseModel):
    node_id: Optional[str] = None
    node_text: Optional[str] = None  # rendered text of the node being answered (carry-forward Task 8)
-    answer: Optional[str] = None     # 'yes' | 'no' for questions
-    acknowledged: Optional[bool] = None
+    answer: Optional[str] = None     # 'yes' | 'no' for questions; None acks an instruction
    note: Optional[str] = None


@@ -70,6 +91,8 @@ class EscalateWithoutWalkRequest(BaseModel):
 class WalkSessionResponse(BaseModel):
    id: UUID
    session_kind: str
+    category: Optional[str] = None
+    problem_text: Optional[str] = None
    flow_id: Optional[UUID]
    flow_proposal_id: Optional[UUID]
    current_node_id: Optional[str]
--- a/backend/app/services/ai_tree_builder.py
+++ b/backend/app/services/ai_tree_builder.py
@@ -7,6 +7,7 @@ for flywheel capture.
 """
 import logging
 from typing import Any, Optional
+from uuid import uuid4

 from app.core.ai_provider import get_ai_provider
 from app.core.config import settings
@@ -37,32 +38,57 @@ HARD RULES:
 - When you run out of safe in-scope steps, DO NOT GUESS. Emit an "escalate" node.

 Return ONLY a JSON object for ONE node, one of:
-{"node_type":"question","text":"<yes/no question>"}
+{"node_type":"question","text":"<binary question>","yes_label":"<button text>","no_label":"<button text>"}
 {"node_type":"instruction","text":"<one safe reversible action>"}
 {"node_type":"resolved","text":"<confirmation the issue is fixed>"}
 {"node_type":"escalate","reason_category":"exhausted_safe_steps","text":"<why>"}
 No prose, no markdown fences.
+
+QUESTION LABELS: yes_label and no_label are the literal button texts the tech
+clicks — each must be a direct, complete answer to the question. For a plain
+yes/no question use "Yes"/"No". If the question offers two alternatives
+("Is it X or Y?"), the labels MUST be those alternatives (yes_label = the
+first), e.g. {"text":"Is the account a Microsoft account or a local account?",
+"yes_label":"Microsoft account","no_label":"Local account"}. Never pair an
+alternatives question with Yes/No labels. Keep labels under 6 words.
 """


-def _strip_meta(walked_path: list[dict]) -> list[dict]:
-    """Drop the hidden ``meta`` entry (category carrier) the intake endpoint seeds.
+def _assign_id(node: dict[str, Any]) -> dict[str, Any]:
+    """Stamp a stable server-side id on a generated node (Finding 1).

-    The first walked_path entry on an ai_build session may be a
-    ``{"node_type": "meta", "category": ...}`` marker used to persist the
-    classified category; it is not a real walk step and must be excluded from
-    both model context and tree normalization.
+    The SYSTEM_PROMPT never asks the model for an id — and we must not, since a
+    model-invented id is neither stable nor trustworthy. But the advance protocol
+    keys on ``node_id``: without one, the answer to every node is discarded and
+    the walk can never progress past the first question. So every node the builder
+    hands back — generated, depth-capped, or generation-failed — gets an id here.
    """
-    return [s for s in walked_path if s.get("node_type") != "meta"]
+    if not node.get("id"):
+        node["id"] = uuid4().hex[:8]
+    return node
+
+
+def _ensure_labels(node: dict[str, Any]) -> dict[str, Any]:
+    """Default question labels to Yes/No when the model omits them.
+
+    Labels are the literal button texts; downstream (UI, walked_path
+    answer_label, LLM context) assumes every served question carries both.
+    """
+    if node.get("node_type") == "question":
+        node["yes_label"] = (node.get("yes_label") or "Yes").strip() or "Yes"
+        node["no_label"] = (node.get("no_label") or "No").strip() or "No"
+    return node


 def _build_context(problem_text: str, category: str, walked_path: list[dict]) -> str:
-    walked_path = _strip_meta(walked_path)
    lines = [f"PROBLEM: {problem_text}", f"CATEGORY: {category}", "STEPS SO FAR:"]
    if not walked_path:
        lines.append("(none yet — produce the first diagnostic question)")
    for i, step in enumerate(walked_path, 1):
-        ans = step.get("answer")
+        # Prefer the chosen label: for an alternatives question
+        # ("Microsoft account or local account?"), a raw "yes" is ambiguous
+        # and degrades the next generation.
+        ans = step.get("answer_label") or step.get("answer")
        suffix = f" -> {ans}" if ans else ""
        lines.append(f"{i}. [{step.get('node_type','?')}] {step.get('text','')}{suffix}")
    return "\n".join(lines)
@@ -76,16 +102,27 @@ def validate_node(node: dict[str, Any]) -> dict[str, Any]:
    for pat in HARD_FLOOR_TEXT_PATTERNS:
        if pat in text:
            raise UnsafeNodeError(f"hard-floor pattern '{pat}' in node text")
+    labels = [node.get(k) for k in ("yes_label", "no_label") if node.get(k) is not None]
+    if labels:
+        if not all(isinstance(lb, str) and lb.strip() for lb in labels):
+            raise UnsafeNodeError(f"malformed answer labels: {labels!r}")
+        if len(labels) == 2 and labels[0].strip().lower() == labels[1].strip().lower():
+            raise UnsafeNodeError(f"indistinct answer labels: {labels!r}")
+        for lb in labels:
+            low = lb.lower()
+            for pat in HARD_FLOOR_TEXT_PATTERNS:
+                if pat in low:
+                    raise UnsafeNodeError(f"hard-floor pattern '{pat}' in answer label")
    return node


 def escalate_if_depth_exceeded(walked_path: list[dict]) -> Optional[dict[str, Any]]:
    if len(walked_path) >= MAX_DEPTH:
-        return {
+        return _assign_id({
            "node_type": "escalate",
            "reason_category": "depth_cap",
            "text": "Reached the L1 troubleshooting depth limit — escalating to engineering.",
-        }
+        })
    return None


@@ -108,16 +145,16 @@ async def generate_next_node(
                max_tokens=1024,
            )
            node = parse_llm_json(raw)
-            return validate_node(node)
+            return _assign_id(_ensure_labels(validate_node(node)))
        except Exception as e:
            logger.warning("ai_tree_builder node attempt %d failed: %s", attempt + 1, e)
            continue

-    return {
+    return _assign_id({
        "node_type": "escalate",
        "reason_category": "generation_failed",
        "text": "Could not generate a safe next step — escalating to engineering.",
-    }
+    })


 def normalize_walked_path(walked_path: list[dict]) -> dict[str, Any]:
@@ -128,7 +165,6 @@ def normalize_walked_path(walked_path: list[dict]) -> dict[str, Any]:
    Returns {id, nodes: {id: node}} — a dict with an id (passes the proposal
    approval guard).
    """
-    walked_path = _strip_meta(walked_path)
    nodes: dict[str, Any] = {}
    if not walked_path:
        root_id = "root"
@@ -145,6 +181,10 @@ def normalize_walked_path(walked_path: list[dict]) -> dict[str, Any]:
        if step.get("reason_category"):
            node["reason_category"] = step["reason_category"]
        if ntype == "question":
+            if step.get("yes_label"):
+                node["yes_label"] = step["yes_label"]
+            if step.get("no_label"):
+                node["no_label"] = step["no_label"]
            answer = (step.get("answer") or "").lower()
            stub_seq += 1
            stub_id = f"review-{stub_seq}"
--- a/backend/app/services/l1_session_service.py
+++ b/backend/app/services/l1_session_service.py
@@ -3,6 +3,7 @@
 start_* functions live in T12; step/notes are T13; resolve/escalate are T14.
 """
 import json
+import logging
 from datetime import datetime, timezone
 from typing import Optional
 from uuid import UUID
@@ -18,6 +19,8 @@ from app.services import ai_tree_builder
 from app.services import internal_ticket_service
 from app.services.notification_service import notify

+logger = logging.getLogger(__name__)
+

 def _resolve_acting_as(user: User) -> Optional[str]:
    """An engineer (whether covering or not) gets tagged for audit when using L1 surface.
@@ -108,8 +111,15 @@ async def start_ai_build_session(
    user: User,
    ticket_id: str,
    ticket_kind: str,
+    category: Optional[str] = None,
+    problem_text: Optional[str] = None,
 ) -> L1WalkSession:
-    """Start an AI-built tree session (nodes generated on demand via next-node)."""
+    """Start an AI-built tree session (nodes generated on demand via next-node).
+
+    ``category`` and ``problem_text`` are the immutable AI-build context, stored
+    once here so /next-node never re-derives them (no ticket re-fetch, no
+    walked_path scan, no hidden meta entry).
+    """
    session = L1WalkSession(
        account_id=account_id,
        created_by_user_id=user.id,
@@ -117,6 +127,8 @@ async def start_ai_build_session(
        ticket_id=ticket_id,
        ticket_kind=ticket_kind,
        session_kind="ai_build",
+        category=category,
+        problem_text=problem_text,
    )
    db.add(session)
    await db.flush()
@@ -144,6 +156,11 @@ async def advance_ai_build(
    the caller/endpoint, which holds the served node. Storing it here ensures that
    later nodes receive full prior-step context via ``ai_tree_builder._build_context``
    and that captured flywheel trees (``normalize_walked_path``) have meaningful text.
+
+    Pending-node replay (Finding 8): the node served but not yet answered is stored
+    on ``session.pending_node``. When node_id is None and a pending node exists (a
+    refresh, a StrictMode double-mount, or back/forward), we replay it instead of
+    firing a fresh paid LLM call that might also swap the question mid-answer.
    """
    session = await db.get(L1WalkSession, session_id)
    if not session:
@@ -166,11 +183,34 @@ async def advance_ai_build(
            "answer": answer,
            "l1_note": note,
        }
+        # answer_label: the button text the tech actually clicked. Derived from
+        # the server-held pending_node (never client-supplied) so an
+        # alternatives question ("Microsoft account or local account?") records
+        # "Microsoft account", not a bare "yes", in the transcript, the LLM
+        # context, and the captured flywheel tree.
+        pending = session.pending_node
+        if (
+            answer in ("yes", "no")
+            and isinstance(pending, dict)
+            and pending.get("id") == node_id
+        ):
+            label = pending.get(f"{answer}_label")
+            if label:
+                entry["answer_label"] = label
+            if pending.get("yes_label"):
+                entry["yes_label"] = pending["yes_label"]
+            if pending.get("no_label"):
+                entry["no_label"] = pending["no_label"]
        # JSONB requires assigning a new list — in-place mutation isn't tracked
        session.walked_path = [*session.walked_path, entry]
+        session.pending_node = None  # the served node has now been answered
+    elif session.pending_node is not None:
+        # Re-mount before answering — return the already-served node verbatim.
+        return session.pending_node

    next_node = await ai_tree_builder.generate_next_node(
        problem_text, category, session.walked_path)
+    session.pending_node = next_node
    session.current_node_id = next_node.get("id")
    session.last_step_at = datetime.now(timezone.utc)
    await db.flush()
@@ -361,24 +401,36 @@ async def escalate(
    )

    # Notify engineers (owner/admin/engineer roles) about the escalation.
+    # Filter soft-deleted users too (is_active alone misses them — handoff_manager
+    # does the same): a deleted engineer must not be paged.
    eng_rows = await db.execute(
        select(User.id).where(
            User.account_id == session.account_id,
            User.is_active.is_(True),
+            User.deleted_at.is_(None),
            User.account_role.in_(("owner", "admin", "engineer")),
        )
    )
    target_ids = [r[0] for r in eng_rows.all()]
+    if not target_ids:
+        # No eligible engineer. Passing [] to notify() would suppress the in-app
+        # notification entirely (explicit-empty is honored). Fall back to the
+        # default owner/admin recipient set instead of silently dropping it.
+        logger.warning(
+            "L1 escalation for session %s has no active engineer recipients; "
+            "falling back to default owner/admin notification set.",
+            session.id,
+        )
    await notify(
        "l1.session.escalated",
        session.account_id,
        {
-            "problem_summary": session.ticket_id,
+            "problem_summary": session.problem_text or session.ticket_id,
            "session_id": str(session.id),
            "reason_category": reason_category,
        },
        db,
-        target_user_ids=target_ids,
+        target_user_ids=target_ids or None,
    )

    await db.flush()
--- a/backend/app/services/match_or_build.py
+++ b/backend/app/services/match_or_build.py
@@ -52,7 +52,6 @@ async def match_or_build(
    account_id: UUID,
    problem_text: str,
    problem_domain: Optional[str],
-    ticket_ref: str,  # passed through for caller/session use; not consumed here (Task 10)
    *,
    db: AsyncSession,
    force_build: bool = False,
--- a/backend/tests/test_ai_tree_builder.py
+++ b/backend/tests/test_ai_tree_builder.py
@@ -2,6 +2,129 @@ import pytest
 from app.services import ai_tree_builder as atb


+class _FakeProvider:
+    def __init__(self, raw):
+        self._raw = raw
+
+    async def generate_json(self, *, system_prompt, messages, max_tokens):
+        return self._raw, None, None
+
+
+@pytest.mark.asyncio
+async def test_generate_next_node_assigns_id_when_model_omits_it(monkeypatch):
+    """The SYSTEM_PROMPT never asks the model for an id (Finding 1). The server
+    must assign one to every generated node, or the advance protocol — which keys
+    on node_id — can never record an answer and the walk stalls on question 1."""
+    monkeypatch.setattr(
+        atb, "get_ai_provider",
+        lambda *a, **k: _FakeProvider('{"node_type":"question","text":"Plugged in?"}'),
+    )
+    node = await atb.generate_next_node("printer down", "printer", [])
+    assert node["node_type"] == "question"
+    assert node.get("id"), "generated node must carry a server-assigned id"
+
+
+@pytest.mark.asyncio
+async def test_generate_next_node_depth_cap_node_has_id(monkeypatch):
+    """The depth-cap escalate node must also carry an id (it is persisted as
+    current_node_id and may be appended to walked_path)."""
+    walked = [{"node_type": "question", "id": f"n{i}", "text": "?", "answer": "no"}
+              for i in range(atb.MAX_DEPTH)]
+    node = await atb.generate_next_node("x", "printer", walked)
+    assert node["node_type"] == "escalate"
+    assert node.get("id")
+
+
+@pytest.mark.asyncio
+async def test_generate_next_node_generation_failed_node_has_id(monkeypatch):
+    """When both generation attempts fail, the fallback escalate node carries an id."""
+    monkeypatch.setattr(
+        atb, "get_ai_provider",
+        lambda *a, **k: _FakeProvider("not json at all"),
+    )
+    node = await atb.generate_next_node("x", "printer", [])
+    assert node["node_type"] == "escalate"
+    assert node["reason_category"] == "generation_failed"
+    assert node.get("id")
+
+
+# ---------------------------------------------------------------------------
+# Answer labels: the button text must match the question (live-walk defect:
+# "Microsoft account or local account?" rendered with Yes/No buttons).
+# ---------------------------------------------------------------------------
+
+def test_system_prompt_requires_answer_labels():
+    """The prompt must mandate yes_label/no_label on question nodes — the prompt
+    forcing label-less '<yes/no question>' output is the root cause of the
+    question/button mismatch."""
+    assert "yes_label" in atb.SYSTEM_PROMPT and "no_label" in atb.SYSTEM_PROMPT
+
+
+@pytest.mark.asyncio
+async def test_generated_question_passes_labels_through(monkeypatch):
+    monkeypatch.setattr(
+        atb, "get_ai_provider",
+        lambda *a, **k: _FakeProvider(
+            '{"node_type":"question",'
+            '"text":"Is Jane\'s Windows account a Microsoft account or a local account?",'
+            '"yes_label":"Microsoft account","no_label":"Local account"}'
+        ),
+    )
+    node = await atb.generate_next_node("login issue", "account_login", [])
+    assert node["yes_label"] == "Microsoft account"
+    assert node["no_label"] == "Local account"
+
+
+@pytest.mark.asyncio
+async def test_question_missing_labels_gets_yes_no_defaults(monkeypatch):
+    monkeypatch.setattr(
+        atb, "get_ai_provider",
+        lambda *a, **k: _FakeProvider('{"node_type":"question","text":"Is the printer powered on?"}'),
+    )
+    node = await atb.generate_next_node("printer down", "printer", [])
+    assert node["yes_label"] == "Yes"
+    assert node["no_label"] == "No"
+
+
+def test_validate_node_rejects_hard_floor_text_in_labels():
+    node = {"node_type": "question", "text": "How should we proceed?",
+            "yes_label": "Edit the registry", "no_label": "Wait"}
+    with pytest.raises(atb.UnsafeNodeError):
+        atb.validate_node(node)
+
+
+def test_validate_node_rejects_indistinct_or_malformed_labels():
+    base = {"node_type": "question", "text": "Which network is the laptop on?"}
+    with pytest.raises(atb.UnsafeNodeError):
+        atb.validate_node({**base, "yes_label": "Wi-Fi", "no_label": "wi-fi "})
+    with pytest.raises(atb.UnsafeNodeError):
+        atb.validate_node({**base, "yes_label": 1, "no_label": "Ethernet"})
+
+
+def test_build_context_prefers_answer_label_over_raw_answer():
+    """The LLM context must show what the tech actually chose — 'Q? -> yes' is
+    ambiguous for an alternatives question and degrades the next generation."""
+    ctx = atb._build_context("login issue", "account_login", [
+        {"node_type": "question", "id": "n1",
+         "text": "Microsoft account or local account?",
+         "answer": "yes", "answer_label": "Microsoft account"},
+    ])
+    assert "-> Microsoft account" in ctx
+    assert "-> yes" not in ctx
+
+
+def test_normalize_walked_path_preserves_question_labels():
+    walked = [
+        {"node_type": "question", "id": "n1", "text": "Wi-Fi or Ethernet?",
+         "answer": "yes", "answer_label": "Wi-Fi",
+         "yes_label": "Wi-Fi", "no_label": "Ethernet"},
+        {"node_type": "resolved", "id": "n2", "text": "Fixed."},
+    ]
+    tree = atb.normalize_walked_path(walked)
+    n1 = tree["nodes"]["n1"]
+    assert n1["yes_label"] == "Wi-Fi" and n1["no_label"] == "Ethernet"
+
+
 def test_validate_node_rejects_hard_floor_text():
    node = {"node_type": "instruction", "id": "n1", "text": "Open regedit and change the key", "next": "generate"}
    with pytest.raises(atb.UnsafeNodeError):
--- a/backend/tests/test_flow_proposal_l1_source.py
+++ b/backend/tests/test_flow_proposal_l1_source.py
@@ -1,5 +1,13 @@
 import uuid
+
+import pytest
+from sqlalchemy import select
+from sqlalchemy.ext.asyncio import AsyncSession
+
+from app.models.account import Account
 from app.models.flow_proposal import FlowProposal
+from app.models.l1_walk_session import L1WalkSession
+from app.models.user import User


 def test_flow_proposal_accepts_l1_session_id_without_source_session():
@@ -14,3 +22,44 @@ def test_flow_proposal_accepts_l1_session_id_without_source_session():
        status="pending",
    )
    assert p.l1_session_id is not None and p.source_session_id is None
+
+
+@pytest.mark.asyncio
+async def test_deleting_l1_session_cascades_proposal_not_check_violation(test_db: AsyncSession):
+    """Finding 6: an L1-sourced proposal has source_session_id NULL by the exactly-one
+    CHECK. With ondelete=CASCADE the proposal dies with its session; the old SET NULL
+    would have NULLed both columns and aborted the DELETE on the CHECK (time bomb)."""
+    s = str(uuid.uuid4())[:8]
+    account = Account(id=uuid.uuid4(), name=f"Acct {s}", display_code=s.upper())
+    test_db.add(account)
+    await test_db.flush()
+    user = User(
+        id=uuid.uuid4(), email=f"u-{uuid.uuid4()}@example.com", name="U",
+        account_id=account.id, account_role="l1_tech", role="engineer", is_active=True,
+    )
+    test_db.add(user)
+    await test_db.flush()
+    session = L1WalkSession(
+        account_id=account.id, created_by_user_id=user.id,
+        ticket_id="t-cascade", ticket_kind="internal", session_kind="ai_build",
+    )
+    test_db.add(session)
+    await test_db.flush()
+    proposal = FlowProposal(
+        account_id=account.id, l1_session_id=session.id, source_session_id=None,
+        proposal_type="new_flow", title="AI L1 draft",
+        proposed_flow_data={"tree_structure": {"id": "root"}},
+        source="ai_realtime_l1", status="pending",
+    )
+    test_db.add(proposal)
+    await test_db.flush()
+    pid = proposal.id
+
+    # Delete the session — must succeed and cascade to the proposal.
+    await test_db.delete(session)
+    await test_db.flush()
+
+    remaining = (await test_db.execute(
+        select(FlowProposal).where(FlowProposal.id == pid)
+    )).scalar_one_or_none()
+    assert remaining is None
--- a/backend/tests/test_l1_api_ai_build.py
+++ b/backend/tests/test_l1_api_ai_build.py
@@ -155,3 +155,73 @@ async def test_escalations_forbidden_for_l1_tech(client: AsyncClient, test_db: A
    info = await _make_user(client, test_db, email="aib_l1@example.com", account_role="l1_tech")
    r = await client.get("/api/v1/l1/escalations", headers=info["headers"])
    assert r.status_code == 403, r.text
+
+
+@pytest.mark.asyncio
+async def test_intake_with_flow_id_starts_flow_directly(client: AsyncClient, test_db: AsyncSession):
+    """Finding 4: an explicit flow_id bypasses the matcher and starts that flow."""
+    from app.models.tree import Tree
+    info = await _make_user(client, test_db, email="aib_flowid@example.com", account_role="l1_tech")
+    tree = Tree(
+        id=uuid.uuid4(), name="VPN Flow", account_id=info["account_id"],
+        author_id=info["user_id"], tree_type="troubleshooting",
+        tree_structure={"nodes": [], "edges": []}, visibility="team", status="published",
+    )
+    test_db.add(tree)
+    await test_db.commit()
+
+    # match_or_build must NOT be called when flow_id is supplied.
+    with patch(
+        "app.api.endpoints.l1.match_or_build.match_or_build",
+        new=AsyncMock(side_effect=AssertionError("matcher should be bypassed")),
+    ):
+        r = await client.post(
+            "/api/v1/l1/intake",
+            json={"problem_statement": "vpn down", "flow_id": str(tree.id)},
+            headers=info["headers"],
+        )
+    assert r.status_code == 200, r.text
+    body = r.json()
+    assert body["outcome"] == "matched"
+    assert body["session_kind"] == "flow"
+    assert body["flow_id"] == str(tree.id)
+    assert body["session_id"]
+
+
+@pytest.mark.asyncio
+async def test_intake_adhoc_starts_adhoc_session(client: AsyncClient, test_db: AsyncSession):
+    """Finding 5: adhoc=True starts a free-form ad-hoc walk (out_of_scope fallback)."""
+    info = await _make_user(client, test_db, email="aib_adhoc@example.com", account_role="l1_tech")
+    with patch(
+        "app.api.endpoints.l1.match_or_build.match_or_build",
+        new=AsyncMock(side_effect=AssertionError("matcher should be bypassed")),
+    ):
+        r = await client.post(
+            "/api/v1/l1/intake",
+            json={"problem_statement": "weird thing", "adhoc": True},
+            headers=info["headers"],
+        )
+    assert r.status_code == 200, r.text
+    body = r.json()
+    assert body["outcome"] == "adhoc"
+    assert body["session_kind"] == "adhoc"
+    assert body["session_id"]
+
+
+@pytest.mark.asyncio
+async def test_intake_build_persists_category_and_problem_text(client: AsyncClient, test_db: AsyncSession):
+    """Root cause B: build stores category + problem_text on the session (no meta entry)."""
+    info = await _make_user(client, test_db, email="aib_cols@example.com", account_role="l1_tech")
+    with patch(
+        "app.api.endpoints.l1.match_or_build.match_or_build",
+        new=AsyncMock(return_value={"outcome": "build", "session_kind": "ai_build",
+                                    "category": "printer"}),
+    ):
+        r = await client.post("/api/v1/l1/intake",
+                              json={"problem_statement": "printer jam"}, headers=info["headers"])
+    sid = r.json()["session_id"]
+    sess = await test_db.get(L1WalkSession, uuid.UUID(sid))
+    assert sess.category == "printer"
+    assert sess.problem_text == "printer jam"
+    # No hidden meta entry smuggled into walked_path.
+    assert sess.walked_path == []
--- a/backend/tests/test_l1_categories_api.py
+++ b/backend/tests/test_l1_categories_api.py
@@ -1,6 +1,6 @@
 """Tests for the account L1 AI-build category settings API (Phase 2A).

-GET /accounts/me/l1-categories — readable by L1-or-above.
+GET /accounts/me/l1-categories — owner/admin only (Finding 7: read and write agree).
 PATCH /accounts/me/l1-categories — owner/admin only; drops unknown/hard-floored keys.
 """
 import uuid
@@ -65,12 +65,22 @@ async def test_get_categories_returns_enabled_available_hard_floor(client: Async


@pytest.mark.asyncio
-async def test_get_categories_readable_by_l1_tech(client: AsyncClient, test_db: AsyncSession):
-    info = await _make_user(client, test_db, email="cat_l1_get@example.com", account_role="l1_tech")
+async def test_get_categories_readable_by_admin(client: AsyncClient, test_db: AsyncSession):
+    """Finding 7: account admins can READ (previously 403 on GET while they could PATCH)."""
+    info = await _make_user(client, test_db, email="cat_admin_get@example.com", account_role="admin")
    r = await client.get("/api/v1/accounts/me/l1-categories", headers=info["headers"])
    assert r.status_code == 200, r.text


+@pytest.mark.asyncio
+async def test_get_categories_forbidden_for_l1_tech(client: AsyncClient, test_db: AsyncSession):
+    """Finding 7: GET now matches PATCH (owner/admin only). The walker gates
+    server-side and never fetches this, so l1_tech read access was unused."""
+    info = await _make_user(client, test_db, email="cat_l1_get@example.com", account_role="l1_tech")
+    r = await client.get("/api/v1/accounts/me/l1-categories", headers=info["headers"])
+    assert r.status_code == 403, r.text
+
+
@pytest.mark.asyncio
 async def test_patch_categories_owner_can_set(client: AsyncClient, test_db: AsyncSession):
    info = await _make_user(client, test_db, email="cat_owner_patch@example.com", account_role="owner")
--- a/backend/tests/test_l1_endpoints.py
+++ b/backend/tests/test_l1_endpoints.py
@@ -124,8 +124,9 @@ async def _create_adhoc_session(db: AsyncSession, info: dict, *, problem: str =
 async def test_intake_build_creates_ai_build_session(client: AsyncClient, test_db: AsyncSession):
    """POST /l1/intake with a 'build' outcome creates an ai_build session.

-    Phase 2A: intake dispatches via match_or_build; 'adhoc' is no longer a direct
-    intake outcome (it is offered from the out_of_scope prompt on the frontend).
+    Phase 2A: intake dispatches via match_or_build. An explicit adhoc=True (the
+    out_of_scope prompt's "Walk it ad-hoc") starts an ad-hoc session directly —
+    see test_l1_api_ai_build.test_intake_adhoc_starts_adhoc_session.
    """
    from unittest.mock import AsyncMock, patch
    info = await _make_l1_user(client, test_db, email="l1intake@example.com")
--- a/backend/tests/test_l1_session_service.py
+++ b/backend/tests/test_l1_session_service.py
@@ -11,6 +11,7 @@ from app.models.user import User
 from app.models.tree import Tree
 from app.models.ai_session import AISession
 from app.models.flow_proposal import FlowProposal
+from app.models.l1_walk_session import L1WalkSession
 from app.services.l1_session_service import (
    start_flow_session,
    start_proposal_session,
@@ -1073,3 +1074,173 @@ async def test_escalate_without_walk_writes_audit_log(test_db: AsyncSession):
    )
    row = result.scalar_one()
    assert row.account_id == account.id
+    # Audit coverage: the reason category must be recorded (restored — a prior
+    # edit dropped this assertion, weakening the audit guarantee).
+    assert row.details["escalation_reason_category"] == "no_kb_content"
+
+
+# ---------------------------------------------------------------------------
+# Finding 1 (server-assigned node ids) + Finding 8 (pending-node replay)
+# ---------------------------------------------------------------------------
+
+class _FakeProvider:
+    def __init__(self, raw):
+        self._raw = raw
+
+    async def generate_json(self, *, system_prompt, messages, max_tokens):
+        return self._raw, None, None
+
+
+@pytest.mark.asyncio
+async def test_ai_build_first_node_carries_id_and_advance_grows_walk(
+    test_db: AsyncSession, monkeypatch,
+):
+    """Finding 1 contract: the SYSTEM_PROMPT never asks for an id, yet the first
+    generated node must carry one — and advancing with that id must grow walked_path
+    (the original showstopper: node_id was always None, so the walk never advanced)."""
+    from app.services import l1_session_service as svc
+    from app.services import ai_tree_builder
+    account = await _make_account(test_db)
+    l1_user = await _make_user(test_db, account_id=account.id)
+    s = await svc.start_ai_build_session(
+        test_db, account_id=account.id, user=l1_user,
+        ticket_id="t-contract", ticket_kind="internal",
+        category="printer", problem_text="printer offline")
+
+    # Real generator + a provider that omits id (the shape the model produces).
+    monkeypatch.setattr(
+        ai_tree_builder, "get_ai_provider",
+        lambda *a, **k: _FakeProvider('{"node_type":"question","text":"Plugged in?"}'))
+
+    first = await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="printer offline",
+        category="printer", node_id=None)
+    assert first.get("id"), "first node must carry a server-assigned id"
+
+    # Answer it with the id we were handed; walked_path must grow by one.
+    await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="printer offline", category="printer",
+        node_id=first["id"], node_text=first["text"], answer="no")
+    refreshed = await test_db.get(L1WalkSession, s.id)
+    assert len(refreshed.walked_path) == 1
+    assert refreshed.walked_path[0]["id"] == first["id"]
+
+
+@pytest.mark.asyncio
+async def test_advance_ai_build_replays_pending_node_without_regenerating(
+    test_db: AsyncSession, monkeypatch,
+):
+    """Finding 8: a re-mount (node_id=None) replays the served-but-unanswered node
+    instead of firing a fresh paid LLM call (which could also swap the question)."""
+    from app.services import l1_session_service as svc
+    from app.services import ai_tree_builder
+    account = await _make_account(test_db)
+    l1_user = await _make_user(test_db, account_id=account.id)
+    s = await svc.start_ai_build_session(
+        test_db, account_id=account.id, user=l1_user,
+        ticket_id="t-replay", ticket_kind="internal",
+        category="printer", problem_text="printer offline")
+
+    calls = {"n": 0}
+
+    async def fake_next(problem, category, walked):
+        calls["n"] += 1
+        return {"node_type": "question", "id": f"q{calls['n']}", "text": "?"}
+    monkeypatch.setattr(ai_tree_builder, "generate_next_node", fake_next)
+
+    first = await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="p", category="printer", node_id=None)
+    # Re-mount without answering — must NOT regenerate.
+    replay = await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="p", category="printer", node_id=None)
+    assert calls["n"] == 1
+    assert replay["id"] == first["id"]
+
+
+@pytest.mark.asyncio
+async def test_advance_ai_build_records_answer_label_from_pending_node(
+    test_db: AsyncSession, monkeypatch,
+):
+    """When the served question carried yes_label/no_label, answering it must
+    record the chosen label (answer_label) in walked_path — derived server-side
+    from pending_node, never trusted from the client. 'Microsoft account or
+    local account? -> yes' is meaningless in the transcript and the LLM context."""
+    from app.services import l1_session_service as svc
+    from app.services import ai_tree_builder
+    account = await _make_account(test_db)
+    l1_user = await _make_user(test_db, account_id=account.id)
+    s = await svc.start_ai_build_session(
+        test_db, account_id=account.id, user=l1_user,
+        ticket_id="t-label", ticket_kind="internal",
+        category="account_login", problem_text="login issue")
+
+    async def fake_next(problem, category, walked):
+        return {"node_type": "question", "id": "q-acct",
+                "text": "Is the account a Microsoft account or a local account?",
+                "yes_label": "Microsoft account", "no_label": "Local account"}
+    monkeypatch.setattr(ai_tree_builder, "generate_next_node", fake_next)
+
+    first = await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="login issue",
+        category="account_login", node_id=None)
+    await svc.advance_ai_build(
+        test_db, session_id=s.id, problem_text="login issue",
+        category="account_login",
+        node_id=first["id"], node_text=first["text"], answer="yes")
+    refreshed = await test_db.get(L1WalkSession, s.id)
+    assert refreshed.walked_path[0]["answer"] == "yes"
+    assert refreshed.walked_path[0]["answer_label"] == "Microsoft account"
+
+
+# ---------------------------------------------------------------------------
+# Finding 10: escalation recipient resolution
+# ---------------------------------------------------------------------------
+
+@pytest.mark.asyncio
+async def test_escalate_skips_soft_deleted_engineer(test_db: AsyncSession, monkeypatch):
+    """A soft-deleted engineer must not be paged (is_active alone misses them)."""
+    from datetime import datetime, timezone
+    from app.services import l1_session_service as svc
+    calls = {}
+
+    async def fake_notify(event, account_id, payload, db, target_user_ids=None):
+        calls["target_user_ids"] = target_user_ids
+    monkeypatch.setattr(svc, "notify", fake_notify)
+
+    account = await _make_account(test_db)
+    l1_user = await _make_user(test_db, account_id=account.id)
+    live_eng = await _make_user(test_db, account_id=account.id, account_role="engineer")
+    dead_eng = await _make_user(test_db, account_id=account.id, account_role="engineer")
+    dead_eng.deleted_at = datetime.now(timezone.utc)
+    await test_db.flush()
+    ticket = await _make_internal_ticket(test_db, account_id=account.id, user_id=l1_user.id)
+    s = await svc.start_ai_build_session(
+        test_db, account_id=account.id, user=l1_user,
+        ticket_id=str(ticket.id), ticket_kind="internal")
+    await svc.escalate(test_db, session_id=s.id, reason="x", reason_category="exhausted_safe_steps")
+    assert live_eng.id in calls["target_user_ids"]
+    assert dead_eng.id not in calls["target_user_ids"]
+
+
+@pytest.mark.asyncio
+async def test_escalate_with_no_engineers_falls_back_to_default_recipients(
+    test_db: AsyncSession, monkeypatch,
+):
+    """Finding 10: when no eligible engineer exists, pass None (not []) so notify()
+    falls back to the default owner/admin set instead of silently dropping it."""
+    from app.services import l1_session_service as svc
+    calls = {}
+
+    async def fake_notify(event, account_id, payload, db, target_user_ids=None):
+        calls["target_user_ids"] = target_user_ids
+    monkeypatch.setattr(svc, "notify", fake_notify)
+
+    account = await _make_account(test_db)
+    # Only an l1_tech exists — not in the owner/admin/engineer recipient query.
+    l1_user = await _make_user(test_db, account_id=account.id)
+    ticket = await _make_internal_ticket(test_db, account_id=account.id, user_id=l1_user.id)
+    s = await svc.start_ai_build_session(
+        test_db, account_id=account.id, user=l1_user,
+        ticket_id=str(ticket.id), ticket_kind="internal")
+    await svc.escalate(test_db, session_id=s.id, reason="x", reason_category="exhausted_safe_steps")
+    assert calls["target_user_ids"] is None
--- a/backend/tests/test_match_or_build.py
+++ b/backend/tests/test_match_or_build.py
@@ -10,7 +10,7 @@ async def test_match_wins_before_category_gate():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(
            return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "VPN", "score": 0.9}])), \
         patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=[])):
-        res = await mob.match_or_build(uuid.uuid4(), "vpn down", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "vpn down", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "matched"
    assert res["session_kind"] == "flow"

@@ -19,7 +19,7 @@ async def test_match_wins_before_category_gate():
 async def test_suggest_band():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(
            return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.66}])):
-        res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "p", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "suggest"
    assert res["near_miss"]["flow_name"] == "X"
    assert "flow_id" in res["near_miss"] and isinstance(res["near_miss"]["flow_id"], str)
@@ -32,7 +32,7 @@ async def test_out_of_scope_when_category_disabled_on_build_path():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(return_value=[])), \
         patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \
         patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["vpn_connect"])):
-        res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "out_of_scope"


@@ -41,7 +41,7 @@ async def test_build_when_enabled_and_no_match():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(return_value=[])), \
         patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \
         patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["printer"])):
-        res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "build"
    assert res["session_kind"] == "ai_build"

@@ -52,7 +52,7 @@ async def test_force_build_skips_match_but_still_gates():
    with patch.object(mob.flow_matching_engine, "find_matches", new=fm), \
         patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \
         patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["printer"])):
-        res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=True)
+        res = await mob.match_or_build(uuid.uuid4(), "p", None, db=AsyncMock(), force_build=True)
    fm.assert_not_called()
    assert res["outcome"] == "build"

@@ -61,7 +61,7 @@ async def test_force_build_skips_match_but_still_gates():
 async def test_score_exactly_match_threshold_is_matched():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(
            return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.75}])):
-        res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "p", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "matched"


@@ -69,7 +69,7 @@ async def test_score_exactly_match_threshold_is_matched():
 async def test_score_exactly_suggest_threshold_is_suggest():
    with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(
            return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.60}])):
-        res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "p", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "suggest"


@@ -79,7 +79,7 @@ async def test_score_below_suggest_falls_through_to_build_path():
            return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.4}])), \
         patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \
         patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["printer"])):
-        res = await mob.match_or_build(uuid.uuid4(), "printer", None, "t1", db=AsyncMock(), force_build=False)
+        res = await mob.match_or_build(uuid.uuid4(), "printer", None, db=AsyncMock(), force_build=False)
    assert res["outcome"] == "build"


--- a/docs/plans/2026-06-09-pr193-phase2a-review-findings.md
+++ b/docs/plans/2026-06-09-pr193-phase2a-review-findings.md
@@ -0,0 +1,180 @@
+# PR #193 (Phase 2A — L1 AI Tree Builder) Review Findings
+
+**Date:** 2026-06-09
+**Reviewed:** `feat/l1-ai-tree-builder-phase-2a` vs `main` (42 files, +2,326/−154)
+**Process:** 7 independent finder angles, every candidate independently verified against actual code (quoted lines confirmed, not speculation).
+**Verdict: DO NOT MERGE as-is.** The headline feature (AI-guided walkthrough) is non-functional end-to-end, two tasks recorded as complete in `.ai/HANDOFF.md` were never actually committed, and one DB constraint is a deletion time bomb.
+
+---
+
+## ✅ RESOLUTION (2026-06-09, same day)
+
+**All 10 findings resolved.** Two architectural decisions taken (see `.ai/DECISIONS.md`):
+the **root fix** for Findings 8/9 (real `category` / `problem_text` / `pending_node`
+columns on `l1_walk_sessions`; the `{"node_type":"meta"}` walked_path convention
+deleted entirely — migration `61dda4f615c6`), and **restoring the ad-hoc walk**
+(Finding 5 option a — `adhoc=True` intake + "Walk it ad-hoc" out_of_scope button).
+
+- **Finding 1** — `ai_tree_builder._assign_id` stamps `uuid4().hex[:8]` on every node
+  (generated, depth-cap, generation-failed); `current_node_id` now real. Contract test
+  added (`test_ai_build_first_node_carries_id_and_advance_grows_walk`).
+- **Finding 2a/3** — `L1EscalationsSection` mounted on `EscalationQueuePage`;
+  `ProposalDetail` `/pilot` link gated on `source_session_id`, L1-source block added.
+- **Finding 2b** — renders `step.question ?? step.text`, `timeAgo`, shows `problem_text`.
+- **Finding 4** — intake honors explicit `flow_id` (matcher bypassed); suggest card passes
+  `near_miss.flow_id`; the three intake handlers collapsed into one `runIntake`.
+- **Finding 5** — ad-hoc walk restored (option a).
+- **Finding 6** — `l1_session_id` FK → `ondelete=CASCADE` (model + migration); cascade-delete test.
+- **Finding 7** — owner+admin at all three layers (GET dep, route guard, `usePermissions`);
+  `require_account_owner_or_admin` delegates to `User.can_manage_account`; `User.account_role`
+  TS type gains `'admin'`.
+- **Finding 8** — `pending_node` column; `/next-node` replays the served node on re-mount
+  (no duplicate paid generation); reads context off the session (no ticket re-fetch).
+- **Finding 9** — meta entry gone → empty walk is falsy (no junk proposal) and the depth
+  cap counts only real steps.
+- **Finding 10** — `escalate` passes `target_ids or None` (default fallback), filters
+  `deleted_at IS NULL`, warns when empty; two tests.
+- **Cleanups** — dead `ticket_ref` deleted, `IntakeResponse` per-outcome validator + `ticket_kind`
+  Literal restored, unused `acknowledged` dropped, escalations partial index added, restored the
+  deleted `no_kb_content` audit assertion.
+
+**Verification:** full Phase 2A backend set **110 passed / 0 failed**; frontend `tsc -b` +
+`eslint` + `vite build` clean; migration upgrade→downgrade→upgrade roundtrip clean
+(columns + FK `confdeltype` + partial index confirmed); anti-parrot guardrail green.
+
+How to use this file: work the findings in order. Findings 1–7 are merge blockers; 8–10 can be fast-follows. Each finding lists the verified evidence (file:line) and a suggested fix. Several findings share two root causes — fix those at the root rather than patching symptoms:
+
+- **Root cause A:** AI-generated nodes have no `id`, but the advance protocol keys on `node_id`. (Finding 1; touches 8.)
+- **Root cause B:** The intake category is smuggled into `walked_path` as a fake `{"node_type":"meta"}` entry that every consumer must know to skip — and most don't. (Findings 2b, 9; the deeper fix is a real `category` column on `l1_walk_sessions`, plus `problem_text` while you're there — see Finding 8's note.)
+
+---
+
+## MERGE BLOCKERS
+
+### 1. AI walkthrough can never advance past the first question (showstopper)
+
+**Evidence:**
+- `backend/app/services/ai_tree_builder.py:37-41` — SYSTEM_PROMPT's JSON output shapes (`{"node_type":"question","text":...}` etc.) define **no `id` field**; `validate_node` (lines 71-79) returns the node unchanged; nothing anywhere assigns an id.
+- `backend/app/services/l1_session_service.py:156` — `advance_ai_build` only appends to `walked_path` `if node_id is not None`; docstring (line 139) says "On the first call (node_id is None) nothing is appended."
+- `frontend/src/components/l1/L1WalkTreeVariant.tsx:52-54` — sends `node_id: node.id`, which is `undefined` at runtime (server never sends an id; `JSON.stringify` drops undefined keys) → backend always receives `node_id=None`.
+- `l1_session_service.py:174` — `session.current_node_id = next_node.get("id")` is always `None`.
+
+**User impact:** Tech answers the first question → answer is discarded, the same (or a re-rolled) first question regenerates forever. `walked_path` never grows past the meta entry, the depth cap never fires, and resolve captures an empty tree.
+
+**Why tests missed it:** `test_l1_api_ai_build` and friends mock `advance_ai_build` / hand-craft nodes **with** `id` keys — a shape the real model is never instructed to produce.
+
+**Fix:** Assign a server-side id to every generated node before returning it (e.g., `uuid4().hex[:8]` in `generate_next_node` after `validate_node`), persist it as `session.current_node_id`, and add a test that runs the real (unmocked-shape) prompt contract: generate → assert node has id → advance with that id → assert walked_path grew. Do NOT ask the LLM to invent ids.
+
+### 2. Escalations from AI sessions go nowhere (two linked defects)
+
+**2a — Component never mounted.**
+- `grep -rn "L1EscalationsSection" frontend/src` → exactly one hit: its own definition (`frontend/src/components/l1/L1EscalationsSection.tsx:10`). It is imported nowhere.
+- `frontend/src/pages/EscalationQueuePage.tsx:3` imports only `EscalationQueue, EscalationMetricCard` from `@/components/flowpilot`.
+- `backend/app/services/notification_service.py:449` — `"l1.session.escalated": "/escalations"` deep-link → `frontend/src/router.tsx:299` renders `EscalationQueuePage` → engineer sees only FlowPilot escalations; the L1 handoff is invisible. `GET /l1/escalations` (`backend/app/api/endpoints/l1.py:330`) has no UI surface.
+- **Note:** `.ai/HANDOFF.md:38` claims "L1EscalationsSection on EscalationQueuePage" — that claim is false (documentation drift; see Finding 3).
+
+**2b — Component renders wrong fields once mounted.**
+- `backend/app/services/l1_session_service.py:162-168` — ai_build entries are `{"node_type", "id", "text", "answer", "l1_note"}` (key is `text`); legacy `record_step` (lines 199-204) uses `question`/`node_id`.
+- `L1EscalationsSection.tsx:61` renders `{step.question}` → blank for every ai_build entry.
+- `L1EscalationsSection.tsx:46` — `{s.walked_path.length} steps walked` counts the hidden meta entry → "N+1 steps walked".
+- `backend/app/api/endpoints/l1.py:41` — `_to_response` returns `walked_path` raw (meta entry included).
+
+**Fix:** Mount `L1EscalationsSection` on `EscalationQueuePage` (or fold L1 rows into the existing queue). Render `step.question ?? step.text`. Filter `node_type === 'meta'` entries — ideally server-side in `_to_response` (or eliminate the meta entry entirely per Root cause B). Also use the shared `timeAgo` util (`frontend/src/lib/timeAgo.ts`) instead of `new Date(...).toLocaleString()` at line 50, to match every sibling queue.
+
+### 3. Two tasks recorded as complete were never committed
+
+- Task 16 ("ProposalDetail L1-source block") and Task 17 (mounting the escalations section) appear in `.ai/HANDOFF.md` / `SESSION_LOG.md` as done, but **no hunk for `ProposalDetail.tsx` exists in the diff**, and Finding 2a proves the mount never happened.
+- Concrete user impact today: `frontend/src/components/flowpilot/ProposalDetail.tsx:91-101` renders the "Source Session" card unconditionally; line 95 is `` to={`/pilot/${proposal.source_session_id}`} `` with no null guard. L1-sourced proposals (created with `source_session_id=None`, `l1_session_id=<session>`) reach the review queue as `pending` → engineers get a broken **`/pilot/null`** link.
+
+**Fix:** Implement the missing work: in `ProposalDetail.tsx`, gate the `/pilot/` link on `source_session_id != null` and render an L1-source block (problem statement, category, link to the L1 session / escalations view) when `l1_session_id` is set. The backend already serves `l1_session_id` via `backend/app/schemas/flow_proposal.py`. Then correct `.ai/HANDOFF.md`.
+
+### 4. "Use this flow" button silently does nothing
+
+- `frontend/src/pages/l1/L1Dashboard.tsx:77-86` — `useSuggestedFlow` re-POSTs `/l1/intake` with the same text, no `flow_id`. The in-code comment ("it matches again and returns a `matched` outcome") is factually wrong: the same text scores in the same 0.60–0.75 suggest band (`backend/app/services/match_or_build.py:66-72`, `MATCH_THRESHOLD = 0.75` line 21) → `suggest` again, no `session_id` → handler falls to `resetPrompts()` and the card vanishes. The suggested flow can never be started.
+- `backend/app/api/endpoints/l1.py` — the rewritten intake **never reads `payload.flow_id`** (old branch deleted, diff confirms); `IntakeRequest.flow_id` (`backend/app/schemas/l1.py:13`) is now dead.
+
+**Fix:** Make intake honor an explicit `flow_id` (bypass the matcher, call `start_flow_session` directly — restores the deleted behavior), and have the suggest card pass `near_miss.flow_id`. This also kills the wasteful re-run of the embedding + pgvector + keyword pipeline just to rediscover a flow_id the client already holds.
+
+### 5. Out-of-scope problems lost the ad-hoc walk fallback
+
+- Old intake had `else: start_adhoc_session(...)`; the rewrite (`backend/app/api/endpoints/l1.py:88-102`) dispatches only matched/build/suggest/out_of_scope. `start_adhoc_session` (`l1_session_service.py:82`) now has **zero callers** — ad-hoc sessions are unreachable product-wide (the only remaining `session_kind="adhoc"` creation is `escalate_without_walk`, an audit record, not walkable).
+- `L1Dashboard.tsx:269-292` — out_of_scope prompt offers only "Escalate to engineering" / "Cancel".
+- Stale copy: `frontend/src/pages/account/L1CategoriesPage.tsx:57-58` still promises "Disabled categories fall back to an ad-hoc walk or escalation." A diff test comment also claims adhoc "is offered from the out_of_scope prompt" — it is not.
+
+**Fix (decide deliberately, don't drift):** Either (a) add a "Walk it ad-hoc" option to the out_of_scope prompt that hits a path creating an adhoc session (restore the capability), or (b) if dropping ad-hoc is intentional, fix the L1CategoriesPage copy and the test comment, and note the decision in `.ai/DECISIONS.md`. Option (a) preserves pre-existing user capability; recommend (a).
+
+### 6. DB constraint makes L1-session deletion always fail (time bomb)
+
+- `backend/app/models/flow_proposal.py:60-63` — `CheckConstraint("(source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL)")` (XOR).
+- Line 87-92 — `l1_session_id` FK is `ondelete="SET NULL"`; `source_session_id` (line 83) is `ondelete="CASCADE"`.
+- Migration `backend/alembic/versions/1fd88a68b145_flow_proposal_l1_source_linkage.py:32-45` ships the same DDL.
+- Postgres CHECK constraints are non-deferrable and ARE evaluated on the UPDATE produced by `ON DELETE SET NULL`. So deleting any `l1_walk_sessions` row referenced by a proposal (whose `source_session_id` is NULL by construction) → both columns NULL → CHECK violation → the DELETE fails. The `SET NULL` action can literally never fire successfully.
+- Reachable today via `backend/app/api/endpoints/admin.py:1336` (`hard_delete_user` → `db.delete(account)`, DB-side cascades with unspecified ordering), and via any future GDPR/retention purge.
+
+**Fix:** Change `l1_session_id` to `ondelete="CASCADE"` (matching `source_session_id`'s behavior — proposal dies with its source), in both the model and a new migration. Keep the XOR check. Alternative (`num_nonnulls(...) >= 1` style relaxation) is weaker; prefer CASCADE.
+
+### 7. Account admins locked out of L1 category settings (3-layer inconsistency)
+
+- Frontend route: `frontend/src/router.tsx:368-372` — `requiredRole="owner"`; `frontend/src/hooks/usePermissions.ts:21-28` (`getEffectiveRole`) has **no admin branch** → `account_role='admin'` maps to `viewer` → bounced to /trees.
+- Backend GET: `backend/app/api/endpoints/accounts.py:175-178` uses `require_l1_or_above` (`deps.py:235-242`: `l1_tech/engineer/owner` only) → admin gets **403 on read**.
+- Backend PATCH: `accounts.py:193-197` uses `require_account_owner_or_admin` (`deps.py:279-289`) → admin **can write**.
+- `admin` is a real role: `backend/app/models/user.py:25` CHECK constraint; `user.py:132` treats admin as account-manager.
+
+**Fix:** Pick one rule — owner+admin manage L1 categories — and apply it at all three layers: GET should use `require_account_owner_or_admin` too (or a combined dep), and the route guard needs admins to pass (either add an admin branch to `getEffectiveRole` — check blast radius on other `requiredRole` uses first — or a dedicated `canManageAccount`-style guard for this route). Also note `require_account_owner_or_admin` duplicates `User.can_manage_account` (`user.py:130-132`); delegate to it.
+
+---
+
+## FAST-FOLLOWS (real bugs, lower urgency)
+
+### 8. Every walk-view mount fires a fresh paid LLM call (and may swap the question)
+
+- The served-but-unanswered node is never persisted: `l1_session_service.py:156-174` — `node_id is None` path goes straight to `generate_next_node`; only `current_node_id` (always None today, see Finding 1) is stored. No replay branch.
+- `L1WalkTreeVariant.tsx:26-44` — mount effect unconditionally POSTs `/next-node {}`; `frontend/src/main.tsx:4,34` — StrictMode is on, so dev double-mounts double-generate.
+
+**Impact:** Refresh/back-forward = duplicate Sonnet spend, multi-second stall, and possibly a *different* question than the one the tech was answering.
+
+**Fix:** Persist the pending node (e.g., a `pending_node` JSONB column on `l1_walk_sessions`, or reuse `current_node_id` + stored payload) and replay it when `node_id is None` and a pending node exists. Note: if adding columns, this is the moment to also add `category` and `problem_text` columns and delete the meta-entry convention (Root cause B) — `/next-node` currently re-fetches the internal ticket and re-scans walked_path on every step (`l1.py:302-310`) just to recover these immutable values.
+
+### 9. Hidden meta entry: junk proposals + depth cap off-by-one
+
+- Junk proposal: `l1_session_service.py:270` — `if helpful and session.session_kind == "ai_build" and session.walked_path:` — a meta-only walked_path (seeded at intake, `l1.py:132-134`) is truthy. `normalize_walked_path` (`ai_tree_builder.py:131-137`) strips meta → empty → returns the `"Empty walk — needs authoring."` stub, which "passes the proposal approval guard" per its own docstring → a `status="pending"`, `validated_by_outcome=True` junk proposal reaches the review queue when a tech resolves immediately after intake.
+- Depth cap: `l1_session_service.py:172-173` passes the **raw** walked_path; `ai_tree_builder.py:82-83,96-98` — `len(walked_path) >= MAX_DEPTH` (12) counts the meta entry → cap fires after 11 real steps. `_strip_meta` is applied only downstream.
+
+**Fix (symptom-level):** strip meta before both the truthiness guard and `escalate_if_depth_exceeded`. **Fix (root):** real `category` column, delete the meta convention (see Finding 8 note). Root fix preferred per project principle (correct architecture over minimal diff).
+
+### 10. Escalation notification silently dropped when recipient query is empty
+
+- `notification_service.py:180` changed `if target_user_ids:` → `if target_user_ids is not None:` (intentional, documented at lines 176-178).
+- `l1_session_service.py:371-381` — `escalate()` passes its computed `target_ids` unconditionally; if all owners/admins/engineers are inactive, `[]` → zero in-app notifications, no log, no fallback. (Existing callers are safe — they all use `[x] if x else None` patterns.)
+- Bonus divergence: escalate's hand-rolled query filters only `is_active`, while `handoff_manager.py:323-333` also filters `deleted_at IS NULL` — soft-deleted engineers would be notified.
+
+**Fix:** In `escalate()`: `target_user_ids=target_ids or None` (falls back to default recipients) plus a warning log when empty; add the `deleted_at` filter. Longer-term: give `_resolve_recipients` a roles parameter so callers stop hand-rolling recipient queries.
+
+---
+
+## Cleanups (optional, do alongside adjacent fixes)
+
+- `L1Dashboard.tsx:47-110` — `handleStart` / `useSuggestedFlow` / `buildNew` are three near-identical intake calls; collapse to one `runIntake(opts)` switching on `response.outcome` (this also prevents Finding-4-class drift).
+- `backend/app/schemas/l1.py` — `IntakeRequest.flow_id` is dead unless Finding 4 revives it; `NextNodeRequest.acknowledged` is sent by the frontend but never read by the backend (advance infers ack from `answer is None`) — wire it or drop it. `IntakeResponse` lost its per-outcome guarantees (all fields Optional, `ticket_kind` no longer `Literal["psa","internal"]`); add a `model_validator(mode="after")` requiring `session_id`/`ticket_id` when outcome is matched/build, and add a `session_id` null-guard before `navigate()` in `handleStart` (`L1Dashboard.tsx:58-59` — currently navigates to `/l1/walk/undefined` on a regression).
+- `backend/app/services/match_or_build.py:55` — unused positional `ticket_ref` param (only caller passes `""`); delete it. Also note `classify()` is a second bespoke LLM intake classifier alongside `flowpilot_engine._classify_intake`; `l1.py` passes `problem_domain=None` to matching, losing the domain signal the existing classifier provides — consider unifying in Phase 2B.
+- `backend/app/services/l1_category_service.py:17` + `models/account.py:75-81` — DEFAULT_L1_CATEGORIES duplicated as a hand-escaped JSON `server_default`; derive one from the other (migration copy stays frozen).
+- `frontend/src/pages/account/L1CategoriesPage.tsx` — local `prettify()` duplicates `humanizeFeatureKey` (`UpgradePrompt.tsx:62`); page skips shared `PageHeader`/`Spinner` used by sibling settings pages.
+- Missing index for `GET /l1/escalations` (`l1.py:338`): consider `CREATE INDEX ... ON l1_walk_sessions (account_id, last_step_at DESC) WHERE status = 'escalated'`.
+- `backend/tests/test_l1_session_service.py` — the `escalation_reason_category == "no_kb_content"` assertion was deleted from `test_escalate_without_walk_writes_audit_log`, weakening audit coverage; restore it.
+- Per-step `walked_path` rewrite is O(n²) cumulative bytes (`session.walked_path = [*session.walked_path, entry]`); bounded by MAX_DEPTH=12 so fine today — note for Phase 2B if depth grows.
+
+---
+
+## Suggested execution order
+
+1. Finding 1 (node ids) — unblocks everything; add the contract test.
+2. Finding 6 (FK/constraint) — new migration; do early so it ships in the same release.
+3. Findings 2 + 3 together (mount section, fix field names/meta filter, ProposalDetail L1 block + null-guard the /pilot link).
+4. Finding 4 (intake honors flow_id; suggest card passes it).
+5. Finding 5 (decide adhoc: restore option (a) recommended, or fix copy + DECISIONS.md).
+6. Finding 7 (align all three permission layers).
+7. Findings 8 + 9 via the root fix: add `category`/`problem_text` (+ optionally `pending_node`) columns, delete the meta-entry convention, strip-meta fixes become moot.
+8. Finding 10 (one-line guard + deleted_at filter).
+9. Cleanups opportunistically alongside the file they touch.
+
+After fixes: run the 11 Phase 2A backend test files together (authoritative gate per HANDOFF — do NOT trust a full local serial `pytest tests/`; use `--override-ini="addopts="`), frontend `tsc -b` + lint + build, and migration downgrade/upgrade roundtrip. Update `.ai/HANDOFF.md` to correct the Task 16/17 record.
--- a/frontend/src/components/flowpilot/ProposalDetail.tsx
+++ b/frontend/src/components/flowpilot/ProposalDetail.tsx
@@ -88,18 +88,35 @@ export function ProposalDetail({ proposal, onReview }: ProposalDetailProps) {

      {/* Content */}
      <div className="flex-1 overflow-y-auto p-6 space-y-5">
-        {/* Source session link */}
-        <div className="card-flat p-4">
-          <h4 className="font-sans text-xs text-[0.625rem] uppercase tracking-wider text-text-muted mb-2">Source Session</h4>
-          <Link
-            to={`/pilot/${proposal.source_session_id}`}
-            target="_blank"
-            className="flex items-center gap-2 text-sm text-primary hover:underline"
-          >
-            <ExternalLink size={12} />
-            View session that generated this proposal
-          </Link>
-        </div>
+        {/* Source — exactly one of a FlowPilot session XOR an L1 walk is set
+            (DB CHECK). Never link to /pilot for an L1-sourced proposal:
+            source_session_id is NULL there, so the old unconditional link
+            rendered a broken /pilot/null. */}
+        {proposal.source_session_id ? (
+          <div className="card-flat p-4">
+            <h4 className="font-sans text-xs text-[0.625rem] uppercase tracking-wider text-text-muted mb-2">Source Session</h4>
+            <Link
+              to={`/pilot/${proposal.source_session_id}`}
+              target="_blank"
+              className="flex items-center gap-2 text-sm text-primary hover:underline"
+            >
+              <ExternalLink size={12} />
+              View session that generated this proposal
+            </Link>
+          </div>
+        ) : proposal.l1_session_id ? (
+          <div className="card-flat p-4">
+            <h4 className="font-sans text-xs text-[0.625rem] uppercase tracking-wider text-text-muted mb-2">Source — L1 AI walkthrough</h4>
+            <p className="text-sm text-muted-foreground">
+              Captured from an L1 technician's AI-guided walk and validated by a
+              successful resolution. The proposed flow is the path that resolved the ticket.
+            </p>
+            <p className="mt-2 flex items-center gap-1.5 text-xs text-text-muted">
+              <Hash size={11} />
+              <span className="font-mono">L1 session {proposal.l1_session_id.slice(0, 8)}</span>
+            </p>
+          </div>
+        ) : null}

        {/* Proposed diff (for enhancements) */}
        {proposal.proposed_diff && (() => {
--- a/frontend/src/components/l1/L1EscalationsSection.tsx
+++ b/frontend/src/components/l1/L1EscalationsSection.tsx
@@ -1,5 +1,6 @@
 import { useEffect, useState } from 'react'
 import { l1Api } from '@/api/l1'
+import { timeAgo } from '@/lib/timeAgo'
 import type { WalkSession } from '@/types/l1'

 /**
@@ -43,11 +44,13 @@ export function L1EscalationsSection() {
                <div className="flex items-center gap-3 min-w-0">
                  <span className="font-mono text-xs text-text-muted">#{s.id.slice(0, 8)}</span>
                  <span className="text-sm text-text-primary truncate">
-                    {s.walked_path.length} step{s.walked_path.length === 1 ? '' : 's'} walked
+                    {s.problem_text
+                      ? s.problem_text
+                      : `${s.walked_path.length} step${s.walked_path.length === 1 ? '' : 's'} walked`}
                  </span>
                </div>
                <span className="text-xs text-text-muted whitespace-nowrap">
-                  {new Date(s.last_step_at).toLocaleString()}
+                  {timeAgo(s.last_step_at)}
                </span>
              </button>
              {isOpen && (
@@ -58,9 +61,9 @@ export function L1EscalationsSection() {
                    <ol className="space-y-1.5 text-sm">
                      {s.walked_path.map((step, i) => (
                        <li key={i} className="flex flex-col">
-                          <span className="text-text-muted text-xs">{step.question}</span>
+                          <span className="text-text-muted text-xs">{step.question ?? step.text}</span>
                          {step.answer && (
-                            <span className="font-medium text-text-primary">→ {step.answer}</span>
+                            <span className="font-medium text-text-primary">→ {step.answer_label ?? step.answer}</span>
                          )}
                        </li>
                      ))}
--- a/frontend/src/components/l1/L1WalkTreeVariant.tsx
+++ b/frontend/src/components/l1/L1WalkTreeVariant.tsx
@@ -44,7 +44,7 @@ export function L1WalkTreeVariant({ session, onSessionUpdate, onDone }: Props) {
  }, [isAiBuild, session.id, session.status])

  const advanceNode = useCallback(
-    async (body: { answer?: 'yes' | 'no'; acknowledged?: boolean }) => {
+    async (body: { answer?: 'yes' | 'no' }) => {
      if (!node) return
      setNodeLoading(true)
      setNodeError(null)
@@ -167,13 +167,13 @@ export function L1WalkTreeVariant({ session, onSessionUpdate, onDone }: Props) {
                      onClick={() => advanceNode({ answer: 'yes' })}
                      className="flex-1 rounded-md bg-accent text-white py-3 text-base font-medium hover:bg-accent/90 min-h-[44px] transition-colors"
                    >
-                      Yes
+                      {node.yes_label ?? 'Yes'}
                    </button>
                    <button
                      onClick={() => advanceNode({ answer: 'no' })}
                      className="flex-1 rounded-md border border-default py-3 text-base font-medium hover:bg-elevated min-h-[44px] transition-colors"
                    >
-                      No
+                      {node.no_label ?? 'No'}
                    </button>
                  </div>
                </>
@@ -183,7 +183,7 @@ export function L1WalkTreeVariant({ session, onSessionUpdate, onDone }: Props) {
                <>
                  <p className="text-lg">{node.text}</p>
                  <button
-                    onClick={() => advanceNode({ acknowledged: true })}
+                    onClick={() => advanceNode({})}
                    className="rounded-md bg-accent text-white px-5 py-3 text-base font-medium hover:bg-accent/90 min-h-[44px] transition-colors"
                  >
                    Done — next step
@@ -251,8 +251,8 @@ export function L1WalkTreeVariant({ session, onSessionUpdate, onDone }: Props) {
            <ol className="space-y-3 text-sm">
              {session.walked_path.map((step, i) => (
                <li key={i} className="flex flex-col">
-                  <span className="text-muted-foreground text-xs">{step.question}</span>
-                  <span className="font-medium">→ {step.answer}</span>
+                  <span className="text-muted-foreground text-xs">{step.question ?? step.text}</span>
+                  {step.answer && <span className="font-medium">→ {step.answer_label ?? step.answer}</span>}
                  {step.l1_note && <span className="text-muted-foreground text-xs italic mt-0.5">{step.l1_note}</span>}
                </li>
              ))}
--- a/frontend/src/components/layout/ProtectedRoute.tsx
+++ b/frontend/src/components/layout/ProtectedRoute.tsx
@@ -5,13 +5,18 @@ import { Spinner } from '@/components/common/Spinner'

 interface ProtectedRouteProps {
  requiredRole?: EffectiveRole
+  // Gate on account-management capability (owner OR account-admin OR super_admin),
+  // mirroring backend require_account_owner_or_admin. Use instead of
+  // requiredRole="owner" when account admins must also pass — the role hierarchy
+  // has no 'admin' rung, so requiredRole alone wrongly bounces admins.
+  requireAccountManager?: boolean
  children: React.ReactNode
 }

-export function ProtectedRoute({ requiredRole, children }: ProtectedRouteProps) {
+export function ProtectedRoute({ requiredRole, requireAccountManager, children }: ProtectedRouteProps) {
  const { isAuthenticated, isLoading, user } = useAuthStore()
  const location = useLocation()
-  const { effectiveRole } = usePermissions()
+  const { effectiveRole, canManageAccount } = usePermissions()

  if (isLoading) {
    return (
@@ -48,6 +53,10 @@ export function ProtectedRoute({ requiredRole, children }: ProtectedRouteProps)
    }
  }

+  if (requireAccountManager && !canManageAccount) {
+    return <Navigate to="/trees" replace />
+  }
+
  if (requiredRole) {
    const ROLE_HIERARCHY: Record<EffectiveRole, number> = {
      super_admin: 5,
--- a/frontend/src/hooks/usePermissions.ts
+++ b/frontend/src/hooks/usePermissions.ts
@@ -88,7 +88,13 @@ export function usePermissions() {
    // Management permissions
    canManageCategories: hasMinimumRole(user, 'owner'),
    canManageGlobalCategories: effectiveRole === 'super_admin',
-    canManageAccount: effectiveRole === 'super_admin' || effectiveRole === 'owner',
+    // Mirrors backend User.can_manage_account (super_admin OR owner OR admin).
+    // account_role 'admin' isn't in the effectiveRole hierarchy, so check it
+    // directly — otherwise account admins map to 'viewer' and are wrongly excluded.
+    canManageAccount:
+      effectiveRole === 'super_admin' ||
+      effectiveRole === 'owner' ||
+      user?.account_role === 'admin',

    canManageScriptTemplate: (template: { created_by: string | null; team_id?: string | null }) => {
      if (!user) return false
--- a/frontend/src/pages/EscalationQueuePage.tsx
+++ b/frontend/src/pages/EscalationQueuePage.tsx
@@ -1,13 +1,14 @@
 import { useState } from 'react'
 import { AlertTriangle } from 'lucide-react'
 import { EscalationQueue, EscalationMetricCard } from '@/components/flowpilot'
+import { L1EscalationsSection } from '@/components/l1/L1EscalationsSection'

 export default function EscalationQueuePage() {
  const [count, setCount] = useState<number | null>(null)

  return (
-    <div className="mx-auto max-w-4xl p-6">
-      <div className="flex items-center gap-3 mb-6">
+    <div className="mx-auto max-w-4xl p-6 space-y-6">
+      <div className="flex items-center gap-3">
        <span className="flex h-8 w-8 items-center justify-center rounded-lg bg-warning-dim">
          <AlertTriangle size={16} className="text-warning" />
        </span>
@@ -24,6 +25,10 @@ export default function EscalationQueuePage() {
      <EscalationMetricCard period="30d" />

      <EscalationQueue onCountChange={setCount} />
+
+      {/* L1 AI-build handoffs (GET /l1/escalations). Renders nothing when empty,
+          so engineers without L1 escalations see no change. */}
+      <L1EscalationsSection />
    </div>
  )
 }
--- a/frontend/src/pages/l1/L1Dashboard.tsx
+++ b/frontend/src/pages/l1/L1Dashboard.tsx
@@ -6,7 +6,7 @@ import { l1Api } from '@/api/l1'
 import { toast } from '@/lib/toast'
 import { EmptyStateCard } from '@/components/l1/EmptyStateCard'
 import { ResumeInProgress } from '@/components/l1/ResumeInProgress'
-import type { NearMiss, QueueRow } from '@/types/l1'
+import type { IntakeRequest, NearMiss, QueueRow } from '@/types/l1'

 export default function L1Dashboard() {
  const user = useAuthStore((s) => s.user)
@@ -44,23 +44,42 @@ export default function L1Dashboard() {
    setOutOfScope(null)
  }

-  const handleStart = async () => {
+  // Single intake entry point — `opts` selects the variant:
+  //   {}                      → normal match-or-build
+  //   { flow_id }             → "Use this flow" (bypass matcher, walk that flow)
+  //   { force_build: true }   → "Build new" (skip match, still category-gated)
+  //   { adhoc: true }         → out-of-scope "Walk it ad-hoc"
+  // Collapsing the old three near-identical handlers removes the drift that let
+  // "Use this flow" silently re-suggest forever (it never passed the flow_id).
+  const runIntake = async (opts: Partial<IntakeRequest> = {}) => {
    if (!problem.trim()) return
    setSubmitting(true)
    resetPrompts()
    try {
-      // Phase 2A: intake dispatches via match_or_build and returns an `outcome`.
      const response = await l1Api.intake({
        problem_statement: problem.trim(),
        customer_name: customerName.trim() || undefined,
        customer_contact: customerContact.trim() || undefined,
+        ...opts,
      })
-      if (response.outcome === 'matched' || response.outcome === 'build') {
-        navigate(`/l1/walk/${response.session_id}`)
-      } else if (response.outcome === 'suggest') {
-        setSuggestion(response.near_miss ?? null)
-      } else if (response.outcome === 'out_of_scope') {
-        setOutOfScope(response.category ?? 'unknown')
+      switch (response.outcome) {
+        case 'matched':
+        case 'build':
+        case 'adhoc':
+          if (response.session_id) {
+            navigate(`/l1/walk/${response.session_id}`)
+          } else {
+            // Backend guarantees session_id on these outcomes; guard so a
+            // regression never navigates to /l1/walk/undefined.
+            toast.error('Walk started but no session was returned. Try again.')
+          }
+          break
+        case 'suggest':
+          setSuggestion(response.near_miss ?? null)
+          break
+        case 'out_of_scope':
+          setOutOfScope(response.category ?? 'unknown')
+          break
      }
    } catch (err) {
      const detail = (err as { response?: { data?: { detail?: string } } }).response?.data?.detail
@@ -72,47 +91,14 @@ export default function L1Dashboard() {
    }
  }

-  // "Use this flow" — re-run intake with the same text; it matches again and
-  // returns a `matched` outcome with a started flow session (acceptable Phase 2A).
-  const useSuggestedFlow = async () => {
-    setSubmitting(true)
-    try {
-      const response = await l1Api.intake({
-        problem_statement: problem.trim(),
-        customer_name: customerName.trim() || undefined,
-        customer_contact: customerContact.trim() || undefined,
-      })
-      if (response.session_id) navigate(`/l1/walk/${response.session_id}`)
-      else resetPrompts()
-    } catch {
-      toast.error('Could not start the matched flow. Try again.')
-    } finally {
-      setSubmitting(false)
-    }
-  }
-
-  // "Build new" — skip the match pass (force_build); still gated by enabled categories.
-  const buildNew = async () => {
-    setSubmitting(true)
-    resetPrompts()
-    try {
-      const response = await l1Api.intake({
-        problem_statement: problem.trim(),
-        customer_name: customerName.trim() || undefined,
-        customer_contact: customerContact.trim() || undefined,
-        force_build: true,
-      })
-      if (response.outcome === 'build' && response.session_id) {
-        navigate(`/l1/walk/${response.session_id}`)
-      } else if (response.outcome === 'out_of_scope') {
-        setOutOfScope(response.category ?? 'unknown')
-      }
-    } catch {
-      toast.error('Failed to start walk. Try again.')
-    } finally {
-      setSubmitting(false)
-    }
-  }
+  const handleStart = () => runIntake()
+  // "Use this flow" — pass the near-miss flow_id so intake walks it directly
+  // (the matcher can't reliably re-derive the same flow from the same text).
+  const useSuggestedFlow = () => runIntake({ flow_id: suggestion?.flow_id })
+  // "Build new" — skip the match pass (force_build); still gated by categories.
+  const buildNew = () => runIntake({ force_build: true })
+  // "Walk it ad-hoc" — out-of-scope fallback: a free-form walk (no AI tree).
+  const walkAdhoc = () => runIntake({ adhoc: true })

  // out-of-scope fallback: escalate straight to engineering (no walk).
  const escalateOutOfScope = async () => {
@@ -272,14 +258,23 @@ export default function L1Dashboard() {
            <p className="text-sm text-primary">
              This problem isn’t in your account’s enabled L1 categories
              {outOfScope !== 'unknown' ? ` (${outOfScope.replace(/_/g, ' ')})` : ''}, so
-              there’s no AI-built walk for it. You can escalate it to engineering.
+              there’s no AI-built walk for it. You can still walk it ad-hoc (free-form
+              notes, no AI tree), or escalate it to engineering.
            </p>
            <div className="flex gap-2">
              <button
                type="button"
-                onClick={escalateOutOfScope}
+                onClick={walkAdhoc}
                disabled={submitting}
                className="rounded-md bg-accent text-white px-4 py-2 text-sm font-medium hover:bg-accent/90 transition-colors disabled:opacity-50"
+              >
+                Walk it ad-hoc
+              </button>
+              <button
+                type="button"
+                onClick={escalateOutOfScope}
+                disabled={submitting}
+                className="rounded-md border border-default px-4 py-2 text-sm hover:bg-elevated transition-colors disabled:opacity-50"
              >
                Escalate to engineering
              </button>
--- a/frontend/src/router.tsx
+++ b/frontend/src/router.tsx
@@ -367,7 +367,7 @@ export const router = sentryCreateBrowserRouter([
          {
            path: 'l1-categories',
            element: (
-              <ProtectedRoute requiredRole="owner">
+              <ProtectedRoute requireAccountManager>
                {page(L1CategoriesPage)}
              </ProtectedRoute>
            ),
--- a/frontend/src/types/l1.ts
+++ b/frontend/src/types/l1.ts
@@ -3,9 +3,17 @@ export type SessionStatus = 'active' | 'resolved' | 'escalated' | 'abandoned'
 export type TicketKind = 'psa' | 'internal'

 export interface WalkStep {
-  node_id: string
-  question: string
-  answer: string
+  // Two shapes coexist (segregated by session_kind): legacy flow/adhoc steps use
+  // node_id + question; ai_build steps use id + node_type + text. Render with
+  // `step.question ?? step.text`.
+  node_id?: string
+  id?: string
+  node_type?: string
+  question?: string
+  text?: string
+  answer: string | null
+  /** Button text the tech clicked (ai_build); falls back to `answer`. */
+  answer_label?: string
  l1_note: string | null
 }

@@ -17,6 +25,8 @@ export interface AdhocNote {
 export interface WalkSession {
  id: string
  session_kind: SessionKind
+  category: string | null
+  problem_text: string | null
  flow_id: string | null
  flow_proposal_id: string | null
  current_node_id: string | null
@@ -42,10 +52,11 @@ export interface IntakeRequest {
  customer_name?: string
  customer_contact?: string
  flow_id?: string
+  adhoc?: boolean
  force_build?: boolean
 }

-export type IntakeOutcome = 'matched' | 'suggest' | 'out_of_scope' | 'build'
+export type IntakeOutcome = 'matched' | 'suggest' | 'out_of_scope' | 'build' | 'adhoc'

 export interface NearMiss {
  flow_id: string
@@ -66,9 +77,12 @@ export interface IntakeResult {
  category?: string        // for 'out_of_scope'
 }

-/** A single node of an AI-built decision tree, returned by /next-node. */
+/** A single node of an AI-built decision tree, returned by /next-node.
+ *  Question nodes carry the literal button texts (yes_label/no_label) so the
+ *  choices always match the question ("Microsoft account" / "Local account",
+ *  not a mismatched Yes/No). The backend defaults them to Yes/No. */
 export type TreeNode =
-  | { node_type: 'question'; id: string; text: string }
+  | { node_type: 'question'; id: string; text: string; yes_label?: string; no_label?: string }
  | { node_type: 'instruction'; id: string; text: string }
  | { node_type: 'resolved'; id: string; text: string }
  | { node_type: 'escalate'; id: string; reason_category?: string; text: string }
@@ -77,8 +91,7 @@ export type TreeNode =
 export interface NextNodeRequest {
  node_id?: string
  node_text?: string       // rendered text of the node being answered
-  answer?: 'yes' | 'no'
-  acknowledged?: boolean
+  answer?: 'yes' | 'no'    // omit to acknowledge an instruction node
  note?: string
 }

--- a/frontend/src/types/user.ts
+++ b/frontend/src/types/user.ts
@@ -9,7 +9,7 @@ export interface User {
  is_active: boolean
  must_change_password: boolean
  account_id: string | null
-  account_role: 'owner' | 'engineer' | 'l1_tech' | 'viewer' | null
+  account_role: 'owner' | 'admin' | 'engineer' | 'l1_tech' | 'viewer' | null
  can_cover_l1: boolean
  team_id: string | null
  created_at: string
Author	SHA1	Message	Date
Michael Chihlas	0e41a990ed	docs(handoff): record answer-label fix (`9c34d1e`) + smoke-test note Some checks failed Mirror to GitHub / mirror (push) Successful in 6s Details CI / frontend (pull_request) Successful in 6m52s Details CI / e2e (pull_request) Failing after 4m26s Details CI / backend (pull_request) Successful in 11m32s Details Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:56:04 -04:00
Michael Chihlas	9c34d1e82d	fix(l1): answer buttons must match the question — yes_label/no_label end-to-end Live walk defect: the builder generated alternatives questions ("Is Jane's account a Microsoft account or a local account?") while the UI could only offer Yes/No. Root cause: SYSTEM_PROMPT mandated a label-less '<yes/no question>' shape with no way to express the two answers. - SYSTEM_PROMPT: question nodes must carry yes_label/no_label — the literal button texts; alternatives questions must use the alternatives as labels. - validate_node: labels hard-floor-scanned, must be distinct non-empty strings. - _ensure_labels: server defaults missing labels to Yes/No. - advance_ai_build: records answer_label (and both labels) in walked_path, derived from the server-held pending_node — never client-supplied. - _build_context: LLM context shows the chosen label, not a bare yes/no (a raw "-> yes" on an alternatives question degrades the next generation). - normalize_walked_path: captured flywheel trees keep question labels. - Frontend: buttons render yes_label/no_label; walk transcript and L1EscalationsSection render answer_label. Phase 2A backend set: 137 passed / 0 failed / 8 deselected. tsc, eslint, vite build clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-11 15:03:15 -04:00
Michael Chihlas	db446e1fd6	docs(handoff): PR #193 all 10 review findings resolved + 2 decisions Findings doc gets a per-finding RESOLUTION section; HANDOFF resume point moves to "re-push + merge" and corrects the false Task 16/17 "done" record; CURRENT_TASK updated; two architectural decisions logged (real ai_build columns replacing the meta convention; ad-hoc walk restored); SESSION_LOG entry added. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 15:56:03 -04:00
Michael Chihlas	9afaf37fb3	fix(l1): resolve PR #193 frontend review findings (2a,2b,3,4,5,7) Mounts L1EscalationsSection on EscalationQueuePage (Finding 2a — it was never rendered) and renders the correct fields: step.question ?? step.text, timeAgo, and the session problem_text (Finding 2b). ProposalDetail gates the /pilot link on source_session_id and shows an L1-source block for l1_session_id-sourced proposals (Finding 3 — was a broken /pilot/null link). Collapses the three near-identical intake handlers into one runIntake: "Use this flow" now passes near_miss.flow_id (Finding 4 — it previously re-suggested forever) and a navigate guard prevents /l1/walk/undefined; out_of_scope gains a "Walk it ad-hoc" button (Finding 5). Aligns L1-category permissions to owner+admin: usePermissions.canManageAccount includes account admins, User.account_role TS type gains 'admin', and a new ProtectedRoute requireAccountManager guard fronts the route (Finding 7). Drops the unused NextNodeRequest.acknowledged field. tsc -b + eslint + vite build clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 15:55:55 -04:00
Michael Chihlas	ac89e7b2fa	fix(l1): resolve PR #193 backend review findings (1,4,5,6,7,8,9,10) Server-assigns a uuid4 id to every AI-generated node (Finding 1 showstopper: nodes had no id but the advance protocol keys on node_id, so ai_build walks never advanced past question 1). Replaces the hidden {"node_type":"meta"} walked_path convention with real category/problem_text/pending_node columns on l1_walk_sessions (migration 61dda4f615c6) — fixes junk proposals + off-by-one depth cap (Findings 8,9), and pending_node replays the served node on re-mount (no duplicate paid LLM call). Intake honors explicit flow_id and adhoc=True (Findings 4,5); flow_proposals.l1_session_id FK -> CASCADE (Finding 6 time bomb); L1 category GET is owner+admin like PATCH and require_account_owner_or_admin delegates to User.can_manage_account (Finding 7); escalate falls back to default recipients + filters deleted_at + warns when empty (Finding 10). Cleanups: dead ticket_ref removed, IntakeResponse per-outcome validator, unused acknowledged dropped, escalations partial index, restored a deleted audit assertion. Full Phase 2A backend set: 110 passed / 0 failed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 15:55:45 -04:00
Michael Chihlas	42a4536c63	docs(review): PR #193 review findings — 10 confirmed defects, merge blocked; handoff points to fix plan Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-09 14:58:24 -04:00