From 5b58702b20b9c6c4ec15c419ed354cf0c30a4ed4 Mon Sep 17 00:00:00 2001 From: Michael Chihlas Date: Fri, 29 May 2026 01:22:37 -0400 Subject: [PATCH 1/4] =?UTF-8?q?docs(spec):=20L1=20AI=20decision-tree=20bui?= =?UTF-8?q?lder=20=E2=80=94=20Phase=202A=20design?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Brainstormed design for real-time AI tree building when no KB/flow matches. Overrides the original "no empty-KB build" rule: build from generic L1 knowledge under a layered safety model (classification gate, constrained generation, per-node validation with a hard floor, standing disclaimer). Approach C — dedicated ai_tree_builder + match_or_build orchestrator, reusing flow_matching_engine and the knowledge_flywheel proposal pipeline. Scope: streaming node-by-node builder, admin-configurable categories, flywheel capture of resolved trees, minimum escalation handoff (notify + engineer surface). KB ingestion/connectors, PSA reassign, escalation package, and AI chat handoff deferred to later phases. Co-Authored-By: Claude Opus 4.7 --- ...5-29-l1-ai-tree-builder-phase-2a-design.md | 220 ++++++++++++++++++ 1 file changed, 220 insertions(+) create mode 100644 docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md diff --git a/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md b/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md new file mode 100644 index 00000000..cdaa77be --- /dev/null +++ b/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md @@ -0,0 +1,220 @@ +# L1 AI Decision-Tree Builder — Phase 2A Design + +**Status:** Draft for review +**Date:** 2026-05-29 +**Author:** previous session (brainstorming) +**Predecessor:** [`2026-05-28-l1-workspace-design.md`](2026-05-28-l1-workspace-design.md) (full L1 vision), [`2026-05-28-l1-workspace-phase-1-acceptance.md`](2026-05-28-l1-workspace-phase-1-acceptance.md) (what shipped in Phase 1) + +--- + +## 1. Goal + +When an L1 tech describes a problem and there is **no matching authored flow or AI draft**, the platform builds a yes/no decision tree **in real time from the model's general L1 knowledge** and walks the tech through it node by node. Scoped to L1-appropriate troubleshooting: simple yes/no questions and reversible step-by-step instructions. Successful trees are captured as outcome-validated drafts for engineer review, compounding the account's knowledge base from real resolutions. + +This **overrides** the original spec's "no empty-KB build" rule (§8.1 of the predecessor), which aborted to a degradation screen when no KB existed. Instead of aborting, we build from generic knowledge under a layered safety model. + +KB grounding (RAG over ingested documents) is **explicitly deferred to Phase 2B** — Phase 2A builds from generic knowledge only, plus matching against already-authored flows. + +## 2. Scope + +**In scope (Phase 2A):** +- `match_or_build` orchestrator inserted at L1 intake (match-first, build-on-miss). +- `ai_tree_builder` service: node-by-node ("streaming") tree generation, constrained + escalate-early. +- Admin-configurable L1 category allowlist (Account Owner/Admin control panel). +- Standing AI-disclaimer banner on AI-built walks. +- Flywheel capture: resolved AI trees become outcome-validated `FlowProposal`s. +- Minimum escalation handoff: engineer bell-badge notification + an engineer-visible "escalated from L1" surface. + +**Deferred:** +- KB document ingestion + connectors (IT Glue, Hudu, SharePoint/OneDrive) — Phase 2B. +- RAG grounding of the builder on ingested KB — Phase 2B. +- PSA ticket reassign on escalation, escalation-package generation, AI chat handoff — later phase. +- `BuildAbortedNoKB` screen from the original spec — **dropped** (superseded by build-from-generic). + +## 3. Architecture (Approach C) + +Dedicated builder for the constrained node generation; reuse existing rails for matching and capture. + +**New services:** +| File | Responsibility | +|---|---| +| `backend/app/services/match_or_build.py` | Orchestrator. `match_or_build(account_id, problem_text, ticket_ref, *, force_build=False) -> MatchOrBuildResult`. Classify → category gate → match pass → build/suggest/out-of-scope decision. | +| `backend/app/services/ai_tree_builder.py` | Node-by-node generation. `generate_next_node(problem_text, category, walked_path) -> TreeNode`. Reuses `get_ai_provider` + `generate_json` + `parse_llm_json`. Owns the constrained system prompt and per-node validation. | +| `backend/app/services/l1_category_service.py` | Read/write an account's enabled L1 categories; expose the default allowlist and the always-forbidden hard floor. | + +**Reused as-is:** +- `flow_matching_engine.find_matches()` — semantic + keyword + recency match pass. +- `knowledge_flywheel` proposal-creation + dedupe (`_find_similar_pending_proposal`) — outcome-validated capture. +- `notification_service` — engineer escalation notification. +- Phase 1 `L1WalkTreeVariant` walker — its stubbed synthetic-step UI is replaced by real AI node rendering. + +**Intake decision flow:** +``` +POST /l1/intake (problem_statement, customer_*, force_build?) + → match_or_build(account_id, problem_text, ticket_ref, force_build): + 1. category = classify(problem_text) # new + 2. if category not in account.enabled_l1_categories: + return {outcome: 'out_of_scope', category} + 3. if not force_build: + hits = flow_matching_engine.find_matches(problem_text) + best = max(hits, default=None) + if best.score >= MATCH_THRESHOLD: + return {outcome: 'matched', target_id, session_kind} # flow|proposal + if best.score >= SUGGEST_THRESHOLD: + return {outcome: 'suggest', near_miss, can_build: true} + 4. return {outcome: 'build', session_kind: 'ai_build', category} +``` +Frontend dispatches on `outcome`: +- `matched` → start a `flow`/`proposal` walk (Phase 1 paths). +- `suggest` → inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake with `force_build=true`. +- `out_of_scope` → inline prompt offering ad-hoc walk or escalate-without-walk (Phase 1 paths). +- `build` → create an `ai_build` session, navigate to the walker, fetch the first node. + +## 4. The streaming build & node schema + +`ai_tree_builder.generate_next_node()` is called with the problem statement, the resolved category, and the **full walked path so far**. It returns exactly one node. Passing the whole path every call is what keeps independently-generated nodes coherent and lets the model decide when it has exhausted safe steps. + +**Node shape (`proposed_flow_data` node, also the live `walked_path` entry):** +```json +// question — yes/no branch; both branches regenerate +{ "node_type": "question", "id": "n3", "text": "Is the printer showing a 'ready' status light?", + "yes_next": "generate", "no_next": "generate" } + +// instruction — a single safe, reversible action; advances on acknowledgement +{ "node_type": "instruction", "id": "n4", "text": "Unplug the printer for 30 seconds, then power it back on.", + "next": "generate" } + +// resolved — terminal success +{ "node_type": "resolved", "id": "n7", "text": "Printer is back online and printing test pages." } + +// escalate — terminal handoff (escalate-early safety valve) +{ "node_type": "escalate", "id": "n7", "reason_category": "exhausted_safe_steps", + "text": "This looks like a driver-level fault beyond L1 scope — escalating to engineering." } +``` + +`"generate"` is a sentinel meaning "call `generate_next_node` again with the new answer appended." The first node is fetched synchronously on `ai_build` session creation (intake). Each subsequent node is fetched when the tech answers/acknowledges — target latency ~2–4s per node; show a per-node "Thinking through the next step…" affordance. + +**Endpoint:** `POST /l1/sessions/{id}/next-node` body `{node_id, answer?: 'yes'|'no', acknowledged?: true, note?}`. Appends the answered node to `walked_path`, then generates and returns the next node (or a terminal node). Replaces the Phase 1 synthetic stepping in `L1WalkTreeVariant`. + +## 5. Safety model (layered) + +**Layer 1 — classification gate.** `classify(problem_text)` maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or `unknown`); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is `unknown`), intake returns `out_of_scope`; no build happens. + +**Layer 2 — constrained generation.** The `ai_tree_builder` system prompt restricts output to: +- Safe, reversible, observe-or-restart-class steps only (toggle/restart/reconnect/re-enter, check-status questions). +- A **hard floor of always-forbidden actions** (see §5.1) that NO category may unlock. +- An explicit instruction to emit an `escalate` node — never guess — once it runs out of in-scope safe steps. + +**Layer 3 — per-node validation.** Server-side, every generated node is checked before being returned: +- Reject (and regenerate once, then escalate) nodes whose text matches forbidden-action patterns (§5.1). +- Enforce a **depth cap** (default `L1_BUILD_MAX_DEPTH = 12`): once the walked path hits the cap, force an `escalate` node. +- Validate node JSON shape (Pydantic); malformed → regenerate once, then escalate. + +**Layer 4 — standing disclaimer.** Persistent banner on every `ai_build` walk: + +> *"These are high-confidence troubleshooting steps, but they come from outside your organization's knowledge base — review them before acting. When in doubt, escalate early."* + +### 5.1 Hard floor — always forbidden (admins cannot enable) +Regardless of enabled categories, the builder must never produce steps that: +- Modify the Windows registry, system files, or boot configuration. +- Delete, format, or repartition data/disks; remove user profiles or mailboxes. +- Change credentials, MFA, security/firewall/AV settings, or disable protections. +- Run scripts/commands with elevated/admin privileges. +- Touch domain controllers, DNS, DHCP, or production server config. +- Make purchases, license changes, or anything with billing impact. + +*(This list is a product decision — review and edit during spec review.)* + +### 5.2 Default enabled category allowlist (admin-editable) +Ships enabled by default; Account Owners/Admins toggle per account: +`password_reset`, `account_lockout`, `printer`, `email_outlook_client`, `wifi_network_basics`, `vpn_connect`, `teams_zoom_av`, `browser_cache_cookies`, `peripheral_reconnect`, `os_restart_update`. + +*(This list is a product decision — review and edit during spec review.)* + +### 5.3 Tunables +| Setting | Default | Notes | +|---|---|---| +| `MATCH_THRESHOLD` | 0.75 | Carried from predecessor spec §8.1. | +| `SUGGEST_THRESHOLD` | 0.60 | Carried from predecessor spec §8.1. | +| `L1_BUILD_MAX_DEPTH` | 12 | Force escalate beyond this many nodes. | +| `get_model_for_action('l1_realtime_build')` | Sonnet | Latency-sensitive; benchmark Sonnet vs Opus during plan. | +| Per-node max_tokens | 1024 | One node is small. | + +## 6. Flywheel capture + +On `resolve` of an `ai_build` session (`l1_session_service.resolve` extension): +1. Build `proposed_flow_data` from the `walked_path` (the nodes that were actually traversed, normalized into a tree structure). +2. Create a `FlowProposal`: `source='ai_realtime_l1'`, `validated_by_outcome=true`, `proposed_flow_data=`, `linked_ticket_id/kind=`, `problem_domain=`, `status='pending'`. +3. Run the existing `_find_similar_pending_proposal` dedupe — merge (bump supporting count) if a near-duplicate pending proposal exists, else insert. +4. Emit the existing `proposal.pending` notification to the review queue. + +Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by `flow_matching_engine` on future intakes → the KB compounds. No new review UI needed; `source='ai_realtime_l1'` rows surface in the existing queue (optionally badge them "AI · outcome-validated"). + +## 7. Minimum escalation handoff + +On `escalate` (terminal node reached, or the L1 hits the Escalate modal during an `ai_build` walk) — extends `l1_session_service.escalate`: +1. **Notify engineers** — `notification_service` bell-badge event `l1.session.escalated` to the account's engineers (and `is_team_admin`/owner). Payload: ticket ref, problem summary, escalation reason category, link. +2. **Engineer-visible surface** — escalated L1 sessions appear in an engineer-facing list. Reuse/extend the existing `/escalations` queue (`EscalationQueuePage`) with an "L1 escalations" section, or a dedicated `GET /l1/escalations` consumed there. Each row shows problem, the walked path summary, who escalated, when. + +**Still deferred** (documented, not built): PSA ticket reassign, escalation-package markdown generation, AI chat handoff/session creation. + +## 8. Data model & migrations + +**Migration 1 — `ai_build` session kind.** +- Extend `l1_walk_sessions` `ck_l1_walk_sessions_session_kind` CHECK to include `'ai_build'`. +- Extend `ck_l1_walk_sessions_target_consistency`: for `ai_build`, both `flow_id` and `flow_proposal_id` are NULL (same as `adhoc`). + +**Migration 2 — account L1 category settings.** +- Add `accounts.enabled_l1_categories` `JSONB NOT NULL DEFAULT ''::jsonb` (list of category keys). RLS already covers `accounts`. + +No new tables — live build state rides on the existing `l1_walk_sessions.walked_path`; persisted trees ride on `FlowProposal.proposed_flow_data`. + +## 9. API surface + +| Method | Path | Notes | Auth | +|---|---|---|---| +| POST | `/l1/intake` | **Extended**: now runs `match_or_build`; response carries `outcome` (`matched`/`suggest`/`out_of_scope`/`build`). | `require_l1_or_coverage` | +| POST | `/l1/sessions/{id}/next-node` | **New**: record answer/ack on current node, generate + return next node (or terminal). | `require_l1_or_coverage` | +| GET | `/accounts/me/l1-categories` | **New**: list enabled + available categories + hard-floor (read-only) list. | `require_l1_or_above` (read) | +| PATCH | `/accounts/me/l1-categories` | **New**: set enabled categories. | `require_engineer_or_admin` (owner/admin) | +| GET | `/l1/escalations` | **New** (or extend `/escalations`): engineer-visible escalated-from-L1 list. | `require_engineer_or_admin` | + +## 10. Frontend + +- `L1WalkTreeVariant` — replace synthetic stepping with real node rendering driven by `/next-node`; render `question` (yes/no), `instruction` (acknowledge), `resolved`/`escalate` (terminal). Per-node loading affordance. Disclaimer banner mounted for `ai_build` sessions. +- `L1Dashboard` intake handler — dispatch on `match_or_build` `outcome` (suggest prompt, out-of-scope prompt, build → walker). +- New admin settings panel (under `/account`) — toggle enabled L1 categories; show hard-floor list as read-only "always excluded." +- Engineer escalations surface — "L1 escalations" section/list. + +## 11. Testing strategy + +**Backend unit:** +- `ai_tree_builder.generate_next_node` — returns valid node per type; escalate-early when path is deep / model signals exhaustion; regenerate-then-escalate on malformed/forbidden output; depth cap forces escalate. +- Per-node validation — forbidden-action patterns rejected; hard-floor enforced even if a category is enabled. +- `match_or_build` — all four outcomes at threshold boundaries (`score == MATCH_THRESHOLD`, `== SUGGEST_THRESHOLD`), `force_build` bypasses match, `out_of_scope` when category disabled. +- `classify` — known categories map correctly; unknown → out_of_scope. +- Flywheel capture — resolve creates `ai_realtime_l1` proposal; dedupe merges near-duplicate. +- Escalation handoff — notification fired; escalated session appears in engineer query. + +**Backend integration:** +- Full intake→build→resolve creates an outcome-validated proposal. +- Intake→build→escalate notifies engineers and surfaces in the escalations list. +- Migrations roundtrip; `ai_build` CHECK + target-consistency hold. + +**Frontend e2e (extend `l1-workspace.spec.ts`):** +- L1 intake with no match → AI build → answer nodes → resolve → proposal created. +- L1 build → escalate node → escalate handoff. +- Admin toggles a category off → that problem class returns out-of-scope. + +**AI quality (plan-time):** small eval set of common L1 problems; assert trees stay in-scope, reach resolution or escalate cleanly, never emit hard-floor actions. Benchmark Sonnet vs Opus for the model-tier decision. + +## 12. Risks & open questions + +- **Hallucinated-but-plausible steps** for niche/company-specific apps. Mitigation: classification gate + constrained prompt + escalate-early + disclaimer. Residual risk accepted for v1; eval set bounds it. +- **Latency on a live call.** Node-by-node means ~2–4s per branch. Mitigation: Sonnet, small per-node token budget, clear loading affordance. Benchmark at plan time. +- **Coherence across independently-generated nodes.** Mitigation: full walked-path context every call. +- **Classification accuracy.** A misclassify could wrongly gate a valid problem out, or let a borderline one through. Mitigation: hard floor is category-independent; out-of-scope still offers adhoc/escalate (no dead end). +- **Open (product, for spec review):** the default category allowlist (§5.2) and the hard-floor list (§5.1) — confirm/edit. Model tier — confirm Sonnet pending benchmark. + +## 13. Out of scope (restated) +KB ingestion + connectors, RAG grounding, PSA reassign, escalation-package generation, AI chat handoff. Each is its own later phase with its own spec. -- 2.49.1 From f62712d11c314cfab82370e7515886f0db44c9c4 Mon Sep 17 00:00:00 2001 From: Michael Chihlas Date: Fri, 29 May 2026 03:04:49 -0400 Subject: [PATCH 2/4] docs(spec): resolve 6 Codex review findings on L1 AI tree builder spec MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Blocker: FlowProposal can't link an l1_walk_session (source_session_id is NOT NULL FK→ai_sessions, UI links /pilot). Add nullable l1_session_id + exactly-one CHECK + read-only walked-path link for L1-sourced proposals. - High: flow_matching_engine matches published flows only; scope match pass to flows, defer proposal-matching. - High: notification system is FlowPilot-shaped; enumerate the 3 changes for l1.session.escalated (VALID_EVENTS, link+body builder, explicit engineer recipients). Engineer-visible surface is the primary handoff. - Medium: match before category gate so authored flows aren't blocked. - Medium: define normalize_walked_path → valid tree with root id, unexplored branches as needs_review stubs. - Medium: category write auth needs owner/admin, not engineer; add require_account_owner_or_admin dep. Co-Authored-By: Claude Opus 4.7 --- ...5-29-l1-ai-tree-builder-phase-2a-design.md | 92 ++++++++++++++----- 1 file changed, 69 insertions(+), 23 deletions(-) diff --git a/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md b/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md index cdaa77be..4a714eb8 100644 --- a/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md +++ b/docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md @@ -49,24 +49,30 @@ Dedicated builder for the constrained node generation; reuse existing rails for - Phase 1 `L1WalkTreeVariant` walker — its stubbed synthetic-step UI is replaced by real AI node rendering. **Intake decision flow:** + +Order matters: **match first, gate only the build path.** The category allowlist exists to bound *generic AI building* for safety — it must not block a human-authored flow that already exists for that problem. So matching against published flows runs before any category check; the category gate applies only when we fall through to building. + ``` POST /l1/intake (problem_statement, customer_*, force_build?) - → match_or_build(account_id, problem_text, ticket_ref, force_build): - 1. category = classify(problem_text) # new - 2. if category not in account.enabled_l1_categories: - return {outcome: 'out_of_scope', category} - 3. if not force_build: - hits = flow_matching_engine.find_matches(problem_text) - best = max(hits, default=None) - if best.score >= MATCH_THRESHOLD: - return {outcome: 'matched', target_id, session_kind} # flow|proposal - if best.score >= SUGGEST_THRESHOLD: + → match_or_build(account_id, problem_text, problem_domain, ticket_ref, force_build): + 1. if not force_build: + hits = flow_matching_engine.find_matches(problem_text, problem_domain, account_id) + best = max(hits, default=None) # published flows (Trees) only + if best and best.score >= MATCH_THRESHOLD: + return {outcome: 'matched', flow_id, session_kind: 'flow'} + if best and best.score >= SUGGEST_THRESHOLD: return {outcome: 'suggest', near_miss, can_build: true} + 2. category = classify(problem_text) # new — only on build path + 3. if category not in account.enabled_l1_categories: + return {outcome: 'out_of_scope', category} 4. return {outcome: 'build', session_kind: 'ai_build', category} ``` + +**Match scope (Finding 2):** `flow_matching_engine.find_matches()` matches **published flows (`trees`) only** — it returns `{tree_id, tree_name, score, ...}` and has no notion of `FlowProposal`s. Phase 2A therefore matches against published flows only; the `matched` outcome is always `session_kind: 'flow'`. This is sufficient because the flywheel promotes good AI drafts to published flows (§6), which then become matchable on future intakes. Matching against not-yet-promoted proposals is a deferred enhancement (would require extending the engine), noted in §13. + Frontend dispatches on `outcome`: -- `matched` → start a `flow`/`proposal` walk (Phase 1 paths). -- `suggest` → inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake with `force_build=true`. +- `matched` → start a `flow` walk (Phase 1 path). +- `suggest` → inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake with `force_build=true` (which skips the match pass and runs the category gate before building). - `out_of_scope` → inline prompt offering ad-hoc walk or escalate-without-walk (Phase 1 paths). - `build` → create an `ai_build` session, navigate to the walker, fetch the first node. @@ -98,7 +104,7 @@ Frontend dispatches on `outcome`: ## 5. Safety model (layered) -**Layer 1 — classification gate.** `classify(problem_text)` maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or `unknown`); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is `unknown`), intake returns `out_of_scope`; no build happens. +**Layer 1 — classification gate (build path only).** Runs only after the match pass misses (§3) — a human-authored flow is never blocked by category settings. `classify(problem_text)` maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or `unknown`); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is `unknown`), intake returns `out_of_scope` (offer adhoc/escalate); no build happens. **Layer 2 — constrained generation.** The `ai_tree_builder` system prompt restricts output to: - Safe, reversible, observe-or-restart-class steps only (toggle/restart/reconnect/re-enter, check-status questions). @@ -143,18 +149,37 @@ Ships enabled by default; Account Owners/Admins toggle per account: ## 6. Flywheel capture On `resolve` of an `ai_build` session (`l1_session_service.resolve` extension): -1. Build `proposed_flow_data` from the `walked_path` (the nodes that were actually traversed, normalized into a tree structure). -2. Create a `FlowProposal`: `source='ai_realtime_l1'`, `validated_by_outcome=true`, `proposed_flow_data=`, `linked_ticket_id/kind=`, `problem_domain=`, `status='pending'`. +1. **Normalize** the `walked_path` into a complete, valid `tree_structure` (§6.1) — approval requires a dict with a real `id` (see Finding 5 / `_create_tree_from_proposal`). +2. Create a `FlowProposal`: `source='ai_realtime_l1'`, `validated_by_outcome=true`, `proposed_flow_data={tree_structure, match_keywords}`, `l1_session_id=` (NOT `source_session_id` — see §6.2 / Finding 1), `linked_ticket_id/kind=`, `problem_domain=`, `status='pending'`. 3. Run the existing `_find_similar_pending_proposal` dedupe — merge (bump supporting count) if a near-duplicate pending proposal exists, else insert. 4. Emit the existing `proposal.pending` notification to the review queue. -Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by `flow_matching_engine` on future intakes → the KB compounds. No new review UI needed; `source='ai_realtime_l1'` rows surface in the existing queue (optionally badge them "AI · outcome-validated"). +Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by `flow_matching_engine` on future intakes → the KB compounds. `source='ai_realtime_l1'` rows surface in the existing queue (badge them "AI · outcome-validated"). + +### 6.1 Tree normalization (Finding 5) +The live `walked_path` holds only traversed nodes, and `"generate"` is a runtime sentinel, not a real edge — that is not a valid tree and would fail the `_create_tree_from_proposal` guard (`tree_structure` must be a dict with an `id`). At resolve time, `ai_tree_builder.normalize_walked_path(walked_path) -> tree_structure` produces a complete object: +- Assign stable string `id`s to every node; the first node becomes the root and `tree_structure.id` = root id. +- `question` nodes: the **traversed** branch (`yes`/`no` the tech actually chose) points to the next traversed node; the **untraversed** branch points to a terminal `{node_type: 'needs_review', text: 'Branch not explored during the originating call'}` stub. +- `instruction` nodes point to the next traversed node. +- The traversal ends at the real terminal node (`resolved` or `escalate`). +This yields a structurally valid, reviewable tree: engineers fill in the `needs_review` branches when promoting. (Trees are `tree_type='troubleshooting'`.) + +### 6.2 FlowProposal L1 source linkage (Finding 1 — Blocker) +`FlowProposal.source_session_id` is currently `nullable=False` FK → `ai_sessions`, and the review UI (`ProposalDetail.tsx`) links the "Source Session" to `/pilot/{source_session_id}` (a FlowPilot chat surface). An L1 `ai_build` session is an `l1_walk_session`, not an `ai_session`, so it cannot populate `source_session_id`. Changes: +- **Model/migration:** add `FlowProposal.l1_session_id` (nullable FK → `l1_walk_sessions.id`, `ondelete=SET NULL`, indexed). Make `source_session_id` **nullable**. Add CHECK `((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))` — exactly one source set. +- **Review UI:** when `l1_session_id` is set (source `ai_realtime_l1`), render the "Source" block as a read-only walked-path summary (problem statement + the resolved path) instead of a `/pilot/...` link. Existing ai_session-sourced proposals are unchanged. +- **Tree promotion:** `_create_tree_from_proposal` sets `Tree.source_session_id` from the proposal — for L1-sourced proposals leave it NULL (confirm `Tree.source_session_id` is nullable; if not, include in the migration). ## 7. Minimum escalation handoff -On `escalate` (terminal node reached, or the L1 hits the Escalate modal during an `ai_build` walk) — extends `l1_session_service.escalate`: -1. **Notify engineers** — `notification_service` bell-badge event `l1.session.escalated` to the account's engineers (and `is_team_admin`/owner). Payload: ticket ref, problem summary, escalation reason category, link. -2. **Engineer-visible surface** — escalated L1 sessions appear in an engineer-facing list. Reuse/extend the existing `/escalations` queue (`EscalationQueuePage`) with an "L1 escalations" section, or a dedicated `GET /l1/escalations` consumed there. Each row shows problem, the walked path summary, who escalated, when. +On `escalate` (terminal node reached, or the L1 hits the Escalate modal during an `ai_build` walk) — extends `l1_session_service.escalate`. **The engineer-visible surface is the primary, dependency-free handoff; the bell-badge notification is a thin addition that requires three specific extensions to the FlowPilot-shaped notification system (Finding 3).** + +1. **Engineer-visible surface (primary).** Escalated L1 sessions appear in an engineer-facing list — extend the existing `/escalations` queue (`EscalationQueuePage`) with an "L1 escalations" section, backed by a new `GET /l1/escalations`. Each row: problem statement, walked-path summary, who escalated, when, reason category. Pollable; no dependency on the notification subsystem. + +2. **Bell-badge notification (Finding 3 — three explicit changes).** The notification system is currently FlowPilot-specific: + - `VALID_EVENTS` (`backend/app/schemas/notification.py`) has no `l1.session.escalated`. **Add it** to the set (and to the default `events_enabled` map). + - `_build_notification_link` (`notification_service.py`) only knows `session.escalated → /pilot/{session_id}?pickup=true`. **Add** `l1.session.escalated → /escalations` and **add** a body template for the new event. The existing `session.escalated` event must NOT be reused — an L1 escalation has no ai_session and no `/pilot` pickup flow. + - Default recipients (`_resolve_recipients`, ~line 184) are owner/admin/team_admin only — ordinary **engineers are excluded**. Since L1 escalations must reach engineers who can pick them up, the call **must pass explicit `target_user_ids`** = the account's active `engineer`-role users (plus owner/admin), not rely on the default set. **Still deferred** (documented, not built): PSA ticket reassign, escalation-package markdown generation, AI chat handoff/session creation. @@ -167,6 +192,12 @@ On `escalate` (terminal node reached, or the L1 hits the Escalate modal during a **Migration 2 — account L1 category settings.** - Add `accounts.enabled_l1_categories` `JSONB NOT NULL DEFAULT ''::jsonb` (list of category keys). RLS already covers `accounts`. +**Migration 3 — FlowProposal L1 source linkage (Finding 1).** +- Add `flow_proposals.l1_session_id` nullable FK → `l1_walk_sessions.id` (`ondelete=SET NULL`, indexed). +- Make `flow_proposals.source_session_id` **nullable** (was `NOT NULL`). +- Add CHECK `((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))` — exactly one source. +- Confirm `trees.source_session_id` is nullable (L1-promoted trees leave it NULL); if not, drop its NOT NULL here. + No new tables — live build state rides on the existing `l1_walk_sessions.walked_path`; persisted trees ride on `FlowProposal.proposed_flow_data`. ## 9. API surface @@ -176,9 +207,11 @@ No new tables — live build state rides on the existing `l1_walk_sessions.walke | POST | `/l1/intake` | **Extended**: now runs `match_or_build`; response carries `outcome` (`matched`/`suggest`/`out_of_scope`/`build`). | `require_l1_or_coverage` | | POST | `/l1/sessions/{id}/next-node` | **New**: record answer/ack on current node, generate + return next node (or terminal). | `require_l1_or_coverage` | | GET | `/accounts/me/l1-categories` | **New**: list enabled + available categories + hard-floor (read-only) list. | `require_l1_or_above` (read) | -| PATCH | `/accounts/me/l1-categories` | **New**: set enabled categories. | `require_engineer_or_admin` (owner/admin) | +| PATCH | `/accounts/me/l1-categories` | **New**: set enabled categories. | `require_account_owner_or_admin` (Finding 6) | | GET | `/l1/escalations` | **New** (or extend `/escalations`): engineer-visible escalated-from-L1 list. | `require_engineer_or_admin` | +**Finding 6 — new auth dep.** The category control is an owner/admin setting, but `require_engineer_or_admin` also admits `engineer`. No existing dep matches "owner or account-admin" (`require_account_owner` is owner-only; `require_admin` is super-admin-only). Add `require_account_owner_or_admin` to `deps.py`: allow `super_admin` bypass, then `account_role in ('owner', 'admin')`, else 403. Use it for the PATCH. + ## 10. Frontend - `L1WalkTreeVariant` — replace synthetic stepping with real node rendering driven by `/next-node`; render `question` (yes/no), `instruction` (acknowledge), `resolved`/`escalate` (terminal). Per-node loading affordance. Disclaimer banner mounted for `ai_build` sessions. @@ -191,10 +224,11 @@ No new tables — live build state rides on the existing `l1_walk_sessions.walke **Backend unit:** - `ai_tree_builder.generate_next_node` — returns valid node per type; escalate-early when path is deep / model signals exhaustion; regenerate-then-escalate on malformed/forbidden output; depth cap forces escalate. - Per-node validation — forbidden-action patterns rejected; hard-floor enforced even if a category is enabled. -- `match_or_build` — all four outcomes at threshold boundaries (`score == MATCH_THRESHOLD`, `== SUGGEST_THRESHOLD`), `force_build` bypasses match, `out_of_scope` when category disabled. +- `match_or_build` — all four outcomes at threshold boundaries (`score == MATCH_THRESHOLD`, `== SUGGEST_THRESHOLD`); **match runs before the category gate** (a matched published flow is returned even when its category is disabled — Finding 4); `force_build` skips match but still applies the category gate; `out_of_scope` only on the build path when category disabled/unknown. - `classify` — known categories map correctly; unknown → out_of_scope. -- Flywheel capture — resolve creates `ai_realtime_l1` proposal; dedupe merges near-duplicate. -- Escalation handoff — notification fired; escalated session appears in engineer query. +- `normalize_walked_path` (Finding 5) — produces a dict with a root `id`; untraversed `question` branches become `needs_review` stubs; output passes the `_create_tree_from_proposal` validity guard. +- Flywheel capture — resolve creates `ai_realtime_l1` proposal with `l1_session_id` set and `source_session_id` NULL (Finding 1); CHECK accepts exactly-one-source; dedupe merges near-duplicate. +- Escalation handoff — `l1.session.escalated` accepted by the notification schema (Finding 3); link resolves to `/escalations`; explicit engineer `target_user_ids` receive it; escalated session appears in `GET /l1/escalations`. **Backend integration:** - Full intake→build→resolve creates an outcome-validated proposal. @@ -218,3 +252,15 @@ No new tables — live build state rides on the existing `l1_walk_sessions.walke ## 13. Out of scope (restated) KB ingestion + connectors, RAG grounding, PSA reassign, escalation-package generation, AI chat handoff. Each is its own later phase with its own spec. + +**Also deferred (surfaced in review):** +- **Matching against unpromoted `FlowProposal`s** (Finding 2). `flow_matching_engine` matches published flows only. Extending it to also surface outcome-validated drafts before promotion is a later enhancement; Phase 2A relies on engineer promotion (draft → published flow → matchable). + +## 14. Review revisions (2026-05-29 Codex review) +All six findings verified against code and resolved in this spec: +1. **Blocker — FlowProposal source linkage:** §6.2 + §8 Migration 3 (new nullable `l1_session_id`, `source_session_id` made nullable, exactly-one CHECK, review-UI link change). +2. **High — match scope:** §3 (match published flows only; proposal-matching deferred §13). +3. **High — escalation notification:** §7 (engineer surface is primary; three explicit notification-system changes enumerated). +4. **Medium — gate ordering:** §3 + §5 Layer 1 (match first; category gate only on the build path). +5. **Medium — flywheel tree shape:** §6.1 (`normalize_walked_path` produces a valid tree with root `id`; unexplored branches → `needs_review` stubs). +6. **Medium — category write auth:** §9 (new `require_account_owner_or_admin` dep; `require_engineer_or_admin` was too broad). -- 2.49.1 From 23dbcec86e20c8e23f802f655fc9d909fd3ab7ea Mon Sep 17 00:00:00 2001 From: Michael Chihlas Date: Fri, 29 May 2026 03:16:10 -0400 Subject: [PATCH 3/4] =?UTF-8?q?docs(plan):=20L1=20AI=20decision-tree=20bui?= =?UTF-8?q?lder=20=E2=80=94=20Phase=202A=20implementation=20plan?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 19 TDD tasks from the approved spec: 3 migrations (ai_build kind, account categories, FlowProposal l1_session_id), ai_tree_builder (constrained node gen + validation + normalize), match_or_build orchestrator (match-first, gate-on-build), session-service ai_build start/advance, flywheel capture on resolve, engineer escalation notification, category settings API, and the frontend walker/dispatch/settings/escalations surfaces + e2e. Co-Authored-By: Claude Opus 4.7 --- .../2026-05-29-l1-ai-tree-builder-phase-2a.md | 1966 +++++++++++++++++ 1 file changed, 1966 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md diff --git a/docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md b/docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md new file mode 100644 index 00000000..af4ce758 --- /dev/null +++ b/docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md @@ -0,0 +1,1966 @@ +# L1 AI Decision-Tree Builder — Phase 2A Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** When an L1 tech describes a problem with no matching published flow, build a yes/no decision tree in real time from generic L1 knowledge (constrained + escalate-early), walk it node-by-node, capture resolved trees as outcome-validated drafts, and route escalations to engineers. + +**Architecture:** Approach C — a dedicated `ai_tree_builder` service for constrained node-by-node generation, an `match_or_build` orchestrator that matches published flows first and gates generic building behind admin-configured categories, reusing `flow_matching_engine` (match), `knowledge_flywheel`/`FlowProposal` (capture), and `notification_service` (escalation). + +**Tech Stack:** Python 3.12 · FastAPI · SQLAlchemy 2.0 async · Alembic · PostgreSQL 16 (RLS) · React 19 + Vite + TS + Tailwind v4 · Playwright. + +**Source spec:** [`docs/superpowers/specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md`](../specs/2026-05-29-l1-ai-tree-builder-phase-2a-design.md) + +**Conventions (read before starting):** +- Migrations are **hand-written**: `alembic revision -m "msg"` then edit `upgrade()`/`downgrade()` by hand. **Never** `--autogenerate`, **never** `--rev-id`. Current head is `b3358ba0e48c`; each new migration chains from the previous. +- Backend tests run in the container: `docker exec resolutionflow_backend pytest -v`. The suite uses pytest-xdist in CI; single-module runs work locally. +- JSONB columns require **reassignment** (`x = [*x, item]`), not in-place mutation (see `record_step`). +- Commit after each task with the message shown. Git trailer: `Co-Authored-By: Claude Opus 4.7 `. +- Model tiers: `settings.get_model_for_action(key)` → `ACTION_MODEL_MAP[key]` → `AI_MODEL_TIERS[tier]`. `fast`=Haiku, `standard`=Sonnet. + +--- + +## File Structure + +**New backend files:** +- `backend/app/services/ai_tree_builder.py` — node Pydantic models, constrained system prompt, `generate_next_node`, per-node validation, `normalize_walked_path`. +- `backend/app/services/match_or_build.py` — orchestrator (`match_or_build`, `classify`). +- `backend/app/services/l1_category_service.py` — `DEFAULT_L1_CATEGORIES`, `HARD_FLOOR_FORBIDDEN`, get/set enabled categories. +- `backend/app/schemas/l1_categories.py` — category settings request/response. +- `backend/tests/test_ai_tree_builder.py`, `test_match_or_build.py`, `test_l1_category_service.py`, `test_l1_ai_build_flow.py` (integration). +- 3 Alembic migrations. + +**Modified backend files:** +- `backend/app/models/l1_walk_session.py` — `ai_build` in CHECK constraints. +- `backend/app/models/account.py` — `enabled_l1_categories` column. +- `backend/app/models/flow_proposal.py` — `l1_session_id`, nullable `source_session_id`, exactly-one CHECK. +- `backend/app/core/config.py` — `l1_realtime_build` + `l1_classify` action keys. +- `backend/app/api/deps.py` — `require_account_owner_or_admin`. +- `backend/app/api/endpoints/l1.py` — intake dispatch, `/sessions/{id}/next-node`, `/escalations`. +- `backend/app/api/endpoints/accounts.py` — `/me/l1-categories` GET/PATCH. +- `backend/app/schemas/l1.py` — `IntakeResponse.outcome`, `ai_build` literal, `NextNode*` schemas. +- `backend/app/services/l1_session_service.py` — `start_ai_build_session`, flywheel capture in `resolve`, engineer notification in `escalate`. +- `backend/app/services/notification_service.py` + `backend/app/schemas/notification.py` — `l1.session.escalated` event. + +**Modified frontend files:** +- `frontend/src/api/l1.ts`, `frontend/src/types/l1.ts` — next-node, outcome, categories. +- `frontend/src/pages/l1/L1Dashboard.tsx` — dispatch on intake `outcome`. +- `frontend/src/components/l1/L1WalkTreeVariant.tsx` — real node rendering + disclaimer. +- `frontend/src/components/flowpilot/ProposalDetail.tsx` — L1-sourced source block. +- `frontend/src/pages/EscalationQueuePage.tsx` — L1 escalations section. +- New: `frontend/src/pages/account/L1CategoriesPage.tsx` + route + nav. +- `frontend/e2e/l1-workspace.spec.ts` — AI build flow tests. + +--- + +## Task 1: Migration + model — `ai_build` session kind + +**Files:** +- Create: `backend/alembic/versions/_add_ai_build_session_kind.py` +- Modify: `backend/app/models/l1_walk_session.py:42-61` +- Test: `backend/tests/test_l1_ai_build_model.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_l1_ai_build_model.py +import uuid +import pytest +from app.models.l1_walk_session import L1WalkSession + + +def test_ai_build_session_kind_allowed_by_model_constraint(): + """ai_build is a valid session_kind with both target FKs null (like adhoc).""" + s = L1WalkSession( + account_id=uuid.uuid4(), + created_by_user_id=uuid.uuid4(), + ticket_id="t1", + ticket_kind="internal", + session_kind="ai_build", + ) + assert s.session_kind == "ai_build" + assert s.flow_id is None and s.flow_proposal_id is None +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_ai_build_model.py -v` +Expected: PASS at the Python level already (model has no enum on the attribute) — the real enforcement is the DB CHECK. If it errors on import, fix the import first. Treat this task's true verification as the migration roundtrip in Step 6. + +- [ ] **Step 3: Update model CHECK constraints** + +In `backend/app/models/l1_walk_session.py`, update the two constraints: + +```python + CheckConstraint( + "session_kind IN ('flow', 'proposal', 'adhoc', 'ai_build')", + name="ck_l1_walk_sessions_session_kind", + ), + CheckConstraint( + "(session_kind = 'flow' AND flow_id IS NOT NULL AND flow_proposal_id IS NULL) " + "OR (session_kind = 'proposal' AND flow_proposal_id IS NOT NULL AND flow_id IS NULL) " + "OR (session_kind IN ('adhoc', 'ai_build') AND flow_id IS NULL AND flow_proposal_id IS NULL)", + name="ck_l1_walk_sessions_target_consistency", + ), +``` + +- [ ] **Step 4: Create the migration** + +Run: `docker exec resolutionflow_backend alembic revision -m "add ai_build session kind"` +Then edit the generated file so `down_revision` is the current head (`b3358ba0e48c` unless a later task already advanced it) and the body drops+recreates the two CHECK constraints: + +```python +def upgrade() -> None: + op.drop_constraint("ck_l1_walk_sessions_session_kind", "l1_walk_sessions", type_="check") + op.create_check_constraint( + "ck_l1_walk_sessions_session_kind", "l1_walk_sessions", + "session_kind IN ('flow', 'proposal', 'adhoc', 'ai_build')", + ) + op.drop_constraint("ck_l1_walk_sessions_target_consistency", "l1_walk_sessions", type_="check") + op.create_check_constraint( + "ck_l1_walk_sessions_target_consistency", "l1_walk_sessions", + "(session_kind = 'flow' AND flow_id IS NOT NULL AND flow_proposal_id IS NULL) " + "OR (session_kind = 'proposal' AND flow_proposal_id IS NOT NULL AND flow_id IS NULL) " + "OR (session_kind IN ('adhoc', 'ai_build') AND flow_id IS NULL AND flow_proposal_id IS NULL)", + ) + + +def downgrade() -> None: + op.drop_constraint("ck_l1_walk_sessions_target_consistency", "l1_walk_sessions", type_="check") + op.create_check_constraint( + "ck_l1_walk_sessions_target_consistency", "l1_walk_sessions", + "(session_kind = 'flow' AND flow_id IS NOT NULL AND flow_proposal_id IS NULL) " + "OR (session_kind = 'proposal' AND flow_proposal_id IS NOT NULL AND flow_id IS NULL) " + "OR (session_kind = 'adhoc' AND flow_id IS NULL AND flow_proposal_id IS NULL)", + ) + op.drop_constraint("ck_l1_walk_sessions_session_kind", "l1_walk_sessions", type_="check") + op.create_check_constraint( + "ck_l1_walk_sessions_session_kind", "l1_walk_sessions", + "session_kind IN ('flow', 'proposal', 'adhoc')", + ) +``` + +- [ ] **Step 5: Apply the migration** + +Run: `docker exec resolutionflow_backend alembic upgrade head` +Expected: `Running upgrade b3358ba0e48c -> , add ai_build session kind` + +- [ ] **Step 6: Verify roundtrip + insert an ai_build row** + +Run: +```bash +docker exec resolutionflow_postgres psql -U postgres -d resolutionflow -c \ +"INSERT INTO l1_walk_sessions (id, account_id, created_by_user_id, ticket_id, ticket_kind, session_kind, walked_path, walk_notes, status, started_at, last_step_at) \ + SELECT gen_random_uuid(), a.id, u.id, 't-smoke', 'internal', 'ai_build', '[]'::jsonb, '[]'::jsonb, 'active', now(), now() \ + FROM accounts a JOIN users u ON u.account_id=a.id LIMIT 1 RETURNING id;" +``` +Expected: one row id returned (no CHECK violation). Then clean up: `DELETE FROM l1_walk_sessions WHERE ticket_id='t-smoke';` + +- [ ] **Step 7: Commit** + +```bash +git add backend/app/models/l1_walk_session.py backend/alembic/versions/ backend/tests/test_l1_ai_build_model.py +git commit -m "feat(l1): add ai_build session kind (model + migration)" +``` + +--- + +## Task 2: Migration + model — account `enabled_l1_categories` + +**Files:** +- Create: `backend/alembic/versions/_add_enabled_l1_categories.py` +- Modify: `backend/app/models/account.py` +- Test: `backend/tests/test_account_l1_categories_column.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_account_l1_categories_column.py +from app.models.account import Account + + +def test_account_has_enabled_l1_categories_default(): + a = Account(name="Acme", display_code="ABC12345") + # Column default is applied at flush; attribute may be None pre-flush. + assert hasattr(a, "enabled_l1_categories") +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_account_l1_categories_column.py -v` +Expected: FAIL — `AttributeError`/no such attribute. + +- [ ] **Step 3: Add the model column** + +In `backend/app/models/account.py`, after `sso_config` (or near other JSONB columns), add: + +```python + enabled_l1_categories: Mapped[list[str]] = mapped_column( + JSONB(), nullable=False, + server_default=sa_text( + "'[\"password_reset\",\"account_lockout\",\"printer\"," + "\"email_outlook_client\",\"wifi_network_basics\",\"vpn_connect\"," + "\"teams_zoom_av\",\"browser_cache_cookies\",\"peripheral_reconnect\"," + "\"os_restart_update\"]'::jsonb" + ), + ) +``` + +Ensure imports exist at top of file: `from sqlalchemy.dialects.postgresql import JSONB` and `from sqlalchemy import text as sa_text` (add if missing). + +- [ ] **Step 4: Run test to verify it passes** + +Run: `docker exec resolutionflow_backend pytest tests/test_account_l1_categories_column.py -v` +Expected: PASS. + +- [ ] **Step 5: Create + apply migration** + +Run: `docker exec resolutionflow_backend alembic revision -m "add enabled_l1_categories to accounts"` +Edit body: + +```python +import sqlalchemy as sa +from sqlalchemy.dialects import postgresql + +_DEFAULT = ('["password_reset","account_lockout","printer","email_outlook_client",' + '"wifi_network_basics","vpn_connect","teams_zoom_av","browser_cache_cookies",' + '"peripheral_reconnect","os_restart_update"]') + +def upgrade() -> None: + op.add_column("accounts", sa.Column( + "enabled_l1_categories", postgresql.JSONB(), nullable=False, + server_default=sa.text(f"'{_DEFAULT}'::jsonb"), + )) + +def downgrade() -> None: + op.drop_column("accounts", "enabled_l1_categories") +``` + +Run: `docker exec resolutionflow_backend alembic upgrade head` +Expected: upgrade applied; existing accounts backfill to the default list. + +- [ ] **Step 6: Commit** + +```bash +git add backend/app/models/account.py backend/alembic/versions/ backend/tests/test_account_l1_categories_column.py +git commit -m "feat(l1): add accounts.enabled_l1_categories with default allowlist" +``` + +--- + +## Task 3: Migration + model — FlowProposal L1 source linkage (Finding 1) + +**Files:** +- Create: `backend/alembic/versions/_flow_proposal_l1_source.py` +- Modify: `backend/app/models/flow_proposal.py:42-82` +- Test: `backend/tests/test_flow_proposal_l1_source.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_flow_proposal_l1_source.py +import uuid +from app.models.flow_proposal import FlowProposal + + +def test_flow_proposal_accepts_l1_session_id_without_source_session(): + p = FlowProposal( + account_id=uuid.uuid4(), + l1_session_id=uuid.uuid4(), + source_session_id=None, + proposal_type="new_flow", + title="AI L1 draft", + proposed_flow_data={"tree_structure": {"id": "root"}}, + source="ai_realtime_l1", + status="pending", + ) + assert p.l1_session_id is not None and p.source_session_id is None +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_flow_proposal_l1_source.py -v` +Expected: FAIL — `TypeError`/unexpected kwarg `l1_session_id`. + +- [ ] **Step 3: Update the model** + +In `backend/app/models/flow_proposal.py`: make `source_session_id` nullable, add `l1_session_id`, add the exactly-one CHECK in `__table_args__`. + +```python + source_session_id: Mapped[Optional[uuid.UUID]] = mapped_column( + UUID(as_uuid=True), + ForeignKey("ai_sessions.id", ondelete="CASCADE"), + nullable=True, + index=True, + ) + l1_session_id: Mapped[Optional[uuid.UUID]] = mapped_column( + UUID(as_uuid=True), + ForeignKey("l1_walk_sessions.id", ondelete="SET NULL"), + nullable=True, + index=True, + ) +``` + +Add to `__table_args__` (alongside the existing source/linked_ticket checks): + +```python + CheckConstraint( + "(source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL)", + name="ck_flow_proposals_exactly_one_source", + ), +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `docker exec resolutionflow_backend pytest tests/test_flow_proposal_l1_source.py -v` +Expected: PASS. + +- [ ] **Step 5: Create + apply migration** + +Run: `docker exec resolutionflow_backend alembic revision -m "flow_proposal l1 source linkage"` +Edit body: + +```python +import sqlalchemy as sa +from sqlalchemy.dialects import postgresql + +def upgrade() -> None: + op.add_column("flow_proposals", sa.Column( + "l1_session_id", postgresql.UUID(as_uuid=True), nullable=True)) + op.create_index("ix_flow_proposals_l1_session_id", "flow_proposals", ["l1_session_id"]) + op.create_foreign_key( + "fk_flow_proposals_l1_session_id", "flow_proposals", "l1_walk_sessions", + ["l1_session_id"], ["id"], ondelete="SET NULL") + op.alter_column("flow_proposals", "source_session_id", nullable=True) + op.create_check_constraint( + "ck_flow_proposals_exactly_one_source", "flow_proposals", + "(source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL)") + +def downgrade() -> None: + op.drop_constraint("ck_flow_proposals_exactly_one_source", "flow_proposals", type_="check") + op.alter_column("flow_proposals", "source_session_id", nullable=False) + op.drop_constraint("fk_flow_proposals_l1_session_id", "flow_proposals", type_="foreignkey") + op.drop_index("ix_flow_proposals_l1_session_id", "flow_proposals") + op.drop_column("flow_proposals", "l1_session_id") +``` + +Run: `docker exec resolutionflow_backend alembic upgrade head`. Expected: applied cleanly (no existing rows violate the new CHECK because all current proposals have a non-null `source_session_id` and null `l1_session_id`). + +- [ ] **Step 6: Verify Tree.source_session_id is nullable** + +Run: `docker exec resolutionflow_postgres psql -U postgres -d resolutionflow -c "\d trees" | grep source_session_id` +Expected: shows the column **without** `not null`. If it shows `not null`, add `op.alter_column("trees","source_session_id",nullable=True)` to this migration's `upgrade()` and re-run. (L1-promoted trees leave it NULL.) + +- [ ] **Step 7: Commit** + +```bash +git add backend/app/models/flow_proposal.py backend/alembic/versions/ backend/tests/test_flow_proposal_l1_source.py +git commit -m "feat(l1): FlowProposal l1_session_id source linkage (nullable source_session_id + exactly-one check)" +``` + +--- + +## Task 4: Category service + model action keys + +**Files:** +- Create: `backend/app/services/l1_category_service.py` +- Modify: `backend/app/core/config.py` (ACTION_MODEL_MAP) +- Test: `backend/tests/test_l1_category_service.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_l1_category_service.py +from app.services.l1_category_service import ( + DEFAULT_L1_CATEGORIES, HARD_FLOOR_FORBIDDEN, is_category_enabled, +) + + +def test_defaults_and_hard_floor_present(): + assert "password_reset" in DEFAULT_L1_CATEGORIES + assert "registry_edit" in HARD_FLOOR_FORBIDDEN # representative forbidden action key + assert len(DEFAULT_L1_CATEGORIES) == 10 + + +def test_is_category_enabled(): + enabled = ["printer", "vpn_connect"] + assert is_category_enabled("printer", enabled) is True + assert is_category_enabled("registry_edit", enabled) is False + assert is_category_enabled("unknown", enabled) is False +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_category_service.py -v` +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement the service** + +```python +# backend/app/services/l1_category_service.py +"""L1 category allowlist + the always-forbidden hard floor. + +DEFAULT_L1_CATEGORIES seeds an account's enabled set. HARD_FLOOR_FORBIDDEN is a +category-independent safety floor the AI tree builder must never emit and admins +cannot enable. See spec §5.1/§5.2. +""" +from uuid import UUID + +from sqlalchemy import select +from sqlalchemy.ext.asyncio import AsyncSession + +from app.models.account import Account + +DEFAULT_L1_CATEGORIES: list[str] = [ + "password_reset", "account_lockout", "printer", "email_outlook_client", + "wifi_network_basics", "vpn_connect", "teams_zoom_av", + "browser_cache_cookies", "peripheral_reconnect", "os_restart_update", +] + +# Always-forbidden action classes (keys are stable identifiers; the human-readable +# phrasing lives in the builder system prompt). Admins cannot enable these. +HARD_FLOOR_FORBIDDEN: list[str] = [ + "registry_edit", "system_file_or_boot_edit", "data_or_disk_deletion", + "credential_or_mfa_change", "security_or_av_or_firewall_change", + "elevated_or_admin_script", "domain_dns_dhcp_change", + "server_or_production_config", "billing_or_license_change", +] + +# Substrings that, if present in a generated node's text, indicate a hard-floor +# violation. Used by ai_tree_builder per-node validation (defense in depth). +HARD_FLOOR_TEXT_PATTERNS: list[str] = [ + "regedit", "registry", "format ", "delete partition", "diskpart", + "reset password for", "disable firewall", "disable antivirus", "disable defender", + "run as administrator", "sudo ", "domain controller", "dns record", "dhcp scope", + "uninstall security", "bitlocker", +] + + +def is_category_enabled(category: str, enabled: list[str]) -> bool: + """A category is buildable only if explicitly enabled and not hard-floored.""" + if category in HARD_FLOOR_FORBIDDEN: + return False + return category in enabled + + +async def get_enabled_categories(account_id: UUID, db: AsyncSession) -> list[str]: + acct = (await db.execute(select(Account).where(Account.id == account_id))).scalar_one() + return list(acct.enabled_l1_categories or []) + + +async def set_enabled_categories( + account_id: UUID, categories: list[str], db: AsyncSession +) -> list[str]: + """Persist the enabled set, dropping anything unknown or hard-floored.""" + cleaned = [c for c in categories if c in DEFAULT_L1_CATEGORIES] + acct = (await db.execute(select(Account).where(Account.id == account_id))).scalar_one() + acct.enabled_l1_categories = cleaned + await db.flush() + return cleaned +``` + +- [ ] **Step 4: Add model action keys** + +In `backend/app/core/config.py`, add to `ACTION_MODEL_MAP`: + +```python + # L1 AI tree builder (Phase 2A): per-node generation is latency-sensitive + # on a live call → Sonnet; classification is a short label task → Haiku. + "l1_realtime_build": "standard", + "l1_classify": "fast", +``` + +- [ ] **Step 5: Run tests** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_category_service.py -v` +Expected: PASS (3 tests). + +- [ ] **Step 6: Commit** + +```bash +git add backend/app/services/l1_category_service.py backend/app/core/config.py backend/tests/test_l1_category_service.py +git commit -m "feat(l1): category service (defaults + hard floor) and AI action keys" +``` + +--- + +## Task 5: `ai_tree_builder` — node schema, prompt, generation, validation + +**Files:** +- Create: `backend/app/services/ai_tree_builder.py` +- Test: `backend/tests/test_ai_tree_builder.py` + +- [ ] **Step 1: Write the failing tests** + +```python +# backend/tests/test_ai_tree_builder.py +import pytest +from app.services import ai_tree_builder as atb + + +def test_validate_node_rejects_hard_floor_text(): + node = {"node_type": "instruction", "id": "n1", "text": "Open regedit and change the key", "next": "generate"} + with pytest.raises(atb.UnsafeNodeError): + atb.validate_node(node) + + +def test_validate_node_accepts_safe_instruction(): + node = {"node_type": "instruction", "id": "n1", "text": "Restart the printer.", "next": "generate"} + assert atb.validate_node(node)["node_type"] == "instruction" + + +def test_depth_cap_forces_escalate(): + walked = [{"node_type": "question", "id": f"n{i}", "text": "?", "answer": "no"} for i in range(atb.MAX_DEPTH)] + node = atb.escalate_if_depth_exceeded(walked) + assert node is not None and node["node_type"] == "escalate" + + +def test_normalize_walked_path_builds_valid_tree(): + walked = [ + {"node_type": "question", "id": "n1", "text": "Powered on?", "answer": "no"}, + {"node_type": "instruction", "id": "n2", "text": "Power it on.", "answer": "ack"}, + {"node_type": "resolved", "id": "n3", "text": "Fixed."}, + ] + tree = atb.normalize_walked_path(walked) + assert isinstance(tree, dict) and tree.get("id") == "n1" + # untraversed 'yes' branch of n1 became a needs_review stub + assert any(n["node_type"] == "needs_review" for n in tree["nodes"].values()) +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `docker exec resolutionflow_backend pytest tests/test_ai_tree_builder.py -v` +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement the builder** + +```python +# backend/app/services/ai_tree_builder.py +"""Constrained, node-by-node L1 decision-tree generation (spec §4/§5/§6.1). + +Each call produces ONE node given the problem, category, and full walked path. +Generation is constrained to safe/reversible L1 steps and biased to escalate +early. normalize_walked_path() turns a resolved walk into a valid tree object +for flywheel capture. +""" +import json +import logging +from typing import Any, Optional + +from app.core.ai_provider import get_ai_provider +from app.core.config import settings +from app.services.l1_category_service import HARD_FLOOR_TEXT_PATTERNS +from app.services.llm_utils import parse_llm_json + +logger = logging.getLogger(__name__) + +MAX_DEPTH = 12 +VALID_NODE_TYPES = {"question", "instruction", "resolved", "escalate"} + + +class UnsafeNodeError(ValueError): + """Raised when a generated node violates the hard floor or is malformed.""" + + +SYSTEM_PROMPT = """\ +You are an L1 helpdesk troubleshooting guide builder. Given a problem and the +steps already tried, produce the SINGLE next node of a yes/no decision tree. + +HARD RULES: +- Only safe, reversible, observe-or-restart-class steps: checking status, toggling, + restarting, reconnecting, re-entering credentials the USER already knows. +- NEVER produce steps that: edit the registry/system files/boot config; delete or + format data/disks; change credentials/MFA/security/firewall/AV; run elevated or + admin scripts; touch domain controllers/DNS/DHCP or production servers; or have + billing/license impact. These are out of L1 scope. +- When you run out of safe in-scope steps, DO NOT GUESS. Emit an "escalate" node. + +Return ONLY a JSON object for ONE node, one of: +{"node_type":"question","text":""} +{"node_type":"instruction","text":""} +{"node_type":"resolved","text":""} +{"node_type":"escalate","reason_category":"exhausted_safe_steps","text":""} +No prose, no markdown fences. +""" + + +def _build_context(problem_text: str, category: str, walked_path: list[dict]) -> str: + lines = [f"PROBLEM: {problem_text}", f"CATEGORY: {category}", "STEPS SO FAR:"] + if not walked_path: + lines.append("(none yet — produce the first diagnostic question)") + for i, step in enumerate(walked_path, 1): + ans = step.get("answer") + suffix = f" -> {ans}" if ans else "" + lines.append(f"{i}. [{step.get('node_type','?')}] {step.get('text','')}{suffix}") + return "\n".join(lines) + + +def validate_node(node: dict[str, Any]) -> dict[str, Any]: + """Shape + hard-floor validation. Raises UnsafeNodeError on violation.""" + if not isinstance(node, dict) or node.get("node_type") not in VALID_NODE_TYPES: + raise UnsafeNodeError(f"invalid node_type: {node!r}") + text = (node.get("text") or "").lower() + for pat in HARD_FLOOR_TEXT_PATTERNS: + if pat in text: + raise UnsafeNodeError(f"hard-floor pattern '{pat}' in node text") + return node + + +def escalate_if_depth_exceeded(walked_path: list[dict]) -> Optional[dict[str, Any]]: + if len(walked_path) >= MAX_DEPTH: + return { + "node_type": "escalate", + "reason_category": "depth_cap", + "text": "Reached the L1 troubleshooting depth limit — escalating to engineering.", + } + return None + + +async def generate_next_node( + problem_text: str, category: str, walked_path: list[dict] +) -> dict[str, Any]: + """Generate + validate the next node. Regenerate once on failure, then escalate.""" + capped = escalate_if_depth_exceeded(walked_path) + if capped: + return capped + + provider = get_ai_provider(settings.get_model_for_action("l1_realtime_build")) + context = _build_context(problem_text, category, walked_path) + + for attempt in range(2): + try: + raw, _, _ = await provider.generate_json( + system_prompt=SYSTEM_PROMPT, + messages=[{"role": "user", "content": context}], + max_tokens=1024, + ) + node = parse_llm_json(raw) + return validate_node(node) + except (UnsafeNodeError, ValueError) as e: + logger.warning("ai_tree_builder node attempt %d failed: %s", attempt + 1, e) + continue + + return { + "node_type": "escalate", + "reason_category": "generation_failed", + "text": "Could not generate a safe next step — escalating to engineering.", + } + + +def normalize_walked_path(walked_path: list[dict]) -> dict[str, Any]: + """Turn a resolved walk into a valid troubleshooting tree (spec §6.1). + + Root = first node's id; question nodes' traversed branch points to the next + node, the untraversed branch to a needs_review stub; terminal node ends it. + Returns {id, nodes: {id: node}} — a dict with an id (passes the proposal + approval guard). + """ + nodes: dict[str, Any] = {} + if not walked_path: + root_id = "root" + nodes[root_id] = {"id": root_id, "node_type": "needs_review", + "text": "Empty walk — needs authoring."} + return {"id": root_id, "nodes": nodes} + + stub_seq = 0 + for i, step in enumerate(walked_path): + nid = step.get("id") or f"n{i+1}" + ntype = step.get("node_type", "question") + nxt = walked_path[i + 1].get("id", f"n{i+2}") if i + 1 < len(walked_path) else None + node: dict[str, Any] = {"id": nid, "node_type": ntype, "text": step.get("text", "")} + if ntype == "question": + answer = (step.get("answer") or "").lower() + stub_seq += 1 + stub_id = f"review-{stub_seq}" + nodes[stub_id] = {"id": stub_id, "node_type": "needs_review", + "text": "Branch not explored during the originating call."} + node["yes_next"] = nxt if answer == "yes" else stub_id + node["no_next"] = nxt if answer == "no" else stub_id + elif ntype == "instruction": + node["next"] = nxt + nodes[nid] = node + + return {"id": walked_path[0].get("id", "n1"), "nodes": nodes} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: `docker exec resolutionflow_backend pytest tests/test_ai_tree_builder.py -v` +Expected: PASS (4 tests). `generate_next_node` is not unit-tested against a live model here; it is covered by the integration test in Task 11 with a mocked provider. + +- [ ] **Step 5: Commit** + +```bash +git add backend/app/services/ai_tree_builder.py backend/tests/test_ai_tree_builder.py +git commit -m "feat(l1): ai_tree_builder — constrained node generation, validation, normalize" +``` + +--- + +## Task 6: `match_or_build` orchestrator + `classify` + +**Files:** +- Create: `backend/app/services/match_or_build.py` +- Test: `backend/tests/test_match_or_build.py` + +- [ ] **Step 1: Write the failing tests** + +```python +# backend/tests/test_match_or_build.py +import uuid +import pytest +from unittest.mock import AsyncMock, patch +from app.services import match_or_build as mob + + +@pytest.mark.asyncio +async def test_match_wins_before_category_gate(): + """A strong published-flow match returns 'matched' even if category disabled.""" + with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock( + return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "VPN", "score": 0.9}])), \ + patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=[])): + res = await mob.match_or_build(uuid.uuid4(), "vpn down", None, "t1", db=AsyncMock(), force_build=False) + assert res["outcome"] == "matched" + assert res["session_kind"] == "flow" + + +@pytest.mark.asyncio +async def test_suggest_band(): + with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock( + return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.66}])): + res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=False) + assert res["outcome"] == "suggest" + + +@pytest.mark.asyncio +async def test_out_of_scope_when_category_disabled_on_build_path(): + with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(return_value=[])), \ + patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \ + patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["vpn_connect"])): + res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, "t1", db=AsyncMock(), force_build=False) + assert res["outcome"] == "out_of_scope" + + +@pytest.mark.asyncio +async def test_build_when_enabled_and_no_match(): + with patch.object(mob.flow_matching_engine, "find_matches", new=AsyncMock(return_value=[])), \ + patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \ + patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["printer"])): + res = await mob.match_or_build(uuid.uuid4(), "printer jam", None, "t1", db=AsyncMock(), force_build=False) + assert res["outcome"] == "build" + assert res["session_kind"] == "ai_build" + + +@pytest.mark.asyncio +async def test_force_build_skips_match_but_still_gates(): + fm = AsyncMock(return_value=[{"tree_id": str(uuid.uuid4()), "tree_name": "X", "score": 0.99}]) + with patch.object(mob.flow_matching_engine, "find_matches", new=fm), \ + patch.object(mob, "classify", new=AsyncMock(return_value="printer")), \ + patch.object(mob, "get_enabled_categories", new=AsyncMock(return_value=["printer"])): + res = await mob.match_or_build(uuid.uuid4(), "p", None, "t1", db=AsyncMock(), force_build=True) + fm.assert_not_called() + assert res["outcome"] == "build" +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `docker exec resolutionflow_backend pytest tests/test_match_or_build.py -v` +Expected: FAIL — module not found. + +- [ ] **Step 3: Implement the orchestrator** + +```python +# backend/app/services/match_or_build.py +"""Intake orchestrator: match published flows first, gate generic build behind +the account's enabled categories (spec §3). Match runs BEFORE the category gate +so an authored flow is never blocked by category settings (Finding 4).""" +import logging +from typing import Any, Optional +from uuid import UUID + +from sqlalchemy.ext.asyncio import AsyncSession + +from app.core.ai_provider import get_ai_provider +from app.core.config import settings +from app.services import flow_matching_engine +from app.services.l1_category_service import ( + DEFAULT_L1_CATEGORIES, get_enabled_categories, is_category_enabled, +) +from app.services.llm_utils import parse_llm_json + +logger = logging.getLogger(__name__) + +MATCH_THRESHOLD = 0.75 +SUGGEST_THRESHOLD = 0.60 + +_CLASSIFY_PROMPT = ( + "Classify the IT support problem into exactly one of these category keys, " + "or 'unknown'. Return JSON {\"category\":\"\"} only.\nKEYS: " + + ", ".join(DEFAULT_L1_CATEGORIES) +) + + +async def classify(problem_text: str) -> str: + """Map a problem to a category key via a short model call; keyword fallback.""" + try: + provider = get_ai_provider(settings.get_model_for_action("l1_classify")) + raw, _, _ = await provider.generate_json( + system_prompt=_CLASSIFY_PROMPT, + messages=[{"role": "user", "content": problem_text}], + max_tokens=64, + ) + cat = parse_llm_json(raw).get("category", "unknown") + return cat if cat in DEFAULT_L1_CATEGORIES else "unknown" + except Exception as e: # noqa: BLE001 — fall back, never hard-fail intake + logger.warning("classify model call failed (%s); keyword fallback", e) + text = problem_text.lower() + for cat in DEFAULT_L1_CATEGORIES: + if any(tok in text for tok in cat.split("_")): + return cat + return "unknown" + + +async def match_or_build( + account_id: UUID, + problem_text: str, + problem_domain: Optional[str], + ticket_ref: str, + *, + db: AsyncSession, + force_build: bool = False, +) -> dict[str, Any]: + if not force_build: + hits = await flow_matching_engine.find_matches( + problem_text, problem_domain, account_id, db) + best = max(hits, key=lambda h: h["score"], default=None) if hits else None + if best and best["score"] >= MATCH_THRESHOLD: + return {"outcome": "matched", "flow_id": best["tree_id"], "session_kind": "flow"} + if best and best["score"] >= SUGGEST_THRESHOLD: + return {"outcome": "suggest", + "near_miss": {"flow_id": best["tree_id"], "flow_name": best["tree_name"], + "score": best["score"]}, + "can_build": True} + + category = await classify(problem_text) + enabled = await get_enabled_categories(account_id, db) + if not is_category_enabled(category, enabled): + return {"outcome": "out_of_scope", "category": category} + return {"outcome": "build", "session_kind": "ai_build", "category": category} +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: `docker exec resolutionflow_backend pytest tests/test_match_or_build.py -v` +Expected: PASS (5 tests). + +- [ ] **Step 5: Commit** + +```bash +git add backend/app/services/match_or_build.py backend/tests/test_match_or_build.py +git commit -m "feat(l1): match_or_build orchestrator + classify (match-first, gate-on-build)" +``` + +--- + +## Task 7: Session service — `start_ai_build_session` + +**Files:** +- Modify: `backend/app/services/l1_session_service.py` +- Test: `backend/tests/test_l1_session_service.py` (add) + +- [ ] **Step 1: Write the failing test** + +```python +# add to backend/tests/test_l1_session_service.py +@pytest.mark.asyncio +async def test_start_ai_build_session(db_session, l1_user): + from app.services import l1_session_service as svc + s = await svc.start_ai_build_session( + db_session, account_id=l1_user.account_id, user=l1_user, + ticket_id="t-ai", ticket_kind="internal", + ) + assert s.session_kind == "ai_build" + assert s.flow_id is None and s.flow_proposal_id is None + assert s.status == "active" +``` + +(Use the same fixtures the existing tests in this file use for `db_session`/`l1_user`.) + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py::test_start_ai_build_session -v` +Expected: FAIL — `AttributeError: start_ai_build_session`. + +- [ ] **Step 3: Implement (mirror `start_adhoc_session`)** + +In `backend/app/services/l1_session_service.py`, after `start_adhoc_session`: + +```python +async def start_ai_build_session( + db: AsyncSession, + *, + account_id: UUID, + user: User, + ticket_id: str, + ticket_kind: str, +) -> L1WalkSession: + """Start an AI-built tree session (nodes generated on demand via next-node).""" + session = L1WalkSession( + account_id=account_id, + created_by_user_id=user.id, + acting_as=_resolve_acting_as(user), + ticket_id=ticket_id, + ticket_kind=ticket_kind, + session_kind="ai_build", + ) + db.add(session) + await db.flush() + return session +``` + +- [ ] **Step 4: Run test to verify it passes** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py::test_start_ai_build_session -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add backend/app/services/l1_session_service.py backend/tests/test_l1_session_service.py +git commit -m "feat(l1): start_ai_build_session" +``` + +--- + +## Task 8: Session service — `advance_ai_build` (record answer + generate next node) + +**Files:** +- Modify: `backend/app/services/l1_session_service.py` +- Test: `backend/tests/test_l1_session_service.py` (add) + +- [ ] **Step 1: Write the failing test** + +```python +# add to backend/tests/test_l1_session_service.py +@pytest.mark.asyncio +async def test_advance_ai_build_appends_and_returns_next(db_session, l1_user, monkeypatch): + from app.services import l1_session_service as svc + from app.services import ai_tree_builder + s = await svc.start_ai_build_session( + db_session, account_id=l1_user.account_id, user=l1_user, + ticket_id="t-ai", ticket_kind="internal") + + async def fake_next(problem, category, walked): + return {"node_type": "resolved", "id": "done", "text": "Fixed."} + monkeypatch.setattr(ai_tree_builder, "generate_next_node", fake_next) + + next_node = await svc.advance_ai_build( + db_session, session_id=s.id, problem_text="printer", category="printer", + node_id="n1", answer="no", note=None) + assert next_node["node_type"] == "resolved" + refreshed = await db_session.get(type(s), s.id) + assert len(refreshed.walked_path) == 1 + assert refreshed.walked_path[0]["answer"] == "no" +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py::test_advance_ai_build_appends_and_returns_next -v` +Expected: FAIL — `AttributeError: advance_ai_build`. + +- [ ] **Step 3: Implement** + +Add to `l1_session_service.py` (imports at top: `from app.services import ai_tree_builder`): + +```python +async def advance_ai_build( + db: AsyncSession, + *, + session_id: UUID, + problem_text: str, + category: str, + node_id: Optional[str] = None, + answer: Optional[str] = None, + note: Optional[str] = None, +) -> dict: + """Append the answered/acked node to walked_path, then generate the next node. + + On the first call (node_id is None) nothing is appended — we just generate the + first node. Returns the next node dict (caller persists current_node_id). + Raises ValueError on missing/inactive/non-ai_build session. + """ + session = await db.get(L1WalkSession, session_id) + if not session: + raise ValueError(f"L1WalkSession {session_id} not found") + if session.session_kind != "ai_build": + raise ValueError("advance_ai_build requires an ai_build session") + if session.status != "active": + raise ValueError(f"Session {session_id} is not active (status={session.status})") + + if node_id is not None: + # Find the text of the node being answered from current_node payload if + # the caller passed it via walk; otherwise store id+answer (text optional). + entry = {"node_type": "question" if answer in ("yes", "no") else "instruction", + "id": node_id, "answer": answer, "l1_note": note} + session.walked_path = [*session.walked_path, entry] + + next_node = await ai_tree_builder.generate_next_node( + problem_text, category, session.walked_path) + session.current_node_id = next_node.get("id") + session.last_step_at = datetime.now(timezone.utc) + await db.flush() + return next_node +``` + +> Note: the node `text` for traversed nodes is filled by the endpoint layer (Task 10) which knows the current node it served; `advance_ai_build` records the answer against the id. The endpoint passes the served node's text in `note`-adjacent payload if richer transcript is desired — keep Phase 2A minimal (id + answer). + +- [ ] **Step 4: Run test to verify it passes** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py::test_advance_ai_build_appends_and_returns_next -v` +Expected: PASS. + +- [ ] **Step 5: Commit** + +```bash +git add backend/app/services/l1_session_service.py backend/tests/test_l1_session_service.py +git commit -m "feat(l1): advance_ai_build — record answer + generate next node" +``` + +--- + +## Task 9: Session service — flywheel capture on resolve + engineer notification on escalate + +**Files:** +- Modify: `backend/app/services/l1_session_service.py` (`resolve`, `escalate`) +- Modify: `backend/app/schemas/notification.py` (VALID_EVENTS) +- Modify: `backend/app/services/notification_service.py` (link + body) +- Test: `backend/tests/test_l1_session_service.py` (add) + +- [ ] **Step 1: Write the failing tests** + +```python +# add to backend/tests/test_l1_session_service.py +@pytest.mark.asyncio +async def test_resolve_ai_build_creates_outcome_validated_proposal(db_session, l1_user, monkeypatch): + from app.services import l1_session_service as svc + from app.models.flow_proposal import FlowProposal + from sqlalchemy import select + s = await svc.start_ai_build_session( + db_session, account_id=l1_user.account_id, user=l1_user, + ticket_id="t-ai", ticket_kind="internal") + s.walked_path = [ + {"node_type": "question", "id": "n1", "text": "On?", "answer": "no"}, + {"node_type": "resolved", "id": "n2", "text": "Fixed."}, + ] + await db_session.flush() + await svc.resolve(db_session, session_id=s.id, helpful=True, resolution_notes="ok") + props = (await db_session.execute( + select(FlowProposal).where(FlowProposal.l1_session_id == s.id))).scalars().all() + assert len(props) == 1 + assert props[0].source == "ai_realtime_l1" + assert props[0].validated_by_outcome is True + assert props[0].source_session_id is None + assert props[0].proposed_flow_data["tree_structure"]["id"] == "n1" + + +@pytest.mark.asyncio +async def test_escalate_notifies_engineers(db_session, l1_user, monkeypatch): + from app.services import l1_session_service as svc + calls = {} + async def fake_notify(event, account_id, payload, db, target_user_ids=None): + calls["event"] = event + calls["target_user_ids"] = target_user_ids + monkeypatch.setattr(svc, "notify", fake_notify) + s = await svc.start_ai_build_session( + db_session, account_id=l1_user.account_id, user=l1_user, + ticket_id="t-ai", ticket_kind="internal") + await svc.escalate(db_session, session_id=s.id, reason="stuck", reason_category="exhausted_safe_steps") + assert calls["event"] == "l1.session.escalated" + assert calls["target_user_ids"] is not None # explicit engineer recipients +``` + +- [ ] **Step 2: Run tests to verify they fail** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py -k "ai_build_creates_outcome or notifies_engineers" -v` +Expected: FAIL — no proposal created / `notify` not called with the new event. + +- [ ] **Step 3a: Add notification event + link + body** + +`backend/app/schemas/notification.py` — add to `VALID_EVENTS`: +```python + "l1.session.escalated", +``` + +`backend/app/services/notification_service.py` — in `_build_notification_link` `links` dict add: +```python + "l1.session.escalated": "/escalations", +``` +and in the body-template builder (the `bodies` dict near `_build_notification_link`) add: +```python + "l1.session.escalated": "L1 escalated a ticket: {problem_summary}", +``` + +- [ ] **Step 3b: Flywheel capture in `resolve`** + +In `l1_session_service.resolve`, after the existing `proposal.validated_by_outcome` block and before the ticket close, add (imports: `from app.services import ai_tree_builder`, `from app.models.flow_proposal import FlowProposal` already present): + +```python + if helpful and session.session_kind == "ai_build" and session.walked_path: + tree_structure = ai_tree_builder.normalize_walked_path(session.walked_path) + db.add(FlowProposal( + account_id=session.account_id, + l1_session_id=session.id, + source_session_id=None, + proposal_type="new_flow", + title=(session.resolution_notes or "AI L1 resolution")[:255], + proposed_flow_data={"tree_structure": tree_structure, "match_keywords": []}, + source="ai_realtime_l1", + validated_by_outcome=True, + linked_ticket_id=session.ticket_id, + linked_ticket_kind=session.ticket_kind, + status="pending", + )) +``` + +> Dedupe via `_find_similar_pending_proposal` is a nice-to-have; Phase 2A inserts directly. If duplicate noise appears in QA, wire the existing dedupe helper here. + +- [ ] **Step 3c: Engineer notification in `escalate`** + +In `l1_session_service.escalate`, after `await log_audit(...)` and before the final `await db.flush()`, add (imports: `from app.services.notification_service import notify`, `from app.models.user import User`, `from sqlalchemy import select`): + +```python + eng_rows = await db.execute( + select(User.id).where( + User.account_id == session.account_id, + User.is_active.is_(True), + User.account_role.in_(("owner", "admin", "engineer")), + ) + ) + target_ids = [r[0] for r in eng_rows.all()] + await notify( + "l1.session.escalated", + session.account_id, + {"problem_summary": session.ticket_id, "session_id": str(session.id), + "reason_category": reason_category}, + db, + target_user_ids=target_ids, + ) +``` + +- [ ] **Step 4: Run tests to verify they pass** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_session_service.py -k "ai_build_creates_outcome or notifies_engineers" -v` +Expected: PASS. + +- [ ] **Step 5: Run notification schema test** + +Run: `docker exec resolutionflow_backend pytest tests/ -k notification -v` +Expected: PASS (the new event is accepted by `validate_event_keys`). + +- [ ] **Step 6: Commit** + +```bash +git add backend/app/services/l1_session_service.py backend/app/schemas/notification.py backend/app/services/notification_service.py backend/tests/test_l1_session_service.py +git commit -m "feat(l1): flywheel capture on resolve + engineer notification on escalate" +``` + +--- + +## Task 10: API — intake dispatch, next-node, escalations; schemas + deps + +**Files:** +- Modify: `backend/app/schemas/l1.py` +- Modify: `backend/app/api/deps.py` +- Modify: `backend/app/api/endpoints/l1.py` +- Test: `backend/tests/test_l1_api_ai_build.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_l1_api_ai_build.py +import pytest +from unittest.mock import AsyncMock, patch + + +@pytest.mark.asyncio +async def test_intake_build_outcome_creates_ai_build_session(l1_client): + with patch("app.api.endpoints.l1.match_or_build.match_or_build", + new=AsyncMock(return_value={"outcome": "build", "session_kind": "ai_build", "category": "printer"})): + r = await l1_client.post("/api/v1/l1/intake", json={"problem_statement": "printer jam"}) + assert r.status_code == 200 + body = r.json() + assert body["outcome"] == "build" + assert body["session_kind"] == "ai_build" + assert body["session_id"] + + +@pytest.mark.asyncio +async def test_intake_out_of_scope(l1_client): + with patch("app.api.endpoints.l1.match_or_build.match_or_build", + new=AsyncMock(return_value={"outcome": "out_of_scope", "category": "unknown"})): + r = await l1_client.post("/api/v1/l1/intake", json={"problem_statement": "weird"}) + assert r.status_code == 200 + assert r.json()["outcome"] == "out_of_scope" +``` + +(Use the existing L1 client fixture pattern from `test_l1_api*`/conftest; `l1_client` is an authed AsyncClient for an `l1_tech` user.) + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_api_ai_build.py -v` +Expected: FAIL — `outcome` not in response / KeyError. + +- [ ] **Step 3a: Schemas** + +In `backend/app/schemas/l1.py`: +- Change `IntakeResponse.session_kind` literal to include `ai_build` and make `session_id`/`session_kind` optional (non-build outcomes have no session): + +```python +class IntakeResponse(BaseModel): + outcome: Literal["matched", "suggest", "out_of_scope", "build"] + session_id: Optional[UUID] = None + session_kind: Optional[Literal["flow", "proposal", "adhoc", "ai_build"]] = None + ticket_id: Optional[str] = None + ticket_kind: Optional[str] = None + flow_id: Optional[UUID] = None # for 'matched' + near_miss: Optional[dict] = None # for 'suggest' + category: Optional[str] = None # for 'out_of_scope' +``` +Add `NextNodeRequest` / `NextNodeResponse`: +```python +class NextNodeRequest(BaseModel): + node_id: Optional[str] = None + answer: Optional[str] = None # 'yes' | 'no' for questions + acknowledged: Optional[bool] = None + note: Optional[str] = None + +class NextNodeResponse(BaseModel): + node: dict + session_status: str +``` +Ensure `IntakeRequest` has an optional `force_build: bool = False` and `flow_id` is no longer required. + +- [ ] **Step 3b: Auth dep** + +In `backend/app/api/deps.py`, after `require_account_owner`: +```python +async def require_account_owner_or_admin( + current_user: Annotated[User, Depends(get_current_active_user)] +) -> User: + """Require account owner or account-admin (blocks engineers); super_admin bypass.""" + if current_user.is_super_admin: + return current_user + if current_user.account_role in ("owner", "admin"): + return current_user + raise HTTPException( + status_code=status.HTTP_403_FORBIDDEN, + detail="Account owner or admin access required", + ) +``` + +- [ ] **Step 3c: Rewrite intake + add next-node + escalations in `l1.py`** + +Replace the intake body to run the orchestrator (imports: `from app.services import match_or_build`): + +```python +@router.post("/intake", response_model=IntakeResponse) +async def intake( + payload: IntakeRequest, + db: Annotated[AsyncSession, Depends(get_db)], + user: Annotated[User, Depends(require_l1_or_coverage)], +): + """L1 intake: match a published flow, else gate + build, else suggest/out-of-scope.""" + result = await match_or_build.match_or_build( + user.account_id, payload.problem_statement, None, ticket_ref="", + db=db, force_build=payload.force_build, + ) + outcome = result["outcome"] + + if outcome in ("suggest", "out_of_scope"): + await db.commit() + return IntakeResponse(outcome=outcome, near_miss=result.get("near_miss"), + category=result.get("category")) + + # matched OR build → create a ticket and a session + ticket = await internal_ticket_service.create_ticket( + db, account_id=user.account_id, created_by_user_id=user.id, + problem_statement=payload.problem_statement, + customer_name=payload.customer_name, customer_contact=payload.customer_contact, + ) + if outcome == "matched": + session = await l1_session_service.start_flow_session( + db, account_id=user.account_id, user=user, flow_id=UUID(result["flow_id"]), + ticket_id=str(ticket.id), ticket_kind="internal") + else: # build + session = await l1_session_service.start_ai_build_session( + db, account_id=user.account_id, user=user, + ticket_id=str(ticket.id), ticket_kind="internal") + await db.commit() + return IntakeResponse( + outcome=outcome, session_id=session.id, session_kind=session.session_kind, + ticket_id=str(ticket.id), ticket_kind="internal", + flow_id=UUID(result["flow_id"]) if outcome == "matched" else None, + ) +``` + +Add next-node endpoint: +```python +@router.post("/sessions/{session_id}/next-node", response_model=NextNodeResponse) +async def next_node( + session_id: UUID, + payload: NextNodeRequest, + db: Annotated[AsyncSession, Depends(get_db)], + user: Annotated[User, Depends(require_l1_or_coverage)], +): + session = await _get_session_or_404(db, session_id, user) + # problem_text + category come from the linked internal ticket + stored category. + ticket = await internal_ticket_service.get_ticket(db, ticket_id=UUID(session.ticket_id)) + problem_text = ticket.problem_statement if ticket else "" + category = session.walked_path[0].get("category") if session.walked_path else None + try: + node = await l1_session_service.advance_ai_build( + db, session_id=session_id, problem_text=problem_text, + category=category or "unknown", node_id=payload.node_id, + answer=payload.answer, note=payload.note) + except ValueError as e: + raise HTTPException(status_code=http_status.HTTP_409_CONFLICT, detail=str(e)) + await db.commit() + return NextNodeResponse(node=node, session_status=session.status) + + +@router.get("/escalations", response_model=list[WalkSessionResponse]) +async def l1_escalations( + db: Annotated[AsyncSession, Depends(get_db)], + user: Annotated[User, Depends(require_engineer_or_admin)], + limit: int = 50, +): + rows = await db.execute( + select(L1WalkSession) + .where(L1WalkSession.account_id == user.account_id, + L1WalkSession.status == "escalated") + .order_by(L1WalkSession.resolved_at.desc()).limit(limit)) + return [_to_response(s) for s in rows.scalars()] +``` +Update the import line for deps: `from app.api.deps import get_db, require_l1_or_coverage, require_engineer_or_admin` and add the new schema imports (`NextNodeRequest, NextNodeResponse`). + +> Category persistence: store the resolved category on the first walked_path entry. In `advance_ai_build`, when `node_id is None` (first call), seed `walked_path` with a hidden meta entry `{"node_type":"meta","category":category}` OR persist category on the session. Simplest: pass category from intake by storing it — add a `category` field write in `start_ai_build_session` is out of scope; instead the endpoint seeds the first node call with the classified category by re-classifying once and caching in walked_path meta. **Decision for implementer:** add a nullable `meta` first entry on session creation in the intake `build` branch: after `start_ai_build_session`, call `advance_ai_build(..., node_id=None)` is NOT done here; instead store category by setting `session.walked_path=[{"node_type":"meta","category":result["category"]}]` before commit, and have `normalize_walked_path`/`generate_next_node` skip `meta` entries. Add a one-line filter in both. + +- [ ] **Step 4: Handle the `meta` entry** + +In `ai_tree_builder._build_context` and `normalize_walked_path`, skip entries with `node_type == "meta"`: +```python + walked_path = [s for s in walked_path if s.get("node_type") != "meta"] +``` +(add as the first line of both functions). In the next-node endpoint, read category from the meta entry: +```python + category = next((s.get("category") for s in session.walked_path if s.get("node_type") == "meta"), "unknown") +``` + +- [ ] **Step 5: Run tests to verify they pass** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_api_ai_build.py -v` +Expected: PASS. + +- [ ] **Step 6: Commit** + +```bash +git add backend/app/schemas/l1.py backend/app/api/deps.py backend/app/api/endpoints/l1.py backend/tests/test_l1_api_ai_build.py +git commit -m "feat(l1): intake dispatch + next-node + escalations endpoints, owner/admin dep" +``` + +--- + +## Task 11: Category settings API + +**Files:** +- Create: `backend/app/schemas/l1_categories.py` +- Modify: `backend/app/api/endpoints/accounts.py` +- Test: `backend/tests/test_l1_categories_api.py` + +- [ ] **Step 1: Write the failing test** + +```python +# backend/tests/test_l1_categories_api.py +import pytest + + +@pytest.mark.asyncio +async def test_get_categories(owner_client): + r = await owner_client.get("/api/v1/accounts/me/l1-categories") + assert r.status_code == 200 + body = r.json() + assert "enabled" in body and "available" in body and "hard_floor" in body + + +@pytest.mark.asyncio +async def test_patch_categories_owner_only(owner_client, engineer_client): + r = await engineer_client.patch("/api/v1/accounts/me/l1-categories", + json={"enabled": ["printer"]}) + assert r.status_code == 403 + r2 = await owner_client.patch("/api/v1/accounts/me/l1-categories", + json={"enabled": ["printer", "vpn_connect"]}) + assert r2.status_code == 200 + assert set(r2.json()["enabled"]) == {"printer", "vpn_connect"} +``` + +- [ ] **Step 2: Run test to verify it fails** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_categories_api.py -v` +Expected: FAIL — 404 (routes not defined). + +- [ ] **Step 3: Schema + endpoints** + +```python +# backend/app/schemas/l1_categories.py +from pydantic import BaseModel + +class L1CategoriesResponse(BaseModel): + enabled: list[str] + available: list[str] + hard_floor: list[str] + +class L1CategoriesUpdate(BaseModel): + enabled: list[str] +``` + +In `backend/app/api/endpoints/accounts.py` (imports: the category service + new deps/schemas): +```python +@router.get("/me/l1-categories", response_model=L1CategoriesResponse) +async def get_l1_categories( + db: Annotated[AsyncSession, Depends(get_db)], + user: Annotated[User, Depends(require_l1_or_above)], +): + enabled = await l1_category_service.get_enabled_categories(user.account_id, db) + return L1CategoriesResponse( + enabled=enabled, + available=l1_category_service.DEFAULT_L1_CATEGORIES, + hard_floor=l1_category_service.HARD_FLOOR_FORBIDDEN, + ) + + +@router.patch("/me/l1-categories", response_model=L1CategoriesResponse) +async def set_l1_categories( + payload: L1CategoriesUpdate, + db: Annotated[AsyncSession, Depends(get_db)], + user: Annotated[User, Depends(require_account_owner_or_admin)], +): + enabled = await l1_category_service.set_enabled_categories(user.account_id, payload.enabled, db) + await db.commit() + return L1CategoriesResponse( + enabled=enabled, + available=l1_category_service.DEFAULT_L1_CATEGORIES, + hard_floor=l1_category_service.HARD_FLOOR_FORBIDDEN, + ) +``` +Add imports: `from app.services import l1_category_service`, `from app.api.deps import require_l1_or_above, require_account_owner_or_admin`, `from app.schemas.l1_categories import L1CategoriesResponse, L1CategoriesUpdate`. + +- [ ] **Step 4: Run test to verify it passes** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_categories_api.py -v` +Expected: PASS. (If `engineer_client`/`owner_client` fixtures don't exist, add them mirroring `l1_client` with `account_role` `engineer`/`owner`.) + +- [ ] **Step 5: Commit** + +```bash +git add backend/app/schemas/l1_categories.py backend/app/api/endpoints/accounts.py backend/tests/test_l1_categories_api.py +git commit -m "feat(l1): account L1 category settings API (owner/admin write)" +``` + +--- + +## Task 12: Backend integration test — full intake→build→resolve and →escalate + +**Files:** +- Test: `backend/tests/test_l1_ai_build_flow.py` + +- [ ] **Step 1: Write the integration test** + +```python +# backend/tests/test_l1_ai_build_flow.py +import pytest +from unittest.mock import AsyncMock, patch +from sqlalchemy import select +from app.models.flow_proposal import FlowProposal + + +@pytest.mark.asyncio +async def test_intake_build_walk_resolve_creates_proposal(l1_client, db_session, monkeypatch): + from app.services import ai_tree_builder + # 1. force a build outcome + with patch("app.api.endpoints.l1.match_or_build.match_or_build", + new=AsyncMock(return_value={"outcome": "build", "session_kind": "ai_build", "category": "printer"})): + r = await l1_client.post("/api/v1/l1/intake", json={"problem_statement": "printer jam"}) + sid = r.json()["session_id"] + + # 2. drive next-node deterministically to a resolved node + seq = iter([ + {"node_type": "question", "id": "n1", "text": "Powered on?"}, + {"node_type": "resolved", "id": "n2", "text": "Fixed."}, + ]) + async def fake_next(problem, category, walked): + return next(seq) + monkeypatch.setattr(ai_tree_builder, "generate_next_node", fake_next) + + r1 = await l1_client.post(f"/api/v1/l1/sessions/{sid}/next-node", json={}) + assert r1.json()["node"]["node_type"] == "question" + r2 = await l1_client.post(f"/api/v1/l1/sessions/{sid}/next-node", + json={"node_id": "n1", "answer": "no"}) + assert r2.json()["node"]["node_type"] == "resolved" + + # 3. resolve → proposal + await l1_client.post(f"/api/v1/l1/sessions/{sid}/resolve", + json={"helpful": True, "resolution_notes": "ok"}) + props = (await db_session.execute( + select(FlowProposal).where(FlowProposal.source == "ai_realtime_l1"))).scalars().all() + assert len(props) >= 1 +``` + +- [ ] **Step 2: Run test** + +Run: `docker exec resolutionflow_backend pytest tests/test_l1_ai_build_flow.py -v` +Expected: PASS. Fix any wiring gaps surfaced here (this is the end-to-end backend gate). + +- [ ] **Step 3: Run the full L1 backend suite for regressions** + +Run: `docker exec resolutionflow_backend pytest tests/ -k "l1 or match_or_build or ai_tree_builder or notification" -q` +Expected: all pass. + +- [ ] **Step 4: Commit** + +```bash +git add backend/tests/test_l1_ai_build_flow.py +git commit -m "test(l1): integration — intake build → walk → resolve → proposal" +``` + +--- + +## Task 13: Frontend — API client + types + +**Files:** +- Modify: `frontend/src/types/l1.ts`, `frontend/src/api/l1.ts` + +- [ ] **Step 1: Add types** + +In `frontend/src/types/l1.ts`: +```typescript +export type IntakeOutcome = 'matched' | 'suggest' | 'out_of_scope' | 'build' + +export interface IntakeResult { + outcome: IntakeOutcome + session_id?: string + session_kind?: 'flow' | 'proposal' | 'adhoc' | 'ai_build' + ticket_id?: string + ticket_kind?: string + flow_id?: string + near_miss?: { flow_id: string; flow_name: string; score: number } + category?: string +} + +export type TreeNode = + | { node_type: 'question'; id: string; text: string } + | { node_type: 'instruction'; id: string; text: string } + | { node_type: 'resolved'; id: string; text: string } + | { node_type: 'escalate'; id: string; reason_category?: string; text: string } + | { node_type: 'needs_review'; id: string; text: string } + +export interface NextNodeResult { node: TreeNode; session_status: string } + +export interface L1Categories { enabled: string[]; available: string[]; hard_floor: string[] } +``` + +- [ ] **Step 2: Add API methods** + +In `frontend/src/api/l1.ts`: +```typescript + nextNode: (sessionId: string, body: { node_id?: string; answer?: 'yes' | 'no'; acknowledged?: boolean; note?: string }) => + apiClient.post(`/l1/sessions/${sessionId}/next-node`, body).then(r => r.data), + + getCategories: () => + apiClient.get('/accounts/me/l1-categories').then(r => r.data), + + setCategories: (enabled: string[]) => + apiClient.patch('/accounts/me/l1-categories', { enabled }).then(r => r.data), + + escalations: () => + apiClient.get('/l1/escalations').then(r => r.data), +``` +Update the existing `intake` method's return type to `IntakeResult`. + +- [ ] **Step 3: Type-check** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json'` +Expected: clean (callers updated in Tasks 14-15; if tsc flags `intake` callers, proceed — they're fixed next). + +- [ ] **Step 4: Commit** + +```bash +git add frontend/src/types/l1.ts frontend/src/api/l1.ts +git commit -m "feat(l1): frontend api/types for next-node, outcome, categories" +``` + +--- + +## Task 14: Frontend — L1Dashboard intake dispatch + +**Files:** +- Modify: `frontend/src/pages/l1/L1Dashboard.tsx` + +- [ ] **Step 1: Replace `handleStart` to dispatch on outcome** + +```typescript + const handleStart = async () => { + if (!problem.trim()) return + setSubmitting(true) + try { + const res = await l1Api.intake({ + problem_statement: problem.trim(), + customer_name: customerName.trim() || undefined, + customer_contact: customerContact.trim() || undefined, + }) + if (res.outcome === 'matched' || res.outcome === 'build') { + navigate(`/l1/walk/${res.session_id}`) + } else if (res.outcome === 'suggest') { + setSuggestion(res.near_miss ?? null) // render an inline prompt (below) + } else if (res.outcome === 'out_of_scope') { + setOutOfScope(res.category ?? 'unknown') + } + } catch (err) { + const detail = (err as { response?: { data?: { detail?: string } } }).response?.data?.detail + toast.error(typeof detail === 'string' ? detail : 'Failed to start. Try again.') + } finally { + setSubmitting(false) + } + } + + const buildNew = async () => { + setSuggestion(null) + const res = await l1Api.intake({ problem_statement: problem.trim(), force_build: true }) + if (res.outcome === 'build') navigate(`/l1/walk/${res.session_id}`) + else if (res.outcome === 'out_of_scope') setOutOfScope(res.category ?? 'unknown') + } +``` + +Add state near the top: `const [suggestion, setSuggestion] = useState<{flow_id:string;flow_name:string;score:number}|null>(null)` and `const [outOfScope, setOutOfScope] = useState(null)`. Add `force_build` as an optional field in the `l1Api.intake` body type. + +- [ ] **Step 2: Render the suggest + out-of-scope prompts** + +Below the intake card, add: +```tsx +{suggestion && ( +
+

Found a similar flow: {suggestion.flow_name}.

+
+ + +
+
+)} +{outOfScope && ( +
+

This problem isn’t in your enabled L1 categories. Start an ad-hoc walk or escalate.

+ {/* reuse existing adhoc/escalate CTAs from Phase 1 */} +
+)} +``` + +> For "Use this flow", reuse the Phase-1 matched-flow path: re-call intake is unnecessary — the matched outcome already created a session. Simplest Phase 2A: when `outcome==='matched'` we already navigated; for `suggest → Use this flow`, call `l1Api.intake({problem_statement, ...})` is the matched path again is not guaranteed. Implementer: on "Use this flow", POST intake with the original text (it will match again and return `matched` with a session) — acceptable for Phase 2A. + +- [ ] **Step 3: Type-check + lint** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json && npx eslint src/pages/l1/L1Dashboard.tsx'` +Expected: clean. + +- [ ] **Step 4: Commit** + +```bash +git add frontend/src/pages/l1/L1Dashboard.tsx +git commit -m "feat(l1): dashboard intake dispatch on match_or_build outcome" +``` + +--- + +## Task 15: Frontend — L1WalkTreeVariant real node rendering + disclaimer + +**Files:** +- Modify: `frontend/src/components/l1/L1WalkTreeVariant.tsx` + +- [ ] **Step 1: Drive nodes from `/next-node`** + +Replace the synthetic stepping. On mount, if `session.session_kind === 'ai_build'`, fetch the first node (`l1Api.nextNode(session.id, {})`). On answer/ack, POST the current node id + answer, render the returned node. Terminal nodes (`resolved`/`escalate`/`needs_review`) switch to the existing Resolve/Escalate modal affordances. + +```tsx +const [node, setNode] = useState(null) +const [loading, setLoading] = useState(false) + +useEffect(() => { + if (session.session_kind !== 'ai_build') return + setLoading(true) + l1Api.nextNode(session.id, {}).then(r => setNode(r.node)).finally(() => setLoading(false)) +}, [session.id, session.session_kind]) + +const answer = async (a: 'yes' | 'no') => { + if (!node) return + setLoading(true) + try { + const r = await l1Api.nextNode(session.id, { node_id: node.id, answer: a }) + setNode(r.node) + } finally { setLoading(false) } +} + +const acknowledge = async () => { + if (!node) return + setLoading(true) + try { + const r = await l1Api.nextNode(session.id, { node_id: node.id, acknowledged: true }) + setNode(r.node) + } finally { setLoading(false) } +} +``` + +- [ ] **Step 2: Render by node_type + disclaimer banner** + +```tsx +{session.session_kind === 'ai_build' && ( +
+ These are high-confidence troubleshooting steps, but they come from outside your + organization’s knowledge base — review them before acting. When in doubt, escalate early. +
+)} +{loading &&

Thinking through the next step…

} +{node?.node_type === 'question' && ( + <> +

{node.text}

+
+ + +
+ +)} +{node?.node_type === 'instruction' && ( + <> +

{node.text}

+ + +)} +{(node?.node_type === 'resolved') && ( + /* opens existing Resolve modal */ +)} +{(node?.node_type === 'escalate' || node?.node_type === 'needs_review') && ( + /* opens existing Escalate modal */ +)} +``` +Wire `ResolveCta`/`EscalateCta` to the existing `WalkModals` Resolve/Escalate handlers already in this component (reuse, don't duplicate). + +- [ ] **Step 3: Type-check + lint** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json && npx eslint src/components/l1/L1WalkTreeVariant.tsx'` +Expected: clean. + +- [ ] **Step 4: Commit** + +```bash +git add frontend/src/components/l1/L1WalkTreeVariant.tsx +git commit -m "feat(l1): walker renders AI-built nodes via next-node + disclaimer banner" +``` + +--- + +## Task 16: Frontend — admin category settings page + +**Files:** +- Create: `frontend/src/pages/account/L1CategoriesPage.tsx` +- Modify: router + account nav (follow the existing `/account/*` child-route pattern) + +- [ ] **Step 1: Build the page** + +```tsx +// frontend/src/pages/account/L1CategoriesPage.tsx +import { useEffect, useState } from 'react' +import { l1Api } from '@/api/l1' +import { toast } from '@/lib/toast' +import type { L1Categories } from '@/types/l1' + +export default function L1CategoriesPage() { + const [data, setData] = useState(null) + useEffect(() => { l1Api.getCategories().then(setData) }, []) + if (!data) return null + const toggle = async (cat: string) => { + const enabled = data.enabled.includes(cat) + ? data.enabled.filter(c => c !== cat) : [...data.enabled, cat] + const updated = await l1Api.setCategories(enabled) + setData({ ...data, enabled: updated.enabled }) + toast.success('L1 categories updated') + } + return ( +
+

L1 AI build categories

+

+ Problems in enabled categories can be built into AI troubleshooting trees when no + flow exists. Disabled categories fall back to ad-hoc or escalation. +

+
+ {data.available.map(cat => ( + + ))} +
+
+

Always excluded (safety)

+
    + {data.hard_floor.map(h =>
  • {h.replace(/_/g, ' ')}
  • )} +
+
+
+ ) +} +``` + +- [ ] **Step 2: Register route + nav** + +Add a lazy import + a child route under the `/account` subtree in `frontend/src/router.tsx` (mirror existing account children, e.g. `{ path: 'l1-categories', element: page(L1CategoriesPage) }` under the AccountLayout route), and a nav entry in the account settings sidebar/menu following the existing pattern. Gate visibility to owner/admin in the menu (reuse `usePermissions`). + +- [ ] **Step 3: Type-check + lint + build** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json && npx eslint src/pages/account/L1CategoriesPage.tsx'` +Expected: clean. + +- [ ] **Step 4: Commit** + +```bash +git add frontend/src/pages/account/L1CategoriesPage.tsx frontend/src/router.tsx +git commit -m "feat(l1): admin L1 category settings page" +``` + +--- + +## Task 17: Frontend — ProposalDetail L1 source + engineer escalations section + +**Files:** +- Modify: `frontend/src/components/flowpilot/ProposalDetail.tsx` +- Modify: `frontend/src/pages/EscalationQueuePage.tsx` + +- [ ] **Step 1: ProposalDetail — L1-sourced source block (Finding 1)** + +Where it currently renders the `/pilot/{source_session_id}` link, branch on the new `l1_session_id`: +```tsx +{proposal.l1_session_id ? ( +
+ Source: AI L1 walk (outcome-validated). Unexplored branches are marked + needs review below. +
+) : proposal.source_session_id ? ( + + {/* existing link */} + +) : null} +``` +Add `l1_session_id?: string | null` to the proposal type used here. + +- [ ] **Step 2: EscalationQueuePage — L1 escalations section** + +Fetch `l1Api.escalations()` and render a section above/below the existing queue: +```tsx +const [l1Escalations, setL1Escalations] = useState([]) +useEffect(() => { l1Api.escalations().then(setL1Escalations).catch(() => setL1Escalations([])) }, []) +// render: problem (from ticket), walked-path length, escalated-at, reason +``` +Each row shows the walked-path summary and links to a read-only view (Phase 2A: a simple expandable row is sufficient; no new route required). + +- [ ] **Step 3: Type-check + lint + build** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json && npm run build'` +Expected: tsc clean, build succeeds. + +- [ ] **Step 4: Commit** + +```bash +git add frontend/src/components/flowpilot/ProposalDetail.tsx frontend/src/pages/EscalationQueuePage.tsx +git commit -m "feat(l1): proposal L1 source block + engineer L1-escalations section" +``` + +--- + +## Task 18: E2E — AI build flow + +**Files:** +- Modify: `frontend/e2e/l1-workspace.spec.ts` + +- [ ] **Step 1: Add an AI-build e2e test** + +Because the builder calls a live model, stub the network at the Playwright layer: intercept `POST **/l1/intake` to return `{outcome:'build', session_kind:'ai_build', session_id:}` and `POST **/l1/sessions/*/next-node` to return scripted nodes (question → resolved). Assert: L1 lands on the walker, sees the disclaimer banner, answers the question, reaches the resolved CTA. + +```typescript +test('L1 AI build: intake → answer node → resolve CTA', async ({ page }) => { + await login(page, L1_EMAIL) + await page.route('**/api/v1/l1/intake', route => route.fulfill({ + status: 200, contentType: 'application/json', + body: JSON.stringify({ outcome: 'build', session_kind: 'ai_build', session_id: 'e2e-sess', ticket_id: 't', ticket_kind: 'internal' }), + })) + let call = 0 + await page.route('**/api/v1/l1/sessions/*/next-node', route => { + call += 1 + const node = call === 1 + ? { node_type: 'question', id: 'n1', text: 'Is it powered on?' } + : { node_type: 'resolved', id: 'n2', text: 'Resolved.' } + route.fulfill({ status: 200, contentType: 'application/json', + body: JSON.stringify({ node, session_status: 'active' }) }) + }) + // also stub GET session fetch the walker does on load, if any, to return an ai_build session + await page.goto('/l1') + await page.getByPlaceholder(/What's the user calling about/i).fill('printer jam') + await page.getByRole('button', { name: /Start walk/i }).click() + await expect(page.getByText(/outside your organization’s knowledge base/i)).toBeVisible() + await expect(page.getByText('Is it powered on?')).toBeVisible() + await page.getByRole('button', { name: 'No' }).click() + await expect(page.getByText(/Resolved\./i)).toBeVisible() +}) +``` +Adjust selectors/route patterns to the actual walker data-loading (stub the session GET the walker performs so it reports `session_kind: 'ai_build'`). + +- [ ] **Step 2: Run e2e locally only if chromium available; otherwise rely on CI** + +This container cannot launch chromium (sandbox). Push and let CI run `npm run test:e2e`. Do not block on local e2e. + +- [ ] **Step 3: Commit** + +```bash +git add frontend/e2e/l1-workspace.spec.ts +git commit -m "test(l1): e2e AI build flow (network-stubbed)" +``` + +--- + +## Task 19: Final verification + +- [ ] **Step 1: Backend suite** + +Run: `docker exec resolutionflow_backend pytest tests/ -q` +Expected: all pass (note any pre-existing xdist-only failures per Phase-1 acceptance report §7). + +- [ ] **Step 2: Frontend gates** + +Run: `docker exec resolutionflow_frontend sh -c 'cd /app && npx tsc --noEmit -p tsconfig.app.json && npm run lint && npm run build'` +Expected: tsc clean, lint 0 errors, build succeeds. + +- [ ] **Step 3: Migration roundtrip on a clean DB** + +Run: `docker exec resolutionflow_backend alembic downgrade -3 && docker exec resolutionflow_backend alembic upgrade head` +Expected: clean down+up for the three new migrations (run against a DB without `ai_build`/L1-proposal rows, or accept the documented downgrade caveat). + +- [ ] **Step 4: Open PR** + +Push the branch and open a PR to `main` summarizing Phase 2A, linking the spec, and listing the deferred items (KB grounding/connectors, PSA reassign, escalation package, AI chat handoff, proposal-matching). + +--- + +## Self-Review notes (author) + +- **Spec coverage:** §3 match_or_build → Task 6/10; §4 streaming + node schema → Task 5/8/10/15; §5 safety (classify, constrained prompt, validation, depth cap, disclaimer) → Task 4/5/15; §6 flywheel + §6.1 normalize + §6.2 linkage → Task 3/9; §7 escalation handoff → Task 9/10/17; §8 migrations → Task 1/2/3; §9 API → Task 10/11; §10 frontend → Task 13-17; §11 testing → throughout + Task 12/18. +- **Known soft spots flagged for the implementer:** category persistence via a `meta` walked_path entry (Task 10 Step 3c/Step 4) and the "Use this flow" suggest path (Task 14 Step 2) are the two places to validate carefully during review. +- **Model calls** are mocked/stubbed in tests; a live constrained-decoding smoke test + the Sonnet-vs-Opus benchmark for `l1_realtime_build` should run in staging before wide enablement (spec §5.3). -- 2.49.1 From 5c38fb890463bf3c26e47f3afd42d2bd466feb9c Mon Sep 17 00:00:00 2001 From: Michael Chihlas Date: Fri, 29 May 2026 11:25:28 -0400 Subject: [PATCH 4/4] docs(decisions): record plan-tier taxonomy centralization decision (Option B) Captures the 2026-05-29 decision to derive admin plan dropdown + validation from the plan_limits table rather than hand-duplicating the allow-list across 6+ sites. Triggered by the prod "AI sessions down" report that traced to the admin dropdown still offering the dead 'team' slug. Adds the matching backlog entry to TODO.md with duplication sites enumerated. Co-Authored-By: Claude Opus 4.7 --- .ai/DECISIONS.md | 12 ++++++++++++ .ai/TODO.md | 2 ++ 2 files changed, 14 insertions(+) diff --git a/.ai/DECISIONS.md b/.ai/DECISIONS.md index c1d37240..884deed5 100644 --- a/.ai/DECISIONS.md +++ b/.ai/DECISIONS.md @@ -13,6 +13,18 @@ --- +## 2026-05-29 — Single source of truth for plan-tier taxonomy (derive admin UI + validation from `plan_limits`) + +**Context:** A prod report ("AI sessions aren't working") traced to the owner account having no paid plan (AI is plan-gated), compounded by a real bug: the admin "Change Plan" dropdown ([`AccountDetailPage.tsx:443-445`](../frontend/src/pages/admin/AccountDetailPage.tsx)) still offered the dead `team` slug (renamed to `enterprise` in migration `4ce3e594cb87`, 2026-05-07) and omitted `starter`/`enterprise`. Selecting "Team" 400s against the hardcoded allow-list in [`admin.py:994`](../backend/app/api/endpoints/admin.py#L994). The dropdown was missed during the 2026-05-07 taxonomy reconciliation because the allowed-plan list is hand-duplicated across ≥6 backend + frontend sites. Second taxonomy-drift incident. + +**Decision:** Option B — make `plan_limits` the single source of truth: admin dropdown + pricing/checkout derive plan options from a plans endpoint (filter `is_public`, order by `sort_order`, label from `display_name`), and backend validation checks against actual `plan_limits` rows rather than a hardcoded tuple. Implementation deferred (active work is on another branch); fully specced in [TODO.md](TODO.md). A trivial dropdown-options fix may land first to unblock the admin tool. + +**Rejected:** Option A (patch only the `AccountDetailPage` dropdown). Fixes the symptom but leaves the duplication that has now caused two drift incidents — and there is no outage forcing a minimal diff (bug is admin-only and was already worked around via direct Pro assignment). Conflicts with the repo principle "prefer correct architecture over minimal diff." + +**Consequences:** New plan tiers become a data change (a `plan_limits` row) instead of a multi-file code edit; UI and validation can no longer drift from the catalog. Requires a public-plans read endpoint (or extending billing state) consumed by the admin UI + pricing page. The `'team'` visibility string (`Tree.visibility` / `StepLibrary.visibility`) is a separate domain and is explicitly out of scope. + +--- + ## 2026-05-28 — Scope Anthropic structured outputs to flat-array JSON only **Context:** Optimizing the existing Claude API usage (no model change). The Anthropic path in `generate_json` (`ai_provider.py`) had no equivalent to the Gemini path's `response_mime_type="application/json"` — it prompted for JSON and relied on downstream defenses: `_strip_markdown_fences` (ai_fix), `parse_llm_json` (knowledge_flywheel), and `_try_repair_json` (kb_conversion, which balances unclosed braces on truncated output). Anthropic structured outputs (`output_config.format` with a JSON schema) guarantee valid, parseable JSON and would eliminate those band-aids. The question was which of the four `generate_json` call sites can adopt it. diff --git a/.ai/TODO.md b/.ai/TODO.md index 438c6e20..a6c9640b 100644 --- a/.ai/TODO.md +++ b/.ai/TODO.md @@ -23,3 +23,5 @@ None selected. Pick from the backlog below or `03-DEVELOPMENT-ROADMAP.md`. - [ ] **`bg-card-hover` Tailwind class doesn't resolve.** [`frontend/src/components/layout/CommandPalette.tsx:450-451`](../frontend/src/components/layout/CommandPalette.tsx) uses `bg-card-hover` as a Tailwind utility, but Tailwind v4 generates `bg-{token}` from `--color-{token}` — and the token in [`frontend/src/index.css:15`](../frontend/src/index.css) is `--color-bg-card-hover`, which generates `bg-bg-card-hover`, not `bg-card-hover`. So those classes silently produce nothing. Other call sites (KnowledgeBaseCards, TeamSummary, ProposalBanner) use the explicit `hover:bg-[var(--color-bg-card-hover)]` form which works. Fix: change the CommandPalette classes to the explicit-var form, OR add a `--color-card-hover` semantic mapping in index.css alongside `--color-card`. Surfaced 2026-05-01 during impeccable polish sweep. - [ ] **`ConcludeSessionModal` paused/escalated step forces single-artifact choice — should allow multi-select.** [`frontend/src/components/assistant/ConcludeSessionModal.tsx`](../frontend/src/components/assistant/ConcludeSessionModal.tsx) ~lines 430-474 ("Paused/Escalated: status update options"). Today the engineer clicks ONE of Ticket Notes / Client Update / Email Draft, the buttons disappear, and the result replaces them. Real MSP escalations almost always need at least two: technical notes for the next engineer's PSA AND a non-technical client update. Same for pause (client update + ticket notes for context when resuming). Recommended shape: multi-select with smart defaults — three checkboxes (`☑ Ticket Notes ☑ Client Update ☐ Email Draft`); for `escalated` pre-check Ticket Notes + Client Update; for `paused` pre-check Client Update only. One "Generate" button fires all selected in parallel via existing `aiSessionsApi.generateStatusUpdate(...)` (already supports the three `audience` values: `ticket_notes`, `client_update`, `email_draft`). Each result renders in its own card with its own Copy / Post-to-PSA / Send-Email action. Surfaced 2026-05-01. Feature work, not polish — touches streaming wiring for parallel calls. + +- [ ] **Centralize plan-tier taxonomy — derive admin plan dropdown (and validation) from `plan_limits`, not hardcoded lists.** Chose **Option B** over a one-line patch (see [DECISIONS.md](DECISIONS.md) 2026-05-29). *Surfaced by a prod bug (2026-05-28):* the admin "Change Plan" dropdown at [`AccountDetailPage.tsx:443-445`](../frontend/src/pages/admin/AccountDetailPage.tsx) still offered `free / pro / team` — the dead `team` slug (renamed to `enterprise` in migration `4ce3e594cb87`, 2026-05-07) and missing `starter`/`enterprise`. Selecting "Team" sends `{plan:"team"}` to `PUT /admin/accounts/{id}/subscription/plan`, which 400s on `if data.plan not in ("free","pro","starter","enterprise")` ([admin.py:994](../backend/app/api/endpoints/admin.py#L994), duplicated at [:975](../backend/app/api/endpoints/admin.py#L975)). The 400 detail was swallowed by a generic `toast.error('Failed to update plan')` ([AccountDetailPage.tsx:196](../frontend/src/pages/admin/AccountDetailPage.tsx)), so it presented as "AI sessions are down" (real cause: owner account had no paid plan; AI is plan-gated). **Root cause of the root cause:** the allowed-plan list is hand-duplicated across ≥6 sites and drifted (2nd such incident). **Duplication sites to consolidate:** backend [`admin.py:975`](../backend/app/api/endpoints/admin.py#L975) + [`:994`](../backend/app/api/endpoints/admin.py#L994) (tuple, twice), [`schemas/admin.py:128`](../backend/app/schemas/admin.py) (`AdminAccountCreate.plan` Literal), frontend `AccountDetailPage.tsx` dropdown, `AccountsPage.tsx` create-account dropdown, `types/admin.ts` + `types/account.ts` + `types/billing.ts`, `hooks/useSubscription.ts` (`isPaidPlan`), `components/subscription/CheckoutButton.tsx` (`planLabels`). **Source of truth:** the `plan_limits` table (rows: free/starter/pro/enterprise) — `PlanLimitWithBillingResponse` already exposes `is_public` + `sort_order` + `display_name` for ordering/labels. **End state (B):** admin dropdown + pricing/checkout derive options from a plans endpoint backed by `plan_limits` (filter `is_public`, order by `sort_order`, label from `display_name`); backend validation checks against actual `plan_limits` rows instead of a hardcoded tuple. **Trivial first commit (land anytime to unblock the admin tool):** fix the `AccountDetailPage` dropdown to `Free / Starter / Pro / Enterprise` and surface the backend error detail in the toast. ⚠️ The `'team'` string in `Tree.visibility` / `StepLibrary.visibility` is a *separate domain* (shared-with-account) — do NOT touch it. -- 2.49.1