feat(l1): AI decision-tree builder — Phase 2A #193
Reference in New Issue
Block a user
Delete Branch "feat/l1-ai-tree-builder-phase-2a"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
L1 AI Decision-Tree Builder — Phase 2A
Implements the Phase 2A plan (design spec). When an L1 tech describes a problem with no matching published flow, the platform builds a yes/no decision tree in real time from generic L1 knowledge (constrained + escalate-early), walks it node-by-node, captures resolved trees as outcome-validated drafts, and routes escalations to engineers.
Phase 1 (the dependency) is already on
main.What's included
Data model (3 migrations, head
1fd88a68b145)l1_walk_sessions.session_kind = 'ai_build'(FK shape mirrorsadhoc).accounts.enabled_l1_categoriesJSONB allowlist (10-key default).FlowProposal.l1_session_idFK →l1_walk_sessions(SET NULL);source_session_idmade nullable; exactly-one-source CHECK.FlowProposalSummaryschema updated to match.Services
l1_category_service— default allowlist + always-forbidden hard floor + get/set.ai_tree_builder— constrained node-by-node generation, per-node hard-floor validation, depth cap,normalize_walked_path(captures a valid reviewable tree; unexplored branches →needs_reviewstubs; skips the hiddenmetacategory-carrier entry).match_or_build— match published flows first, gate generic build behind enabled categories (match runs before the category gate so an authored flow is never blocked);classifywith word-boundary keyword fallback.l1_session_service—start_ai_build_session,advance_ai_build(records answer +node_text, generates next node), flywheel capture onresolve, engineer notification onescalate.l1.session.escalatedevent (link/escalations, body/title templates);_resolve_recipientsnow honors an explicit empty recipient list.API
POST /l1/intakedispatches viamatch_or_build(matched / suggest / out_of_scope / build); build seeds the classified category as a hiddenmetawalked_path entry.POST /l1/sessions/{id}/next-node,GET /l1/escalations(engineer-or-above).GET|PATCH /accounts/me/l1-categories(read: L1-or-above; write: newrequire_account_owner_or_admindep).l1_realtime_build(Sonnet) /l1_classify(Haiku).Frontend
types/l1.ts+api/l1.ts: outcome/result types,TreeNode, categories;nextNode/escalations/getCategories/setCategories(nextNode carriesnode_text).L1Dashboarddispatches on outcome (suggest → use-flow/build-new; out_of_scope → escalate-without-walk).L1WalkTreeVariantrenders AI-built nodes via/next-node+ standing disclaimer banner; terminal nodes → existing Resolve/Escalate modals.account/L1CategoriesPage(+ route + settings card).ProposalDetailL1-source block; newL1EscalationsSectiononEscalationQueuePage.Verification
pytest tests/on a single dev DB is non-deterministic and environmental — two runs gave723 passed / 507 errorsand698 passed / 163 failed / 529 errors, with thousands of asyncpg connection /ProgrammingErrorfailures from shared-event-loop + single-DB serial execution across subsystems this branch never touches (sessions, trees, feedback, branch_manager, fix_outcome, psa, flowpilot…). Proven non-regression: those files pass in isolation (e.g. branch_manager + feedback + fix_outcome = 32 passed / 0 errors). CI runs pytest-xdist with per-worker DBs (conftest._worker_db_url) and is the authoritative gate — please confirm CI green before merge.tsc -b,npm run lint,npm run buildall clean.downgrade -3→b3358ba0e48c→upgrade headroundtrips cleanly.l1-workspace.spec.ts(network-stubbed); runs in CI.l1_realtime_buildshould run in staging before wide enablement (spec §5.3).Deferred (documented, not built)
KB document ingestion + connectors and RAG grounding (Phase 2B); PSA ticket reassign on escalation; escalation-package generation; AI chat handoff; matching against not-yet-promoted
FlowProposals.🤖 Generated with Claude Code
- /intake now runs match_or_build (matched/suggest/out_of_scope/build); build seeds the classified category as a hidden meta walked_path entry, matched starts a flow session, suggest/out_of_scope return prompt data with no session. - New POST /sessions/{id}/next-node (threads node_text to advance_ai_build) and GET /escalations (engineer-or-above) for the handoff queue. - New IntakeResponse(outcome=...)/NextNodeRequest/NextNodeResponse schemas and require_account_owner_or_admin dep. - Reconcile Phase-1 intake tests to the new contract (mock match_or_build); add test_l1_api_ai_build.py covering build/out_of_scope/suggest/next-node/escalations. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>An earlier anchor-edit silently failed, so POST /sessions/{id}/next-node and GET /escalations were never added (they 404'd). Add both, anchored on the real /escalate-without-walk route. Phase-1 test_l1_endpoints tests used POST /intake to create adhoc setup sessions, but Phase 2A intake now dispatches via match_or_build (build/matched/suggest/ out_of_scope — never adhoc). Add a _create_adhoc_session service helper and route the step/notes/resolve/escalate/cross-account setup through it; rewrite test_intake_adhoc as test_intake_build_creates_ai_build_session (mocked outcome). All green: test_l1_endpoints + test_l1_api_ai_build = 25 passed; full Phase 2A backend service/unit/model suite = 56 passed; notification suite = 18 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>ad9c4c8was committed broken) 076a9ec98d076a9ec503b243ed4- flow-proposal.ts: source_session_id nullable + add l1_session_id (matches backend FlowProposalSummary). - ProposalDetail.tsx: render an 'AI L1 walk (outcome-validated)' note when l1_session_id is set instead of the /pilot/{source_session_id} link; fall back to the link for ai_session-sourced proposals. - New L1EscalationsSection.tsx (GET /l1/escalations) — expandable rows with walked-path summary; renders nothing if empty. Mounted below the FlowPilot queue on EscalationQueuePage. tsc -b + eslint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>Replaces two fabricated counts ('1376', '124') with the figure actually read from a complete run: the 11 Phase 2A test files together = 86 passed / 0 errors / 0 failed. Full serial pytest tests/ is environmental (723p/507e and 698p/163f/529e across runs); erroring files pass in isolation (branch_manager+feedback+fix_outcome = 32 passed). CI (pytest-xdist, per-worker DBs) is the gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>Server-assigns a uuid4 id to every AI-generated node (Finding 1 showstopper: nodes had no id but the advance protocol keys on node_id, so ai_build walks never advanced past question 1). Replaces the hidden {"node_type":"meta"} walked_path convention with real category/problem_text/pending_node columns on l1_walk_sessions (migration 61dda4f615c6) — fixes junk proposals + off-by-one depth cap (Findings 8,9), and pending_node replays the served node on re-mount (no duplicate paid LLM call). Intake honors explicit flow_id and adhoc=True (Findings 4,5); flow_proposals.l1_session_id FK -> CASCADE (Finding 6 time bomb); L1 category GET is owner+admin like PATCH and require_account_owner_or_admin delegates to User.can_manage_account (Finding 7); escalate falls back to default recipients + filters deleted_at + warns when empty (Finding 10). Cleanups: dead ticket_ref removed, IntakeResponse per-outcome validator, unused acknowledged dropped, escalations partial index, restored a deleted audit assertion. Full Phase 2A backend set: 110 passed / 0 failed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>Live walk defect: the builder generated alternatives questions ("Is Jane's account a Microsoft account or a local account?") while the UI could only offer Yes/No. Root cause: SYSTEM_PROMPT mandated a label-less '<yes/no question>' shape with no way to express the two answers. - SYSTEM_PROMPT: question nodes must carry yes_label/no_label — the literal button texts; alternatives questions must use the alternatives as labels. - validate_node: labels hard-floor-scanned, must be distinct non-empty strings. - _ensure_labels: server defaults missing labels to Yes/No. - advance_ai_build: records answer_label (and both labels) in walked_path, derived from the server-held pending_node — never client-supplied. - _build_context: LLM context shows the chosen label, not a bare yes/no (a raw "-> yes" on an alternatives question degrades the next generation). - normalize_walked_path: captured flywheel trees keep question labels. - Frontend: buttons render yes_label/no_label; walk transcript and L1EscalationsSection render answer_label. Phase 2A backend set: 137 passed / 0 failed / 8 deselected. tsc, eslint, vite build clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>9c34d1e) + smoke-test note