- Blocker: FlowProposal can't link an l1_walk_session (source_session_id is NOT NULL FK→ai_sessions, UI links /pilot). Add nullable l1_session_id + exactly-one CHECK + read-only walked-path link for L1-sourced proposals. - High: flow_matching_engine matches published flows only; scope match pass to flows, defer proposal-matching. - High: notification system is FlowPilot-shaped; enumerate the 3 changes for l1.session.escalated (VALID_EVENTS, link+body builder, explicit engineer recipients). Engineer-visible surface is the primary handoff. - Medium: match before category gate so authored flows aren't blocked. - Medium: define normalize_walked_path → valid tree with root id, unexplored branches as needs_review stubs. - Medium: category write auth needs owner/admin, not engineer; add require_account_owner_or_admin dep. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
23 KiB
L1 AI Decision-Tree Builder — Phase 2A Design
Status: Draft for review
Date: 2026-05-29
Author: previous session (brainstorming)
Predecessor: 2026-05-28-l1-workspace-design.md (full L1 vision), 2026-05-28-l1-workspace-phase-1-acceptance.md (what shipped in Phase 1)
1. Goal
When an L1 tech describes a problem and there is no matching authored flow or AI draft, the platform builds a yes/no decision tree in real time from the model's general L1 knowledge and walks the tech through it node by node. Scoped to L1-appropriate troubleshooting: simple yes/no questions and reversible step-by-step instructions. Successful trees are captured as outcome-validated drafts for engineer review, compounding the account's knowledge base from real resolutions.
This overrides the original spec's "no empty-KB build" rule (§8.1 of the predecessor), which aborted to a degradation screen when no KB existed. Instead of aborting, we build from generic knowledge under a layered safety model.
KB grounding (RAG over ingested documents) is explicitly deferred to Phase 2B — Phase 2A builds from generic knowledge only, plus matching against already-authored flows.
2. Scope
In scope (Phase 2A):
match_or_buildorchestrator inserted at L1 intake (match-first, build-on-miss).ai_tree_builderservice: node-by-node ("streaming") tree generation, constrained + escalate-early.- Admin-configurable L1 category allowlist (Account Owner/Admin control panel).
- Standing AI-disclaimer banner on AI-built walks.
- Flywheel capture: resolved AI trees become outcome-validated
FlowProposals. - Minimum escalation handoff: engineer bell-badge notification + an engineer-visible "escalated from L1" surface.
Deferred:
- KB document ingestion + connectors (IT Glue, Hudu, SharePoint/OneDrive) — Phase 2B.
- RAG grounding of the builder on ingested KB — Phase 2B.
- PSA ticket reassign on escalation, escalation-package generation, AI chat handoff — later phase.
BuildAbortedNoKBscreen from the original spec — dropped (superseded by build-from-generic).
3. Architecture (Approach C)
Dedicated builder for the constrained node generation; reuse existing rails for matching and capture.
New services:
| File | Responsibility |
|---|---|
backend/app/services/match_or_build.py |
Orchestrator. match_or_build(account_id, problem_text, ticket_ref, *, force_build=False) -> MatchOrBuildResult. Classify → category gate → match pass → build/suggest/out-of-scope decision. |
backend/app/services/ai_tree_builder.py |
Node-by-node generation. generate_next_node(problem_text, category, walked_path) -> TreeNode. Reuses get_ai_provider + generate_json + parse_llm_json. Owns the constrained system prompt and per-node validation. |
backend/app/services/l1_category_service.py |
Read/write an account's enabled L1 categories; expose the default allowlist and the always-forbidden hard floor. |
Reused as-is:
flow_matching_engine.find_matches()— semantic + keyword + recency match pass.knowledge_flywheelproposal-creation + dedupe (_find_similar_pending_proposal) — outcome-validated capture.notification_service— engineer escalation notification.- Phase 1
L1WalkTreeVariantwalker — its stubbed synthetic-step UI is replaced by real AI node rendering.
Intake decision flow:
Order matters: match first, gate only the build path. The category allowlist exists to bound generic AI building for safety — it must not block a human-authored flow that already exists for that problem. So matching against published flows runs before any category check; the category gate applies only when we fall through to building.
POST /l1/intake (problem_statement, customer_*, force_build?)
→ match_or_build(account_id, problem_text, problem_domain, ticket_ref, force_build):
1. if not force_build:
hits = flow_matching_engine.find_matches(problem_text, problem_domain, account_id)
best = max(hits, default=None) # published flows (Trees) only
if best and best.score >= MATCH_THRESHOLD:
return {outcome: 'matched', flow_id, session_kind: 'flow'}
if best and best.score >= SUGGEST_THRESHOLD:
return {outcome: 'suggest', near_miss, can_build: true}
2. category = classify(problem_text) # new — only on build path
3. if category not in account.enabled_l1_categories:
return {outcome: 'out_of_scope', category}
4. return {outcome: 'build', session_kind: 'ai_build', category}
Match scope (Finding 2): flow_matching_engine.find_matches() matches published flows (trees) only — it returns {tree_id, tree_name, score, ...} and has no notion of FlowProposals. Phase 2A therefore matches against published flows only; the matched outcome is always session_kind: 'flow'. This is sufficient because the flywheel promotes good AI drafts to published flows (§6), which then become matchable on future intakes. Matching against not-yet-promoted proposals is a deferred enhancement (would require extending the engine), noted in §13.
Frontend dispatches on outcome:
matched→ start aflowwalk (Phase 1 path).suggest→ inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake withforce_build=true(which skips the match pass and runs the category gate before building).out_of_scope→ inline prompt offering ad-hoc walk or escalate-without-walk (Phase 1 paths).build→ create anai_buildsession, navigate to the walker, fetch the first node.
4. The streaming build & node schema
ai_tree_builder.generate_next_node() is called with the problem statement, the resolved category, and the full walked path so far. It returns exactly one node. Passing the whole path every call is what keeps independently-generated nodes coherent and lets the model decide when it has exhausted safe steps.
Node shape (proposed_flow_data node, also the live walked_path entry):
// question — yes/no branch; both branches regenerate
{ "node_type": "question", "id": "n3", "text": "Is the printer showing a 'ready' status light?",
"yes_next": "generate", "no_next": "generate" }
// instruction — a single safe, reversible action; advances on acknowledgement
{ "node_type": "instruction", "id": "n4", "text": "Unplug the printer for 30 seconds, then power it back on.",
"next": "generate" }
// resolved — terminal success
{ "node_type": "resolved", "id": "n7", "text": "Printer is back online and printing test pages." }
// escalate — terminal handoff (escalate-early safety valve)
{ "node_type": "escalate", "id": "n7", "reason_category": "exhausted_safe_steps",
"text": "This looks like a driver-level fault beyond L1 scope — escalating to engineering." }
"generate" is a sentinel meaning "call generate_next_node again with the new answer appended." The first node is fetched synchronously on ai_build session creation (intake). Each subsequent node is fetched when the tech answers/acknowledges — target latency ~2–4s per node; show a per-node "Thinking through the next step…" affordance.
Endpoint: POST /l1/sessions/{id}/next-node body {node_id, answer?: 'yes'|'no', acknowledged?: true, note?}. Appends the answered node to walked_path, then generates and returns the next node (or a terminal node). Replaces the Phase 1 synthetic stepping in L1WalkTreeVariant.
5. Safety model (layered)
Layer 1 — classification gate (build path only). Runs only after the match pass misses (§3) — a human-authored flow is never blocked by category settings. classify(problem_text) maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or unknown); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is unknown), intake returns out_of_scope (offer adhoc/escalate); no build happens.
Layer 2 — constrained generation. The ai_tree_builder system prompt restricts output to:
- Safe, reversible, observe-or-restart-class steps only (toggle/restart/reconnect/re-enter, check-status questions).
- A hard floor of always-forbidden actions (see §5.1) that NO category may unlock.
- An explicit instruction to emit an
escalatenode — never guess — once it runs out of in-scope safe steps.
Layer 3 — per-node validation. Server-side, every generated node is checked before being returned:
- Reject (and regenerate once, then escalate) nodes whose text matches forbidden-action patterns (§5.1).
- Enforce a depth cap (default
L1_BUILD_MAX_DEPTH = 12): once the walked path hits the cap, force anescalatenode. - Validate node JSON shape (Pydantic); malformed → regenerate once, then escalate.
Layer 4 — standing disclaimer. Persistent banner on every ai_build walk:
"These are high-confidence troubleshooting steps, but they come from outside your organization's knowledge base — review them before acting. When in doubt, escalate early."
5.1 Hard floor — always forbidden (admins cannot enable)
Regardless of enabled categories, the builder must never produce steps that:
- Modify the Windows registry, system files, or boot configuration.
- Delete, format, or repartition data/disks; remove user profiles or mailboxes.
- Change credentials, MFA, security/firewall/AV settings, or disable protections.
- Run scripts/commands with elevated/admin privileges.
- Touch domain controllers, DNS, DHCP, or production server config.
- Make purchases, license changes, or anything with billing impact.
(This list is a product decision — review and edit during spec review.)
5.2 Default enabled category allowlist (admin-editable)
Ships enabled by default; Account Owners/Admins toggle per account:
password_reset, account_lockout, printer, email_outlook_client, wifi_network_basics, vpn_connect, teams_zoom_av, browser_cache_cookies, peripheral_reconnect, os_restart_update.
(This list is a product decision — review and edit during spec review.)
5.3 Tunables
| Setting | Default | Notes |
|---|---|---|
MATCH_THRESHOLD |
0.75 | Carried from predecessor spec §8.1. |
SUGGEST_THRESHOLD |
0.60 | Carried from predecessor spec §8.1. |
L1_BUILD_MAX_DEPTH |
12 | Force escalate beyond this many nodes. |
get_model_for_action('l1_realtime_build') |
Sonnet | Latency-sensitive; benchmark Sonnet vs Opus during plan. |
| Per-node max_tokens | 1024 | One node is small. |
6. Flywheel capture
On resolve of an ai_build session (l1_session_service.resolve extension):
- Normalize the
walked_pathinto a complete, validtree_structure(§6.1) — approval requires a dict with a realid(see Finding 5 /_create_tree_from_proposal). - Create a
FlowProposal:source='ai_realtime_l1',validated_by_outcome=true,proposed_flow_data={tree_structure, match_keywords},l1_session_id=<this session>(NOTsource_session_id— see §6.2 / Finding 1),linked_ticket_id/kind=<session ticket>,problem_domain=<category>,status='pending'. - Run the existing
_find_similar_pending_proposaldedupe — merge (bump supporting count) if a near-duplicate pending proposal exists, else insert. - Emit the existing
proposal.pendingnotification to the review queue.
Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by flow_matching_engine on future intakes → the KB compounds. source='ai_realtime_l1' rows surface in the existing queue (badge them "AI · outcome-validated").
6.1 Tree normalization (Finding 5)
The live walked_path holds only traversed nodes, and "generate" is a runtime sentinel, not a real edge — that is not a valid tree and would fail the _create_tree_from_proposal guard (tree_structure must be a dict with an id). At resolve time, ai_tree_builder.normalize_walked_path(walked_path) -> tree_structure produces a complete object:
- Assign stable string
ids to every node; the first node becomes the root andtree_structure.id= root id. questionnodes: the traversed branch (yes/nothe tech actually chose) points to the next traversed node; the untraversed branch points to a terminal{node_type: 'needs_review', text: 'Branch not explored during the originating call'}stub.instructionnodes point to the next traversed node.- The traversal ends at the real terminal node (
resolvedorescalate). This yields a structurally valid, reviewable tree: engineers fill in theneeds_reviewbranches when promoting. (Trees aretree_type='troubleshooting'.)
6.2 FlowProposal L1 source linkage (Finding 1 — Blocker)
FlowProposal.source_session_id is currently nullable=False FK → ai_sessions, and the review UI (ProposalDetail.tsx) links the "Source Session" to /pilot/{source_session_id} (a FlowPilot chat surface). An L1 ai_build session is an l1_walk_session, not an ai_session, so it cannot populate source_session_id. Changes:
- Model/migration: add
FlowProposal.l1_session_id(nullable FK →l1_walk_sessions.id,ondelete=SET NULL, indexed). Makesource_session_idnullable. Add CHECK((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))— exactly one source set. - Review UI: when
l1_session_idis set (sourceai_realtime_l1), render the "Source" block as a read-only walked-path summary (problem statement + the resolved path) instead of a/pilot/...link. Existing ai_session-sourced proposals are unchanged. - Tree promotion:
_create_tree_from_proposalsetsTree.source_session_idfrom the proposal — for L1-sourced proposals leave it NULL (confirmTree.source_session_idis nullable; if not, include in the migration).
7. Minimum escalation handoff
On escalate (terminal node reached, or the L1 hits the Escalate modal during an ai_build walk) — extends l1_session_service.escalate. The engineer-visible surface is the primary, dependency-free handoff; the bell-badge notification is a thin addition that requires three specific extensions to the FlowPilot-shaped notification system (Finding 3).
-
Engineer-visible surface (primary). Escalated L1 sessions appear in an engineer-facing list — extend the existing
/escalationsqueue (EscalationQueuePage) with an "L1 escalations" section, backed by a newGET /l1/escalations. Each row: problem statement, walked-path summary, who escalated, when, reason category. Pollable; no dependency on the notification subsystem. -
Bell-badge notification (Finding 3 — three explicit changes). The notification system is currently FlowPilot-specific:
VALID_EVENTS(backend/app/schemas/notification.py) has nol1.session.escalated. Add it to the set (and to the defaultevents_enabledmap)._build_notification_link(notification_service.py) only knowssession.escalated → /pilot/{session_id}?pickup=true. Addl1.session.escalated → /escalationsand add a body template for the new event. The existingsession.escalatedevent must NOT be reused — an L1 escalation has no ai_session and no/pilotpickup flow.- Default recipients (
_resolve_recipients, ~line 184) are owner/admin/team_admin only — ordinary engineers are excluded. Since L1 escalations must reach engineers who can pick them up, the call must pass explicittarget_user_ids= the account's activeengineer-role users (plus owner/admin), not rely on the default set.
Still deferred (documented, not built): PSA ticket reassign, escalation-package markdown generation, AI chat handoff/session creation.
8. Data model & migrations
Migration 1 — ai_build session kind.
- Extend
l1_walk_sessionsck_l1_walk_sessions_session_kindCHECK to include'ai_build'. - Extend
ck_l1_walk_sessions_target_consistency: forai_build, bothflow_idandflow_proposal_idare NULL (same asadhoc).
Migration 2 — account L1 category settings.
- Add
accounts.enabled_l1_categoriesJSONB NOT NULL DEFAULT '<default allowlist>'::jsonb(list of category keys). RLS already coversaccounts.
Migration 3 — FlowProposal L1 source linkage (Finding 1).
- Add
flow_proposals.l1_session_idnullable FK →l1_walk_sessions.id(ondelete=SET NULL, indexed). - Make
flow_proposals.source_session_idnullable (wasNOT NULL). - Add CHECK
((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))— exactly one source. - Confirm
trees.source_session_idis nullable (L1-promoted trees leave it NULL); if not, drop its NOT NULL here.
No new tables — live build state rides on the existing l1_walk_sessions.walked_path; persisted trees ride on FlowProposal.proposed_flow_data.
9. API surface
| Method | Path | Notes | Auth |
|---|---|---|---|
| POST | /l1/intake |
Extended: now runs match_or_build; response carries outcome (matched/suggest/out_of_scope/build). |
require_l1_or_coverage |
| POST | /l1/sessions/{id}/next-node |
New: record answer/ack on current node, generate + return next node (or terminal). | require_l1_or_coverage |
| GET | /accounts/me/l1-categories |
New: list enabled + available categories + hard-floor (read-only) list. | require_l1_or_above (read) |
| PATCH | /accounts/me/l1-categories |
New: set enabled categories. | require_account_owner_or_admin (Finding 6) |
| GET | /l1/escalations |
New (or extend /escalations): engineer-visible escalated-from-L1 list. |
require_engineer_or_admin |
Finding 6 — new auth dep. The category control is an owner/admin setting, but require_engineer_or_admin also admits engineer. No existing dep matches "owner or account-admin" (require_account_owner is owner-only; require_admin is super-admin-only). Add require_account_owner_or_admin to deps.py: allow super_admin bypass, then account_role in ('owner', 'admin'), else 403. Use it for the PATCH.
10. Frontend
L1WalkTreeVariant— replace synthetic stepping with real node rendering driven by/next-node; renderquestion(yes/no),instruction(acknowledge),resolved/escalate(terminal). Per-node loading affordance. Disclaimer banner mounted forai_buildsessions.L1Dashboardintake handler — dispatch onmatch_or_buildoutcome(suggest prompt, out-of-scope prompt, build → walker).- New admin settings panel (under
/account) — toggle enabled L1 categories; show hard-floor list as read-only "always excluded." - Engineer escalations surface — "L1 escalations" section/list.
11. Testing strategy
Backend unit:
ai_tree_builder.generate_next_node— returns valid node per type; escalate-early when path is deep / model signals exhaustion; regenerate-then-escalate on malformed/forbidden output; depth cap forces escalate.- Per-node validation — forbidden-action patterns rejected; hard-floor enforced even if a category is enabled.
match_or_build— all four outcomes at threshold boundaries (score == MATCH_THRESHOLD,== SUGGEST_THRESHOLD); match runs before the category gate (a matched published flow is returned even when its category is disabled — Finding 4);force_buildskips match but still applies the category gate;out_of_scopeonly on the build path when category disabled/unknown.classify— known categories map correctly; unknown → out_of_scope.normalize_walked_path(Finding 5) — produces a dict with a rootid; untraversedquestionbranches becomeneeds_reviewstubs; output passes the_create_tree_from_proposalvalidity guard.- Flywheel capture — resolve creates
ai_realtime_l1proposal withl1_session_idset andsource_session_idNULL (Finding 1); CHECK accepts exactly-one-source; dedupe merges near-duplicate. - Escalation handoff —
l1.session.escalatedaccepted by the notification schema (Finding 3); link resolves to/escalations; explicit engineertarget_user_idsreceive it; escalated session appears inGET /l1/escalations.
Backend integration:
- Full intake→build→resolve creates an outcome-validated proposal.
- Intake→build→escalate notifies engineers and surfaces in the escalations list.
- Migrations roundtrip;
ai_buildCHECK + target-consistency hold.
Frontend e2e (extend l1-workspace.spec.ts):
- L1 intake with no match → AI build → answer nodes → resolve → proposal created.
- L1 build → escalate node → escalate handoff.
- Admin toggles a category off → that problem class returns out-of-scope.
AI quality (plan-time): small eval set of common L1 problems; assert trees stay in-scope, reach resolution or escalate cleanly, never emit hard-floor actions. Benchmark Sonnet vs Opus for the model-tier decision.
12. Risks & open questions
- Hallucinated-but-plausible steps for niche/company-specific apps. Mitigation: classification gate + constrained prompt + escalate-early + disclaimer. Residual risk accepted for v1; eval set bounds it.
- Latency on a live call. Node-by-node means ~2–4s per branch. Mitigation: Sonnet, small per-node token budget, clear loading affordance. Benchmark at plan time.
- Coherence across independently-generated nodes. Mitigation: full walked-path context every call.
- Classification accuracy. A misclassify could wrongly gate a valid problem out, or let a borderline one through. Mitigation: hard floor is category-independent; out-of-scope still offers adhoc/escalate (no dead end).
- Open (product, for spec review): the default category allowlist (§5.2) and the hard-floor list (§5.1) — confirm/edit. Model tier — confirm Sonnet pending benchmark.
13. Out of scope (restated)
KB ingestion + connectors, RAG grounding, PSA reassign, escalation-package generation, AI chat handoff. Each is its own later phase with its own spec.
Also deferred (surfaced in review):
- Matching against unpromoted
FlowProposals (Finding 2).flow_matching_enginematches published flows only. Extending it to also surface outcome-validated drafts before promotion is a later enhancement; Phase 2A relies on engineer promotion (draft → published flow → matchable).
14. Review revisions (2026-05-29 Codex review)
All six findings verified against code and resolved in this spec:
- Blocker — FlowProposal source linkage: §6.2 + §8 Migration 3 (new nullable
l1_session_id,source_session_idmade nullable, exactly-one CHECK, review-UI link change). - High — match scope: §3 (match published flows only; proposal-matching deferred §13).
- High — escalation notification: §7 (engineer surface is primary; three explicit notification-system changes enumerated).
- Medium — gate ordering: §3 + §5 Layer 1 (match first; category gate only on the build path).
- Medium — flywheel tree shape: §6.1 (
normalize_walked_pathproduces a valid tree with rootid; unexplored branches →needs_reviewstubs). - Medium — category write auth: §9 (new
require_account_owner_or_admindep;require_engineer_or_adminwas too broad).