Brainstormed design for real-time AI tree building when no KB/flow matches. Overrides the original "no empty-KB build" rule: build from generic L1 knowledge under a layered safety model (classification gate, constrained generation, per-node validation with a hard floor, standing disclaimer). Approach C — dedicated ai_tree_builder + match_or_build orchestrator, reusing flow_matching_engine and the knowledge_flywheel proposal pipeline. Scope: streaming node-by-node builder, admin-configurable categories, flywheel capture of resolved trees, minimum escalation handoff (notify + engineer surface). KB ingestion/connectors, PSA reassign, escalation package, and AI chat handoff deferred to later phases. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
16 KiB
L1 AI Decision-Tree Builder — Phase 2A Design
Status: Draft for review
Date: 2026-05-29
Author: previous session (brainstorming)
Predecessor: 2026-05-28-l1-workspace-design.md (full L1 vision), 2026-05-28-l1-workspace-phase-1-acceptance.md (what shipped in Phase 1)
1. Goal
When an L1 tech describes a problem and there is no matching authored flow or AI draft, the platform builds a yes/no decision tree in real time from the model's general L1 knowledge and walks the tech through it node by node. Scoped to L1-appropriate troubleshooting: simple yes/no questions and reversible step-by-step instructions. Successful trees are captured as outcome-validated drafts for engineer review, compounding the account's knowledge base from real resolutions.
This overrides the original spec's "no empty-KB build" rule (§8.1 of the predecessor), which aborted to a degradation screen when no KB existed. Instead of aborting, we build from generic knowledge under a layered safety model.
KB grounding (RAG over ingested documents) is explicitly deferred to Phase 2B — Phase 2A builds from generic knowledge only, plus matching against already-authored flows.
2. Scope
In scope (Phase 2A):
match_or_buildorchestrator inserted at L1 intake (match-first, build-on-miss).ai_tree_builderservice: node-by-node ("streaming") tree generation, constrained + escalate-early.- Admin-configurable L1 category allowlist (Account Owner/Admin control panel).
- Standing AI-disclaimer banner on AI-built walks.
- Flywheel capture: resolved AI trees become outcome-validated
FlowProposals. - Minimum escalation handoff: engineer bell-badge notification + an engineer-visible "escalated from L1" surface.
Deferred:
- KB document ingestion + connectors (IT Glue, Hudu, SharePoint/OneDrive) — Phase 2B.
- RAG grounding of the builder on ingested KB — Phase 2B.
- PSA ticket reassign on escalation, escalation-package generation, AI chat handoff — later phase.
BuildAbortedNoKBscreen from the original spec — dropped (superseded by build-from-generic).
3. Architecture (Approach C)
Dedicated builder for the constrained node generation; reuse existing rails for matching and capture.
New services:
| File | Responsibility |
|---|---|
backend/app/services/match_or_build.py |
Orchestrator. match_or_build(account_id, problem_text, ticket_ref, *, force_build=False) -> MatchOrBuildResult. Classify → category gate → match pass → build/suggest/out-of-scope decision. |
backend/app/services/ai_tree_builder.py |
Node-by-node generation. generate_next_node(problem_text, category, walked_path) -> TreeNode. Reuses get_ai_provider + generate_json + parse_llm_json. Owns the constrained system prompt and per-node validation. |
backend/app/services/l1_category_service.py |
Read/write an account's enabled L1 categories; expose the default allowlist and the always-forbidden hard floor. |
Reused as-is:
flow_matching_engine.find_matches()— semantic + keyword + recency match pass.knowledge_flywheelproposal-creation + dedupe (_find_similar_pending_proposal) — outcome-validated capture.notification_service— engineer escalation notification.- Phase 1
L1WalkTreeVariantwalker — its stubbed synthetic-step UI is replaced by real AI node rendering.
Intake decision flow:
POST /l1/intake (problem_statement, customer_*, force_build?)
→ match_or_build(account_id, problem_text, ticket_ref, force_build):
1. category = classify(problem_text) # new
2. if category not in account.enabled_l1_categories:
return {outcome: 'out_of_scope', category}
3. if not force_build:
hits = flow_matching_engine.find_matches(problem_text)
best = max(hits, default=None)
if best.score >= MATCH_THRESHOLD:
return {outcome: 'matched', target_id, session_kind} # flow|proposal
if best.score >= SUGGEST_THRESHOLD:
return {outcome: 'suggest', near_miss, can_build: true}
4. return {outcome: 'build', session_kind: 'ai_build', category}
Frontend dispatches on outcome:
matched→ start aflow/proposalwalk (Phase 1 paths).suggest→ inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake withforce_build=true.out_of_scope→ inline prompt offering ad-hoc walk or escalate-without-walk (Phase 1 paths).build→ create anai_buildsession, navigate to the walker, fetch the first node.
4. The streaming build & node schema
ai_tree_builder.generate_next_node() is called with the problem statement, the resolved category, and the full walked path so far. It returns exactly one node. Passing the whole path every call is what keeps independently-generated nodes coherent and lets the model decide when it has exhausted safe steps.
Node shape (proposed_flow_data node, also the live walked_path entry):
// question — yes/no branch; both branches regenerate
{ "node_type": "question", "id": "n3", "text": "Is the printer showing a 'ready' status light?",
"yes_next": "generate", "no_next": "generate" }
// instruction — a single safe, reversible action; advances on acknowledgement
{ "node_type": "instruction", "id": "n4", "text": "Unplug the printer for 30 seconds, then power it back on.",
"next": "generate" }
// resolved — terminal success
{ "node_type": "resolved", "id": "n7", "text": "Printer is back online and printing test pages." }
// escalate — terminal handoff (escalate-early safety valve)
{ "node_type": "escalate", "id": "n7", "reason_category": "exhausted_safe_steps",
"text": "This looks like a driver-level fault beyond L1 scope — escalating to engineering." }
"generate" is a sentinel meaning "call generate_next_node again with the new answer appended." The first node is fetched synchronously on ai_build session creation (intake). Each subsequent node is fetched when the tech answers/acknowledges — target latency ~2–4s per node; show a per-node "Thinking through the next step…" affordance.
Endpoint: POST /l1/sessions/{id}/next-node body {node_id, answer?: 'yes'|'no', acknowledged?: true, note?}. Appends the answered node to walked_path, then generates and returns the next node (or a terminal node). Replaces the Phase 1 synthetic stepping in L1WalkTreeVariant.
5. Safety model (layered)
Layer 1 — classification gate. classify(problem_text) maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or unknown); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is unknown), intake returns out_of_scope; no build happens.
Layer 2 — constrained generation. The ai_tree_builder system prompt restricts output to:
- Safe, reversible, observe-or-restart-class steps only (toggle/restart/reconnect/re-enter, check-status questions).
- A hard floor of always-forbidden actions (see §5.1) that NO category may unlock.
- An explicit instruction to emit an
escalatenode — never guess — once it runs out of in-scope safe steps.
Layer 3 — per-node validation. Server-side, every generated node is checked before being returned:
- Reject (and regenerate once, then escalate) nodes whose text matches forbidden-action patterns (§5.1).
- Enforce a depth cap (default
L1_BUILD_MAX_DEPTH = 12): once the walked path hits the cap, force anescalatenode. - Validate node JSON shape (Pydantic); malformed → regenerate once, then escalate.
Layer 4 — standing disclaimer. Persistent banner on every ai_build walk:
"These are high-confidence troubleshooting steps, but they come from outside your organization's knowledge base — review them before acting. When in doubt, escalate early."
5.1 Hard floor — always forbidden (admins cannot enable)
Regardless of enabled categories, the builder must never produce steps that:
- Modify the Windows registry, system files, or boot configuration.
- Delete, format, or repartition data/disks; remove user profiles or mailboxes.
- Change credentials, MFA, security/firewall/AV settings, or disable protections.
- Run scripts/commands with elevated/admin privileges.
- Touch domain controllers, DNS, DHCP, or production server config.
- Make purchases, license changes, or anything with billing impact.
(This list is a product decision — review and edit during spec review.)
5.2 Default enabled category allowlist (admin-editable)
Ships enabled by default; Account Owners/Admins toggle per account:
password_reset, account_lockout, printer, email_outlook_client, wifi_network_basics, vpn_connect, teams_zoom_av, browser_cache_cookies, peripheral_reconnect, os_restart_update.
(This list is a product decision — review and edit during spec review.)
5.3 Tunables
| Setting | Default | Notes |
|---|---|---|
MATCH_THRESHOLD |
0.75 | Carried from predecessor spec §8.1. |
SUGGEST_THRESHOLD |
0.60 | Carried from predecessor spec §8.1. |
L1_BUILD_MAX_DEPTH |
12 | Force escalate beyond this many nodes. |
get_model_for_action('l1_realtime_build') |
Sonnet | Latency-sensitive; benchmark Sonnet vs Opus during plan. |
| Per-node max_tokens | 1024 | One node is small. |
6. Flywheel capture
On resolve of an ai_build session (l1_session_service.resolve extension):
- Build
proposed_flow_datafrom thewalked_path(the nodes that were actually traversed, normalized into a tree structure). - Create a
FlowProposal:source='ai_realtime_l1',validated_by_outcome=true,proposed_flow_data=<tree>,linked_ticket_id/kind=<session ticket>,problem_domain=<category>,status='pending'. - Run the existing
_find_similar_pending_proposaldedupe — merge (bump supporting count) if a near-duplicate pending proposal exists, else insert. - Emit the existing
proposal.pendingnotification to the review queue.
Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by flow_matching_engine on future intakes → the KB compounds. No new review UI needed; source='ai_realtime_l1' rows surface in the existing queue (optionally badge them "AI · outcome-validated").
7. Minimum escalation handoff
On escalate (terminal node reached, or the L1 hits the Escalate modal during an ai_build walk) — extends l1_session_service.escalate:
- Notify engineers —
notification_servicebell-badge eventl1.session.escalatedto the account's engineers (andis_team_admin/owner). Payload: ticket ref, problem summary, escalation reason category, link. - Engineer-visible surface — escalated L1 sessions appear in an engineer-facing list. Reuse/extend the existing
/escalationsqueue (EscalationQueuePage) with an "L1 escalations" section, or a dedicatedGET /l1/escalationsconsumed there. Each row shows problem, the walked path summary, who escalated, when.
Still deferred (documented, not built): PSA ticket reassign, escalation-package markdown generation, AI chat handoff/session creation.
8. Data model & migrations
Migration 1 — ai_build session kind.
- Extend
l1_walk_sessionsck_l1_walk_sessions_session_kindCHECK to include'ai_build'. - Extend
ck_l1_walk_sessions_target_consistency: forai_build, bothflow_idandflow_proposal_idare NULL (same asadhoc).
Migration 2 — account L1 category settings.
- Add
accounts.enabled_l1_categoriesJSONB NOT NULL DEFAULT '<default allowlist>'::jsonb(list of category keys). RLS already coversaccounts.
No new tables — live build state rides on the existing l1_walk_sessions.walked_path; persisted trees ride on FlowProposal.proposed_flow_data.
9. API surface
| Method | Path | Notes | Auth |
|---|---|---|---|
| POST | /l1/intake |
Extended: now runs match_or_build; response carries outcome (matched/suggest/out_of_scope/build). |
require_l1_or_coverage |
| POST | /l1/sessions/{id}/next-node |
New: record answer/ack on current node, generate + return next node (or terminal). | require_l1_or_coverage |
| GET | /accounts/me/l1-categories |
New: list enabled + available categories + hard-floor (read-only) list. | require_l1_or_above (read) |
| PATCH | /accounts/me/l1-categories |
New: set enabled categories. | require_engineer_or_admin (owner/admin) |
| GET | /l1/escalations |
New (or extend /escalations): engineer-visible escalated-from-L1 list. |
require_engineer_or_admin |
10. Frontend
L1WalkTreeVariant— replace synthetic stepping with real node rendering driven by/next-node; renderquestion(yes/no),instruction(acknowledge),resolved/escalate(terminal). Per-node loading affordance. Disclaimer banner mounted forai_buildsessions.L1Dashboardintake handler — dispatch onmatch_or_buildoutcome(suggest prompt, out-of-scope prompt, build → walker).- New admin settings panel (under
/account) — toggle enabled L1 categories; show hard-floor list as read-only "always excluded." - Engineer escalations surface — "L1 escalations" section/list.
11. Testing strategy
Backend unit:
ai_tree_builder.generate_next_node— returns valid node per type; escalate-early when path is deep / model signals exhaustion; regenerate-then-escalate on malformed/forbidden output; depth cap forces escalate.- Per-node validation — forbidden-action patterns rejected; hard-floor enforced even if a category is enabled.
match_or_build— all four outcomes at threshold boundaries (score == MATCH_THRESHOLD,== SUGGEST_THRESHOLD),force_buildbypasses match,out_of_scopewhen category disabled.classify— known categories map correctly; unknown → out_of_scope.- Flywheel capture — resolve creates
ai_realtime_l1proposal; dedupe merges near-duplicate. - Escalation handoff — notification fired; escalated session appears in engineer query.
Backend integration:
- Full intake→build→resolve creates an outcome-validated proposal.
- Intake→build→escalate notifies engineers and surfaces in the escalations list.
- Migrations roundtrip;
ai_buildCHECK + target-consistency hold.
Frontend e2e (extend l1-workspace.spec.ts):
- L1 intake with no match → AI build → answer nodes → resolve → proposal created.
- L1 build → escalate node → escalate handoff.
- Admin toggles a category off → that problem class returns out-of-scope.
AI quality (plan-time): small eval set of common L1 problems; assert trees stay in-scope, reach resolution or escalate cleanly, never emit hard-floor actions. Benchmark Sonnet vs Opus for the model-tier decision.
12. Risks & open questions
- Hallucinated-but-plausible steps for niche/company-specific apps. Mitigation: classification gate + constrained prompt + escalate-early + disclaimer. Residual risk accepted for v1; eval set bounds it.
- Latency on a live call. Node-by-node means ~2–4s per branch. Mitigation: Sonnet, small per-node token budget, clear loading affordance. Benchmark at plan time.
- Coherence across independently-generated nodes. Mitigation: full walked-path context every call.
- Classification accuracy. A misclassify could wrongly gate a valid problem out, or let a borderline one through. Mitigation: hard floor is category-independent; out-of-scope still offers adhoc/escalate (no dead end).
- Open (product, for spec review): the default category allowlist (§5.2) and the hard-floor list (§5.1) — confirm/edit. Model tier — confirm Sonnet pending benchmark.
13. Out of scope (restated)
KB ingestion + connectors, RAG grounding, PSA reassign, escalation-package generation, AI chat handoff. Each is its own later phase with its own spec.