Merge PR #191: docs: L1 Phase 2A design/plan + plan-taxonomy decision
This commit was merged in pull request #191.
This commit is contained in:
@@ -13,6 +13,18 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## 2026-05-29 — Single source of truth for plan-tier taxonomy (derive admin UI + validation from `plan_limits`)
|
||||||
|
|
||||||
|
**Context:** A prod report ("AI sessions aren't working") traced to the owner account having no paid plan (AI is plan-gated), compounded by a real bug: the admin "Change Plan" dropdown ([`AccountDetailPage.tsx:443-445`](../frontend/src/pages/admin/AccountDetailPage.tsx)) still offered the dead `team` slug (renamed to `enterprise` in migration `4ce3e594cb87`, 2026-05-07) and omitted `starter`/`enterprise`. Selecting "Team" 400s against the hardcoded allow-list in [`admin.py:994`](../backend/app/api/endpoints/admin.py#L994). The dropdown was missed during the 2026-05-07 taxonomy reconciliation because the allowed-plan list is hand-duplicated across ≥6 backend + frontend sites. Second taxonomy-drift incident.
|
||||||
|
|
||||||
|
**Decision:** Option B — make `plan_limits` the single source of truth: admin dropdown + pricing/checkout derive plan options from a plans endpoint (filter `is_public`, order by `sort_order`, label from `display_name`), and backend validation checks against actual `plan_limits` rows rather than a hardcoded tuple. Implementation deferred (active work is on another branch); fully specced in [TODO.md](TODO.md). A trivial dropdown-options fix may land first to unblock the admin tool.
|
||||||
|
|
||||||
|
**Rejected:** Option A (patch only the `AccountDetailPage` dropdown). Fixes the symptom but leaves the duplication that has now caused two drift incidents — and there is no outage forcing a minimal diff (bug is admin-only and was already worked around via direct Pro assignment). Conflicts with the repo principle "prefer correct architecture over minimal diff."
|
||||||
|
|
||||||
|
**Consequences:** New plan tiers become a data change (a `plan_limits` row) instead of a multi-file code edit; UI and validation can no longer drift from the catalog. Requires a public-plans read endpoint (or extending billing state) consumed by the admin UI + pricing page. The `'team'` visibility string (`Tree.visibility` / `StepLibrary.visibility`) is a separate domain and is explicitly out of scope.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## 2026-05-28 — Scope Anthropic structured outputs to flat-array JSON only
|
## 2026-05-28 — Scope Anthropic structured outputs to flat-array JSON only
|
||||||
|
|
||||||
**Context:** Optimizing the existing Claude API usage (no model change). The Anthropic path in `generate_json` (`ai_provider.py`) had no equivalent to the Gemini path's `response_mime_type="application/json"` — it prompted for JSON and relied on downstream defenses: `_strip_markdown_fences` (ai_fix), `parse_llm_json` (knowledge_flywheel), and `_try_repair_json` (kb_conversion, which balances unclosed braces on truncated output). Anthropic structured outputs (`output_config.format` with a JSON schema) guarantee valid, parseable JSON and would eliminate those band-aids. The question was which of the four `generate_json` call sites can adopt it.
|
**Context:** Optimizing the existing Claude API usage (no model change). The Anthropic path in `generate_json` (`ai_provider.py`) had no equivalent to the Gemini path's `response_mime_type="application/json"` — it prompted for JSON and relied on downstream defenses: `_strip_markdown_fences` (ai_fix), `parse_llm_json` (knowledge_flywheel), and `_try_repair_json` (kb_conversion, which balances unclosed braces on truncated output). Anthropic structured outputs (`output_config.format` with a JSON schema) guarantee valid, parseable JSON and would eliminate those band-aids. The question was which of the four `generate_json` call sites can adopt it.
|
||||||
|
|||||||
@@ -23,3 +23,5 @@ None selected. Pick from the backlog below or `03-DEVELOPMENT-ROADMAP.md`.
|
|||||||
- [ ] **`bg-card-hover` Tailwind class doesn't resolve.** [`frontend/src/components/layout/CommandPalette.tsx:450-451`](../frontend/src/components/layout/CommandPalette.tsx) uses `bg-card-hover` as a Tailwind utility, but Tailwind v4 generates `bg-{token}` from `--color-{token}` — and the token in [`frontend/src/index.css:15`](../frontend/src/index.css) is `--color-bg-card-hover`, which generates `bg-bg-card-hover`, not `bg-card-hover`. So those classes silently produce nothing. Other call sites (KnowledgeBaseCards, TeamSummary, ProposalBanner) use the explicit `hover:bg-[var(--color-bg-card-hover)]` form which works. Fix: change the CommandPalette classes to the explicit-var form, OR add a `--color-card-hover` semantic mapping in index.css alongside `--color-card`. Surfaced 2026-05-01 during impeccable polish sweep.
|
- [ ] **`bg-card-hover` Tailwind class doesn't resolve.** [`frontend/src/components/layout/CommandPalette.tsx:450-451`](../frontend/src/components/layout/CommandPalette.tsx) uses `bg-card-hover` as a Tailwind utility, but Tailwind v4 generates `bg-{token}` from `--color-{token}` — and the token in [`frontend/src/index.css:15`](../frontend/src/index.css) is `--color-bg-card-hover`, which generates `bg-bg-card-hover`, not `bg-card-hover`. So those classes silently produce nothing. Other call sites (KnowledgeBaseCards, TeamSummary, ProposalBanner) use the explicit `hover:bg-[var(--color-bg-card-hover)]` form which works. Fix: change the CommandPalette classes to the explicit-var form, OR add a `--color-card-hover` semantic mapping in index.css alongside `--color-card`. Surfaced 2026-05-01 during impeccable polish sweep.
|
||||||
|
|
||||||
- [ ] **`ConcludeSessionModal` paused/escalated step forces single-artifact choice — should allow multi-select.** [`frontend/src/components/assistant/ConcludeSessionModal.tsx`](../frontend/src/components/assistant/ConcludeSessionModal.tsx) ~lines 430-474 ("Paused/Escalated: status update options"). Today the engineer clicks ONE of Ticket Notes / Client Update / Email Draft, the buttons disappear, and the result replaces them. Real MSP escalations almost always need at least two: technical notes for the next engineer's PSA AND a non-technical client update. Same for pause (client update + ticket notes for context when resuming). Recommended shape: multi-select with smart defaults — three checkboxes (`☑ Ticket Notes ☑ Client Update ☐ Email Draft`); for `escalated` pre-check Ticket Notes + Client Update; for `paused` pre-check Client Update only. One "Generate" button fires all selected in parallel via existing `aiSessionsApi.generateStatusUpdate(...)` (already supports the three `audience` values: `ticket_notes`, `client_update`, `email_draft`). Each result renders in its own card with its own Copy / Post-to-PSA / Send-Email action. Surfaced 2026-05-01. Feature work, not polish — touches streaming wiring for parallel calls.
|
- [ ] **`ConcludeSessionModal` paused/escalated step forces single-artifact choice — should allow multi-select.** [`frontend/src/components/assistant/ConcludeSessionModal.tsx`](../frontend/src/components/assistant/ConcludeSessionModal.tsx) ~lines 430-474 ("Paused/Escalated: status update options"). Today the engineer clicks ONE of Ticket Notes / Client Update / Email Draft, the buttons disappear, and the result replaces them. Real MSP escalations almost always need at least two: technical notes for the next engineer's PSA AND a non-technical client update. Same for pause (client update + ticket notes for context when resuming). Recommended shape: multi-select with smart defaults — three checkboxes (`☑ Ticket Notes ☑ Client Update ☐ Email Draft`); for `escalated` pre-check Ticket Notes + Client Update; for `paused` pre-check Client Update only. One "Generate" button fires all selected in parallel via existing `aiSessionsApi.generateStatusUpdate(...)` (already supports the three `audience` values: `ticket_notes`, `client_update`, `email_draft`). Each result renders in its own card with its own Copy / Post-to-PSA / Send-Email action. Surfaced 2026-05-01. Feature work, not polish — touches streaming wiring for parallel calls.
|
||||||
|
|
||||||
|
- [ ] **Centralize plan-tier taxonomy — derive admin plan dropdown (and validation) from `plan_limits`, not hardcoded lists.** Chose **Option B** over a one-line patch (see [DECISIONS.md](DECISIONS.md) 2026-05-29). *Surfaced by a prod bug (2026-05-28):* the admin "Change Plan" dropdown at [`AccountDetailPage.tsx:443-445`](../frontend/src/pages/admin/AccountDetailPage.tsx) still offered `free / pro / team` — the dead `team` slug (renamed to `enterprise` in migration `4ce3e594cb87`, 2026-05-07) and missing `starter`/`enterprise`. Selecting "Team" sends `{plan:"team"}` to `PUT /admin/accounts/{id}/subscription/plan`, which 400s on `if data.plan not in ("free","pro","starter","enterprise")` ([admin.py:994](../backend/app/api/endpoints/admin.py#L994), duplicated at [:975](../backend/app/api/endpoints/admin.py#L975)). The 400 detail was swallowed by a generic `toast.error('Failed to update plan')` ([AccountDetailPage.tsx:196](../frontend/src/pages/admin/AccountDetailPage.tsx)), so it presented as "AI sessions are down" (real cause: owner account had no paid plan; AI is plan-gated). **Root cause of the root cause:** the allowed-plan list is hand-duplicated across ≥6 sites and drifted (2nd such incident). **Duplication sites to consolidate:** backend [`admin.py:975`](../backend/app/api/endpoints/admin.py#L975) + [`:994`](../backend/app/api/endpoints/admin.py#L994) (tuple, twice), [`schemas/admin.py:128`](../backend/app/schemas/admin.py) (`AdminAccountCreate.plan` Literal), frontend `AccountDetailPage.tsx` dropdown, `AccountsPage.tsx` create-account dropdown, `types/admin.ts` + `types/account.ts` + `types/billing.ts`, `hooks/useSubscription.ts` (`isPaidPlan`), `components/subscription/CheckoutButton.tsx` (`planLabels`). **Source of truth:** the `plan_limits` table (rows: free/starter/pro/enterprise) — `PlanLimitWithBillingResponse` already exposes `is_public` + `sort_order` + `display_name` for ordering/labels. **End state (B):** admin dropdown + pricing/checkout derive options from a plans endpoint backed by `plan_limits` (filter `is_public`, order by `sort_order`, label from `display_name`); backend validation checks against actual `plan_limits` rows instead of a hardcoded tuple. **Trivial first commit (land anytime to unblock the admin tool):** fix the `AccountDetailPage` dropdown to `Free / Starter / Pro / Enterprise` and surface the backend error detail in the toast. ⚠️ The `'team'` string in `Tree.visibility` / `StepLibrary.visibility` is a *separate domain* (shared-with-account) — do NOT touch it.
|
||||||
|
|||||||
1966
docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md
Normal file
1966
docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md
Normal file
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,266 @@
|
|||||||
|
# L1 AI Decision-Tree Builder — Phase 2A Design
|
||||||
|
|
||||||
|
**Status:** Draft for review
|
||||||
|
**Date:** 2026-05-29
|
||||||
|
**Author:** previous session (brainstorming)
|
||||||
|
**Predecessor:** [`2026-05-28-l1-workspace-design.md`](2026-05-28-l1-workspace-design.md) (full L1 vision), [`2026-05-28-l1-workspace-phase-1-acceptance.md`](2026-05-28-l1-workspace-phase-1-acceptance.md) (what shipped in Phase 1)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Goal
|
||||||
|
|
||||||
|
When an L1 tech describes a problem and there is **no matching authored flow or AI draft**, the platform builds a yes/no decision tree **in real time from the model's general L1 knowledge** and walks the tech through it node by node. Scoped to L1-appropriate troubleshooting: simple yes/no questions and reversible step-by-step instructions. Successful trees are captured as outcome-validated drafts for engineer review, compounding the account's knowledge base from real resolutions.
|
||||||
|
|
||||||
|
This **overrides** the original spec's "no empty-KB build" rule (§8.1 of the predecessor), which aborted to a degradation screen when no KB existed. Instead of aborting, we build from generic knowledge under a layered safety model.
|
||||||
|
|
||||||
|
KB grounding (RAG over ingested documents) is **explicitly deferred to Phase 2B** — Phase 2A builds from generic knowledge only, plus matching against already-authored flows.
|
||||||
|
|
||||||
|
## 2. Scope
|
||||||
|
|
||||||
|
**In scope (Phase 2A):**
|
||||||
|
- `match_or_build` orchestrator inserted at L1 intake (match-first, build-on-miss).
|
||||||
|
- `ai_tree_builder` service: node-by-node ("streaming") tree generation, constrained + escalate-early.
|
||||||
|
- Admin-configurable L1 category allowlist (Account Owner/Admin control panel).
|
||||||
|
- Standing AI-disclaimer banner on AI-built walks.
|
||||||
|
- Flywheel capture: resolved AI trees become outcome-validated `FlowProposal`s.
|
||||||
|
- Minimum escalation handoff: engineer bell-badge notification + an engineer-visible "escalated from L1" surface.
|
||||||
|
|
||||||
|
**Deferred:**
|
||||||
|
- KB document ingestion + connectors (IT Glue, Hudu, SharePoint/OneDrive) — Phase 2B.
|
||||||
|
- RAG grounding of the builder on ingested KB — Phase 2B.
|
||||||
|
- PSA ticket reassign on escalation, escalation-package generation, AI chat handoff — later phase.
|
||||||
|
- `BuildAbortedNoKB` screen from the original spec — **dropped** (superseded by build-from-generic).
|
||||||
|
|
||||||
|
## 3. Architecture (Approach C)
|
||||||
|
|
||||||
|
Dedicated builder for the constrained node generation; reuse existing rails for matching and capture.
|
||||||
|
|
||||||
|
**New services:**
|
||||||
|
| File | Responsibility |
|
||||||
|
|---|---|
|
||||||
|
| `backend/app/services/match_or_build.py` | Orchestrator. `match_or_build(account_id, problem_text, ticket_ref, *, force_build=False) -> MatchOrBuildResult`. Classify → category gate → match pass → build/suggest/out-of-scope decision. |
|
||||||
|
| `backend/app/services/ai_tree_builder.py` | Node-by-node generation. `generate_next_node(problem_text, category, walked_path) -> TreeNode`. Reuses `get_ai_provider` + `generate_json` + `parse_llm_json`. Owns the constrained system prompt and per-node validation. |
|
||||||
|
| `backend/app/services/l1_category_service.py` | Read/write an account's enabled L1 categories; expose the default allowlist and the always-forbidden hard floor. |
|
||||||
|
|
||||||
|
**Reused as-is:**
|
||||||
|
- `flow_matching_engine.find_matches()` — semantic + keyword + recency match pass.
|
||||||
|
- `knowledge_flywheel` proposal-creation + dedupe (`_find_similar_pending_proposal`) — outcome-validated capture.
|
||||||
|
- `notification_service` — engineer escalation notification.
|
||||||
|
- Phase 1 `L1WalkTreeVariant` walker — its stubbed synthetic-step UI is replaced by real AI node rendering.
|
||||||
|
|
||||||
|
**Intake decision flow:**
|
||||||
|
|
||||||
|
Order matters: **match first, gate only the build path.** The category allowlist exists to bound *generic AI building* for safety — it must not block a human-authored flow that already exists for that problem. So matching against published flows runs before any category check; the category gate applies only when we fall through to building.
|
||||||
|
|
||||||
|
```
|
||||||
|
POST /l1/intake (problem_statement, customer_*, force_build?)
|
||||||
|
→ match_or_build(account_id, problem_text, problem_domain, ticket_ref, force_build):
|
||||||
|
1. if not force_build:
|
||||||
|
hits = flow_matching_engine.find_matches(problem_text, problem_domain, account_id)
|
||||||
|
best = max(hits, default=None) # published flows (Trees) only
|
||||||
|
if best and best.score >= MATCH_THRESHOLD:
|
||||||
|
return {outcome: 'matched', flow_id, session_kind: 'flow'}
|
||||||
|
if best and best.score >= SUGGEST_THRESHOLD:
|
||||||
|
return {outcome: 'suggest', near_miss, can_build: true}
|
||||||
|
2. category = classify(problem_text) # new — only on build path
|
||||||
|
3. if category not in account.enabled_l1_categories:
|
||||||
|
return {outcome: 'out_of_scope', category}
|
||||||
|
4. return {outcome: 'build', session_kind: 'ai_build', category}
|
||||||
|
```
|
||||||
|
|
||||||
|
**Match scope (Finding 2):** `flow_matching_engine.find_matches()` matches **published flows (`trees`) only** — it returns `{tree_id, tree_name, score, ...}` and has no notion of `FlowProposal`s. Phase 2A therefore matches against published flows only; the `matched` outcome is always `session_kind: 'flow'`. This is sufficient because the flywheel promotes good AI drafts to published flows (§6), which then become matchable on future intakes. Matching against not-yet-promoted proposals is a deferred enhancement (would require extending the engine), noted in §13.
|
||||||
|
|
||||||
|
Frontend dispatches on `outcome`:
|
||||||
|
- `matched` → start a `flow` walk (Phase 1 path).
|
||||||
|
- `suggest` → inline prompt ("Found a similar flow — use it, or build new?"); "Build new" re-calls intake with `force_build=true` (which skips the match pass and runs the category gate before building).
|
||||||
|
- `out_of_scope` → inline prompt offering ad-hoc walk or escalate-without-walk (Phase 1 paths).
|
||||||
|
- `build` → create an `ai_build` session, navigate to the walker, fetch the first node.
|
||||||
|
|
||||||
|
## 4. The streaming build & node schema
|
||||||
|
|
||||||
|
`ai_tree_builder.generate_next_node()` is called with the problem statement, the resolved category, and the **full walked path so far**. It returns exactly one node. Passing the whole path every call is what keeps independently-generated nodes coherent and lets the model decide when it has exhausted safe steps.
|
||||||
|
|
||||||
|
**Node shape (`proposed_flow_data` node, also the live `walked_path` entry):**
|
||||||
|
```json
|
||||||
|
// question — yes/no branch; both branches regenerate
|
||||||
|
{ "node_type": "question", "id": "n3", "text": "Is the printer showing a 'ready' status light?",
|
||||||
|
"yes_next": "generate", "no_next": "generate" }
|
||||||
|
|
||||||
|
// instruction — a single safe, reversible action; advances on acknowledgement
|
||||||
|
{ "node_type": "instruction", "id": "n4", "text": "Unplug the printer for 30 seconds, then power it back on.",
|
||||||
|
"next": "generate" }
|
||||||
|
|
||||||
|
// resolved — terminal success
|
||||||
|
{ "node_type": "resolved", "id": "n7", "text": "Printer is back online and printing test pages." }
|
||||||
|
|
||||||
|
// escalate — terminal handoff (escalate-early safety valve)
|
||||||
|
{ "node_type": "escalate", "id": "n7", "reason_category": "exhausted_safe_steps",
|
||||||
|
"text": "This looks like a driver-level fault beyond L1 scope — escalating to engineering." }
|
||||||
|
```
|
||||||
|
|
||||||
|
`"generate"` is a sentinel meaning "call `generate_next_node` again with the new answer appended." The first node is fetched synchronously on `ai_build` session creation (intake). Each subsequent node is fetched when the tech answers/acknowledges — target latency ~2–4s per node; show a per-node "Thinking through the next step…" affordance.
|
||||||
|
|
||||||
|
**Endpoint:** `POST /l1/sessions/{id}/next-node` body `{node_id, answer?: 'yes'|'no', acknowledged?: true, note?}`. Appends the answered node to `walked_path`, then generates and returns the next node (or a terminal node). Replaces the Phase 1 synthetic stepping in `L1WalkTreeVariant`.
|
||||||
|
|
||||||
|
## 5. Safety model (layered)
|
||||||
|
|
||||||
|
**Layer 1 — classification gate (build path only).** Runs only after the match pass misses (§3) — a human-authored flow is never blocked by category settings. `classify(problem_text)` maps the problem to a category via a lightweight model call (low token budget, returns one category key from the enabled set or `unknown`); on model failure it falls back to keyword matching against category aliases. If the result is not in the account's enabled set (or is `unknown`), intake returns `out_of_scope` (offer adhoc/escalate); no build happens.
|
||||||
|
|
||||||
|
**Layer 2 — constrained generation.** The `ai_tree_builder` system prompt restricts output to:
|
||||||
|
- Safe, reversible, observe-or-restart-class steps only (toggle/restart/reconnect/re-enter, check-status questions).
|
||||||
|
- A **hard floor of always-forbidden actions** (see §5.1) that NO category may unlock.
|
||||||
|
- An explicit instruction to emit an `escalate` node — never guess — once it runs out of in-scope safe steps.
|
||||||
|
|
||||||
|
**Layer 3 — per-node validation.** Server-side, every generated node is checked before being returned:
|
||||||
|
- Reject (and regenerate once, then escalate) nodes whose text matches forbidden-action patterns (§5.1).
|
||||||
|
- Enforce a **depth cap** (default `L1_BUILD_MAX_DEPTH = 12`): once the walked path hits the cap, force an `escalate` node.
|
||||||
|
- Validate node JSON shape (Pydantic); malformed → regenerate once, then escalate.
|
||||||
|
|
||||||
|
**Layer 4 — standing disclaimer.** Persistent banner on every `ai_build` walk:
|
||||||
|
|
||||||
|
> *"These are high-confidence troubleshooting steps, but they come from outside your organization's knowledge base — review them before acting. When in doubt, escalate early."*
|
||||||
|
|
||||||
|
### 5.1 Hard floor — always forbidden (admins cannot enable)
|
||||||
|
Regardless of enabled categories, the builder must never produce steps that:
|
||||||
|
- Modify the Windows registry, system files, or boot configuration.
|
||||||
|
- Delete, format, or repartition data/disks; remove user profiles or mailboxes.
|
||||||
|
- Change credentials, MFA, security/firewall/AV settings, or disable protections.
|
||||||
|
- Run scripts/commands with elevated/admin privileges.
|
||||||
|
- Touch domain controllers, DNS, DHCP, or production server config.
|
||||||
|
- Make purchases, license changes, or anything with billing impact.
|
||||||
|
|
||||||
|
*(This list is a product decision — review and edit during spec review.)*
|
||||||
|
|
||||||
|
### 5.2 Default enabled category allowlist (admin-editable)
|
||||||
|
Ships enabled by default; Account Owners/Admins toggle per account:
|
||||||
|
`password_reset`, `account_lockout`, `printer`, `email_outlook_client`, `wifi_network_basics`, `vpn_connect`, `teams_zoom_av`, `browser_cache_cookies`, `peripheral_reconnect`, `os_restart_update`.
|
||||||
|
|
||||||
|
*(This list is a product decision — review and edit during spec review.)*
|
||||||
|
|
||||||
|
### 5.3 Tunables
|
||||||
|
| Setting | Default | Notes |
|
||||||
|
|---|---|---|
|
||||||
|
| `MATCH_THRESHOLD` | 0.75 | Carried from predecessor spec §8.1. |
|
||||||
|
| `SUGGEST_THRESHOLD` | 0.60 | Carried from predecessor spec §8.1. |
|
||||||
|
| `L1_BUILD_MAX_DEPTH` | 12 | Force escalate beyond this many nodes. |
|
||||||
|
| `get_model_for_action('l1_realtime_build')` | Sonnet | Latency-sensitive; benchmark Sonnet vs Opus during plan. |
|
||||||
|
| Per-node max_tokens | 1024 | One node is small. |
|
||||||
|
|
||||||
|
## 6. Flywheel capture
|
||||||
|
|
||||||
|
On `resolve` of an `ai_build` session (`l1_session_service.resolve` extension):
|
||||||
|
1. **Normalize** the `walked_path` into a complete, valid `tree_structure` (§6.1) — approval requires a dict with a real `id` (see Finding 5 / `_create_tree_from_proposal`).
|
||||||
|
2. Create a `FlowProposal`: `source='ai_realtime_l1'`, `validated_by_outcome=true`, `proposed_flow_data={tree_structure, match_keywords}`, `l1_session_id=<this session>` (NOT `source_session_id` — see §6.2 / Finding 1), `linked_ticket_id/kind=<session ticket>`, `problem_domain=<category>`, `status='pending'`.
|
||||||
|
3. Run the existing `_find_similar_pending_proposal` dedupe — merge (bump supporting count) if a near-duplicate pending proposal exists, else insert.
|
||||||
|
4. Emit the existing `proposal.pending` notification to the review queue.
|
||||||
|
|
||||||
|
Engineers promote good proposals to authored flows in the existing review queue. Promoted flows are then found by `flow_matching_engine` on future intakes → the KB compounds. `source='ai_realtime_l1'` rows surface in the existing queue (badge them "AI · outcome-validated").
|
||||||
|
|
||||||
|
### 6.1 Tree normalization (Finding 5)
|
||||||
|
The live `walked_path` holds only traversed nodes, and `"generate"` is a runtime sentinel, not a real edge — that is not a valid tree and would fail the `_create_tree_from_proposal` guard (`tree_structure` must be a dict with an `id`). At resolve time, `ai_tree_builder.normalize_walked_path(walked_path) -> tree_structure` produces a complete object:
|
||||||
|
- Assign stable string `id`s to every node; the first node becomes the root and `tree_structure.id` = root id.
|
||||||
|
- `question` nodes: the **traversed** branch (`yes`/`no` the tech actually chose) points to the next traversed node; the **untraversed** branch points to a terminal `{node_type: 'needs_review', text: 'Branch not explored during the originating call'}` stub.
|
||||||
|
- `instruction` nodes point to the next traversed node.
|
||||||
|
- The traversal ends at the real terminal node (`resolved` or `escalate`).
|
||||||
|
This yields a structurally valid, reviewable tree: engineers fill in the `needs_review` branches when promoting. (Trees are `tree_type='troubleshooting'`.)
|
||||||
|
|
||||||
|
### 6.2 FlowProposal L1 source linkage (Finding 1 — Blocker)
|
||||||
|
`FlowProposal.source_session_id` is currently `nullable=False` FK → `ai_sessions`, and the review UI (`ProposalDetail.tsx`) links the "Source Session" to `/pilot/{source_session_id}` (a FlowPilot chat surface). An L1 `ai_build` session is an `l1_walk_session`, not an `ai_session`, so it cannot populate `source_session_id`. Changes:
|
||||||
|
- **Model/migration:** add `FlowProposal.l1_session_id` (nullable FK → `l1_walk_sessions.id`, `ondelete=SET NULL`, indexed). Make `source_session_id` **nullable**. Add CHECK `((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))` — exactly one source set.
|
||||||
|
- **Review UI:** when `l1_session_id` is set (source `ai_realtime_l1`), render the "Source" block as a read-only walked-path summary (problem statement + the resolved path) instead of a `/pilot/...` link. Existing ai_session-sourced proposals are unchanged.
|
||||||
|
- **Tree promotion:** `_create_tree_from_proposal` sets `Tree.source_session_id` from the proposal — for L1-sourced proposals leave it NULL (confirm `Tree.source_session_id` is nullable; if not, include in the migration).
|
||||||
|
|
||||||
|
## 7. Minimum escalation handoff
|
||||||
|
|
||||||
|
On `escalate` (terminal node reached, or the L1 hits the Escalate modal during an `ai_build` walk) — extends `l1_session_service.escalate`. **The engineer-visible surface is the primary, dependency-free handoff; the bell-badge notification is a thin addition that requires three specific extensions to the FlowPilot-shaped notification system (Finding 3).**
|
||||||
|
|
||||||
|
1. **Engineer-visible surface (primary).** Escalated L1 sessions appear in an engineer-facing list — extend the existing `/escalations` queue (`EscalationQueuePage`) with an "L1 escalations" section, backed by a new `GET /l1/escalations`. Each row: problem statement, walked-path summary, who escalated, when, reason category. Pollable; no dependency on the notification subsystem.
|
||||||
|
|
||||||
|
2. **Bell-badge notification (Finding 3 — three explicit changes).** The notification system is currently FlowPilot-specific:
|
||||||
|
- `VALID_EVENTS` (`backend/app/schemas/notification.py`) has no `l1.session.escalated`. **Add it** to the set (and to the default `events_enabled` map).
|
||||||
|
- `_build_notification_link` (`notification_service.py`) only knows `session.escalated → /pilot/{session_id}?pickup=true`. **Add** `l1.session.escalated → /escalations` and **add** a body template for the new event. The existing `session.escalated` event must NOT be reused — an L1 escalation has no ai_session and no `/pilot` pickup flow.
|
||||||
|
- Default recipients (`_resolve_recipients`, ~line 184) are owner/admin/team_admin only — ordinary **engineers are excluded**. Since L1 escalations must reach engineers who can pick them up, the call **must pass explicit `target_user_ids`** = the account's active `engineer`-role users (plus owner/admin), not rely on the default set.
|
||||||
|
|
||||||
|
**Still deferred** (documented, not built): PSA ticket reassign, escalation-package markdown generation, AI chat handoff/session creation.
|
||||||
|
|
||||||
|
## 8. Data model & migrations
|
||||||
|
|
||||||
|
**Migration 1 — `ai_build` session kind.**
|
||||||
|
- Extend `l1_walk_sessions` `ck_l1_walk_sessions_session_kind` CHECK to include `'ai_build'`.
|
||||||
|
- Extend `ck_l1_walk_sessions_target_consistency`: for `ai_build`, both `flow_id` and `flow_proposal_id` are NULL (same as `adhoc`).
|
||||||
|
|
||||||
|
**Migration 2 — account L1 category settings.**
|
||||||
|
- Add `accounts.enabled_l1_categories` `JSONB NOT NULL DEFAULT '<default allowlist>'::jsonb` (list of category keys). RLS already covers `accounts`.
|
||||||
|
|
||||||
|
**Migration 3 — FlowProposal L1 source linkage (Finding 1).**
|
||||||
|
- Add `flow_proposals.l1_session_id` nullable FK → `l1_walk_sessions.id` (`ondelete=SET NULL`, indexed).
|
||||||
|
- Make `flow_proposals.source_session_id` **nullable** (was `NOT NULL`).
|
||||||
|
- Add CHECK `((source_session_id IS NOT NULL) <> (l1_session_id IS NOT NULL))` — exactly one source.
|
||||||
|
- Confirm `trees.source_session_id` is nullable (L1-promoted trees leave it NULL); if not, drop its NOT NULL here.
|
||||||
|
|
||||||
|
No new tables — live build state rides on the existing `l1_walk_sessions.walked_path`; persisted trees ride on `FlowProposal.proposed_flow_data`.
|
||||||
|
|
||||||
|
## 9. API surface
|
||||||
|
|
||||||
|
| Method | Path | Notes | Auth |
|
||||||
|
|---|---|---|---|
|
||||||
|
| POST | `/l1/intake` | **Extended**: now runs `match_or_build`; response carries `outcome` (`matched`/`suggest`/`out_of_scope`/`build`). | `require_l1_or_coverage` |
|
||||||
|
| POST | `/l1/sessions/{id}/next-node` | **New**: record answer/ack on current node, generate + return next node (or terminal). | `require_l1_or_coverage` |
|
||||||
|
| GET | `/accounts/me/l1-categories` | **New**: list enabled + available categories + hard-floor (read-only) list. | `require_l1_or_above` (read) |
|
||||||
|
| PATCH | `/accounts/me/l1-categories` | **New**: set enabled categories. | `require_account_owner_or_admin` (Finding 6) |
|
||||||
|
| GET | `/l1/escalations` | **New** (or extend `/escalations`): engineer-visible escalated-from-L1 list. | `require_engineer_or_admin` |
|
||||||
|
|
||||||
|
**Finding 6 — new auth dep.** The category control is an owner/admin setting, but `require_engineer_or_admin` also admits `engineer`. No existing dep matches "owner or account-admin" (`require_account_owner` is owner-only; `require_admin` is super-admin-only). Add `require_account_owner_or_admin` to `deps.py`: allow `super_admin` bypass, then `account_role in ('owner', 'admin')`, else 403. Use it for the PATCH.
|
||||||
|
|
||||||
|
## 10. Frontend
|
||||||
|
|
||||||
|
- `L1WalkTreeVariant` — replace synthetic stepping with real node rendering driven by `/next-node`; render `question` (yes/no), `instruction` (acknowledge), `resolved`/`escalate` (terminal). Per-node loading affordance. Disclaimer banner mounted for `ai_build` sessions.
|
||||||
|
- `L1Dashboard` intake handler — dispatch on `match_or_build` `outcome` (suggest prompt, out-of-scope prompt, build → walker).
|
||||||
|
- New admin settings panel (under `/account`) — toggle enabled L1 categories; show hard-floor list as read-only "always excluded."
|
||||||
|
- Engineer escalations surface — "L1 escalations" section/list.
|
||||||
|
|
||||||
|
## 11. Testing strategy
|
||||||
|
|
||||||
|
**Backend unit:**
|
||||||
|
- `ai_tree_builder.generate_next_node` — returns valid node per type; escalate-early when path is deep / model signals exhaustion; regenerate-then-escalate on malformed/forbidden output; depth cap forces escalate.
|
||||||
|
- Per-node validation — forbidden-action patterns rejected; hard-floor enforced even if a category is enabled.
|
||||||
|
- `match_or_build` — all four outcomes at threshold boundaries (`score == MATCH_THRESHOLD`, `== SUGGEST_THRESHOLD`); **match runs before the category gate** (a matched published flow is returned even when its category is disabled — Finding 4); `force_build` skips match but still applies the category gate; `out_of_scope` only on the build path when category disabled/unknown.
|
||||||
|
- `classify` — known categories map correctly; unknown → out_of_scope.
|
||||||
|
- `normalize_walked_path` (Finding 5) — produces a dict with a root `id`; untraversed `question` branches become `needs_review` stubs; output passes the `_create_tree_from_proposal` validity guard.
|
||||||
|
- Flywheel capture — resolve creates `ai_realtime_l1` proposal with `l1_session_id` set and `source_session_id` NULL (Finding 1); CHECK accepts exactly-one-source; dedupe merges near-duplicate.
|
||||||
|
- Escalation handoff — `l1.session.escalated` accepted by the notification schema (Finding 3); link resolves to `/escalations`; explicit engineer `target_user_ids` receive it; escalated session appears in `GET /l1/escalations`.
|
||||||
|
|
||||||
|
**Backend integration:**
|
||||||
|
- Full intake→build→resolve creates an outcome-validated proposal.
|
||||||
|
- Intake→build→escalate notifies engineers and surfaces in the escalations list.
|
||||||
|
- Migrations roundtrip; `ai_build` CHECK + target-consistency hold.
|
||||||
|
|
||||||
|
**Frontend e2e (extend `l1-workspace.spec.ts`):**
|
||||||
|
- L1 intake with no match → AI build → answer nodes → resolve → proposal created.
|
||||||
|
- L1 build → escalate node → escalate handoff.
|
||||||
|
- Admin toggles a category off → that problem class returns out-of-scope.
|
||||||
|
|
||||||
|
**AI quality (plan-time):** small eval set of common L1 problems; assert trees stay in-scope, reach resolution or escalate cleanly, never emit hard-floor actions. Benchmark Sonnet vs Opus for the model-tier decision.
|
||||||
|
|
||||||
|
## 12. Risks & open questions
|
||||||
|
|
||||||
|
- **Hallucinated-but-plausible steps** for niche/company-specific apps. Mitigation: classification gate + constrained prompt + escalate-early + disclaimer. Residual risk accepted for v1; eval set bounds it.
|
||||||
|
- **Latency on a live call.** Node-by-node means ~2–4s per branch. Mitigation: Sonnet, small per-node token budget, clear loading affordance. Benchmark at plan time.
|
||||||
|
- **Coherence across independently-generated nodes.** Mitigation: full walked-path context every call.
|
||||||
|
- **Classification accuracy.** A misclassify could wrongly gate a valid problem out, or let a borderline one through. Mitigation: hard floor is category-independent; out-of-scope still offers adhoc/escalate (no dead end).
|
||||||
|
- **Open (product, for spec review):** the default category allowlist (§5.2) and the hard-floor list (§5.1) — confirm/edit. Model tier — confirm Sonnet pending benchmark.
|
||||||
|
|
||||||
|
## 13. Out of scope (restated)
|
||||||
|
KB ingestion + connectors, RAG grounding, PSA reassign, escalation-package generation, AI chat handoff. Each is its own later phase with its own spec.
|
||||||
|
|
||||||
|
**Also deferred (surfaced in review):**
|
||||||
|
- **Matching against unpromoted `FlowProposal`s** (Finding 2). `flow_matching_engine` matches published flows only. Extending it to also surface outcome-validated drafts before promotion is a later enhancement; Phase 2A relies on engineer promotion (draft → published flow → matchable).
|
||||||
|
|
||||||
|
## 14. Review revisions (2026-05-29 Codex review)
|
||||||
|
All six findings verified against code and resolved in this spec:
|
||||||
|
1. **Blocker — FlowProposal source linkage:** §6.2 + §8 Migration 3 (new nullable `l1_session_id`, `source_session_id` made nullable, exactly-one CHECK, review-UI link change).
|
||||||
|
2. **High — match scope:** §3 (match published flows only; proposal-matching deferred §13).
|
||||||
|
3. **High — escalation notification:** §7 (engineer surface is primary; three explicit notification-system changes enumerated).
|
||||||
|
4. **Medium — gate ordering:** §3 + §5 Layer 1 (match first; category gate only on the build path).
|
||||||
|
5. **Medium — flywheel tree shape:** §6.1 (`normalize_walked_path` produces a valid tree with root `id`; unexplored branches → `needs_review` stubs).
|
||||||
|
6. **Medium — category write auth:** §9 (new `require_account_owner_or_admin` dep; `require_engineer_or_admin` was too broad).
|
||||||
Reference in New Issue
Block a user