docs(design): L1 workspace feature spec

New seat tier between engineer and viewer. Dedicated /l1 surface (dashboard + walker + drafts) for first-call helpdesk staff. Walk-in intake + PSA queue both produce tickets. Match-or-build pipeline prefers authored flows, then outcome-validated AI drafts, then builds fresh from KB. Three KB connectors: IT Glue, Hudu, SharePoint/OneDrive. Escalation via package + PSA reassign, picked up in chat. Engineer coverage via per-user can_cover_l1 flag with audit-log tagging. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 03:33:32 -04:00
parent 41f5519916
commit d1cf77cd41
1 changed files with 717 additions and 0 deletions
--- a/docs/superpowers/specs/2026-05-28-l1-workspace-design.md
+++ b/docs/superpowers/specs/2026-05-28-l1-workspace-design.md
@@ -0,0 +1,717 @@
+# L1 Workspace — Design Spec
+
+**Date:** 2026-05-28
+**Status:** Draft (pending implementation plan)
+**Audience for this doc:** engineers + reviewers building the L1 workspace feature
+
+---
+
+## 1. Summary
+
+Introduce a dedicated **L1 helpdesk** workspace as a new seat tier in ResolutionFlow. L1 techs walk customers through yes/no decision trees on inbound tickets and phone calls. The platform either matches an existing authored flow, reuses an outcome-validated AI draft, or builds a fresh decision tree in real time from the MSP's ingested knowledge base. Drafts that resolve a call become "outcome-validated" and surface first in the engineer review queue for promotion to authored flows. KB ingestion supports manual upload plus three MSP-native connectors: IT Glue, Hudu, and Microsoft SharePoint/OneDrive.
+
+This re-introduces the original deterministic tree-walker UX — which had been deprecated in favor of chat-primary FlowPilot — and repositions it as a frontline-tier product surface distinct from the engineer chat surface.
+
+---
+
+## 2. Motivation
+
+The current ResolutionFlow product funnels every user — regardless of skill tier — into a single chat-primary surface (`AssistantChatPage` mounted at `/pilot`). The chat is excellent for engineers but is the wrong primitive for L1 helpdesk staff who:
+
+- Take inbound phone calls and need a fast, deterministic click-through UX
+- Resolve simple, recurring problems (password resets, mailbox connection issues, VPN disconnects, printer queue clears, etc.)
+- Are not authorized to escalate complex issues themselves; they hand off to engineers
+
+A tree-walker UX serves this audience natively. The substrate already exists in the codebase — decision-tree data model, authoring tools, RAG, KB Accelerator, escalation packaging — but no first-class L1 surface ties it together. This spec defines that surface and the supporting AI/KB pipeline.
+
+---
+
+## 3. Users & roles
+
+### 3.1 Role hierarchy
+
+`super_admin > owner > engineer > l1_tech > viewer`
+
+`l1_tech` is added to the `account_role` enum. Permissions enforced via `app/core/permissions.py` and `app/api/deps.py`.
+
+### 3.2 What L1 can do
+
+- Use the `/l1/*` surface
+- Open tickets from their queue (PSA-fed or internal)
+- Intake walk-in/phone-call problems (creates a ticket as a side effect)
+- Walk authored flows and AI-built FlowProposal drafts
+- Resolve or escalate a session
+- View their own AI drafts list (read-only — outcome tags shown)
+
+### 3.3 What L1 cannot do
+
+- See the chat surface (`/pilot`) — sidebar hidden, route 403s
+- Author or edit flows
+- See `/review-queue` or `/escalations` (engineer inboxes)
+- See team analytics (only `/analytics/me`)
+- Promote AI drafts (engineers/owners only, via existing review queue)
+- Configure KB connectors (owner-only)
+
+### 3.4 Engineer L1 coverage
+
+Engineers do NOT see the L1 surface by default. Owners can toggle `users.can_cover_l1 = true` on individual engineer users. Engineers with that flag (and all owners/super_admins) see an "L1 Workspace" entry in their sidebar. Clicking it puts them in `/l1/*` with a sticky banner: *"Covering L1 — actions logged as coverage."* Coverage actions are audit-logged with `acting_as = 'l1_coverage'`.
+
+Backend dep: `require_l1_or_coverage` = `l1_tech | (engineer AND can_cover_l1) | owner | super_admin`.
+
+This mirrors the existing orthogonal-flag pattern (`is_team_admin`) — no new architectural concept.
+
+### 3.5 Billing data model
+
+- `accounts.l1_seats_purchased INTEGER NOT NULL DEFAULT 0` (new column)
+- Existing `accounts.seats_purchased` continues to represent engineer seats
+- New Stripe SKU placeholder for L1 seat; actual pricing set in Stripe dashboard out-of-band
+
+---
+
+## 4. Architecture overview
+
+### 4.1 New components
+
+**Frontend:**
+- `pages/l1/L1Dashboard.tsx` — landing page; ticket queue + describe-the-problem intake
+- `pages/l1/L1WalkPage.tsx` — purpose-built walker; yes/no cards, transcript, persistent escalate/resolve
+- `pages/l1/L1DraftsPage.tsx` — read-only list of the L1's AI drafts and promotion status
+- `pages/l1/L1TicketsPage.tsx` — full-page queue (PSA + internal merged)
+- `components/l1/L1CoverageBanner.tsx` — slim banner shown to engineer-coverers
+
+**Backend:**
+- `services/match_or_build.py` — orchestrator (RAG match → fallback to AI build)
+- `services/ai_tree_builder.py` — real-time AI tree generation via Anthropic
+- `services/kb_connectors/` package — base, registry, encryption, plus `itglue.py`, `hudu.py`, `microsoft_graph.py`
+- `services/kb_ingestion_writer.py` — shared writer used by manual upload + all connectors
+- `services/kb_ingestion_scheduler.py` — APScheduler job, `max_instances=1`, per-connector sync
+- `services/internal_ticket_service.py` — CRUD + status transitions for the no-PSA fallback
+- `services/l1_session_service.py` — walking-session lifecycle
+- `api/endpoints/l1.py` — L1-role endpoints
+- `api/endpoints/kb_connectors.py` — KB connector config endpoints (owner-only for write)
+
+**Reused / extended:**
+- `services/rag_service.py` — flow & KB matching (existing)
+- `services/flow_matching_engine.py` — existing
+- `services/escalation_package_generator.py` — extended to include walked path, AI draft pointer, KB citations
+- `models/FlowProposal` — new columns (see §5)
+- `services/psa/` — already supports ticket create + reassign across CW/Autotask/HaloPSA
+- `services/embedding_service.py` — used by KB ingestion writer
+- New `kb_documents` + `kb_document_chunks` tables for RAG-retrievable document storage, separate from the existing `kb_imports` (which is a document→tree conversion record, not a persistent KB store — see §5)
+- Audit log writer — gains `acting_as` field
+
+### 4.2 Data flow — walk-in / phone-call intake
+
+```
+L1 types: "User can't connect Outlook after password reset"
+  POST /api/v1/l1/intake
+    body: { problem_statement, customer_name?, customer_contact? }
+    → create ticket
+        - PSA if configured: psa_provider.create_ticket(...)
+        - else: internal_tickets row
+    → match_or_build(account_id, problem_text, ticket_ref)
+        → rag_service.match_flows(...) → top hit; if score ≥ threshold return as 'flow'
+        → rag_service.match_proposals(... where validated_by_outcome=true)
+                                           → top hit; if score ≥ threshold return as 'proposal'
+        → ai_tree_builder.build(problem_text, kb_chunks, nearest_flows)
+                                           → persist FlowProposal(source='ai_realtime_l1',
+                                                                  linked_ticket_id,
+                                                                  linked_ticket_kind,
+                                                                  validated_by_outcome=false)
+                                           → return as 'proposal'
+    → l1_session_service.start(...)
+    → return { session_id, target_kind, target_id, intake_type }
+  → navigate to /l1/walk/{session_id}
+```
+
+### 4.3 Data flow — PSA-queue intake
+
+The L1 dashboard polls the L1's PSA queue plus their internal tickets. Clicking a ticket row calls `POST /api/v1/l1/tickets/{ticket_ref}/start` which is the same `match_or_build` path (the `problem_statement` is the ticket subject + description) followed by walker navigation.
+
+---
+
+## 5. Data model
+
+All new tenant-isolated tables get RLS policies (account-scoped, WITH CHECK). All TIMESTAMPs are `TIMESTAMPTZ`. No `--rev-id` on Alembic; no `--autogenerate` for enum/RLS work.
+
+### 5.1 `FlowProposal` — extended
+
+Existing AI-draft model. Add columns:
+
+| Column | Type | Notes |
+|---|---|---|
+| `source` | `VARCHAR(30) NOT NULL` | `'ai_realtime_l1' \| 'kb_accelerator' \| 'manual_draft'`. Backfill existing rows to `'manual_draft'`. |
+| `linked_ticket_id` | `VARCHAR(64) NULL` | PSA id or internal_tickets UUID (stored as text) |
+| `linked_ticket_kind` | `VARCHAR(10) NULL` | `'psa' \| 'internal'` |
+| `validated_by_outcome` | `BOOLEAN NOT NULL DEFAULT FALSE` | Flipped to true when L1 resolves and marks helpful=true |
+| `walked_path_snapshot` | `JSONB NULL` | Frozen at resolve/escalate; shape `[{node_id, question, answer, l1_note}]` |
+
+Engineer review queue sort:
+```sql
+ORDER BY validated_by_outcome DESC, created_at DESC
+```
+
+### 5.2 `internal_tickets` — new
+
+```
+id                        UUID PRIMARY KEY
+account_id                UUID NOT NULL  (RLS-scoped)
+created_by_user_id        UUID NOT NULL  (the L1 who took the call)
+customer_name             VARCHAR(120)
+customer_contact          VARCHAR(200) NULL    (email or phone, free text)
+problem_statement         TEXT NOT NULL
+status                    VARCHAR(30) NOT NULL  -- 'open' | 'walking' | 'resolved' | 'escalated'
+flow_id                   UUID NULL FK trees
+flow_proposal_id          UUID NULL FK flow_proposals
+ai_session_id             UUID NULL FK ai_sessions (set when engineer picks up in chat post-escalation)
+assigned_user_id          UUID NULL    (engineer post-escalation)
+resolution_notes          TEXT NULL
+psa_promoted_ticket_id    VARCHAR(64) NULL   (set if later promoted to PSA)
+created_at                TIMESTAMPTZ NOT NULL
+updated_at                TIMESTAMPTZ NOT NULL
+resolved_at               TIMESTAMPTZ NULL
+```
+
+RLS: account-scoped, WITH CHECK on insert/update.
+
+### 5.3 `kb_connector_configs` — new
+
+```
+id                        UUID PRIMARY KEY
+account_id                UUID NOT NULL  (RLS-scoped)
+provider                  VARCHAR(20) NOT NULL  -- 'itglue' | 'hudu' | 'microsoft_graph'
+display_name              VARCHAR(80) NOT NULL
+credentials_encrypted     BYTEA NOT NULL        -- Fernet, same pattern as services/psa/encryption.py
+is_active                 BOOLEAN NOT NULL DEFAULT TRUE
+sync_interval_minutes     INTEGER NOT NULL DEFAULT 360
+last_sync_at              TIMESTAMPTZ NULL
+last_sync_status          VARCHAR(20) NULL      -- 'success' | 'error' | 'running'
+last_sync_error           TEXT NULL
+created_by_user_id        UUID NOT NULL
+created_at                TIMESTAMPTZ NOT NULL
+updated_at                TIMESTAMPTZ NOT NULL
+UNIQUE (account_id, provider, display_name)
+```
+
+RLS: account-scoped, WITH CHECK.
+
+### 5.4 New tables: `kb_documents` + `kb_document_chunks`
+
+The existing `kb_imports` table is a document→tree conversion record (status lifecycle `processing | ready | committed | failed`, target `tree_id`) — designed to turn one document into one authored flow. It is NOT a persistent KB document store and does not power RAG retrieval.
+
+The L1 feature needs a separate pair of tables that store ingested docs in RAG-retrievable form:
+
+**`kb_documents`** — one row per ingested document:
+
+```
+id                        UUID PRIMARY KEY
+account_id                UUID NOT NULL  (RLS-scoped)
+source_kind               VARCHAR(20) NOT NULL  -- 'upload' | 'paste' | 'itglue' | 'hudu' | 'microsoft_graph'
+source_ref                VARCHAR(200) NULL     -- provider-side document ID for re-sync
+connector_config_id       UUID NULL FK kb_connector_configs
+title                     VARCHAR(500) NOT NULL
+content                   TEXT NOT NULL          -- full post-extraction text
+content_hash              VARCHAR(64) NOT NULL   -- sha256 for change-detection
+metadata                  JSONB NULL             -- provider-specific (org_id, drive_id, etc.)
+last_synced_at            TIMESTAMPTZ NULL
+deleted_at                TIMESTAMPTZ NULL       -- soft-delete on connector removal
+created_at                TIMESTAMPTZ NOT NULL
+updated_at                TIMESTAMPTZ NOT NULL
+```
+
+Unique partial index: `(connector_config_id, source_ref) WHERE source_ref IS NOT NULL`.
+
+**`kb_document_chunks`** — chunks with embeddings, used by `rag_service.match_kb_chunks`:
+
+```
+id                        UUID PRIMARY KEY
+document_id               UUID NOT NULL FK kb_documents ON DELETE CASCADE
+account_id                UUID NOT NULL  -- denormalized for RLS
+chunk_index               INTEGER NOT NULL
+content                   TEXT NOT NULL
+embedding                 VECTOR(<dim>) NOT NULL  -- dim matches embedding_service
+metadata                  JSONB NULL              -- section title, page number, etc.
+created_at                TIMESTAMPTZ NOT NULL
+UNIQUE (document_id, chunk_index)
+```
+
+Pgvector index (ivfflat or hnsw) on `embedding`; choice tuned during implementation.
+
+RLS on both tables: account-scoped, WITH CHECK on insert.
+
+**Coexistence with `kb_imports`:** when an L1 (or owner) uploads a doc, the system can populate **both** — the existing KBImport pipeline produces a draft tree, and the new ingestion writer additionally chunks+embeds the doc into `kb_documents` for RAG. Both paths share the upload endpoint but write to independent tables. Connectors only write to `kb_documents` (no auto-tree-conversion from synced docs in v1).
+
+### 5.5 Other column additions
+
+- `users.can_cover_l1 BOOLEAN NOT NULL DEFAULT FALSE`
+- `accounts.l1_seats_purchased INTEGER NOT NULL DEFAULT 0`
+- `audit_logs.acting_as VARCHAR(30) NULL` — `'l1_coverage'` when engineer is in coverage mode; null otherwise
+- `account_role` enum: add `'l1_tech'`
+
+### 5.6 Migration ordering
+
+Six manual Alembic revisions (no `--rev-id`, no `--autogenerate`):
+
+1. Add `'l1_tech'` to `account_role` enum.
+2. Add `users.can_cover_l1`, `accounts.l1_seats_purchased`, `audit_logs.acting_as`.
+3. Extend `flow_proposals` with new columns + backfill existing rows to `source='manual_draft'`.
+4. Create `internal_tickets` + RLS policies (account-scoped, WITH CHECK).
+5. Create `kb_connector_configs` + RLS policies.
+6. Create `kb_documents` + `kb_document_chunks` tables + RLS policies + pgvector index on chunks.
+
+Per Lesson on tenant-isolated tables: any service-construction site that creates rows on these tables must pass `account_id=` explicitly. Grep all `Model(` sites before merge.
+
+---
+
+## 6. Backend services & endpoints
+
+### 6.1 New services
+
+| Module | Purpose |
+|---|---|
+| `services/match_or_build.py` | Orchestrator. Single async entrypoint `match_or_build(account_id, problem_text, ticket_ref) -> MatchOrBuildResult`. |
+| `services/ai_tree_builder.py` | Real-time AI tree generation. Anthropic via existing `_call_anthropic_cached` pattern. Model tier via `settings.get_model_for_action('l1_realtime_build')`. Output validated against the flow node schema with Pydantic; rejects malformed output. |
+| `services/kb_connectors/base.py` | Abstract `KBConnector` with `test_credentials`, `list_documents`, `fetch_content`, `subscribe_to_changes` (optional). |
+| `services/kb_connectors/itglue.py` | IT Glue REST client. |
+| `services/kb_connectors/hudu.py` | Hudu REST client. |
+| `services/kb_connectors/microsoft_graph.py` | Microsoft Graph (SharePoint/OneDrive) client. |
+| `services/kb_connectors/registry.py` | `KBConnectorRegistry` (mirrors `PsaProviderRegistry`). |
+| `services/kb_connectors/encryption.py` | Fernet wrapper (or reuse the PSA one if generic). |
+| `services/kb_ingestion_writer.py` | Shared writer: chunk → embed → upsert. Used by manual upload AND connector sync. |
+| `services/kb_ingestion_scheduler.py` | APScheduler interval job, `max_instances=1`. Sequential per account; concurrency cap = 4 accounts simultaneously. |
+| `services/internal_ticket_service.py` | CRUD + status transitions for `internal_tickets`. |
+| `services/l1_session_service.py` | Walking-session lifecycle: start, step, resolve, escalate. Bridges `ai_sessions` and the walked target. |
+
+### 6.2 Extended services
+
+- `services/escalation_package_generator.py` — adds inputs: `walked_path`, `ai_draft_proposal_id`, `kb_citations`. New caller path from `l1_session_service.escalate(...)`.
+- KB Accelerator endpoint — accepts ingested content via the shared `kb_ingestion_writer`. Manual upload and connector sync share the same persistence path.
+
+### 6.3 New endpoints
+
+All under `require_l1_or_coverage` unless noted. Mounted under `/api/v1/l1`.
+
+| Method | Path | Purpose | Auth |
+|---|---|---|---|
+| GET | `/l1/queue` | Merged ticket queue (PSA + internal). Pagination + status filter. | `require_l1_or_coverage` |
+| POST | `/l1/intake` | Walk-in intake. Body `{problem_statement, customer_name?, customer_contact?}`. Creates ticket, returns `{session_id, target_kind, target_id, intake_type}`. | `require_l1_or_coverage` |
+| POST | `/l1/tickets/{ticket_ref}/start` | Start walker from an existing ticket. Internally same as intake but skips ticket creation. | `require_l1_or_coverage` |
+| POST | `/l1/sessions/{id}/step` | Record an answer. Body `{node_id, answer, note?}`. Appends to `walked_path_snapshot`. | `require_l1_or_coverage` |
+| POST | `/l1/sessions/{id}/resolve` | Close as resolved. Body `{resolution_notes, helpful: bool}`. Sets `validated_by_outcome=true` on the proposal when `helpful=true` AND target was a proposal. Closes the ticket. | `require_l1_or_coverage` |
+| POST | `/l1/sessions/{id}/escalate` | Generate escalation package + reassign ticket. Body `{reason, reason_category}`. | `require_l1_or_coverage` |
+| GET | `/l1/drafts` | List current user's AI drafts with promotion status. | `require_l1_or_coverage` |
+
+KB connector endpoints (`/api/v1/kb-connectors`):
+
+| Method | Path | Purpose | Auth |
+|---|---|---|---|
+| GET | `/kb-connectors` | List configured connectors for account. | `require_l1_or_above` |
+| POST | `/kb-connectors` | Create. OAuth handoff for Microsoft Graph; API token entry for IT Glue/Hudu. | `require_account_owner` |
+| DELETE | `/kb-connectors/{id}` | Remove (soft-disable). | `require_account_owner` |
+| POST | `/kb-connectors/{id}/sync` | Trigger immediate sync (enqueued). | `require_account_owner` |
+| GET | `/kb-connectors/{id}/status` | Sync status + doc count + last error. | `require_l1_or_above` |
+
+Internal ticket endpoints (`/api/v1/internal-tickets`):
+
+| Method | Path | Purpose | Auth |
+|---|---|---|---|
+| GET | `/internal-tickets` | List (account-scoped). | `require_l1_or_coverage` |
+| GET | `/internal-tickets/{id}` | Detail. | `require_l1_or_coverage` |
+| POST | `/internal-tickets/{id}/promote-to-psa` | Push to configured PSA, set `psa_promoted_ticket_id`. | `require_account_owner` |
+
+User management addition:
+
+| Method | Path | Purpose | Auth |
+|---|---|---|---|
+| PATCH | `/users/{id}/coverage` | Set `can_cover_l1` flag. Body `{can_cover_l1: bool}`. | `require_account_owner` |
+
+---
+
+## 7. Frontend surface
+
+### 7.1 Sidebar — L1 view
+
+```
+LOGO
+─────────────
+Workspace      /l1
+Tickets        /l1/tickets
+My Drafts      /l1/drafts
+─────────────
+Guides         /guides
+Account        /account     (filtered — no integrations, no categories)
+```
+
+No `/pilot`, no `/trees`, no `/flows`, no `/review-queue`, no `/escalations`, no team analytics. Sidebar.tsx picks the nav array by role.
+
+### 7.2 Sidebar — engineer coverage view
+
+Engineer's existing sidebar plus a single appended entry "L1 Workspace" → `/l1`. Shown when `canCoverL1 || isOwner || isSuperAdmin`.
+
+### 7.3 `/l1` dashboard layout
+
+Three vertical zones, single column, max width ~1100px:
+
+1. **Greeting** — uppercase tracking date label + Bricolage 700 hero ("Good morning, {firstName}.")
+2. **Describe the problem** card — large textarea (autofocus on load), optional `customer_name` + `customer_contact` fields, single primary CTA "Start walk →" (the only electric-blue element on the page)
+3. **Open tickets** — section label, count, table rows (merged PSA + internal with origin badges), row hover `bg-elevated`
+4. **Resume in progress** — shown only when L1 has a half-walked session
+
+Tailwind v4 tokens: `bg-page` base, `bg-card` zones, `bg-elevated` row hover, electric-blue accent only on primary CTA. No `text-secondary`. All borders `border-default`.
+
+### 7.4 `/l1/walk/{sessionId}` walker
+
+Sticky header + two-pane body, full-height (flex chain per Lesson — every ancestor needs `flex` + `flex-1` + `min-h-0`).
+
+**Header:**
+- Back arrow + ticket ref + customer name + AI-built badge (when target is proposal)
+- Problem statement line
+- Persistent action buttons: `[ Escalate ]` `[ Resolve ✓ ]`
+
+**Left pane (main):**
+- "Step N · estimated M" label
+- Current node card — large yes/no/answer buttons (min 44px tap target)
+- Optional note textarea below the card (appended to `walked_path_snapshot`)
+- On a fresh proposal that's still building: shimmer placeholder + "Building from KB… ~10s"
+
+**Right pane (transcript):**
+- Walked-so-far list (node title + answer chosen)
+- Current step highlight
+- "Source:" section listing KB citations for the current node (proposal walks only)
+
+**Resolve modal:**
+- "Did this resolve it?" `[ Yes ]` `[ No ]`
+- Resolution notes textarea
+- Yes + target was proposal → sets `validated_by_outcome=true`
+- No → prompt to escalate instead
+
+**Escalate modal:**
+- Reason category dropdown: *Out of L1 scope · Customer demanding senior · Tree dead-ended · AI tree wrong · Other*
+- Free-text reason
+- Confirm
+
+### 7.5 `/l1/drafts` page
+
+Read-only list, columns: `created` · `problem (truncated)` · `ticket #` · `status` (pending review / outcome-validated / promoted / retired). Click → read-only detail view showing tree + walked path. No edit affordances.
+
+### 7.6 `/l1/tickets` page
+
+Full-page version of the dashboard queue widget. Filter by status, origin (PSA/internal), assigned-to-me.
+
+### 7.7 Coverage banner
+
+`<L1CoverageBanner />` — slim ~32px band, info-cyan-dim background, mounted at the top of all `/l1/*` pages when `!isL1Tech && (canCoverL1 || isOwner || isSuperAdmin)`:
+
+```
+You're covering L1. Actions logged as coverage. [Switch back →]
+```
+
+The "Switch back" link returns to `/`.
+
+### 7.8 Routing
+
+```tsx
+const L1Dashboard = lazyWithRetry(() => import('@/pages/l1/L1Dashboard'))
+const L1WalkPage = lazyWithRetry(() => import('@/pages/l1/L1WalkPage'))
+const L1DraftsPage = lazyWithRetry(() => import('@/pages/l1/L1DraftsPage'))
+const L1TicketsPage = lazyWithRetry(() => import('@/pages/l1/L1TicketsPage'))
+```
+
+Mounted under the `/` ProtectedRoute branch at:
+- `/l1` → `L1Dashboard`
+- `/l1/walk/:sessionId` → `L1WalkPage`
+- `/l1/drafts` → `L1DraftsPage`
+- `/l1/tickets` → `L1TicketsPage`
+
+Wrapped in `L1RouteGuard` (403 if not `l1_tech` AND not coverage-flagged). `ProtectedRoute.tsx` post-login redirect: L1 users land on `/l1` instead of `/`.
+
+`lazyWithRetry`, not `React.lazy` (per existing convention).
+
+---
+
+## 8. AI match-or-build pipeline
+
+### 8.1 Match-or-build algorithm
+
+```
+match_or_build(account_id, problem_text, ticket_ref):
+  embedding = embedding_service.embed(problem_text)
+
+  # 1. Match authored flows
+  flow_hits = rag_service.match_flows(account_id, embedding, k=5)
+  if flow_hits and flow_hits[0].score >= MATCH_THRESHOLD:
+      return {kind: 'flow', id: flow_hits[0].flow_id, score: ...}
+
+  # 2. Match outcome-validated proposals only
+  proposal_hits = rag_service.match_proposals(
+      account_id, embedding, k=5,
+      where=validated_by_outcome=true,
+  )
+  if proposal_hits and proposal_hits[0].score >= MATCH_THRESHOLD:
+      return {kind: 'proposal', id: proposal_hits[0].proposal_id, score: ...}
+
+  # 3. Build fresh
+  kb_chunks = rag_service.match_kb_chunks(account_id, embedding, k=8)
+  if not kb_chunks:
+      raise BuildAbortedNoKB(
+          "Cannot build a tree with no KB content. "
+          "Upload docs or wait for a connector sync."
+      )
+  nearest_flows = flow_hits[:3]
+  proposal = ai_tree_builder.build(
+      problem_text, kb_chunks, nearest_flows, account_id, ticket_ref
+  )
+  return {kind: 'proposal', id: proposal.id, score: None}
+```
+
+`MATCH_THRESHOLD` — per-account configurable; default `0.75` (cosine).
+
+The "no empty KB build" rule is enforced because an AI tree built on the model's general knowledge — without MSP-specific grounding — risks suggesting unsafe or hallucinated fixes.
+
+### 8.2 AI tree-build details
+
+**Model:** `settings.get_model_for_action('l1_realtime_build')`. Recommend Sonnet for v1 (latency-sensitive).
+
+**Schema:** output validated against the existing flow node schema (matches `tree_editor` output). Validation failure aborts the build rather than persisting malformed data.
+
+**Prompt strategy** (per Lesson on prompt anti-parrot — critical):
+- System prompt: role definition + output schema using `<placeholder>` notation only. Never literal field values.
+- Few-shot examples loaded as user/assistant messages from a separate file, never inline in the system prompt.
+- User message: `{problem_statement}` + `{kb_context: [doc_title, section, content]}` + `{nearest_flow_summaries}` + instruction to cite KB chunks per node.
+- Output includes `kb_citations: [{node_id, kb_doc_id, snippet}]` for walker's "Source:" pane and engineer review.
+
+**Latency:** whole-tree-then-return (~5–15s typical). UX is a shimmer "Building from KB…" placeholder. Streaming node-by-node deferred to v2.
+
+**Anthropic SDK config** (per Lesson): `max_retries=1`. Prompt caching enabled on the stable system+few-shot bundle (high cache hit rate expected per account).
+
+**Telemetry:**
+- `l1.match_or_build.duration_ms`, `l1.match_or_build.outcome` (`flow_match`/`proposal_match`/`built`/`aborted_no_kb`)
+- `anthropic.cache` events (existing pattern) tagged `action=l1_realtime_build`
+- `l1.tree_build.tokens_in`, `tokens_out`
+
+**Anti-parrot guardrail:** the existing `tests/test_prompt_anti_parrot.py` auto-discovers new prompt constants via pattern match on `*_PROMPT` / `*_SCHEMA` / `*_PROTOCOL` / `*_FORMAT`. No new test required.
+
+### 8.3 Hallucinated-citation defense
+
+After build, the writer verifies every `kb_doc_id` in `kb_citations` exists in the account's KB. Unverified citations are stripped from the walker's "Source:" pane (the node still renders, just without a source). Engineer review surfaces stripped citations as a warning.
+
+---
+
+## 9. KB ingestion
+
+### 9.1 Connector interface
+
+```python
+class KBConnector(ABC):
+    async def test_credentials(self) -> bool
+    async def list_documents(self, since: datetime | None) -> AsyncIterator[KBDocRef]
+    async def fetch_content(self, ref: KBDocRef) -> KBDocContent
+    async def subscribe_to_changes(self) -> AsyncIterator[ChangeEvent]   # optional, no-op v1
+```
+
+Registry dispatches by `provider` string. Credentials encrypted at rest via Fernet (reuse `services/psa/encryption.py` pattern).
+
+### 9.2 Per-connector specifics
+
+| | IT Glue | Hudu | Microsoft Graph (SharePoint/OneDrive) |
+|---|---|---|---|
+| Auth | API token (header) | API key (header) | OAuth 2.0 |
+| Ingested types | Documents, KB Articles | Articles | docx, pdf, md, txt |
+| Never ingested | Passwords, Configurations, sensitive flex assets | Passwords, sensitive items | Files in folders matching `(secret\|confidential\|private)` heuristic; files with a tenant Sensitivity Label |
+| Filtering | Per-org (techs see all client orgs they have permission to) | Per-folder | Per-site / per-drive (owner picks at config time) |
+| Rate limits | ~100/min token bucket | ~250/min token bucket | Built-in Graph throttling backoff |
+
+All three deliver content to `kb_ingestion_writer` which:
+1. Chunks (paragraph-aware, configurable size with overlap)
+2. Embeds via `embedding_service`
+3. Upserts into `kb_documents` keyed on `(connector_config_id, source_ref)`; chunks into `kb_document_chunks`
+
+Cross-connector conflicts: same doc text appearing in two connectors yields two rows (provider-scoped `source_ref`). Engineers can dedup manually if needed.
+
+### 9.3 Sync scheduling
+
+`kb_ingestion_scheduler.py` runs as APScheduler interval job, `max_instances=1`. Per cycle:
+1. Query active `kb_connector_configs` where `last_sync_at` is older than `sync_interval_minutes` (default 360 = 6h).
+2. Dispatch per account; concurrency cap = 4 simultaneous accounts.
+3. For each connector: `list_documents(since=last_sync_at)` → for each ref, `fetch_content` → write.
+4. Compute the diff between current refs and existing rows (same `connector_config_id`); soft-delete missing ones via `deleted_at`.
+5. Update `last_sync_at`, `last_sync_status`, `last_sync_error`.
+
+Must use `_admin_session_factory()` not `get_db()` for startup-side and scheduler-side queries (per Lesson on RLS at startup — no `app.current_account_id` set).
+
+Immediate sync via `POST /api/v1/kb-connectors/{id}/sync` enqueues a job; scheduler picks it up within ~30s.
+
+---
+
+## 10. Escalation flow
+
+1. L1 clicks **Escalate** → modal (reason category + optional free text).
+2. `POST /api/v1/l1/sessions/{id}/escalate` → backend:
+   - Calls extended `escalation_package_generator.generate(session_id, include_l1_walk=true)`. Package contents:
+     ```
+     problem_statement, customer_name, customer_contact,
+     ticket_ref (PSA id or internal id),
+     target_kind ('flow' | 'proposal'), target_id,
+     walked_path,
+     ai_draft_proposal_id,
+     kb_citations,
+     escalation_reason, reason_category, l1_user_id
+     ```
+   - Creates an `ai_session` with the package serialized into system context for the chat surface.
+   - If PSA-backed: `psa_provider.reassign_ticket(ticket_id, to=account.engineer_queue_name)`. Default `'Tier 2'`. Owner configurable in `/account/integrations`.
+   - If internal-backed: `internal_tickets.status='escalated'`, `assigned_user_id=null` (round-robin assignment is out of scope).
+   - Writes notification via existing `notification_service` — bell badge to all engineers in account.
+   - Audit log entry; `acting_as` reflects whether L1 or coverage-engineer escalated.
+3. Toast on L1 side, return to `/l1`.
+4. Engineer clicks notification → `/pilot/{sessionId}` → chat surface renders the package as a sticky "Escalation context" card; engineer continues in chat.
+
+**Un-escalate is out of scope.** If engineer wants to bounce back, they reassign in PSA manually.
+
+---
+
+## 11. Internal ticket fallback
+
+When the account has no active PSA provider:
+- Intake creates `internal_tickets` row instead of a PSA ticket.
+- Queue surface merges PSA + internal with `Internal` / `PSA` origin badge.
+- Escalation flips `internal_tickets.status='escalated'` and assigns engineer (or leaves null for any engineer to claim — v1 behavior).
+- Engineer post-escalation sees the internal ticket as a session; no PSA roundtrip.
+
+**Promote to PSA:** owner-only action on any internal ticket. Pushes the ticket into the configured PSA provider, sets `psa_promoted_ticket_id`. Manual; not automatic on PSA-install. Lets MSPs adopt PSA mid-flight without orphaning prior internal tickets.
+
+---
+
+## 12. Outcome-validation lifecycle
+
+```
+1. L1 intake → match_or_build → FlowProposal(source='ai_realtime_l1',
+                                              validated_by_outcome=false,
+                                              linked_ticket_id=...)
+2. L1 walks → POST /l1/sessions/{id}/step appends to walked_path_snapshot
+3. L1 hits Resolve:
+     modal: "Did this resolve it?" [Yes] [No] + resolution_notes
+4. helpful=true → flow_proposal.validated_by_outcome = true
+                 → walked_path_snapshot frozen
+                 → ticket closed (PSA or internal)
+   helpful=false → validated_by_outcome stays false
+                  → L1 prompted: "Escalate instead?"
+5. Engineer review queue:
+     ORDER BY validated_by_outcome DESC, created_at DESC
+     - Outcome-validated drafts surface first
+     - Promote / edit-and-promote / retire
+6. Promote → new flow with source='ai_promoted'; original proposal kept with status='promoted'
+           → future match_or_build matches the new flow on the flow-match pass
+```
+
+---
+
+## 13. Out of scope (v1 non-goals)
+
+- End-user / self-service portal ("L0" tier).
+- Engineer warm-transfer / live take-over during a call.
+- L1 ↔ engineer real-time chat during a call.
+- Multi-language UI / customer-language toggle in walker.
+- Auto-promote internal tickets to PSA on integration install.
+- AI tree streaming (node-by-node).
+- KB write-back to IT Glue/Hudu/SharePoint (read-only ingestion).
+- Confluence connector.
+- Per-step KB citation editing in engineer review (engineers edit the tree, not citations).
+- Final Stripe pricing SKU (data model supports differential pricing; price set in Stripe dashboard).
+- "Switch to L1 mode" persistent toggle for engineers (coverage flag + banner only).
+- Cancel/un-escalate flow.
+- Round-robin engineer assignment on internal-ticket escalations.
+
+---
+
+## 14. Testing strategy
+
+### 14.1 Backend (pytest)
+
+- Unit: `match_or_build` covers all four paths (flow-match, proposal-match, built, aborted_no_kb).
+- Unit: `ai_tree_builder` schema validation — assert rejection of malformed Anthropic output before persistence.
+- Unit: each connector's `list_documents` + `fetch_content` against recorded HTTP fixtures.
+- Integration: intake → walk → resolve(helpful=true) → assert `FlowProposal.validated_by_outcome=true`, ticket closed.
+- Integration: intake → walk → escalate → assert PSA `reassign_ticket` invoked, `ai_session` created with package, audit log entry, notification dispatched.
+- Integration: KB scheduler — `max_instances=1`, sequential per-account, soft-delete on removal.
+- **RLS regression** (highest priority): `l1_tech` user in account A cannot read account B's tickets, drafts, KB docs, or connector configs. Added to existing RLS test suite.
+- Anti-parrot: existing CI test auto-discovers new prompt module.
+
+### 14.2 Frontend
+
+- Unit: `usePermissions` — L1 sees L1 paths, blocked from engineer paths. Coverage flag opens L1 paths.
+- Unit: `L1WalkPage` — node advance, escalate modal, resolve modal flips `validated_by_outcome` correctly.
+- Unit: `L1CoverageBanner` — visible for engineer-with-flag on `/l1/*`, hidden for L1 users.
+- E2E (Playwright, scoped selectors per Lesson):
+  - L1 sign-in → dashboard → intake → walker → resolve → verify ticket closed + proposal flagged.
+  - Engineer with `can_cover_l1` → sidebar entry visible → click → coverage banner shows → walks a session → audit log records `acting_as='l1_coverage'`.
+  - L1 hitting `/pilot`, `/trees/new`, `/escalations` → 403 or redirect.
+
+---
+
+## 15. Acceptance criteria (v1 ships when…)
+
+- L1 role assignable; assigned L1 sees L1 sidebar only; no engineer route reachable.
+- L1 intake creates a ticket (PSA or internal) and lands in walker session.
+- Walker handles both flows and proposals; AI-built badge + sources shown for proposals.
+- Escalate generates package, reassigns ticket, notifies engineers.
+- Resolve flips `validated_by_outcome`; review queue prioritizes outcome-validated drafts.
+- All three KB connectors configurable; initial sync + periodic re-sync + soft-delete on removal.
+- AI build refuses with informative error when account KB is empty.
+- Coverage flag works end-to-end with audit-log tagging.
+- RLS blocks cross-tenant reads on every new table.
+- L1 seat count tracked separately from engineer seats in admin/billing UI.
+
+---
+
+## 16. Risks & mitigations
+
+| Risk | Mitigation |
+|---|---|
+| AI builds an unsafe tree | Schema validation rejects malformed output. Engineer review is the gate before draft becomes "real" flow. v1 refuses to build when KB is empty. |
+| Hallucinated KB citations | Post-build verification that each `kb_doc_id` exists; unverified citations stripped from walker, surfaced as warning in engineer review. |
+| Duplicate proposals for same problem | Validated-proposal match pass deduplicates after one L1 validates; pre-validation dups are tolerated and dedup'd during engineer review. |
+| KB ingestion captures sensitive content | Per-connector deny-lists (passwords, sensitive flex assets, MS Graph Sensitivity Labels). Owners exclude specific folders/sites at config. All ingested docs visible in `/account/kb` for manual deletion. |
+| AI build latency frustrates customer on call | Build-progress UI sets expectation. Escalate button visible from page load. Future: pre-warm builds on PSA-ticket-landed event. |
+| Three connectors is more scope than originally proposed | Acknowledged. Each connector is ~1–2 weeks of work. Plan should sequence them and allow shipping with IT Glue + Hudu first if SharePoint slips. |
+| Engineer review queue backlog stalls library growth | Validated-proposal match pass means good drafts get reused without engineer review. Backlog only delays the move from `'proposal'` to `'flow'`, not the L1's ability to use validated content. |
+
+---
+
+## 17. Naming reference
+
+| Layer | Value |
+|---|---|
+| DB enum (`account_role`) | `l1_tech` |
+| UI display | "L1 Tech" / "L1" |
+| Sidebar entry | "L1 Workspace" |
+| URL prefix | `/l1` |
+| Coverage flag column | `users.can_cover_l1` |
+| Coverage audit tag | `acting_as = 'l1_coverage'` |
+| Pricing label | "L1 seat" |
+| Stripe SKU | Set in Stripe dashboard at launch — data model supports differential pricing now |
+
+---
+
+## 18. Open implementation decisions (deferred to plan, not blocking design)
+
+- Specific `MATCH_THRESHOLD` default value validation (initial 0.75, tune from telemetry post-launch).
+- Specific Anthropic model choice for `l1_realtime_build` (Sonnet vs Opus — pick based on quality benchmark during plan).
+- Chunk size + overlap for KB ingestion writer (tune in implementation).
+- Engineer queue label default (`'Tier 2'` vs `'Engineering'`) — owner-configurable anyway.
+- Exact look of the build-progress shimmer animation — design-system handoff.
+
+These are tuning/UX-polish details, not architectural forks. They land during the writing-plans phase, not here.
+
+### Note on scope and phasing
+
+This is a substantive feature: new role, four frontend pages, ~12 endpoints, AI tree-builder, three KB connectors, escalation extensions, and six migrations. The implementation plan will almost certainly phase the work — a reasonable cut is:
+
+- **Phase 1:** role + L1 surface against existing authored flows (no AI build, no connectors yet). Validates the seat model, walker UX, escalation, internal ticket fallback, and coverage flag end-to-end.
+- **Phase 2:** `kb_documents` schema + AI tree-builder + match-or-build pipeline. Enables real-time AI flows grounded on manually-uploaded KB.
+- **Phase 3:** the three KB connectors (IT Glue, Hudu, SharePoint/OneDrive). Each is roughly self-contained — can ship one at a time and reorder if a connector blocks.
+
+Phasing is a plan-level decision; the spec captures the full feature.
+
+---
+
+*End of spec.*