docs(design): L1 workspace feature spec
New seat tier between engineer and viewer. Dedicated /l1 surface (dashboard + walker + drafts) for first-call helpdesk staff. Walk-in intake + PSA queue both produce tickets. Match-or-build pipeline prefers authored flows, then outcome-validated AI drafts, then builds fresh from KB. Three KB connectors: IT Glue, Hudu, SharePoint/OneDrive. Escalation via package + PSA reassign, picked up in chat. Engineer coverage via per-user can_cover_l1 flag with audit-log tagging. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
717
docs/superpowers/specs/2026-05-28-l1-workspace-design.md
Normal file
717
docs/superpowers/specs/2026-05-28-l1-workspace-design.md
Normal file
@@ -0,0 +1,717 @@
|
||||
# L1 Workspace — Design Spec
|
||||
|
||||
**Date:** 2026-05-28
|
||||
**Status:** Draft (pending implementation plan)
|
||||
**Audience for this doc:** engineers + reviewers building the L1 workspace feature
|
||||
|
||||
---
|
||||
|
||||
## 1. Summary
|
||||
|
||||
Introduce a dedicated **L1 helpdesk** workspace as a new seat tier in ResolutionFlow. L1 techs walk customers through yes/no decision trees on inbound tickets and phone calls. The platform either matches an existing authored flow, reuses an outcome-validated AI draft, or builds a fresh decision tree in real time from the MSP's ingested knowledge base. Drafts that resolve a call become "outcome-validated" and surface first in the engineer review queue for promotion to authored flows. KB ingestion supports manual upload plus three MSP-native connectors: IT Glue, Hudu, and Microsoft SharePoint/OneDrive.
|
||||
|
||||
This re-introduces the original deterministic tree-walker UX — which had been deprecated in favor of chat-primary FlowPilot — and repositions it as a frontline-tier product surface distinct from the engineer chat surface.
|
||||
|
||||
---
|
||||
|
||||
## 2. Motivation
|
||||
|
||||
The current ResolutionFlow product funnels every user — regardless of skill tier — into a single chat-primary surface (`AssistantChatPage` mounted at `/pilot`). The chat is excellent for engineers but is the wrong primitive for L1 helpdesk staff who:
|
||||
|
||||
- Take inbound phone calls and need a fast, deterministic click-through UX
|
||||
- Resolve simple, recurring problems (password resets, mailbox connection issues, VPN disconnects, printer queue clears, etc.)
|
||||
- Are not authorized to escalate complex issues themselves; they hand off to engineers
|
||||
|
||||
A tree-walker UX serves this audience natively. The substrate already exists in the codebase — decision-tree data model, authoring tools, RAG, KB Accelerator, escalation packaging — but no first-class L1 surface ties it together. This spec defines that surface and the supporting AI/KB pipeline.
|
||||
|
||||
---
|
||||
|
||||
## 3. Users & roles
|
||||
|
||||
### 3.1 Role hierarchy
|
||||
|
||||
`super_admin > owner > engineer > l1_tech > viewer`
|
||||
|
||||
`l1_tech` is added to the `account_role` enum. Permissions enforced via `app/core/permissions.py` and `app/api/deps.py`.
|
||||
|
||||
### 3.2 What L1 can do
|
||||
|
||||
- Use the `/l1/*` surface
|
||||
- Open tickets from their queue (PSA-fed or internal)
|
||||
- Intake walk-in/phone-call problems (creates a ticket as a side effect)
|
||||
- Walk authored flows and AI-built FlowProposal drafts
|
||||
- Resolve or escalate a session
|
||||
- View their own AI drafts list (read-only — outcome tags shown)
|
||||
|
||||
### 3.3 What L1 cannot do
|
||||
|
||||
- See the chat surface (`/pilot`) — sidebar hidden, route 403s
|
||||
- Author or edit flows
|
||||
- See `/review-queue` or `/escalations` (engineer inboxes)
|
||||
- See team analytics (only `/analytics/me`)
|
||||
- Promote AI drafts (engineers/owners only, via existing review queue)
|
||||
- Configure KB connectors (owner-only)
|
||||
|
||||
### 3.4 Engineer L1 coverage
|
||||
|
||||
Engineers do NOT see the L1 surface by default. Owners can toggle `users.can_cover_l1 = true` on individual engineer users. Engineers with that flag (and all owners/super_admins) see an "L1 Workspace" entry in their sidebar. Clicking it puts them in `/l1/*` with a sticky banner: *"Covering L1 — actions logged as coverage."* Coverage actions are audit-logged with `acting_as = 'l1_coverage'`.
|
||||
|
||||
Backend dep: `require_l1_or_coverage` = `l1_tech | (engineer AND can_cover_l1) | owner | super_admin`.
|
||||
|
||||
This mirrors the existing orthogonal-flag pattern (`is_team_admin`) — no new architectural concept.
|
||||
|
||||
### 3.5 Billing data model
|
||||
|
||||
- `accounts.l1_seats_purchased INTEGER NOT NULL DEFAULT 0` (new column)
|
||||
- Existing `accounts.seats_purchased` continues to represent engineer seats
|
||||
- New Stripe SKU placeholder for L1 seat; actual pricing set in Stripe dashboard out-of-band
|
||||
|
||||
---
|
||||
|
||||
## 4. Architecture overview
|
||||
|
||||
### 4.1 New components
|
||||
|
||||
**Frontend:**
|
||||
- `pages/l1/L1Dashboard.tsx` — landing page; ticket queue + describe-the-problem intake
|
||||
- `pages/l1/L1WalkPage.tsx` — purpose-built walker; yes/no cards, transcript, persistent escalate/resolve
|
||||
- `pages/l1/L1DraftsPage.tsx` — read-only list of the L1's AI drafts and promotion status
|
||||
- `pages/l1/L1TicketsPage.tsx` — full-page queue (PSA + internal merged)
|
||||
- `components/l1/L1CoverageBanner.tsx` — slim banner shown to engineer-coverers
|
||||
|
||||
**Backend:**
|
||||
- `services/match_or_build.py` — orchestrator (RAG match → fallback to AI build)
|
||||
- `services/ai_tree_builder.py` — real-time AI tree generation via Anthropic
|
||||
- `services/kb_connectors/` package — base, registry, encryption, plus `itglue.py`, `hudu.py`, `microsoft_graph.py`
|
||||
- `services/kb_ingestion_writer.py` — shared writer used by manual upload + all connectors
|
||||
- `services/kb_ingestion_scheduler.py` — APScheduler job, `max_instances=1`, per-connector sync
|
||||
- `services/internal_ticket_service.py` — CRUD + status transitions for the no-PSA fallback
|
||||
- `services/l1_session_service.py` — walking-session lifecycle
|
||||
- `api/endpoints/l1.py` — L1-role endpoints
|
||||
- `api/endpoints/kb_connectors.py` — KB connector config endpoints (owner-only for write)
|
||||
|
||||
**Reused / extended:**
|
||||
- `services/rag_service.py` — flow & KB matching (existing)
|
||||
- `services/flow_matching_engine.py` — existing
|
||||
- `services/escalation_package_generator.py` — extended to include walked path, AI draft pointer, KB citations
|
||||
- `models/FlowProposal` — new columns (see §5)
|
||||
- `services/psa/` — already supports ticket create + reassign across CW/Autotask/HaloPSA
|
||||
- `services/embedding_service.py` — used by KB ingestion writer
|
||||
- New `kb_documents` + `kb_document_chunks` tables for RAG-retrievable document storage, separate from the existing `kb_imports` (which is a document→tree conversion record, not a persistent KB store — see §5)
|
||||
- Audit log writer — gains `acting_as` field
|
||||
|
||||
### 4.2 Data flow — walk-in / phone-call intake
|
||||
|
||||
```
|
||||
L1 types: "User can't connect Outlook after password reset"
|
||||
POST /api/v1/l1/intake
|
||||
body: { problem_statement, customer_name?, customer_contact? }
|
||||
→ create ticket
|
||||
- PSA if configured: psa_provider.create_ticket(...)
|
||||
- else: internal_tickets row
|
||||
→ match_or_build(account_id, problem_text, ticket_ref)
|
||||
→ rag_service.match_flows(...) → top hit; if score ≥ threshold return as 'flow'
|
||||
→ rag_service.match_proposals(... where validated_by_outcome=true)
|
||||
→ top hit; if score ≥ threshold return as 'proposal'
|
||||
→ ai_tree_builder.build(problem_text, kb_chunks, nearest_flows)
|
||||
→ persist FlowProposal(source='ai_realtime_l1',
|
||||
linked_ticket_id,
|
||||
linked_ticket_kind,
|
||||
validated_by_outcome=false)
|
||||
→ return as 'proposal'
|
||||
→ l1_session_service.start(...)
|
||||
→ return { session_id, target_kind, target_id, intake_type }
|
||||
→ navigate to /l1/walk/{session_id}
|
||||
```
|
||||
|
||||
### 4.3 Data flow — PSA-queue intake
|
||||
|
||||
The L1 dashboard polls the L1's PSA queue plus their internal tickets. Clicking a ticket row calls `POST /api/v1/l1/tickets/{ticket_ref}/start` which is the same `match_or_build` path (the `problem_statement` is the ticket subject + description) followed by walker navigation.
|
||||
|
||||
---
|
||||
|
||||
## 5. Data model
|
||||
|
||||
All new tenant-isolated tables get RLS policies (account-scoped, WITH CHECK). All TIMESTAMPs are `TIMESTAMPTZ`. No `--rev-id` on Alembic; no `--autogenerate` for enum/RLS work.
|
||||
|
||||
### 5.1 `FlowProposal` — extended
|
||||
|
||||
Existing AI-draft model. Add columns:
|
||||
|
||||
| Column | Type | Notes |
|
||||
|---|---|---|
|
||||
| `source` | `VARCHAR(30) NOT NULL` | `'ai_realtime_l1' \| 'kb_accelerator' \| 'manual_draft'`. Backfill existing rows to `'manual_draft'`. |
|
||||
| `linked_ticket_id` | `VARCHAR(64) NULL` | PSA id or internal_tickets UUID (stored as text) |
|
||||
| `linked_ticket_kind` | `VARCHAR(10) NULL` | `'psa' \| 'internal'` |
|
||||
| `validated_by_outcome` | `BOOLEAN NOT NULL DEFAULT FALSE` | Flipped to true when L1 resolves and marks helpful=true |
|
||||
| `walked_path_snapshot` | `JSONB NULL` | Frozen at resolve/escalate; shape `[{node_id, question, answer, l1_note}]` |
|
||||
|
||||
Engineer review queue sort:
|
||||
```sql
|
||||
ORDER BY validated_by_outcome DESC, created_at DESC
|
||||
```
|
||||
|
||||
### 5.2 `internal_tickets` — new
|
||||
|
||||
```
|
||||
id UUID PRIMARY KEY
|
||||
account_id UUID NOT NULL (RLS-scoped)
|
||||
created_by_user_id UUID NOT NULL (the L1 who took the call)
|
||||
customer_name VARCHAR(120)
|
||||
customer_contact VARCHAR(200) NULL (email or phone, free text)
|
||||
problem_statement TEXT NOT NULL
|
||||
status VARCHAR(30) NOT NULL -- 'open' | 'walking' | 'resolved' | 'escalated'
|
||||
flow_id UUID NULL FK trees
|
||||
flow_proposal_id UUID NULL FK flow_proposals
|
||||
ai_session_id UUID NULL FK ai_sessions (set when engineer picks up in chat post-escalation)
|
||||
assigned_user_id UUID NULL (engineer post-escalation)
|
||||
resolution_notes TEXT NULL
|
||||
psa_promoted_ticket_id VARCHAR(64) NULL (set if later promoted to PSA)
|
||||
created_at TIMESTAMPTZ NOT NULL
|
||||
updated_at TIMESTAMPTZ NOT NULL
|
||||
resolved_at TIMESTAMPTZ NULL
|
||||
```
|
||||
|
||||
RLS: account-scoped, WITH CHECK on insert/update.
|
||||
|
||||
### 5.3 `kb_connector_configs` — new
|
||||
|
||||
```
|
||||
id UUID PRIMARY KEY
|
||||
account_id UUID NOT NULL (RLS-scoped)
|
||||
provider VARCHAR(20) NOT NULL -- 'itglue' | 'hudu' | 'microsoft_graph'
|
||||
display_name VARCHAR(80) NOT NULL
|
||||
credentials_encrypted BYTEA NOT NULL -- Fernet, same pattern as services/psa/encryption.py
|
||||
is_active BOOLEAN NOT NULL DEFAULT TRUE
|
||||
sync_interval_minutes INTEGER NOT NULL DEFAULT 360
|
||||
last_sync_at TIMESTAMPTZ NULL
|
||||
last_sync_status VARCHAR(20) NULL -- 'success' | 'error' | 'running'
|
||||
last_sync_error TEXT NULL
|
||||
created_by_user_id UUID NOT NULL
|
||||
created_at TIMESTAMPTZ NOT NULL
|
||||
updated_at TIMESTAMPTZ NOT NULL
|
||||
UNIQUE (account_id, provider, display_name)
|
||||
```
|
||||
|
||||
RLS: account-scoped, WITH CHECK.
|
||||
|
||||
### 5.4 New tables: `kb_documents` + `kb_document_chunks`
|
||||
|
||||
The existing `kb_imports` table is a document→tree conversion record (status lifecycle `processing | ready | committed | failed`, target `tree_id`) — designed to turn one document into one authored flow. It is NOT a persistent KB document store and does not power RAG retrieval.
|
||||
|
||||
The L1 feature needs a separate pair of tables that store ingested docs in RAG-retrievable form:
|
||||
|
||||
**`kb_documents`** — one row per ingested document:
|
||||
|
||||
```
|
||||
id UUID PRIMARY KEY
|
||||
account_id UUID NOT NULL (RLS-scoped)
|
||||
source_kind VARCHAR(20) NOT NULL -- 'upload' | 'paste' | 'itglue' | 'hudu' | 'microsoft_graph'
|
||||
source_ref VARCHAR(200) NULL -- provider-side document ID for re-sync
|
||||
connector_config_id UUID NULL FK kb_connector_configs
|
||||
title VARCHAR(500) NOT NULL
|
||||
content TEXT NOT NULL -- full post-extraction text
|
||||
content_hash VARCHAR(64) NOT NULL -- sha256 for change-detection
|
||||
metadata JSONB NULL -- provider-specific (org_id, drive_id, etc.)
|
||||
last_synced_at TIMESTAMPTZ NULL
|
||||
deleted_at TIMESTAMPTZ NULL -- soft-delete on connector removal
|
||||
created_at TIMESTAMPTZ NOT NULL
|
||||
updated_at TIMESTAMPTZ NOT NULL
|
||||
```
|
||||
|
||||
Unique partial index: `(connector_config_id, source_ref) WHERE source_ref IS NOT NULL`.
|
||||
|
||||
**`kb_document_chunks`** — chunks with embeddings, used by `rag_service.match_kb_chunks`:
|
||||
|
||||
```
|
||||
id UUID PRIMARY KEY
|
||||
document_id UUID NOT NULL FK kb_documents ON DELETE CASCADE
|
||||
account_id UUID NOT NULL -- denormalized for RLS
|
||||
chunk_index INTEGER NOT NULL
|
||||
content TEXT NOT NULL
|
||||
embedding VECTOR(<dim>) NOT NULL -- dim matches embedding_service
|
||||
metadata JSONB NULL -- section title, page number, etc.
|
||||
created_at TIMESTAMPTZ NOT NULL
|
||||
UNIQUE (document_id, chunk_index)
|
||||
```
|
||||
|
||||
Pgvector index (ivfflat or hnsw) on `embedding`; choice tuned during implementation.
|
||||
|
||||
RLS on both tables: account-scoped, WITH CHECK on insert.
|
||||
|
||||
**Coexistence with `kb_imports`:** when an L1 (or owner) uploads a doc, the system can populate **both** — the existing KBImport pipeline produces a draft tree, and the new ingestion writer additionally chunks+embeds the doc into `kb_documents` for RAG. Both paths share the upload endpoint but write to independent tables. Connectors only write to `kb_documents` (no auto-tree-conversion from synced docs in v1).
|
||||
|
||||
### 5.5 Other column additions
|
||||
|
||||
- `users.can_cover_l1 BOOLEAN NOT NULL DEFAULT FALSE`
|
||||
- `accounts.l1_seats_purchased INTEGER NOT NULL DEFAULT 0`
|
||||
- `audit_logs.acting_as VARCHAR(30) NULL` — `'l1_coverage'` when engineer is in coverage mode; null otherwise
|
||||
- `account_role` enum: add `'l1_tech'`
|
||||
|
||||
### 5.6 Migration ordering
|
||||
|
||||
Six manual Alembic revisions (no `--rev-id`, no `--autogenerate`):
|
||||
|
||||
1. Add `'l1_tech'` to `account_role` enum.
|
||||
2. Add `users.can_cover_l1`, `accounts.l1_seats_purchased`, `audit_logs.acting_as`.
|
||||
3. Extend `flow_proposals` with new columns + backfill existing rows to `source='manual_draft'`.
|
||||
4. Create `internal_tickets` + RLS policies (account-scoped, WITH CHECK).
|
||||
5. Create `kb_connector_configs` + RLS policies.
|
||||
6. Create `kb_documents` + `kb_document_chunks` tables + RLS policies + pgvector index on chunks.
|
||||
|
||||
Per Lesson on tenant-isolated tables: any service-construction site that creates rows on these tables must pass `account_id=` explicitly. Grep all `Model(` sites before merge.
|
||||
|
||||
---
|
||||
|
||||
## 6. Backend services & endpoints
|
||||
|
||||
### 6.1 New services
|
||||
|
||||
| Module | Purpose |
|
||||
|---|---|
|
||||
| `services/match_or_build.py` | Orchestrator. Single async entrypoint `match_or_build(account_id, problem_text, ticket_ref) -> MatchOrBuildResult`. |
|
||||
| `services/ai_tree_builder.py` | Real-time AI tree generation. Anthropic via existing `_call_anthropic_cached` pattern. Model tier via `settings.get_model_for_action('l1_realtime_build')`. Output validated against the flow node schema with Pydantic; rejects malformed output. |
|
||||
| `services/kb_connectors/base.py` | Abstract `KBConnector` with `test_credentials`, `list_documents`, `fetch_content`, `subscribe_to_changes` (optional). |
|
||||
| `services/kb_connectors/itglue.py` | IT Glue REST client. |
|
||||
| `services/kb_connectors/hudu.py` | Hudu REST client. |
|
||||
| `services/kb_connectors/microsoft_graph.py` | Microsoft Graph (SharePoint/OneDrive) client. |
|
||||
| `services/kb_connectors/registry.py` | `KBConnectorRegistry` (mirrors `PsaProviderRegistry`). |
|
||||
| `services/kb_connectors/encryption.py` | Fernet wrapper (or reuse the PSA one if generic). |
|
||||
| `services/kb_ingestion_writer.py` | Shared writer: chunk → embed → upsert. Used by manual upload AND connector sync. |
|
||||
| `services/kb_ingestion_scheduler.py` | APScheduler interval job, `max_instances=1`. Sequential per account; concurrency cap = 4 accounts simultaneously. |
|
||||
| `services/internal_ticket_service.py` | CRUD + status transitions for `internal_tickets`. |
|
||||
| `services/l1_session_service.py` | Walking-session lifecycle: start, step, resolve, escalate. Bridges `ai_sessions` and the walked target. |
|
||||
|
||||
### 6.2 Extended services
|
||||
|
||||
- `services/escalation_package_generator.py` — adds inputs: `walked_path`, `ai_draft_proposal_id`, `kb_citations`. New caller path from `l1_session_service.escalate(...)`.
|
||||
- KB Accelerator endpoint — accepts ingested content via the shared `kb_ingestion_writer`. Manual upload and connector sync share the same persistence path.
|
||||
|
||||
### 6.3 New endpoints
|
||||
|
||||
All under `require_l1_or_coverage` unless noted. Mounted under `/api/v1/l1`.
|
||||
|
||||
| Method | Path | Purpose | Auth |
|
||||
|---|---|---|---|
|
||||
| GET | `/l1/queue` | Merged ticket queue (PSA + internal). Pagination + status filter. | `require_l1_or_coverage` |
|
||||
| POST | `/l1/intake` | Walk-in intake. Body `{problem_statement, customer_name?, customer_contact?}`. Creates ticket, returns `{session_id, target_kind, target_id, intake_type}`. | `require_l1_or_coverage` |
|
||||
| POST | `/l1/tickets/{ticket_ref}/start` | Start walker from an existing ticket. Internally same as intake but skips ticket creation. | `require_l1_or_coverage` |
|
||||
| POST | `/l1/sessions/{id}/step` | Record an answer. Body `{node_id, answer, note?}`. Appends to `walked_path_snapshot`. | `require_l1_or_coverage` |
|
||||
| POST | `/l1/sessions/{id}/resolve` | Close as resolved. Body `{resolution_notes, helpful: bool}`. Sets `validated_by_outcome=true` on the proposal when `helpful=true` AND target was a proposal. Closes the ticket. | `require_l1_or_coverage` |
|
||||
| POST | `/l1/sessions/{id}/escalate` | Generate escalation package + reassign ticket. Body `{reason, reason_category}`. | `require_l1_or_coverage` |
|
||||
| GET | `/l1/drafts` | List current user's AI drafts with promotion status. | `require_l1_or_coverage` |
|
||||
|
||||
KB connector endpoints (`/api/v1/kb-connectors`):
|
||||
|
||||
| Method | Path | Purpose | Auth |
|
||||
|---|---|---|---|
|
||||
| GET | `/kb-connectors` | List configured connectors for account. | `require_l1_or_above` |
|
||||
| POST | `/kb-connectors` | Create. OAuth handoff for Microsoft Graph; API token entry for IT Glue/Hudu. | `require_account_owner` |
|
||||
| DELETE | `/kb-connectors/{id}` | Remove (soft-disable). | `require_account_owner` |
|
||||
| POST | `/kb-connectors/{id}/sync` | Trigger immediate sync (enqueued). | `require_account_owner` |
|
||||
| GET | `/kb-connectors/{id}/status` | Sync status + doc count + last error. | `require_l1_or_above` |
|
||||
|
||||
Internal ticket endpoints (`/api/v1/internal-tickets`):
|
||||
|
||||
| Method | Path | Purpose | Auth |
|
||||
|---|---|---|---|
|
||||
| GET | `/internal-tickets` | List (account-scoped). | `require_l1_or_coverage` |
|
||||
| GET | `/internal-tickets/{id}` | Detail. | `require_l1_or_coverage` |
|
||||
| POST | `/internal-tickets/{id}/promote-to-psa` | Push to configured PSA, set `psa_promoted_ticket_id`. | `require_account_owner` |
|
||||
|
||||
User management addition:
|
||||
|
||||
| Method | Path | Purpose | Auth |
|
||||
|---|---|---|---|
|
||||
| PATCH | `/users/{id}/coverage` | Set `can_cover_l1` flag. Body `{can_cover_l1: bool}`. | `require_account_owner` |
|
||||
|
||||
---
|
||||
|
||||
## 7. Frontend surface
|
||||
|
||||
### 7.1 Sidebar — L1 view
|
||||
|
||||
```
|
||||
LOGO
|
||||
─────────────
|
||||
Workspace /l1
|
||||
Tickets /l1/tickets
|
||||
My Drafts /l1/drafts
|
||||
─────────────
|
||||
Guides /guides
|
||||
Account /account (filtered — no integrations, no categories)
|
||||
```
|
||||
|
||||
No `/pilot`, no `/trees`, no `/flows`, no `/review-queue`, no `/escalations`, no team analytics. Sidebar.tsx picks the nav array by role.
|
||||
|
||||
### 7.2 Sidebar — engineer coverage view
|
||||
|
||||
Engineer's existing sidebar plus a single appended entry "L1 Workspace" → `/l1`. Shown when `canCoverL1 || isOwner || isSuperAdmin`.
|
||||
|
||||
### 7.3 `/l1` dashboard layout
|
||||
|
||||
Three vertical zones, single column, max width ~1100px:
|
||||
|
||||
1. **Greeting** — uppercase tracking date label + Bricolage 700 hero ("Good morning, {firstName}.")
|
||||
2. **Describe the problem** card — large textarea (autofocus on load), optional `customer_name` + `customer_contact` fields, single primary CTA "Start walk →" (the only electric-blue element on the page)
|
||||
3. **Open tickets** — section label, count, table rows (merged PSA + internal with origin badges), row hover `bg-elevated`
|
||||
4. **Resume in progress** — shown only when L1 has a half-walked session
|
||||
|
||||
Tailwind v4 tokens: `bg-page` base, `bg-card` zones, `bg-elevated` row hover, electric-blue accent only on primary CTA. No `text-secondary`. All borders `border-default`.
|
||||
|
||||
### 7.4 `/l1/walk/{sessionId}` walker
|
||||
|
||||
Sticky header + two-pane body, full-height (flex chain per Lesson — every ancestor needs `flex` + `flex-1` + `min-h-0`).
|
||||
|
||||
**Header:**
|
||||
- Back arrow + ticket ref + customer name + AI-built badge (when target is proposal)
|
||||
- Problem statement line
|
||||
- Persistent action buttons: `[ Escalate ]` `[ Resolve ✓ ]`
|
||||
|
||||
**Left pane (main):**
|
||||
- "Step N · estimated M" label
|
||||
- Current node card — large yes/no/answer buttons (min 44px tap target)
|
||||
- Optional note textarea below the card (appended to `walked_path_snapshot`)
|
||||
- On a fresh proposal that's still building: shimmer placeholder + "Building from KB… ~10s"
|
||||
|
||||
**Right pane (transcript):**
|
||||
- Walked-so-far list (node title + answer chosen)
|
||||
- Current step highlight
|
||||
- "Source:" section listing KB citations for the current node (proposal walks only)
|
||||
|
||||
**Resolve modal:**
|
||||
- "Did this resolve it?" `[ Yes ]` `[ No ]`
|
||||
- Resolution notes textarea
|
||||
- Yes + target was proposal → sets `validated_by_outcome=true`
|
||||
- No → prompt to escalate instead
|
||||
|
||||
**Escalate modal:**
|
||||
- Reason category dropdown: *Out of L1 scope · Customer demanding senior · Tree dead-ended · AI tree wrong · Other*
|
||||
- Free-text reason
|
||||
- Confirm
|
||||
|
||||
### 7.5 `/l1/drafts` page
|
||||
|
||||
Read-only list, columns: `created` · `problem (truncated)` · `ticket #` · `status` (pending review / outcome-validated / promoted / retired). Click → read-only detail view showing tree + walked path. No edit affordances.
|
||||
|
||||
### 7.6 `/l1/tickets` page
|
||||
|
||||
Full-page version of the dashboard queue widget. Filter by status, origin (PSA/internal), assigned-to-me.
|
||||
|
||||
### 7.7 Coverage banner
|
||||
|
||||
`<L1CoverageBanner />` — slim ~32px band, info-cyan-dim background, mounted at the top of all `/l1/*` pages when `!isL1Tech && (canCoverL1 || isOwner || isSuperAdmin)`:
|
||||
|
||||
```
|
||||
You're covering L1. Actions logged as coverage. [Switch back →]
|
||||
```
|
||||
|
||||
The "Switch back" link returns to `/`.
|
||||
|
||||
### 7.8 Routing
|
||||
|
||||
```tsx
|
||||
const L1Dashboard = lazyWithRetry(() => import('@/pages/l1/L1Dashboard'))
|
||||
const L1WalkPage = lazyWithRetry(() => import('@/pages/l1/L1WalkPage'))
|
||||
const L1DraftsPage = lazyWithRetry(() => import('@/pages/l1/L1DraftsPage'))
|
||||
const L1TicketsPage = lazyWithRetry(() => import('@/pages/l1/L1TicketsPage'))
|
||||
```
|
||||
|
||||
Mounted under the `/` ProtectedRoute branch at:
|
||||
- `/l1` → `L1Dashboard`
|
||||
- `/l1/walk/:sessionId` → `L1WalkPage`
|
||||
- `/l1/drafts` → `L1DraftsPage`
|
||||
- `/l1/tickets` → `L1TicketsPage`
|
||||
|
||||
Wrapped in `L1RouteGuard` (403 if not `l1_tech` AND not coverage-flagged). `ProtectedRoute.tsx` post-login redirect: L1 users land on `/l1` instead of `/`.
|
||||
|
||||
`lazyWithRetry`, not `React.lazy` (per existing convention).
|
||||
|
||||
---
|
||||
|
||||
## 8. AI match-or-build pipeline
|
||||
|
||||
### 8.1 Match-or-build algorithm
|
||||
|
||||
```
|
||||
match_or_build(account_id, problem_text, ticket_ref):
|
||||
embedding = embedding_service.embed(problem_text)
|
||||
|
||||
# 1. Match authored flows
|
||||
flow_hits = rag_service.match_flows(account_id, embedding, k=5)
|
||||
if flow_hits and flow_hits[0].score >= MATCH_THRESHOLD:
|
||||
return {kind: 'flow', id: flow_hits[0].flow_id, score: ...}
|
||||
|
||||
# 2. Match outcome-validated proposals only
|
||||
proposal_hits = rag_service.match_proposals(
|
||||
account_id, embedding, k=5,
|
||||
where=validated_by_outcome=true,
|
||||
)
|
||||
if proposal_hits and proposal_hits[0].score >= MATCH_THRESHOLD:
|
||||
return {kind: 'proposal', id: proposal_hits[0].proposal_id, score: ...}
|
||||
|
||||
# 3. Build fresh
|
||||
kb_chunks = rag_service.match_kb_chunks(account_id, embedding, k=8)
|
||||
if not kb_chunks:
|
||||
raise BuildAbortedNoKB(
|
||||
"Cannot build a tree with no KB content. "
|
||||
"Upload docs or wait for a connector sync."
|
||||
)
|
||||
nearest_flows = flow_hits[:3]
|
||||
proposal = ai_tree_builder.build(
|
||||
problem_text, kb_chunks, nearest_flows, account_id, ticket_ref
|
||||
)
|
||||
return {kind: 'proposal', id: proposal.id, score: None}
|
||||
```
|
||||
|
||||
`MATCH_THRESHOLD` — per-account configurable; default `0.75` (cosine).
|
||||
|
||||
The "no empty KB build" rule is enforced because an AI tree built on the model's general knowledge — without MSP-specific grounding — risks suggesting unsafe or hallucinated fixes.
|
||||
|
||||
### 8.2 AI tree-build details
|
||||
|
||||
**Model:** `settings.get_model_for_action('l1_realtime_build')`. Recommend Sonnet for v1 (latency-sensitive).
|
||||
|
||||
**Schema:** output validated against the existing flow node schema (matches `tree_editor` output). Validation failure aborts the build rather than persisting malformed data.
|
||||
|
||||
**Prompt strategy** (per Lesson on prompt anti-parrot — critical):
|
||||
- System prompt: role definition + output schema using `<placeholder>` notation only. Never literal field values.
|
||||
- Few-shot examples loaded as user/assistant messages from a separate file, never inline in the system prompt.
|
||||
- User message: `{problem_statement}` + `{kb_context: [doc_title, section, content]}` + `{nearest_flow_summaries}` + instruction to cite KB chunks per node.
|
||||
- Output includes `kb_citations: [{node_id, kb_doc_id, snippet}]` for walker's "Source:" pane and engineer review.
|
||||
|
||||
**Latency:** whole-tree-then-return (~5–15s typical). UX is a shimmer "Building from KB…" placeholder. Streaming node-by-node deferred to v2.
|
||||
|
||||
**Anthropic SDK config** (per Lesson): `max_retries=1`. Prompt caching enabled on the stable system+few-shot bundle (high cache hit rate expected per account).
|
||||
|
||||
**Telemetry:**
|
||||
- `l1.match_or_build.duration_ms`, `l1.match_or_build.outcome` (`flow_match`/`proposal_match`/`built`/`aborted_no_kb`)
|
||||
- `anthropic.cache` events (existing pattern) tagged `action=l1_realtime_build`
|
||||
- `l1.tree_build.tokens_in`, `tokens_out`
|
||||
|
||||
**Anti-parrot guardrail:** the existing `tests/test_prompt_anti_parrot.py` auto-discovers new prompt constants via pattern match on `*_PROMPT` / `*_SCHEMA` / `*_PROTOCOL` / `*_FORMAT`. No new test required.
|
||||
|
||||
### 8.3 Hallucinated-citation defense
|
||||
|
||||
After build, the writer verifies every `kb_doc_id` in `kb_citations` exists in the account's KB. Unverified citations are stripped from the walker's "Source:" pane (the node still renders, just without a source). Engineer review surfaces stripped citations as a warning.
|
||||
|
||||
---
|
||||
|
||||
## 9. KB ingestion
|
||||
|
||||
### 9.1 Connector interface
|
||||
|
||||
```python
|
||||
class KBConnector(ABC):
|
||||
async def test_credentials(self) -> bool
|
||||
async def list_documents(self, since: datetime | None) -> AsyncIterator[KBDocRef]
|
||||
async def fetch_content(self, ref: KBDocRef) -> KBDocContent
|
||||
async def subscribe_to_changes(self) -> AsyncIterator[ChangeEvent] # optional, no-op v1
|
||||
```
|
||||
|
||||
Registry dispatches by `provider` string. Credentials encrypted at rest via Fernet (reuse `services/psa/encryption.py` pattern).
|
||||
|
||||
### 9.2 Per-connector specifics
|
||||
|
||||
| | IT Glue | Hudu | Microsoft Graph (SharePoint/OneDrive) |
|
||||
|---|---|---|---|
|
||||
| Auth | API token (header) | API key (header) | OAuth 2.0 |
|
||||
| Ingested types | Documents, KB Articles | Articles | docx, pdf, md, txt |
|
||||
| Never ingested | Passwords, Configurations, sensitive flex assets | Passwords, sensitive items | Files in folders matching `(secret\|confidential\|private)` heuristic; files with a tenant Sensitivity Label |
|
||||
| Filtering | Per-org (techs see all client orgs they have permission to) | Per-folder | Per-site / per-drive (owner picks at config time) |
|
||||
| Rate limits | ~100/min token bucket | ~250/min token bucket | Built-in Graph throttling backoff |
|
||||
|
||||
All three deliver content to `kb_ingestion_writer` which:
|
||||
1. Chunks (paragraph-aware, configurable size with overlap)
|
||||
2. Embeds via `embedding_service`
|
||||
3. Upserts into `kb_documents` keyed on `(connector_config_id, source_ref)`; chunks into `kb_document_chunks`
|
||||
|
||||
Cross-connector conflicts: same doc text appearing in two connectors yields two rows (provider-scoped `source_ref`). Engineers can dedup manually if needed.
|
||||
|
||||
### 9.3 Sync scheduling
|
||||
|
||||
`kb_ingestion_scheduler.py` runs as APScheduler interval job, `max_instances=1`. Per cycle:
|
||||
1. Query active `kb_connector_configs` where `last_sync_at` is older than `sync_interval_minutes` (default 360 = 6h).
|
||||
2. Dispatch per account; concurrency cap = 4 simultaneous accounts.
|
||||
3. For each connector: `list_documents(since=last_sync_at)` → for each ref, `fetch_content` → write.
|
||||
4. Compute the diff between current refs and existing rows (same `connector_config_id`); soft-delete missing ones via `deleted_at`.
|
||||
5. Update `last_sync_at`, `last_sync_status`, `last_sync_error`.
|
||||
|
||||
Must use `_admin_session_factory()` not `get_db()` for startup-side and scheduler-side queries (per Lesson on RLS at startup — no `app.current_account_id` set).
|
||||
|
||||
Immediate sync via `POST /api/v1/kb-connectors/{id}/sync` enqueues a job; scheduler picks it up within ~30s.
|
||||
|
||||
---
|
||||
|
||||
## 10. Escalation flow
|
||||
|
||||
1. L1 clicks **Escalate** → modal (reason category + optional free text).
|
||||
2. `POST /api/v1/l1/sessions/{id}/escalate` → backend:
|
||||
- Calls extended `escalation_package_generator.generate(session_id, include_l1_walk=true)`. Package contents:
|
||||
```
|
||||
problem_statement, customer_name, customer_contact,
|
||||
ticket_ref (PSA id or internal id),
|
||||
target_kind ('flow' | 'proposal'), target_id,
|
||||
walked_path,
|
||||
ai_draft_proposal_id,
|
||||
kb_citations,
|
||||
escalation_reason, reason_category, l1_user_id
|
||||
```
|
||||
- Creates an `ai_session` with the package serialized into system context for the chat surface.
|
||||
- If PSA-backed: `psa_provider.reassign_ticket(ticket_id, to=account.engineer_queue_name)`. Default `'Tier 2'`. Owner configurable in `/account/integrations`.
|
||||
- If internal-backed: `internal_tickets.status='escalated'`, `assigned_user_id=null` (round-robin assignment is out of scope).
|
||||
- Writes notification via existing `notification_service` — bell badge to all engineers in account.
|
||||
- Audit log entry; `acting_as` reflects whether L1 or coverage-engineer escalated.
|
||||
3. Toast on L1 side, return to `/l1`.
|
||||
4. Engineer clicks notification → `/pilot/{sessionId}` → chat surface renders the package as a sticky "Escalation context" card; engineer continues in chat.
|
||||
|
||||
**Un-escalate is out of scope.** If engineer wants to bounce back, they reassign in PSA manually.
|
||||
|
||||
---
|
||||
|
||||
## 11. Internal ticket fallback
|
||||
|
||||
When the account has no active PSA provider:
|
||||
- Intake creates `internal_tickets` row instead of a PSA ticket.
|
||||
- Queue surface merges PSA + internal with `Internal` / `PSA` origin badge.
|
||||
- Escalation flips `internal_tickets.status='escalated'` and assigns engineer (or leaves null for any engineer to claim — v1 behavior).
|
||||
- Engineer post-escalation sees the internal ticket as a session; no PSA roundtrip.
|
||||
|
||||
**Promote to PSA:** owner-only action on any internal ticket. Pushes the ticket into the configured PSA provider, sets `psa_promoted_ticket_id`. Manual; not automatic on PSA-install. Lets MSPs adopt PSA mid-flight without orphaning prior internal tickets.
|
||||
|
||||
---
|
||||
|
||||
## 12. Outcome-validation lifecycle
|
||||
|
||||
```
|
||||
1. L1 intake → match_or_build → FlowProposal(source='ai_realtime_l1',
|
||||
validated_by_outcome=false,
|
||||
linked_ticket_id=...)
|
||||
2. L1 walks → POST /l1/sessions/{id}/step appends to walked_path_snapshot
|
||||
3. L1 hits Resolve:
|
||||
modal: "Did this resolve it?" [Yes] [No] + resolution_notes
|
||||
4. helpful=true → flow_proposal.validated_by_outcome = true
|
||||
→ walked_path_snapshot frozen
|
||||
→ ticket closed (PSA or internal)
|
||||
helpful=false → validated_by_outcome stays false
|
||||
→ L1 prompted: "Escalate instead?"
|
||||
5. Engineer review queue:
|
||||
ORDER BY validated_by_outcome DESC, created_at DESC
|
||||
- Outcome-validated drafts surface first
|
||||
- Promote / edit-and-promote / retire
|
||||
6. Promote → new flow with source='ai_promoted'; original proposal kept with status='promoted'
|
||||
→ future match_or_build matches the new flow on the flow-match pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 13. Out of scope (v1 non-goals)
|
||||
|
||||
- End-user / self-service portal ("L0" tier).
|
||||
- Engineer warm-transfer / live take-over during a call.
|
||||
- L1 ↔ engineer real-time chat during a call.
|
||||
- Multi-language UI / customer-language toggle in walker.
|
||||
- Auto-promote internal tickets to PSA on integration install.
|
||||
- AI tree streaming (node-by-node).
|
||||
- KB write-back to IT Glue/Hudu/SharePoint (read-only ingestion).
|
||||
- Confluence connector.
|
||||
- Per-step KB citation editing in engineer review (engineers edit the tree, not citations).
|
||||
- Final Stripe pricing SKU (data model supports differential pricing; price set in Stripe dashboard).
|
||||
- "Switch to L1 mode" persistent toggle for engineers (coverage flag + banner only).
|
||||
- Cancel/un-escalate flow.
|
||||
- Round-robin engineer assignment on internal-ticket escalations.
|
||||
|
||||
---
|
||||
|
||||
## 14. Testing strategy
|
||||
|
||||
### 14.1 Backend (pytest)
|
||||
|
||||
- Unit: `match_or_build` covers all four paths (flow-match, proposal-match, built, aborted_no_kb).
|
||||
- Unit: `ai_tree_builder` schema validation — assert rejection of malformed Anthropic output before persistence.
|
||||
- Unit: each connector's `list_documents` + `fetch_content` against recorded HTTP fixtures.
|
||||
- Integration: intake → walk → resolve(helpful=true) → assert `FlowProposal.validated_by_outcome=true`, ticket closed.
|
||||
- Integration: intake → walk → escalate → assert PSA `reassign_ticket` invoked, `ai_session` created with package, audit log entry, notification dispatched.
|
||||
- Integration: KB scheduler — `max_instances=1`, sequential per-account, soft-delete on removal.
|
||||
- **RLS regression** (highest priority): `l1_tech` user in account A cannot read account B's tickets, drafts, KB docs, or connector configs. Added to existing RLS test suite.
|
||||
- Anti-parrot: existing CI test auto-discovers new prompt module.
|
||||
|
||||
### 14.2 Frontend
|
||||
|
||||
- Unit: `usePermissions` — L1 sees L1 paths, blocked from engineer paths. Coverage flag opens L1 paths.
|
||||
- Unit: `L1WalkPage` — node advance, escalate modal, resolve modal flips `validated_by_outcome` correctly.
|
||||
- Unit: `L1CoverageBanner` — visible for engineer-with-flag on `/l1/*`, hidden for L1 users.
|
||||
- E2E (Playwright, scoped selectors per Lesson):
|
||||
- L1 sign-in → dashboard → intake → walker → resolve → verify ticket closed + proposal flagged.
|
||||
- Engineer with `can_cover_l1` → sidebar entry visible → click → coverage banner shows → walks a session → audit log records `acting_as='l1_coverage'`.
|
||||
- L1 hitting `/pilot`, `/trees/new`, `/escalations` → 403 or redirect.
|
||||
|
||||
---
|
||||
|
||||
## 15. Acceptance criteria (v1 ships when…)
|
||||
|
||||
- L1 role assignable; assigned L1 sees L1 sidebar only; no engineer route reachable.
|
||||
- L1 intake creates a ticket (PSA or internal) and lands in walker session.
|
||||
- Walker handles both flows and proposals; AI-built badge + sources shown for proposals.
|
||||
- Escalate generates package, reassigns ticket, notifies engineers.
|
||||
- Resolve flips `validated_by_outcome`; review queue prioritizes outcome-validated drafts.
|
||||
- All three KB connectors configurable; initial sync + periodic re-sync + soft-delete on removal.
|
||||
- AI build refuses with informative error when account KB is empty.
|
||||
- Coverage flag works end-to-end with audit-log tagging.
|
||||
- RLS blocks cross-tenant reads on every new table.
|
||||
- L1 seat count tracked separately from engineer seats in admin/billing UI.
|
||||
|
||||
---
|
||||
|
||||
## 16. Risks & mitigations
|
||||
|
||||
| Risk | Mitigation |
|
||||
|---|---|
|
||||
| AI builds an unsafe tree | Schema validation rejects malformed output. Engineer review is the gate before draft becomes "real" flow. v1 refuses to build when KB is empty. |
|
||||
| Hallucinated KB citations | Post-build verification that each `kb_doc_id` exists; unverified citations stripped from walker, surfaced as warning in engineer review. |
|
||||
| Duplicate proposals for same problem | Validated-proposal match pass deduplicates after one L1 validates; pre-validation dups are tolerated and dedup'd during engineer review. |
|
||||
| KB ingestion captures sensitive content | Per-connector deny-lists (passwords, sensitive flex assets, MS Graph Sensitivity Labels). Owners exclude specific folders/sites at config. All ingested docs visible in `/account/kb` for manual deletion. |
|
||||
| AI build latency frustrates customer on call | Build-progress UI sets expectation. Escalate button visible from page load. Future: pre-warm builds on PSA-ticket-landed event. |
|
||||
| Three connectors is more scope than originally proposed | Acknowledged. Each connector is ~1–2 weeks of work. Plan should sequence them and allow shipping with IT Glue + Hudu first if SharePoint slips. |
|
||||
| Engineer review queue backlog stalls library growth | Validated-proposal match pass means good drafts get reused without engineer review. Backlog only delays the move from `'proposal'` to `'flow'`, not the L1's ability to use validated content. |
|
||||
|
||||
---
|
||||
|
||||
## 17. Naming reference
|
||||
|
||||
| Layer | Value |
|
||||
|---|---|
|
||||
| DB enum (`account_role`) | `l1_tech` |
|
||||
| UI display | "L1 Tech" / "L1" |
|
||||
| Sidebar entry | "L1 Workspace" |
|
||||
| URL prefix | `/l1` |
|
||||
| Coverage flag column | `users.can_cover_l1` |
|
||||
| Coverage audit tag | `acting_as = 'l1_coverage'` |
|
||||
| Pricing label | "L1 seat" |
|
||||
| Stripe SKU | Set in Stripe dashboard at launch — data model supports differential pricing now |
|
||||
|
||||
---
|
||||
|
||||
## 18. Open implementation decisions (deferred to plan, not blocking design)
|
||||
|
||||
- Specific `MATCH_THRESHOLD` default value validation (initial 0.75, tune from telemetry post-launch).
|
||||
- Specific Anthropic model choice for `l1_realtime_build` (Sonnet vs Opus — pick based on quality benchmark during plan).
|
||||
- Chunk size + overlap for KB ingestion writer (tune in implementation).
|
||||
- Engineer queue label default (`'Tier 2'` vs `'Engineering'`) — owner-configurable anyway.
|
||||
- Exact look of the build-progress shimmer animation — design-system handoff.
|
||||
|
||||
These are tuning/UX-polish details, not architectural forks. They land during the writing-plans phase, not here.
|
||||
|
||||
### Note on scope and phasing
|
||||
|
||||
This is a substantive feature: new role, four frontend pages, ~12 endpoints, AI tree-builder, three KB connectors, escalation extensions, and six migrations. The implementation plan will almost certainly phase the work — a reasonable cut is:
|
||||
|
||||
- **Phase 1:** role + L1 surface against existing authored flows (no AI build, no connectors yet). Validates the seat model, walker UX, escalation, internal ticket fallback, and coverage flag end-to-end.
|
||||
- **Phase 2:** `kb_documents` schema + AI tree-builder + match-or-build pipeline. Enables real-time AI flows grounded on manually-uploaded KB.
|
||||
- **Phase 3:** the three KB connectors (IT Glue, Hudu, SharePoint/OneDrive). Each is roughly self-contained — can ship one at a time and reorder if a connector blocks.
|
||||
|
||||
Phasing is a plan-level decision; the spec captures the full feature.
|
||||
|
||||
---
|
||||
|
||||
*End of spec.*
|
||||
Reference in New Issue
Block a user