Compare commits
1 Commits
87bd0b7c56
...
fix/e2e-te
| Author | SHA1 | Date | |
|---|---|---|---|
| 37c4e0c99e |
@@ -1,42 +1,33 @@
|
|||||||
# CURRENT_TASK.md
|
# CURRENT_TASK.md
|
||||||
|
|
||||||
**Task:** Build **Escalation Mode** — the wedge for ResolutionFlow's GTM (first paying-customer push). When a junior tech escalates a FlowPilot session, the senior tech sees structured handoff context in seconds instead of running a 5-minute verbal "tell me what you tried" call.
|
**Task:** none — replace this file when starting the next real task.
|
||||||
|
|
||||||
**Status:** in-flight on `feat/escalation-mode` (currently `feat/escalation-metric-endpoint`). Backend metric + role gate + email notification shipped. Frontend stat-card mounted. **Next:** WebSocket/SSE push (live-arrival half of the dual-path) and the magic-moment handoff-context screen.
|
**Status:** not-started
|
||||||
|
|
||||||
**Plan:** [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md). Reviewed by `/office-hours`, `/plan-eng-review`, `/plan-design-review`, `/codex review`. Eng + Design CLEARED. Codex's two-metric correction + claim-role-gate + per-channel notification model all applied to the plan and the code.
|
**Definition of Done:** n/a
|
||||||
|
|
||||||
**Test plan artifact:** [`docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md`](../docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md) — primary input for `/qa` once the build is feature-complete.
|
**Assumptions:** n/a
|
||||||
|
|
||||||
## Done so far on `feat/escalation-metric-endpoint`
|
**Out of scope:** n/a
|
||||||
|
|
||||||
| Commit | What it ships |
|
---
|
||||||
|---|---|
|
|
||||||
| `d51e95c` | Plan + test-plan artifacts checked in |
|
|
||||||
| `52f6d03` | `GET /analytics/flowpilot/escalations` — in-product time-to-first-action; account-scoped, engineer-or-admin gated; 9 tests including multi-tenant isolation |
|
|
||||||
| `7a5b853` | Role-gate POST `/handoffs/{id}/claim` to engineer-or-admin (was viewer-claimable); 2 tests |
|
|
||||||
| `07d0db9` | `HandoffManager.dispatch_escalation_notifications` — emails engineer/admin teammates on intent=escalate; graceful-degradation regression test; 4 tests |
|
|
||||||
| `9f0bfd4` | `EscalationMetricCard` mounted above the queue list; consumes the new endpoint; matches DESIGN-SYSTEM tokens |
|
|
||||||
|
|
||||||
20 backend tests green across handoff_manager + session_handoffs_api + flowpilot_analytics_escalations. Frontend `tsc -b` clean. Nothing pushed yet.
|
<!-- When you start a real task, replace the block above with:
|
||||||
|
|
||||||
## Remaining work on this branch
|
**Task:** One-sentence goal.
|
||||||
|
|
||||||
1. **WebSocket/SSE push** for live escalation arrival in the queue — the second half of the notification dual-path. Senior already on the queue page sees a new card slide in within ~1s of the junior hitting Escalate. ~3-4 days of work split across multiple commits (connection manager, auth-scoped fan-out, frontend EventSource handling, reconnect, slide-in animation, tab-title flash).
|
**Status:** not-started | in-progress | blocked | ready-for-review | complete
|
||||||
2. **Magic-moment handoff-context screen** — 4-section view (problem summary / what's been tried / AI assessment / Start here CTA) that loads on Pick Up before dissolving into the regular FlowPilot session view. ~1.5-2 days.
|
|
||||||
3. **Owner-facing analytics page** at `/analytics/escalations` — period selector, conversion-rate, trend chart. ~0.5d.
|
|
||||||
4. **Playwright e2e** for the magic-moment demo flow (junior escalates → senior receives → senior claims → opens session). Critical for the GTM Loom not to crash mid-recording.
|
|
||||||
|
|
||||||
## Two-metric framing — read this before quoting numbers to anyone
|
**Definition of Done:**
|
||||||
|
- [ ] Testable criterion 1
|
||||||
|
- [ ] Testable criterion 2
|
||||||
|
- [ ] Tests added or updated
|
||||||
|
- [ ] `npm run build` passes (frontend) / `pytest` passes (backend)
|
||||||
|
|
||||||
The in-product endpoint measures *post-claim time-to-first-action*. The "minutes recovered" sales claim is `manual_baseline − in_product_metric`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc). Don't roll the in-product number alone into "minutes recovered" — that's the apples-to-oranges miscount Codex caught.
|
**Assumptions:**
|
||||||
|
- What we're treating as given
|
||||||
|
|
||||||
## Kill-switch
|
**Out of scope:**
|
||||||
|
- What this task explicitly does NOT cover
|
||||||
|
|
||||||
Week 8: if 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative (deterministic-ops territory) for context, but don't pivot before the data lands.
|
-->
|
||||||
|
|
||||||
## Previous task — closed out
|
|
||||||
|
|
||||||
**Task:** Land PR #153 — fix the `AssistantChatPage` prefill `currentChatRef` bug. **Status:** complete (2026-04-26). Merged as `68fcdc6` on `main`. E2e regression test now in the suite.
|
|
||||||
|
|
||||||
**Background CI item, not blocking:** promoting `CI / e2e (pull_request)` to required on `main`. Two consecutive green PR runs (#150 and #153) cleared the threshold. Ops-only.
|
|
||||||
|
|||||||
@@ -2,49 +2,34 @@
|
|||||||
|
|
||||||
# HANDOFF.md
|
# HANDOFF.md
|
||||||
|
|
||||||
**Last updated:** 2026-04-27 EDT
|
**Last updated:** 2026-04-24 (America/New_York)
|
||||||
|
|
||||||
**Active task:** **Escalation Mode** wedge build. See [`CURRENT_TASK.md`](CURRENT_TASK.md) for the full status; this file holds the resume point only.
|
**Active task:** None — see [CURRENT_TASK.md](CURRENT_TASK.md). Replace it when picking up the next real task.
|
||||||
|
|
||||||
**Branch:** `feat/escalation-metric-endpoint` — five commits stacked on top of `main` (`c0ed6d9`). Nothing pushed yet.
|
**Branch:** `feat/flowpilot-migration` — a long-running FlowPilot Phase 9 feature branch. The recent AI-handoff migration commits ride on this branch (not on their own branch); they'll merge to `main` whenever Phase 9 does.
|
||||||
|
|
||||||
```
|
**Branch state:** 3 commits ahead of `origin/feat/flowpilot-migration`:
|
||||||
9f0bfd4 feat(escalations): mount time-to-first-action stat-card on /escalations
|
|
||||||
07d0db9 feat(handoff): email engineer-or-admin teammates on escalation
|
|
||||||
7a5b853 feat(api): role-gate handoff claim to engineer-or-admin
|
|
||||||
52f6d03 feat(analytics): add escalation time-to-first-action metric endpoint
|
|
||||||
d51e95c docs(plans): add escalation-mode wedge design + test plan
|
|
||||||
```
|
|
||||||
|
|
||||||
## Resume point
|
- `b3be1e0 chore: ignore .remember/ skill runtime state`
|
||||||
|
- `b3506b5 docs(pilot): phase 9 review issues`
|
||||||
|
- `b14a16a chore(tests): gate RLS tests behind RUN_RLS_TESTS flag`
|
||||||
|
|
||||||
Pick up the **WebSocket/SSE push** — the live-arrival half of the notification dual-path. Email is already wired (commit `07d0db9`); push is the second channel that makes the demo's "30-second magic moment" undeniable when the receiving senior is online and on the queue page.
|
Earlier in this session (already pushed to origin):
|
||||||
|
|
||||||
Suggested first slice: a thin server-side SSE endpoint scoped to `current_user.account_id`, fan out from `HandoffManager.dispatch_escalation_notifications` (alongside email), and hook the frontend `EscalationQueue` to subscribe and prepend new cards with the locked 200ms slide-in. Reconnect logic, tab-title flash, and `prefers-reduced-motion` respect are part of this slice per the locked UI spec in the design doc.
|
- `9c8ba29 fix(ai): correct stale role-hierarchy and file-listing claims`
|
||||||
|
- `bee8690 chore(ai): migrate to dual-agent handoff system`
|
||||||
|
- `e110fed chore: snapshot CLAUDE.md before ai-handoff migration` (tag: `pre-ai-handoff`)
|
||||||
|
|
||||||
After the dual-path is feature-complete, the **magic-moment handoff-context screen** is next (4 sections, dissolves into the FlowPilot session view on first action).
|
**Where I left off:**
|
||||||
|
- File: n/a — nothing mid-edit.
|
||||||
|
- Next intended action: push the 3 unpushed commits when ready (`git push`), then start the next real task (replace `CURRENT_TASK.md`, update this file).
|
||||||
|
|
||||||
## Where things stand
|
**Uncommitted state:**
|
||||||
|
- Working tree is clean.
|
||||||
|
|
||||||
- CI on `main` still healthy. Branch protection: `CI / frontend (pull_request)` required, `CI / backend (pull_request)` required, `CI / e2e (pull_request)` not yet required (ops-only follow-up — two consecutive green runs cleared the threshold).
|
**Immediate next steps:**
|
||||||
- 20 backend tests green on this branch (handoff_manager, session_handoffs_api, flowpilot_analytics_escalations). Frontend `tsc -b` clean. Branch has not been pushed; no CI runs yet.
|
1. `git push` to publish the 3 local commits (cleanup batch).
|
||||||
- The plan doc at [`docs/plans/2026-04-27-escalation-mode-wedge-design.md`](../docs/plans/2026-04-27-escalation-mode-wedge-design.md) is the source of truth for every UI / metric / scope decision. The embedded **GSTACK REVIEW REPORT** at the bottom shows Eng + Design CLEARED and Codex INFO with the disposition of all 12 of its findings.
|
2. When starting the next real feature task: replace `CURRENT_TASK.md` with actual goal/DoD, rewrite this file's resume section.
|
||||||
|
|
||||||
## Useful breadcrumbs
|
**Open questions / blockers:**
|
||||||
|
- None. The dual-agent handoff system is live and has survived one Codex review round (see DECISIONS.md 2026-04-24 entry; corrections in `9c8ba29`).
|
||||||
- New endpoint: [`backend/app/api/endpoints/flowpilot_analytics.py`](../backend/app/api/endpoints/flowpilot_analytics.py) — `get_escalation_metrics` at the bottom of the file.
|
|
||||||
- Notification dispatch: [`backend/app/services/handoff_manager.py`](../backend/app/services/handoff_manager.py) — `dispatch_escalation_notifications`. Wired in [`backend/app/api/endpoints/session_handoffs.py`](../backend/app/api/endpoints/session_handoffs.py) **after** `db.commit()` so a rolled-back handoff never emails.
|
|
||||||
- Frontend stat-card: [`frontend/src/components/flowpilot/EscalationMetricCard.tsx`](../frontend/src/components/flowpilot/EscalationMetricCard.tsx). Renders `n_with_action / n_claimed`, avg + median, and the metric_definition disclaimer.
|
|
||||||
- Two-metric framing — required reading before quoting any number to a pilot. The in-product endpoint measures *post-claim time-to-first-action*; the savings claim is `manual_baseline − in_product`. Manual baseline comes from the founder's stopwatch on the next 5 escalations (The Assignment in the design doc).
|
|
||||||
- The `notification_sent` boolean is intentionally NOT being written. Per Codex's correction it should be replaced by per-channel delivery records; v1.x story. For now, application logs are the audit trail.
|
|
||||||
- Two TODOs added during this session: peer-tech escalation (deferred to v2) and the (already moved-in-scope) claim role gate. See [`TODO.md`](TODO.md).
|
|
||||||
|
|
||||||
## Watch-outs
|
|
||||||
|
|
||||||
- `ai_session_step` has NO `user_id` column — the metric query keys "first action by senior" off `session_id + created_at > claimed_at`, which is fine because session activity post-claim IS the senior's activity (the session is reactivated under `escalated_to_id`). If a future change adds `user_id` to `ai_session_step`, the metric query can become more precise.
|
|
||||||
- `account_id` is denormalized on `ai_session_step` (Phase 4 RLS pattern). The metric query and any new SSE subscription scoping must use it directly, not join through `ai_sessions`.
|
|
||||||
- POST `/handoff` still requires the session owner to be the escalator (`AISession.user_id == current_user.id`). Peer-tech escalation is captured as a v2 TODO. Don't widen this without a UX decision.
|
|
||||||
|
|
||||||
## Kill-switch (week 8)
|
|
||||||
|
|
||||||
If 0 of 3 pilots produce a verifiable hours-saved-per-week number above 1.0, revisit the wedge. The design doc names the alternative direction (deterministic-ops territory) but data lands first.
|
|
||||||
|
|||||||
@@ -12,63 +12,6 @@
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 2026-04-26 03:50 EDT — Claude Code — Ship AssistantChatPage prefill `currentChatRef` fix; close out PR #150
|
|
||||||
|
|
||||||
- User reported a troubleshooting-session bug: after answering a subset of task-lane questions and clicking *Send N of M Responses*, no AI response appeared. Traced to `AssistantChatPage`: the dashboard prefill effect set `activeChatId` after creating a new chat session but never updated `currentChatRef.current`. The `currentChatRef.current !== sentForChatId` guard in `handleSend` and `handleTaskSubmit` then bailed silently on every later request and discarded the AI's reply. The user message was already pushed to the chat before the await, so the user saw their answers but nothing else.
|
|
||||||
- Fix: one-line addition mirroring `handleNewChat` and `handleResumeNew` — assign `currentChatRef.current = session.session_id` immediately after `setActiveChatId(session.session_id)` in the prefill effect. Branched off `origin/main` as `fix/tasklane-prefill-ref`; PR #153 opened on Gitea.
|
|
||||||
- Authored a Playwright regression test `frontend/e2e/assistant-chat-prefill.spec.ts` that drives the real dashboard prefill flow against the real backend, stubs `/ai-sessions/*/chat` with `page.route` for deterministic turn-1/turn-2 responses, and asserts the second AI message renders. Confirmed the test fails on unfixed code at the exact assertion (`Got it — based on your answer…` never appears) and passes once the fix is restored.
|
|
||||||
- Verified locally inside `mcr.microsoft.com/playwright:v1.58.2-noble` against the running dev stack: new spec passes, adjacent `flowpilot-chat` spec still passes, `tsc -b` clean. `resume.spec` and `history.spec` failures observed are pre-existing real-backend fixture collisions, unrelated to this change.
|
|
||||||
- First CI run on PR #153 failed on infrastructure issues already addressed by PR #150: backend hit `Bind for 0.0.0.0:5432 failed: port is already allocated`, frontend hit `actions/upload-artifact@v4 not supported on GHES`. PR #150 was already merged (commit `87bb20b` on `main`). Rebased `fix/tasklane-prefill-ref` onto new `main` (force-push `1a8cb06` → `1559feb`), resolved a `.ai/TODO.md` conflict by keeping both backlog item sets, kicked off CI on the rebased SHA.
|
|
||||||
- Confirmed `CI / backend (pull_request)` is now in branch protection's required-status-checks list (added during PR #150 close-out). `CI / e2e (pull_request)` left as not-required pending one more clean PR run as the threshold.
|
|
||||||
- Recorded the broader silent-return concern in TODO backlog: the `currentChatRef.current !== sentForChatId` guard is applied across `handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, and `refreshPreview`. PR #153 fixes one symptom but the same pattern can mask other drift. Either log a Sentry breadcrumb on the mismatch path or distinguish "expected stale" (chat switch) from "unexpected stale" (ref never updated) so the latter alerts.
|
|
||||||
- First CI run on the rebased SHA passed backend and frontend but failed e2e: the new prefill regression test couldn't render the task-lane question text. Diagnosed via the job log: `POST /api/v1/ai-sessions` calls `_require_ai_enabled()` and returns 503 when no provider key is set. The e2e CI job had neither `ANTHROPIC_API_KEY` nor `GOOGLE_AI_API_KEY` in env. Locally the dev backend has a real key, hence the local pass. The Playwright `page.route` stub on `/chat` was correct but never had a chance to fire because the upstream session-creation call was 503-ing.
|
|
||||||
- Fix: added a stub `ANTHROPIC_API_KEY: ci-stub-key-not-used-by-tests` to the e2e job env in `.gitea/workflows/ci.yml`. The Playwright stub still intercepts the actual `/chat` call in the browser, so the backend never contacts Anthropic — the gate just needs to clear. Documented the convention in a workflow comment so future AI-touching e2e tests know what to expect. Pushed `11fe32f`; CI went all-green.
|
|
||||||
- Merged PR #153 as `68fcdc6` on `main`. Local feature branch and remote both deleted via Gitea's `delete_branch_after_merge`.
|
|
||||||
- Opened a small follow-up `chore/post-153-handoff` PR to refresh the now-stale `.ai/` files (this entry, plus `CURRENT_TASK.md` rolling forward to "no active task — pick from `TODO.md`" and `HANDOFF.md` updating to the post-merge home position). The `data-testid` audit at the top of `TODO.md` "Up next" or the `currentChatRef` silent-return audit added in this session's backlog are the natural next pickups.
|
|
||||||
- Files touched: `frontend/src/pages/AssistantChatPage.tsx` (the one-line fix + comment), `frontend/e2e/assistant-chat-prefill.spec.ts` (new regression test), `.gitea/workflows/ci.yml` (stub `ANTHROPIC_API_KEY` for e2e), `.ai/TODO.md` (silent-return follow-up entry, plus conflict resolution preserving PR #150's backlog additions), `.ai/CURRENT_TASK.md`, `.ai/HANDOFF.md`, `.ai/SESSION_LOG.md` (this entry).
|
|
||||||
|
|
||||||
## 2026-04-25 16:41 EDT — Codex — Stabilize PR #150 e2e selectors
|
|
||||||
|
|
||||||
- Investigated the remaining PR #150 failure after backend and frontend CI were green. The e2e resume smoke test was not failing because of product behavior; it used `.bg-card` plus text filtering and matched the tree filter `<select>` before the intended session card.
|
|
||||||
- Added stable test IDs to flow session, tree, and share cards, then updated affected e2e tests to target those cards instead of Tailwind class names.
|
|
||||||
- Hardened the CI workflow by making Postgres healthchecks authenticate as `postgres` and baking `VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}"` into the e2e frontend build.
|
|
||||||
- Verified with `git diff --check`, frontend build in Docker, no remaining `.bg-card` e2e selectors, and focused Playwright runs in an Actions-like Ubuntu container: resume spec passed, then history/library/library-start/resume/shares passed (`6 passed`).
|
|
||||||
- Left for next session: push this WIP commit to PR #150, watch CI, merge when all three jobs are green, then enable backend branch protection and consider the e2e gate after a reliable green run.
|
|
||||||
- Files touched: `.gitea/workflows/ci.yml`, `frontend/e2e/history.spec.ts`, `frontend/e2e/library-start.spec.ts`, `frontend/e2e/library.spec.ts`, `frontend/e2e/resume.spec.ts`, `frontend/e2e/shares.spec.ts`, `frontend/src/components/library/TreeGridView.tsx`, `frontend/src/components/library/TreeListView.tsx`, `frontend/src/pages/MySharesPage.tsx`, `frontend/src/pages/SessionHistoryPage.tsx`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/SESSION_LOG.md`.
|
|
||||||
|
|
||||||
## 2026-04-25 12:00 America/New_York — Claude Code — Mock final AI-provider test, cache CI deps, parallelize backend with pytest-xdist
|
|
||||||
|
|
||||||
- Diagnosed why CI was still red despite Codex's local 1076 passed: a single test (`test_record_decision_persists_and_bumps_state_version`) needed `ANTHROPIC_API_KEY` because the `decision: draft_template` path calls `TemplateExtractionService` → AI provider. Patched `_extract_template_parameters` with an `AsyncMock` so the test no longer depends on AI availability. Verified.
|
|
||||||
- Pushed Codex's WIP commit `49f8856` to PR #150 (had been local-only per handoff protocol).
|
|
||||||
- PR #150 (`fix/ci-workflow-config`) extended with cheap CI wins: `actions/cache@v3` for pip + npm in all three jobs; dropped `--cov-report=term-missing` (the custom display step parses JSON); added `--maxfail=10` so structural breakage exits fast.
|
|
||||||
- PR #151 (`fix/ci-pytest-xdist`) opened, stacked on #150: pytest-xdist with per-worker DB isolation. `conftest.py` reads `PYTEST_XDIST_WORKER`, computes a per-worker DB URL like `…_gw0`, and synchronously CREATEs the DB on first import. The per-test `DROP SCHEMA public CASCADE` then operates on the worker's isolated DB. Verified locally: backend suite went from 22m 27s serial → 4m 28s parallel (8 workers), 1076 passed in both cases. ~5× speedup.
|
|
||||||
- Decided NOT to do per-test transactional rollback (bigger refactor); captured for future TODO consideration.
|
|
||||||
- Left for next session: watch CI on both PRs, merge in order (#150 first, #151 second), then enable `CI / backend (pull_request)` as a required status check on main.
|
|
||||||
- Files touched: `backend/tests/test_session_suggested_fixes_api.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `.gitea/workflows/ci.yml`, `.ai/HANDOFF.md`, `.ai/CURRENT_TASK.md`, `.ai/TODO.md`.
|
|
||||||
|
|
||||||
## 2026-04-25 06:12 EDT — Codex — Fix backend suite to green
|
|
||||||
|
|
||||||
- Fixed the real backend failures left after the CI-infra cleanup: tenant-scoped seed drift, missing production `account_id` writes, public route mounting for survey/share links, Script Builder library saves, resolution output async loading, AI search schema metadata, disabled-AI fixture leakage, and prompt marker guardrails.
|
|
||||||
- Added backend CI/dev system packages required by WeasyPrint PDF export.
|
|
||||||
- Stabilized the pytest harness for pytest-asyncio/asyncpg teardown ResourceWarnings under `filterwarnings = error`.
|
|
||||||
- Verified `pytest --override-ini="addopts=" -q` inside `resolutionflow_backend`: `1076 passed, 35 deselected in 1347.41s`.
|
|
||||||
- Left for next session: commit/push if needed, check and merge PR #150 when Gitea CI is green, add backend CI as a required branch-protection check, and rerun frontend lint if final DoD requires it.
|
|
||||||
- Files touched: `.gitea/workflows/ci.yml`, `backend/Dockerfile.dev`, `backend/app/api/endpoints/folders.py`, `backend/app/api/endpoints/script_builder.py`, `backend/app/api/endpoints/shares.py`, `backend/app/api/router.py`, `backend/app/models/ai_session.py`, `backend/app/schemas/user.py`, `backend/app/services/assistant_chat_service.py`, `backend/app/services/resolution_output_generator.py`, `backend/app/services/script_builder_service.py`, `backend/pytest.ini`, `backend/tests/conftest.py`, and focused backend tests.
|
|
||||||
|
|
||||||
## 2026-04-25 02:00 America/New_York — Claude Code — Land FlowPilot + PSA, recover CI from 488 errors to ~4
|
|
||||||
|
|
||||||
- Started session by completing pending FlowPilot Phase 9 QA: ran `/qa` against the seeded fixtures, found and fixed four latent layout/state bugs (`ResolutionNotePreview` off-screen, `TemplateMatchPanel` deadlock when TaskLane closed, `EscalateInterceptDialog` clipped above viewport, `seed_test_users.py` `cancel_at_period_end` NOT NULL crash). Added a new fixture seeder `backend/scripts/seed_phase9_qa_fixtures.py` that pre-bakes the four backend states the AI orchestrator needs to emit, so future QA can exercise all 7 conditional Phase 9 components without depending on stochastic AI behavior.
|
|
||||||
- Discovered PR #141 (PSA ticket management) and `feat/flowpilot-migration` had 5 overlapping files but only 2 real conflicts (`CLAUDE.md`, `AssistantChatPage.tsx`). Conflicts were both additive — concatenated rather than chose-a-side.
|
|
||||||
- Merged PSA first (PR #141), then merged FlowPilot (PR #147), each through Gitea API. `tsc -b` clean and visual smoke-test confirmed PSA's Tickets sidebar coexists with Phase 9 ProposalBanner.
|
|
||||||
- Discovered main had been merging through a broken CI gate for several merges. Initially recommended "stop the line, fix CI before shipping." After scoping the actual rot (~50% of tests red, ~600 errors on a clean run), reversed the recommendation: ship the queue first because FlowPilot itself carried significant test-infra repairs that would be duplicated work on a fresh recovery branch.
|
|
||||||
- PR #148: two surgical fixes to main (network_diagrams JSONB `server_default` triple-quote bug, deprecated session-scoped `event_loop` fixture in conftest). +78 passing / -114 errors.
|
|
||||||
- PR #149: frontend lint `20 errors → 0`, `requirements-dev.txt` pytest pin bumped to satisfy `pytest-asyncio==0.24.0`'s `pytest>=8.2`, and a one-line `from app import models as _models` in conftest that registers all ~60 models with `Base.metadata` before `create_all`. The conftest fix collapsed 484 of the remaining 488 backend errors. `1018 passed / 4 errors / 54 failed` after.
|
|
||||||
- Enabled Gitea branch protection on `main`: PR-only merges, `CI / frontend (pull_request)` required, force-push blocked, no review required.
|
|
||||||
- Discovered CI on the merge commit STILL showed red despite local pytest being mostly green. Root cause: workflow only set `DATABASE_URL`, but conftest reads only `DATABASE_TEST_URL` (per `dab740d`'s safety hardening). 638 connection-refused errors on every fixture setup. Plus `actions/upload-artifact@v4` not supported by Gitea Actions. PR #150 fixes both.
|
|
||||||
- Left for next session: merge PR #150 once CI confirms green, add `CI / backend (pull_request)` to required status checks, then root-cause and fix the 54 real backend test failures (one sample seen — `test_user` fixture leaking across calls causing duplicate-email violations).
|
|
||||||
- Files touched (committed): `backend/scripts/seed_test_users.py`, `backend/scripts/seed_phase9_qa_fixtures.py` (new), `backend/app/models/network_diagram.py`, `backend/tests/conftest.py`, `backend/requirements-dev.txt`, `frontend/src/components/pilot/ResolutionNotePreview.tsx`, `frontend/src/components/pilot/EscalateInterceptDialog.tsx`, `frontend/src/components/pilot/ScriptBuilderTab.tsx`, `frontend/src/pages/AssistantChatPage.tsx`, `frontend/src/pages/FlowPilotSessionPage.tsx`, `frontend/src/pages/TicketsPage.tsx`, `frontend/src/hooks/useFlowPilotSession.ts`, `frontend/src/hooks/useMediaQuery.ts`, `frontend/src/components/dashboard/TicketQueue.tsx`, `frontend/src/components/network/nodes/DeviceNode.tsx`, `frontend/src/components/network/nodes/GroupNode.tsx`, `frontend/src/components/routing/AssistantSessionRedirect.tsx` (new), `frontend/src/router.tsx`, `.gitea/workflows/ci.yml`, `.claude/settings.json` (new), `.claude/hooks/check-gstack.sh` (new), `.gitignore`, `CLAUDE.md`, `.gstack/qa-reports/phase9-*/` (QA artifacts).
|
|
||||||
- Net merges to main: PR #141 (PSA), PR #147 (FlowPilot), PR #148 (CI fixes part 1), PR #149 (CI fixes part 2). PR #150 still open at session end.
|
|
||||||
|
|
||||||
## 2026-04-24 — Claude Code — Migrate to dual-agent handoff system
|
## 2026-04-24 — Claude Code — Migrate to dual-agent handoff system
|
||||||
|
|
||||||
- Split CLAUDE.md into `.ai/PROJECT_CONTEXT.md` + shared-protocol root files (`CLAUDE.md`, `AGENTS.md`).
|
- Split CLAUDE.md into `.ai/PROJECT_CONTEXT.md` + shared-protocol root files (`CLAUDE.md`, `AGENTS.md`).
|
||||||
|
|||||||
15
.ai/TODO.md
15
.ai/TODO.md
@@ -5,19 +5,8 @@
|
|||||||
|
|
||||||
## Up next
|
## Up next
|
||||||
|
|
||||||
- [ ] **Parallelize backend pytest with pytest-xdist.** ✅ landing as PR #151. Verified locally: backend suite 22 min → 4m 28s with `-n auto` on the 8-core homelab runner. Per-worker DB isolation via `PYTEST_XDIST_WORKER` in conftest.py.
|
- [ ] No queued backlog yet.
|
||||||
|
|
||||||
## Backlog
|
## Backlog
|
||||||
|
|
||||||
- [ ] **Frontend lint warnings cleanup.** 23 `react-hooks/exhaustive-deps` warnings remain after PR #149 (mostly missing-deps in useEffect). Either fix them or audit them for known-safe ones and add eslint-disable comments. Not blocking CI today.
|
- [ ] No queued backlog yet.
|
||||||
- [ ] **Audit `filterwarnings` ignores added in `wip(handoff): restore backend suite to green`.** Codex added narrow `ResourceWarning` filters for unclosed socket/transport/event-loop noise from pytest-asyncio teardown. Worth periodically reviewing whether those are still needed (e.g. when bumping pytest-asyncio) — if a real warning appears in those forms it would be silenced.
|
|
||||||
- [ ] **Add `data-testid` attributes to e2e-critical interactive elements.** PR #152 fixed five Playwright tests by chasing UI-text changes (`Sessions` → `Session History`, `Account Settings` → `Account Management`, `/assistant` → `/pilot`, "Flow Sessions" tab, Resume button on session cards). Each was a one-line selector update, but every UI churn re-breaks them. Adding stable `data-testid` attributes on the targeted elements (page heading wrappers, tab nav, primary action buttons) and switching tests to `getByTestId` would make these immune to copy/route renames. Scope it small — start with `SessionHistoryPage` heading, the AI/Flow Sessions tab buttons, the per-session `Resume` button, and the command-palette FlowPilot option.
|
|
||||||
- [ ] **Per-test transactional rollback in `test_db` fixture.** Bigger engineering than xdist (which we already shipped). Instead of `DROP SCHEMA public CASCADE` per test, wrap each test in a savepoint and rollback at teardown. ~30-40% additional speedup on top of xdist for test-DB-heavy tests. Real refactor; only worth it if the suite gets significantly larger or runs more frequently.
|
|
||||||
- [ ] **Consider `pytest-testmon` for PR-time test selection.** Tracks which tests touched which source files and only re-runs affected ones. Best for small PRs touching ~few files. Adds cache-invalidation complexity; only worth it if the suite stays painfully long even after xdist.
|
|
||||||
- [ ] **AssistantChatPage `currentChatRef` guard is a silent return** — `handleSend`, `handleTaskSubmit`, `selectChat`, `refreshFacts`, `refreshActiveFix`, and `refreshPreview` all bail with `if (currentChatRef.current !== sentForChatId) return` when stale. This is by design for chat switching, but it also silently masked the prefill-ref bug fixed in PR #153 — the user just saw "no AI response" with no log, no toast, no Sentry event. Either (a) log a `console.warn`/Sentry breadcrumb on the mismatch path so future drift is visible, or (b) split "expected stale" (chat switch) from "unexpected stale" (ref never updated) so only the latter alerts. Pair with an audit of every `currentChatRef.current = ...` assignment vs every `setActiveChatId(...)` call to make sure they're paired everywhere.
|
|
||||||
|
|
||||||
- [ ] **Allow peer-tech to escalate a colleague's session.** Today `POST /ai-sessions/{session_id}/handoff` in [endpoints/session_handoffs.py:48](backend/app/api/endpoints/session_handoffs.py#L48) filters by `AISession.user_id == current_user.id`, so only the session owner can escalate. Real MSP shops have peer hand-offs: Junior A is on lunch, Junior B sees the session is stuck and should be able to escalate it. Auth tweak: switch from session-owner check to `require_engineer_or_admin` + same-account scope. Add a `handed_off_by` audit column (already exists on `SessionHandoff`) so the original-owner-vs-actual-escalator distinction is preserved. Surfaced from /plan-eng-review on the Escalation-Mode wedge plan; v1 wedge demo doesn't need this (solo-founder pilot), but capture for v2 once 3+ pilots are live and a peer-claim need surfaces.
|
|
||||||
|
|
||||||
- [ ] **Mobile/responsive design for EscalationQueue + handoff-context screen.** Pre-PMF wedge demo targets desktop only — MSP techs work on laptops/desktops in shop environments. Once 3+ paying customers exist and a tech requests mobile (likely on-call use case), spec the responsive behavior: stacked card layout below `sm:` breakpoint, full-bleed handoff-context overlay on mobile, swipe-to-claim gesture instead of Pick Up button. Surfaced from /plan-design-review on the Escalation-Mode wedge plan.
|
|
||||||
|
|
||||||
- [ ] **(MOVED IN-SCOPE for Escalation Mode v1, 2026-04-27)** ~~Add role gate to handoff claim endpoint.~~ Codex review correctly flagged this as wedge-relevant (the race-condition story depends on auth gating). Now part of the Escalation Mode v1 build, not a deferred TODO.
|
|
||||||
|
|||||||
@@ -17,13 +17,10 @@ jobs:
|
|||||||
POSTGRES_USER: postgres
|
POSTGRES_USER: postgres
|
||||||
POSTGRES_PASSWORD: postgres
|
POSTGRES_PASSWORD: postgres
|
||||||
POSTGRES_DB: resolutionflow_test
|
POSTGRES_DB: resolutionflow_test
|
||||||
# No host port mapping. Tests connect to `postgres:5432` (the service
|
ports:
|
||||||
# container's docker-network DNS name), not `localhost:5432`. With
|
- 5432:5432
|
||||||
# multiple Gitea runners on the same homelab box, host-port mapping
|
|
||||||
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
|
||||||
# second fails with "port is already allocated".
|
|
||||||
options: >-
|
options: >-
|
||||||
--health-cmd "pg_isready -U postgres"
|
--health-cmd pg_isready
|
||||||
--health-interval 10s
|
--health-interval 10s
|
||||||
--health-timeout 5s
|
--health-timeout 5s
|
||||||
--health-retries 5
|
--health-retries 5
|
||||||
@@ -31,12 +28,6 @@ jobs:
|
|||||||
env:
|
env:
|
||||||
DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
|
DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
|
||||||
DATABASE_URL_SYNC: postgresql://postgres:postgres@postgres:5432/resolutionflow_test
|
DATABASE_URL_SYNC: postgresql://postgres:postgres@postgres:5432/resolutionflow_test
|
||||||
# conftest.py reads DATABASE_TEST_URL only (DATABASE_URL is intentionally
|
|
||||||
# not consulted after the dab740d test-isolation hardening). The CI test
|
|
||||||
# DB is the same postgres service, so point DATABASE_TEST_URL at it
|
|
||||||
# explicitly — without this, conftest falls back to localhost:5432 and
|
|
||||||
# all tests fail at fixture setup with "connection refused".
|
|
||||||
DATABASE_TEST_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
|
|
||||||
SECRET_KEY: ci-test-secret-key-not-for-production
|
SECRET_KEY: ci-test-secret-key-not-for-production
|
||||||
DEBUG: "true"
|
DEBUG: "true"
|
||||||
APP_NAME: ResolutionFlow
|
APP_NAME: ResolutionFlow
|
||||||
@@ -46,19 +37,6 @@ jobs:
|
|||||||
steps:
|
steps:
|
||||||
- uses: actions/checkout@v4
|
- uses: actions/checkout@v4
|
||||||
|
|
||||||
- name: Cache pip
|
|
||||||
uses: actions/cache@v3
|
|
||||||
with:
|
|
||||||
path: ~/.cache/pip
|
|
||||||
key: pip-${{ runner.os }}-${{ hashFiles('backend/requirements.txt', 'backend/requirements-dev.txt') }}
|
|
||||||
restore-keys: |
|
|
||||||
pip-${{ runner.os }}-
|
|
||||||
|
|
||||||
- name: Install system dependencies
|
|
||||||
run: |
|
|
||||||
apt-get update
|
|
||||||
apt-get install -y libpango1.0-dev libcairo2-dev libgdk-pixbuf-2.0-dev libffi-dev libjpeg-dev zlib1g-dev
|
|
||||||
|
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
|
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
|
||||||
|
|
||||||
@@ -69,15 +47,7 @@ jobs:
|
|||||||
run: cd backend && python scripts/check_tenant_filters.py
|
run: cd backend && python scripts/check_tenant_filters.py
|
||||||
|
|
||||||
- name: Run tests with coverage
|
- name: Run tests with coverage
|
||||||
# `-n auto` parallelizes across all runner cores via pytest-xdist.
|
run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50
|
||||||
# conftest.py creates a per-worker DB (resolutionflow_test_gw0,
|
|
||||||
# resolutionflow_test_gw1, …) so the per-test DROP SCHEMA doesn't
|
|
||||||
# race across workers. Master/serial runs keep the base DB.
|
|
||||||
# term-missing dropped — the custom "Display coverage summary" step
|
|
||||||
# below parses coverage.json and prints the same info more concisely.
|
|
||||||
# --maxfail=10 short-circuits on structural breakage so we don't burn
|
|
||||||
# 25 minutes when a fixture explodes.
|
|
||||||
run: cd backend && python -m pytest --override-ini="addopts=" -n auto --maxfail=10 --cov=app --cov-report=json:coverage.json --cov-fail-under=50
|
|
||||||
|
|
||||||
- name: Display coverage summary
|
- name: Display coverage summary
|
||||||
if: always()
|
if: always()
|
||||||
@@ -105,14 +75,6 @@ jobs:
|
|||||||
steps:
|
steps:
|
||||||
- uses: actions/checkout@v4
|
- uses: actions/checkout@v4
|
||||||
|
|
||||||
- name: Cache npm
|
|
||||||
uses: actions/cache@v3
|
|
||||||
with:
|
|
||||||
path: ~/.npm
|
|
||||||
key: npm-${{ runner.os }}-${{ hashFiles('frontend/package-lock.json') }}
|
|
||||||
restore-keys: |
|
|
||||||
npm-${{ runner.os }}-
|
|
||||||
|
|
||||||
- name: Install dependencies
|
- name: Install dependencies
|
||||||
run: cd frontend && npm ci
|
run: cd frontend && npm ci
|
||||||
|
|
||||||
@@ -125,14 +87,15 @@ jobs:
|
|||||||
- name: Build
|
- name: Build
|
||||||
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
|
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
|
||||||
|
|
||||||
# Build artifact intentionally NOT uploaded. The e2e job below builds
|
- name: Upload build artifact
|
||||||
# its own frontend rather than downloading one from this job, so there
|
uses: actions/upload-artifact@v4
|
||||||
# is no need for the cross-job artifact handoff (which previously broke
|
with:
|
||||||
# on actions/upload-artifact@v4 GHES support and forced a v3 pin).
|
name: frontend-dist
|
||||||
# Decoupling also lets e2e start immediately rather than waiting for
|
path: frontend/dist
|
||||||
# this job to finish — important on a multi-runner setup.
|
retention-days: 1
|
||||||
|
|
||||||
e2e:
|
e2e:
|
||||||
|
needs: [frontend]
|
||||||
runs-on: ubuntu-latest
|
runs-on: ubuntu-latest
|
||||||
|
|
||||||
services:
|
services:
|
||||||
@@ -142,13 +105,10 @@ jobs:
|
|||||||
POSTGRES_USER: postgres
|
POSTGRES_USER: postgres
|
||||||
POSTGRES_PASSWORD: postgres
|
POSTGRES_PASSWORD: postgres
|
||||||
POSTGRES_DB: resolutionflow_test
|
POSTGRES_DB: resolutionflow_test
|
||||||
# No host port mapping. Tests connect to `postgres:5432` (the service
|
ports:
|
||||||
# container's docker-network DNS name), not `localhost:5432`. With
|
- 5432:5432
|
||||||
# multiple Gitea runners on the same homelab box, host-port mapping
|
|
||||||
# would race — two backend/e2e jobs both binding 0.0.0.0:5432 → the
|
|
||||||
# second fails with "port is already allocated".
|
|
||||||
options: >-
|
options: >-
|
||||||
--health-cmd "pg_isready -U postgres"
|
--health-cmd pg_isready
|
||||||
--health-interval 10s
|
--health-interval 10s
|
||||||
--health-timeout 5s
|
--health-timeout 5s
|
||||||
--health-retries 5
|
--health-retries 5
|
||||||
@@ -161,45 +121,21 @@ jobs:
|
|||||||
PLAYWRIGHT_SECRET_KEY: ci-playwright-secret-key
|
PLAYWRIGHT_SECRET_KEY: ci-playwright-secret-key
|
||||||
PLAYWRIGHT_TEST_EMAIL: teamadmin@resolutionflow.example.com
|
PLAYWRIGHT_TEST_EMAIL: teamadmin@resolutionflow.example.com
|
||||||
PLAYWRIGHT_TEST_PASSWORD: TestPass123!
|
PLAYWRIGHT_TEST_PASSWORD: TestPass123!
|
||||||
# AI-touching endpoints (POST /ai-sessions, /chat, /respond, etc.) are
|
|
||||||
# gated by `_require_ai_enabled()`, which returns 503 when no provider
|
|
||||||
# key is set. Tests that exercise those flows stub the AI calls in the
|
|
||||||
# browser via `page.route`, so the backend never actually contacts
|
|
||||||
# Anthropic — but the gate still has to pass. A stub value is enough.
|
|
||||||
ANTHROPIC_API_KEY: ci-stub-key-not-used-by-tests
|
|
||||||
|
|
||||||
steps:
|
steps:
|
||||||
- uses: actions/checkout@v4
|
- uses: actions/checkout@v4
|
||||||
|
|
||||||
- name: Cache pip
|
|
||||||
uses: actions/cache@v3
|
|
||||||
with:
|
|
||||||
path: ~/.cache/pip
|
|
||||||
key: pip-${{ runner.os }}-${{ hashFiles('backend/requirements.txt', 'backend/requirements-dev.txt') }}
|
|
||||||
restore-keys: |
|
|
||||||
pip-${{ runner.os }}-
|
|
||||||
|
|
||||||
- name: Cache npm
|
|
||||||
uses: actions/cache@v3
|
|
||||||
with:
|
|
||||||
path: ~/.npm
|
|
||||||
key: npm-${{ runner.os }}-${{ hashFiles('frontend/package-lock.json') }}
|
|
||||||
restore-keys: |
|
|
||||||
npm-${{ runner.os }}-
|
|
||||||
|
|
||||||
- name: Install backend dependencies
|
- name: Install backend dependencies
|
||||||
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
|
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
|
||||||
|
|
||||||
- name: Install frontend dependencies
|
- name: Install frontend dependencies
|
||||||
run: cd frontend && npm ci
|
run: cd frontend && npm ci
|
||||||
|
|
||||||
- name: Build frontend
|
- name: Download frontend build
|
||||||
# Building inline (instead of downloading an artifact from the
|
uses: actions/download-artifact@v4
|
||||||
# frontend job) drops the cross-job dependency, so e2e can start
|
with:
|
||||||
# immediately on a free runner. Adds ~1-2 min of build time, but
|
name: frontend-dist
|
||||||
# eliminates the artifact-upload mechanism entirely (no more
|
path: frontend/dist
|
||||||
# v3/v4 GHES headaches) and saves ~5 min of waiting.
|
|
||||||
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" VITE_API_URL="${PLAYWRIGHT_API_ORIGIN}" npm run build
|
|
||||||
|
|
||||||
- name: Install Playwright browser
|
- name: Install Playwright browser
|
||||||
run: cd frontend && npx playwright install --with-deps chromium
|
run: cd frontend && npx playwright install --with-deps chromium
|
||||||
@@ -209,7 +145,7 @@ jobs:
|
|||||||
|
|
||||||
- name: Upload Playwright report
|
- name: Upload Playwright report
|
||||||
if: always()
|
if: always()
|
||||||
uses: actions/upload-artifact@v3
|
uses: actions/upload-artifact@v4
|
||||||
with:
|
with:
|
||||||
name: playwright-report
|
name: playwright-report
|
||||||
path: |
|
path: |
|
||||||
|
|||||||
@@ -5,12 +5,6 @@ WORKDIR /app
|
|||||||
RUN apt-get update && apt-get install -y \
|
RUN apt-get update && apt-get install -y \
|
||||||
gcc \
|
gcc \
|
||||||
libpq-dev \
|
libpq-dev \
|
||||||
libpango1.0-dev \
|
|
||||||
libcairo2-dev \
|
|
||||||
libgdk-pixbuf-2.0-dev \
|
|
||||||
libffi-dev \
|
|
||||||
libjpeg-dev \
|
|
||||||
zlib1g-dev \
|
|
||||||
&& rm -rf /var/lib/apt/lists/*
|
&& rm -rf /var/lib/apt/lists/*
|
||||||
|
|
||||||
COPY requirements.txt requirements-dev.txt ./
|
COPY requirements.txt requirements-dev.txt ./
|
||||||
@@ -18,4 +12,4 @@ RUN pip install --no-cache-dir -r requirements-dev.txt
|
|||||||
|
|
||||||
EXPOSE 8000
|
EXPOSE 8000
|
||||||
|
|
||||||
CMD [ "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload" ]
|
CMD [ "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--reload" ]
|
||||||
@@ -3,10 +3,8 @@
|
|||||||
Endpoints:
|
Endpoints:
|
||||||
GET /analytics/flowpilot?period=30d — Main dashboard data
|
GET /analytics/flowpilot?period=30d — Main dashboard data
|
||||||
GET /analytics/flowpilot/knowledge-gaps — Knowledge gap report
|
GET /analytics/flowpilot/knowledge-gaps — Knowledge gap report
|
||||||
GET /analytics/flowpilot/escalations?period=30d — Escalation handoff metrics
|
|
||||||
"""
|
"""
|
||||||
import logging
|
import logging
|
||||||
import statistics
|
|
||||||
from datetime import datetime, timezone, timedelta
|
from datetime import datetime, timezone, timedelta
|
||||||
from typing import Annotated, Optional
|
from typing import Annotated, Optional
|
||||||
|
|
||||||
@@ -15,17 +13,10 @@ from sqlalchemy import select, func, case, cast, Date, extract
|
|||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
from app.core.rate_limit import limiter
|
from app.core.rate_limit import limiter
|
||||||
from app.api.deps import (
|
from app.api.deps import get_current_active_user, get_db, require_team_admin
|
||||||
get_current_active_user,
|
|
||||||
get_db,
|
|
||||||
require_engineer_or_admin,
|
|
||||||
require_team_admin,
|
|
||||||
)
|
|
||||||
from app.models.user import User
|
from app.models.user import User
|
||||||
from app.models.tree import Tree
|
from app.models.tree import Tree
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.ai_session_step import AISessionStep
|
|
||||||
from app.models.session_handoff import SessionHandoff
|
|
||||||
from app.models.flow_proposal import FlowProposal
|
from app.models.flow_proposal import FlowProposal
|
||||||
from app.models.psa_activity_log import PsaActivityLog
|
from app.models.psa_activity_log import PsaActivityLog
|
||||||
from app.models.psa_post_log import PsaPostLog
|
from app.models.psa_post_log import PsaPostLog
|
||||||
@@ -45,7 +36,6 @@ from app.schemas.flowpilot_analytics import (
|
|||||||
EnhancedPsaMetrics,
|
EnhancedPsaMetrics,
|
||||||
PsaFunnel,
|
PsaFunnel,
|
||||||
PsaDailyTrend,
|
PsaDailyTrend,
|
||||||
EscalationMetrics,
|
|
||||||
)
|
)
|
||||||
from app.services.knowledge_gap_service import get_knowledge_gaps, KnowledgeGapReport
|
from app.services.knowledge_gap_service import get_knowledge_gaps, KnowledgeGapReport
|
||||||
|
|
||||||
@@ -737,104 +727,3 @@ async def get_enhanced_psa_metrics(
|
|||||||
push_funnel=push_funnel,
|
push_funnel=push_funnel,
|
||||||
daily_trend=daily_trend,
|
daily_trend=daily_trend,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
# ─── Escalation Mode metrics (wedge stat for /escalations queue + analytics page)
|
|
||||||
#
|
|
||||||
# Pulls all (handoff.claimed_at, first_step_after_claim.created_at) pairs in the
|
|
||||||
# window and aggregates avg/median/p95 of the delta in Python. Pilot scale
|
|
||||||
# (~1k rows max per account per month) makes this cheaper and clearer than
|
|
||||||
# Postgres percentile_cont gymnastics.
|
|
||||||
#
|
|
||||||
# IMPORTANT: this is the in-product metric only. The "minutes recovered"
|
|
||||||
# sales claim requires manual baseline measurement (see The Assignment in
|
|
||||||
# docs/plans/2026-04-27-escalation-mode-wedge-design.md).
|
|
||||||
|
|
||||||
|
|
||||||
@router.get("/escalations", response_model=EscalationMetrics)
|
|
||||||
@limiter.limit("30/minute")
|
|
||||||
async def get_escalation_metrics(
|
|
||||||
request: Request,
|
|
||||||
current_user: Annotated[User, Depends(get_current_active_user)],
|
|
||||||
db: Annotated[AsyncSession, Depends(get_db)],
|
|
||||||
_: None = Depends(require_engineer_or_admin),
|
|
||||||
period: str = Query("30d", pattern="^(7d|30d|90d)$"),
|
|
||||||
) -> EscalationMetrics:
|
|
||||||
"""Time-to-first-action after escalation claim, account-scoped.
|
|
||||||
|
|
||||||
Returns:
|
|
||||||
n_handoffs_claimed: handoffs in window that were claimed by a senior.
|
|
||||||
n_handoffs_with_action: subset where the senior took at least one
|
|
||||||
action (an ai_session_step row created after claimed_at).
|
|
||||||
avg/median/p95_seconds_to_first_action: aggregates of
|
|
||||||
(first_step.created_at - claimed_at) in seconds.
|
|
||||||
|
|
||||||
Excludes handoffs where claimed_at IS NULL (never claimed) and handoffs
|
|
||||||
where no ai_session_step was created after the claim. Both are
|
|
||||||
counted — n_handoffs_claimed includes "no action yet" handoffs so the
|
|
||||||
conversion rate is visible.
|
|
||||||
"""
|
|
||||||
if not current_user.account_id:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN, detail="No account"
|
|
||||||
)
|
|
||||||
|
|
||||||
account_id = current_user.account_id
|
|
||||||
period_start = _get_period_start(period)
|
|
||||||
|
|
||||||
# First-action timestamp per handoff via correlated scalar subquery.
|
|
||||||
first_action_subq = (
|
|
||||||
select(func.min(AISessionStep.created_at))
|
|
||||||
.where(
|
|
||||||
AISessionStep.session_id == SessionHandoff.session_id,
|
|
||||||
AISessionStep.created_at > SessionHandoff.claimed_at,
|
|
||||||
)
|
|
||||||
.correlate(SessionHandoff)
|
|
||||||
.scalar_subquery()
|
|
||||||
)
|
|
||||||
|
|
||||||
rows = (
|
|
||||||
await db.execute(
|
|
||||||
select(
|
|
||||||
SessionHandoff.claimed_at,
|
|
||||||
first_action_subq.label("first_action_at"),
|
|
||||||
).where(
|
|
||||||
SessionHandoff.account_id == account_id,
|
|
||||||
SessionHandoff.claimed_at.isnot(None),
|
|
||||||
SessionHandoff.claimed_at >= period_start,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
).all()
|
|
||||||
|
|
||||||
n_handoffs_claimed = len(rows)
|
|
||||||
deltas: list[float] = []
|
|
||||||
for claimed_at, first_action_at in rows:
|
|
||||||
if first_action_at is None:
|
|
||||||
continue
|
|
||||||
delta_s = (first_action_at - claimed_at).total_seconds()
|
|
||||||
# Floor at zero — clock drift between rows could in theory yield a
|
|
||||||
# tiny negative if a step's created_at races claimed_at. Surface as
|
|
||||||
# 0s rather than absurd negative deltas.
|
|
||||||
if delta_s < 0:
|
|
||||||
delta_s = 0.0
|
|
||||||
deltas.append(delta_s)
|
|
||||||
|
|
||||||
n_handoffs_with_action = len(deltas)
|
|
||||||
if n_handoffs_with_action == 0:
|
|
||||||
return EscalationMetrics(
|
|
||||||
period=period,
|
|
||||||
n_handoffs_claimed=n_handoffs_claimed,
|
|
||||||
n_handoffs_with_action=0,
|
|
||||||
)
|
|
||||||
|
|
||||||
sorted_deltas = sorted(deltas)
|
|
||||||
p95_idx = max(0, int(round(0.95 * (n_handoffs_with_action - 1))))
|
|
||||||
|
|
||||||
return EscalationMetrics(
|
|
||||||
period=period,
|
|
||||||
n_handoffs_claimed=n_handoffs_claimed,
|
|
||||||
n_handoffs_with_action=n_handoffs_with_action,
|
|
||||||
avg_seconds_to_first_action=round(statistics.fmean(deltas), 2),
|
|
||||||
median_seconds_to_first_action=round(statistics.median(deltas), 2),
|
|
||||||
p95_seconds_to_first_action=round(sorted_deltas[p95_idx], 2),
|
|
||||||
)
|
|
||||||
|
|||||||
@@ -194,7 +194,6 @@ async def create_folder(
|
|||||||
|
|
||||||
new_folder = UserFolder(
|
new_folder = UserFolder(
|
||||||
user_id=current_user.id,
|
user_id=current_user.id,
|
||||||
account_id=current_user.account_id,
|
|
||||||
name=folder_data.name,
|
name=folder_data.name,
|
||||||
color=folder_data.color,
|
color=folder_data.color,
|
||||||
icon=folder_data.icon,
|
icon=folder_data.icon,
|
||||||
|
|||||||
@@ -260,7 +260,6 @@ async def save_to_library(
|
|||||||
category_id=data.category_id,
|
category_id=data.category_id,
|
||||||
share_with_team=data.share_with_team,
|
share_with_team=data.share_with_team,
|
||||||
user_id=current_user.id,
|
user_id=current_user.id,
|
||||||
account_id=current_user.account_id,
|
|
||||||
team_id=current_user.team_id,
|
team_id=current_user.team_id,
|
||||||
script_body=data.script_body,
|
script_body=data.script_body,
|
||||||
parameters_schema=data.parameters_schema,
|
parameters_schema=data.parameters_schema,
|
||||||
|
|||||||
@@ -1,24 +1,19 @@
|
|||||||
"""Handoff endpoints — unified park/escalate.
|
"""Handoff endpoints — unified park/escalate.
|
||||||
|
|
||||||
POST /ai-sessions/{id}/handoff — Create handoff
|
POST /ai-sessions/{id}/handoff — Create handoff
|
||||||
GET /ai-sessions/{id}/handoffs — Handoff history
|
GET /ai-sessions/{id}/handoffs — Handoff history
|
||||||
POST /ai-sessions/{id}/handoffs/{hid}/claim — Claim session
|
POST /ai-sessions/{id}/handoffs/{hid}/claim — Claim session
|
||||||
GET /ai-sessions/queue — Team queue
|
GET /ai-sessions/queue — Team queue
|
||||||
GET /ai-sessions/escalations/stream — SSE: live escalation arrivals
|
|
||||||
"""
|
"""
|
||||||
import asyncio
|
|
||||||
import json
|
|
||||||
import logging
|
import logging
|
||||||
from typing import Annotated, AsyncGenerator
|
from typing import Annotated
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
from fastapi import APIRouter, Depends, HTTPException, Request, status
|
from fastapi import APIRouter, Depends, HTTPException, status
|
||||||
from fastapi.responses import StreamingResponse
|
|
||||||
from sqlalchemy import select
|
from sqlalchemy import select
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
from app.api.deps import get_current_active_user, get_db, require_engineer_or_admin
|
from app.api.deps import get_current_active_user, get_db
|
||||||
from app.core.escalation_bus import bus as escalation_bus
|
|
||||||
from app.models.user import User
|
from app.models.user import User
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.session_handoff import SessionHandoff
|
from app.models.session_handoff import SessionHandoff
|
||||||
@@ -68,13 +63,6 @@ async def create_handoff(
|
|||||||
raise HTTPException(status_code=400, detail=str(e))
|
raise HTTPException(status_code=400, detail=str(e))
|
||||||
|
|
||||||
await db.commit()
|
await db.commit()
|
||||||
|
|
||||||
# Best-effort notification dispatch AFTER commit so we never email about
|
|
||||||
# a rolled-back handoff. Failures are swallowed inside the manager —
|
|
||||||
# handoff creation is authoritative; notifications are advisory.
|
|
||||||
if handoff.intent == "escalate":
|
|
||||||
await manager.dispatch_escalation_notifications(handoff)
|
|
||||||
|
|
||||||
return HandoffResponse.model_validate(handoff)
|
return HandoffResponse.model_validate(handoff)
|
||||||
|
|
||||||
|
|
||||||
@@ -98,16 +86,10 @@ async def list_handoffs(
|
|||||||
async def claim_handoff(
|
async def claim_handoff(
|
||||||
session_id: UUID,
|
session_id: UUID,
|
||||||
handoff_id: UUID,
|
handoff_id: UUID,
|
||||||
current_user: Annotated[User, Depends(require_engineer_or_admin)],
|
current_user: Annotated[User, Depends(get_current_active_user)],
|
||||||
db: Annotated[AsyncSession, Depends(get_db)],
|
db: Annotated[AsyncSession, Depends(get_db)],
|
||||||
) -> HandoffResponse:
|
) -> HandoffResponse:
|
||||||
"""Claim a handed-off session.
|
"""Claim a handed-off session."""
|
||||||
|
|
||||||
Role-gated to engineer/admin/owner — viewers cannot claim. The race-condition
|
|
||||||
story (two seniors clicking Pick Up simultaneously) depends on auth gating
|
|
||||||
for audit integrity. Codex review flagged this as wedge-relevant; locked
|
|
||||||
in-scope for Escalation Mode v1.
|
|
||||||
"""
|
|
||||||
manager = HandoffManager(db)
|
manager = HandoffManager(db)
|
||||||
try:
|
try:
|
||||||
handoff = await manager.claim_session(
|
handoff = await manager.claim_session(
|
||||||
@@ -132,80 +114,3 @@ async def get_queue(
|
|||||||
team_id=current_user.team_id,
|
team_id=current_user.team_id,
|
||||||
account_id=current_user.account_id,
|
account_id=current_user.account_id,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
# ─── Live escalation arrivals (SSE) ──────────────────────────────────────────
|
|
||||||
#
|
|
||||||
# Streams `handoff_created` events to subscribers in the same account_id as
|
|
||||||
# the new handoff. Connected EscalationQueue instances prepend the new card
|
|
||||||
# with the locked 200ms slide-in. Account-scoped: cross-tenant leakage is
|
|
||||||
# prevented at the bus.publish boundary (only handoff.account_id subscribers
|
|
||||||
# are notified) and re-enforced here by binding the subscription to
|
|
||||||
# current_user.account_id.
|
|
||||||
#
|
|
||||||
# Heartbeat: a `: keepalive\n\n` SSE comment every 25s keeps the connection
|
|
||||||
# alive through Railway / nginx default 60s idle timeouts. Reconnect policy
|
|
||||||
# is on the client (browser EventSource auto-reconnects; our fetch-based
|
|
||||||
# reader retries with backoff).
|
|
||||||
|
|
||||||
|
|
||||||
_HEARTBEAT_INTERVAL_S = 25
|
|
||||||
_QUEUE_GET_TIMEOUT_S = 25 # < heartbeat so heartbeat fires reliably
|
|
||||||
|
|
||||||
|
|
||||||
@queue_router.get("/escalations/stream")
|
|
||||||
async def stream_escalations(
|
|
||||||
request: Request,
|
|
||||||
current_user: Annotated[User, Depends(require_engineer_or_admin)],
|
|
||||||
):
|
|
||||||
"""SSE stream of new escalation arrivals for the current user's account.
|
|
||||||
|
|
||||||
Role-gated to engineer/admin/owner so viewers can't subscribe (matches
|
|
||||||
the queue + claim role surface). One open connection per browser tab is
|
|
||||||
expected; the bus handles fan-out.
|
|
||||||
"""
|
|
||||||
if not current_user.account_id:
|
|
||||||
raise HTTPException(
|
|
||||||
status_code=status.HTTP_403_FORBIDDEN, detail="No account"
|
|
||||||
)
|
|
||||||
|
|
||||||
account_id = current_user.account_id
|
|
||||||
|
|
||||||
async def event_generator() -> AsyncGenerator[str, None]:
|
|
||||||
queue = await escalation_bus.subscribe(account_id)
|
|
||||||
try:
|
|
||||||
# Initial hello so the client knows the stream is live.
|
|
||||||
yield (
|
|
||||||
"event: ready\n"
|
|
||||||
f"data: {json.dumps({'account_id': str(account_id)})}\n\n"
|
|
||||||
)
|
|
||||||
|
|
||||||
while True:
|
|
||||||
if await request.is_disconnected():
|
|
||||||
break
|
|
||||||
try:
|
|
||||||
event = await asyncio.wait_for(
|
|
||||||
queue.get(), timeout=_QUEUE_GET_TIMEOUT_S
|
|
||||||
)
|
|
||||||
except asyncio.TimeoutError:
|
|
||||||
# Heartbeat keeps the connection alive through proxies.
|
|
||||||
yield ": keepalive\n\n"
|
|
||||||
continue
|
|
||||||
|
|
||||||
event_type = event.get("type", "message")
|
|
||||||
yield (
|
|
||||||
f"event: {event_type}\n"
|
|
||||||
f"data: {json.dumps(event)}\n\n"
|
|
||||||
)
|
|
||||||
finally:
|
|
||||||
await escalation_bus.unsubscribe(account_id, queue)
|
|
||||||
|
|
||||||
return StreamingResponse(
|
|
||||||
event_generator(),
|
|
||||||
media_type="text/event-stream",
|
|
||||||
headers={
|
|
||||||
"Cache-Control": "no-cache",
|
|
||||||
"Connection": "keep-alive",
|
|
||||||
"X-Accel-Buffering": "no",
|
|
||||||
},
|
|
||||||
)
|
|
||||||
|
|||||||
@@ -20,7 +20,6 @@ from app.core.audit import log_audit
|
|||||||
from app.core.rate_limit import limiter
|
from app.core.rate_limit import limiter
|
||||||
|
|
||||||
router = APIRouter(tags=["shares"])
|
router = APIRouter(tags=["shares"])
|
||||||
public_router = APIRouter(tags=["shares"])
|
|
||||||
|
|
||||||
|
|
||||||
def build_share_response(share: SessionShare) -> ShareResponse:
|
def build_share_response(share: SessionShare) -> ShareResponse:
|
||||||
@@ -207,7 +206,7 @@ async def _get_optional_user(request: Request, db: AsyncSession) -> Optional[Use
|
|||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
@public_router.get("/share/{share_token}", response_model=SharePublicView)
|
@router.get("/share/{share_token}", response_model=SharePublicView)
|
||||||
@limiter.limit("30/minute")
|
@limiter.limit("30/minute")
|
||||||
async def access_share(
|
async def access_share(
|
||||||
share_token: str,
|
share_token: str,
|
||||||
|
|||||||
@@ -78,11 +78,9 @@ api_router = APIRouter()
|
|||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
api_router.include_router(auth.router)
|
api_router.include_router(auth.router)
|
||||||
api_router.include_router(shared.router) # Public share links (no auth)
|
api_router.include_router(shared.router) # Public share links (no auth)
|
||||||
api_router.include_router(shares.public_router) # Public session share links (optional auth)
|
|
||||||
api_router.include_router(beta_signup.router)
|
api_router.include_router(beta_signup.router)
|
||||||
api_router.include_router(webhooks.router) # Stripe webhook receiver
|
api_router.include_router(webhooks.router) # Stripe webhook receiver
|
||||||
api_router.include_router(public_templates.router) # Public gallery (no auth, rate-limited)
|
api_router.include_router(public_templates.router) # Public gallery (no auth, rate-limited)
|
||||||
api_router.include_router(survey.router) # Public survey flow (no auth, rate-limited)
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Admin endpoints — super_admin only
|
# Admin endpoints — super_admin only
|
||||||
@@ -127,6 +125,7 @@ api_router.include_router(ai_fix.router, dependencies=_tenant_deps)
|
|||||||
api_router.include_router(ai_chat.router, dependencies=_tenant_deps)
|
api_router.include_router(ai_chat.router, dependencies=_tenant_deps)
|
||||||
api_router.include_router(copilot.router, dependencies=_tenant_deps)
|
api_router.include_router(copilot.router, dependencies=_tenant_deps)
|
||||||
api_router.include_router(assistant_chat.router, dependencies=_tenant_deps)
|
api_router.include_router(assistant_chat.router, dependencies=_tenant_deps)
|
||||||
|
api_router.include_router(survey.router, dependencies=_tenant_deps)
|
||||||
api_router.include_router(tree_transfer.router, dependencies=_tenant_deps)
|
api_router.include_router(tree_transfer.router, dependencies=_tenant_deps)
|
||||||
api_router.include_router(ai_suggestions.router, dependencies=_tenant_deps)
|
api_router.include_router(ai_suggestions.router, dependencies=_tenant_deps)
|
||||||
api_router.include_router(kb_accelerator.router, dependencies=_tenant_deps)
|
api_router.include_router(kb_accelerator.router, dependencies=_tenant_deps)
|
||||||
|
|||||||
@@ -1,97 +0,0 @@
|
|||||||
"""In-memory pub/sub bus for live escalation events.
|
|
||||||
|
|
||||||
Single-process, non-durable. When a handoff fires, every connected SSE
|
|
||||||
subscriber for the same `account_id` receives the event. Subscribers come
|
|
||||||
and go as senior techs open and close the EscalationQueue page.
|
|
||||||
|
|
||||||
Pre-PMF scale (3 pilots × 5-20 techs/MSP = ~15-60 concurrent subscribers
|
|
||||||
total, single Railway replica) makes in-memory the right call. When the
|
|
||||||
deployment scales horizontally, swap this for Redis pub/sub or similar —
|
|
||||||
the public surface (`publish` / `subscribe`) is intentionally narrow so
|
|
||||||
the swap is local.
|
|
||||||
|
|
||||||
Events are JSON-serializable dicts. `publish()` is non-blocking (drops the
|
|
||||||
event if a subscriber's queue is full rather than back-pressuring the
|
|
||||||
caller). `subscribe()` MUST be paired with `unsubscribe()` in a finally
|
|
||||||
block, or you leak queues.
|
|
||||||
"""
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
import logging
|
|
||||||
from typing import Any
|
|
||||||
from uuid import UUID
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
|
||||||
|
|
||||||
|
|
||||||
# Bound how many unconsumed events can sit in a subscriber's queue before
|
|
||||||
# we start dropping. 64 is generous for the queue-page use case; if a
|
|
||||||
# subscriber is that far behind, they're probably gone or stuck.
|
|
||||||
_QUEUE_MAXSIZE = 64
|
|
||||||
|
|
||||||
|
|
||||||
class EscalationBus:
|
|
||||||
"""Account-scoped pub/sub for escalation arrival events."""
|
|
||||||
|
|
||||||
def __init__(self) -> None:
|
|
||||||
self._subscribers: dict[UUID, set[asyncio.Queue[dict[str, Any]]]] = {}
|
|
||||||
self._lock = asyncio.Lock()
|
|
||||||
|
|
||||||
async def subscribe(self, account_id: UUID) -> asyncio.Queue[dict[str, Any]]:
|
|
||||||
"""Register a new subscriber queue for an account.
|
|
||||||
|
|
||||||
Caller must invoke `unsubscribe(account_id, queue)` when the
|
|
||||||
consumer disconnects.
|
|
||||||
"""
|
|
||||||
queue: asyncio.Queue[dict[str, Any]] = asyncio.Queue(
|
|
||||||
maxsize=_QUEUE_MAXSIZE
|
|
||||||
)
|
|
||||||
async with self._lock:
|
|
||||||
self._subscribers.setdefault(account_id, set()).add(queue)
|
|
||||||
return queue
|
|
||||||
|
|
||||||
async def unsubscribe(
|
|
||||||
self, account_id: UUID, queue: asyncio.Queue[dict[str, Any]]
|
|
||||||
) -> None:
|
|
||||||
async with self._lock:
|
|
||||||
subs = self._subscribers.get(account_id)
|
|
||||||
if subs is None:
|
|
||||||
return
|
|
||||||
subs.discard(queue)
|
|
||||||
if not subs:
|
|
||||||
self._subscribers.pop(account_id, None)
|
|
||||||
|
|
||||||
async def publish(self, account_id: UUID, event: dict[str, Any]) -> int:
|
|
||||||
"""Fan event out to every subscriber for `account_id`.
|
|
||||||
|
|
||||||
Returns the number of subscribers that successfully received the
|
|
||||||
event. Drops the event for any subscriber whose queue is full
|
|
||||||
(logs at warning level).
|
|
||||||
"""
|
|
||||||
async with self._lock:
|
|
||||||
subs = list(self._subscribers.get(account_id, ()))
|
|
||||||
if not subs:
|
|
||||||
return 0
|
|
||||||
delivered = 0
|
|
||||||
for queue in subs:
|
|
||||||
try:
|
|
||||||
queue.put_nowait(event)
|
|
||||||
delivered += 1
|
|
||||||
except asyncio.QueueFull:
|
|
||||||
logger.warning(
|
|
||||||
"EscalationBus: dropped event for full subscriber queue "
|
|
||||||
"(account_id=%s, event=%s)",
|
|
||||||
account_id,
|
|
||||||
event.get("type", "?"),
|
|
||||||
)
|
|
||||||
return delivered
|
|
||||||
|
|
||||||
def subscriber_count(self, account_id: UUID) -> int:
|
|
||||||
"""Diagnostic — number of active subscribers for an account."""
|
|
||||||
return len(self._subscribers.get(account_id, ()))
|
|
||||||
|
|
||||||
|
|
||||||
# Module-level singleton. FastAPI imports this; `subscribe()` and `publish()`
|
|
||||||
# are coroutine-safe via the internal Lock.
|
|
||||||
bus = EscalationBus()
|
|
||||||
@@ -10,7 +10,7 @@ from typing import Optional, Any, TYPE_CHECKING
|
|||||||
from sqlalchemy import String, Text, DateTime, ForeignKey, Boolean, Integer, Float, CheckConstraint
|
from sqlalchemy import String, Text, DateTime, ForeignKey, Boolean, Integer, Float, CheckConstraint
|
||||||
import sqlalchemy as sa
|
import sqlalchemy as sa
|
||||||
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
from sqlalchemy.orm import Mapped, mapped_column, relationship
|
||||||
from sqlalchemy.dialects.postgresql import UUID, JSONB, TSVECTOR
|
from sqlalchemy.dialects.postgresql import UUID, JSONB
|
||||||
|
|
||||||
from app.core.database import Base
|
from app.core.database import Base
|
||||||
|
|
||||||
@@ -46,7 +46,6 @@ class AISession(Base):
|
|||||||
"confidence_tier IN ('guided', 'exploring', 'discovery')",
|
"confidence_tier IN ('guided', 'exploring', 'discovery')",
|
||||||
name="ck_ai_sessions_confidence_tier",
|
name="ck_ai_sessions_confidence_tier",
|
||||||
),
|
),
|
||||||
sa.Index("idx_ai_sessions_search", "search_vector", postgresql_using="gin"),
|
|
||||||
)
|
)
|
||||||
|
|
||||||
id: Mapped[uuid.UUID] = mapped_column(
|
id: Mapped[uuid.UUID] = mapped_column(
|
||||||
@@ -151,18 +150,6 @@ class AISession(Base):
|
|||||||
Text, nullable=True,
|
Text, nullable=True,
|
||||||
comment="Why escalated (set on escalation)",
|
comment="Why escalated (set on escalation)",
|
||||||
)
|
)
|
||||||
search_vector: Mapped[Optional[str]] = mapped_column(
|
|
||||||
TSVECTOR,
|
|
||||||
sa.Computed(
|
|
||||||
"to_tsvector('english', "
|
|
||||||
"coalesce(problem_summary, '') || ' ' || "
|
|
||||||
"coalesce(resolution_summary, '') || ' ' || "
|
|
||||||
"coalesce(escalation_reason, '') || ' ' || "
|
|
||||||
"coalesce(problem_domain, ''))",
|
|
||||||
persisted=True,
|
|
||||||
),
|
|
||||||
nullable=True,
|
|
||||||
)
|
|
||||||
escalation_package: Mapped[Optional[dict[str, Any]]] = mapped_column(
|
escalation_package: Mapped[Optional[dict[str, Any]]] = mapped_column(
|
||||||
JSONB, nullable=True,
|
JSONB, nullable=True,
|
||||||
comment="Context package for receiving engineer: steps_tried, hypotheses, suggestions",
|
comment="Context package for receiving engineer: steps_tried, hypotheses, suggestions",
|
||||||
|
|||||||
@@ -124,26 +124,3 @@ class FlowPilotDashboard(BaseModel):
|
|||||||
confidence_breakdown: ConfidenceBreakdown
|
confidence_breakdown: ConfidenceBreakdown
|
||||||
knowledge_coverage: KnowledgeCoverage
|
knowledge_coverage: KnowledgeCoverage
|
||||||
psa_metrics: PsaMetrics | None = None
|
psa_metrics: PsaMetrics | None = None
|
||||||
|
|
||||||
|
|
||||||
class EscalationMetrics(BaseModel):
|
|
||||||
"""In-product time-to-first-action metric for the Escalation Mode wedge.
|
|
||||||
|
|
||||||
NOTE: this is the *in-product* metric (post-claim time-to-first-action). The
|
|
||||||
"minutes recovered" sales claim requires a manual baseline measurement of the
|
|
||||||
pre-Escalation-Mode verbal-handoff time. See
|
|
||||||
docs/plans/2026-04-27-escalation-mode-wedge-design.md for the two-metric
|
|
||||||
framing — do not roll this number alone into "minutes recovered."
|
|
||||||
"""
|
|
||||||
|
|
||||||
period: str
|
|
||||||
n_handoffs_claimed: int
|
|
||||||
n_handoffs_with_action: int
|
|
||||||
avg_seconds_to_first_action: float | None = None
|
|
||||||
median_seconds_to_first_action: float | None = None
|
|
||||||
p95_seconds_to_first_action: float | None = None
|
|
||||||
metric_definition: str = (
|
|
||||||
"elapsed_seconds(first ai_session_step in session where "
|
|
||||||
"created_at > SessionHandoff.claimed_at) — measures post-claim activity "
|
|
||||||
"lag, NOT verbal-handoff savings. Pair with manual baseline."
|
|
||||||
)
|
|
||||||
|
|||||||
@@ -68,6 +68,4 @@ class RoleUpdate(BaseModel):
|
|||||||
|
|
||||||
|
|
||||||
class AccountRoleUpdate(BaseModel):
|
class AccountRoleUpdate(BaseModel):
|
||||||
# Ownership changes must go through the explicit transfer-ownership flow so
|
account_role: str = Field(..., pattern="^(owner|admin|engineer|viewer)$")
|
||||||
# account.owner_id stays consistent with user.account_role.
|
|
||||||
account_role: str = Field(..., pattern="^(admin|engineer|viewer)$")
|
|
||||||
|
|||||||
@@ -300,14 +300,13 @@ To create a fork, append this marker AFTER your [QUESTIONS]/[ACTIONS] markers:
|
|||||||
When you identify a second distinct issue that is clearly separate from the primary topic \
|
When you identify a second distinct issue that is clearly separate from the primary topic \
|
||||||
of this session, suggest creating a spin-off ticket using the [ACTIONS] marker below. \
|
of this session, suggest creating a spin-off ticket using the [ACTIONS] marker below. \
|
||||||
Use this sparingly — only when the issue is genuinely independent, not for every tangential mention.
|
Use this sparingly — only when the issue is genuinely independent, not for every tangential mention.
|
||||||
Use `create_spin_off_ticket` as the command value for this action.
|
|
||||||
|
|
||||||
Format:
|
Format:
|
||||||
[ACTIONS]
|
[ACTIONS]
|
||||||
[
|
[
|
||||||
{
|
{
|
||||||
"label": "Create ticket: <brief issue title>",
|
"label": "Create ticket: <brief issue title>",
|
||||||
"command": "<spin-off ticket action command>",
|
"command": "create_spin_off_ticket",
|
||||||
"description": "<one sentence description of the separate issue>"
|
"description": "<one sentence description of the separate issue>"
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
|
|||||||
@@ -4,7 +4,6 @@ Creates handoff snapshots, AI assessments (for escalations), claim workflow,
|
|||||||
and queue queries. Dual-writes to ai_sessions.escalation_package for
|
and queue queries. Dual-writes to ai_sessions.escalation_package for
|
||||||
backward compatibility with the existing escalation queue.
|
backward compatibility with the existing escalation queue.
|
||||||
"""
|
"""
|
||||||
import asyncio
|
|
||||||
import logging
|
import logging
|
||||||
from datetime import datetime, timezone
|
from datetime import datetime, timezone
|
||||||
from typing import Any
|
from typing import Any
|
||||||
@@ -13,13 +12,9 @@ from uuid import UUID
|
|||||||
from sqlalchemy import select
|
from sqlalchemy import select
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
|
|
||||||
from app.core.config import settings
|
|
||||||
from app.core.email import EmailService
|
|
||||||
from app.core.escalation_bus import bus as escalation_bus
|
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.session_branch import SessionBranch
|
from app.models.session_branch import SessionBranch
|
||||||
from app.models.session_handoff import SessionHandoff
|
from app.models.session_handoff import SessionHandoff
|
||||||
from app.models.user import User
|
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -92,125 +87,6 @@ class HandoffManager:
|
|||||||
await self.db.flush()
|
await self.db.flush()
|
||||||
return handoff
|
return handoff
|
||||||
|
|
||||||
async def dispatch_escalation_notifications(
|
|
||||||
self, handoff: SessionHandoff
|
|
||||||
) -> int:
|
|
||||||
"""Email engineer-or-admin users in the account about a new escalation.
|
|
||||||
|
|
||||||
Call this AFTER `db.commit()` has succeeded — sending email for a
|
|
||||||
rolled-back handoff is the kind of trust-erosion bug that makes pilot
|
|
||||||
customers stop trusting the tool. Returns the number of recipients
|
|
||||||
successfully emailed (best-effort, not authoritative).
|
|
||||||
|
|
||||||
Failures are logged but never raise: the wedge demo's reliability
|
|
||||||
story is "handoff creation always succeeds; notification is best-effort,"
|
|
||||||
not "handoff creation depends on the email service being up." This is
|
|
||||||
the graceful-degradation regression the eng + codex reviews both
|
|
||||||
flagged as critical.
|
|
||||||
|
|
||||||
Per-channel delivery records (Codex correction on the dead
|
|
||||||
`notification_sent` boolean) are a v1.x story — for now the
|
|
||||||
application logs are the audit trail.
|
|
||||||
"""
|
|
||||||
if handoff.intent != "escalate":
|
|
||||||
return 0
|
|
||||||
|
|
||||||
# Publish to the in-memory bus first so connected senior-tech inboxes
|
|
||||||
# see the new card slide in within ~1s of escalate. This path is
|
|
||||||
# fire-and-forget (no IO, just memory) so it can sit ahead of the
|
|
||||||
# email fan-out.
|
|
||||||
try:
|
|
||||||
await escalation_bus.publish(
|
|
||||||
handoff.account_id,
|
|
||||||
{
|
|
||||||
"type": "handoff_created",
|
|
||||||
"handoff_id": str(handoff.id),
|
|
||||||
"session_id": str(handoff.session_id),
|
|
||||||
"priority": handoff.priority,
|
|
||||||
"engineer_notes": handoff.engineer_notes or "",
|
|
||||||
"created_at": handoff.created_at.isoformat()
|
|
||||||
if handoff.created_at
|
|
||||||
else None,
|
|
||||||
},
|
|
||||||
)
|
|
||||||
except Exception:
|
|
||||||
logger.exception(
|
|
||||||
"EscalationBus publish failed for handoff %s", handoff.id
|
|
||||||
)
|
|
||||||
|
|
||||||
try:
|
|
||||||
recipients = (
|
|
||||||
await self.db.execute(
|
|
||||||
select(User).where(
|
|
||||||
User.account_id == handoff.account_id,
|
|
||||||
User.id != handoff.handed_off_by,
|
|
||||||
User.account_role.in_(("owner", "admin", "engineer")),
|
|
||||||
User.is_active.is_(True),
|
|
||||||
User.deleted_at.is_(None),
|
|
||||||
)
|
|
||||||
)
|
|
||||||
).scalars().all()
|
|
||||||
|
|
||||||
if not recipients:
|
|
||||||
logger.info(
|
|
||||||
"No notification recipients for handoff %s in account %s",
|
|
||||||
handoff.id,
|
|
||||||
handoff.account_id,
|
|
||||||
)
|
|
||||||
return 0
|
|
||||||
|
|
||||||
# Pull session for the email subject. Fall back to a generic title
|
|
||||||
# if the session is gone (e.g. cascade delete mid-dispatch).
|
|
||||||
session_result = await self.db.execute(
|
|
||||||
select(AISession).where(AISession.id == handoff.session_id)
|
|
||||||
)
|
|
||||||
session = session_result.scalar_one_or_none()
|
|
||||||
problem = (
|
|
||||||
session.problem_summary if session and session.problem_summary
|
|
||||||
else "an active session"
|
|
||||||
)
|
|
||||||
|
|
||||||
title = f"New escalation: {problem}"
|
|
||||||
notes = (handoff.engineer_notes or "").strip()
|
|
||||||
body = (
|
|
||||||
"A teammate has escalated a session and is asking for help.\n\n"
|
|
||||||
f"Reason: {notes if notes else 'No reason provided.'}\n"
|
|
||||||
f"Priority: {handoff.priority}"
|
|
||||||
)
|
|
||||||
link_url = (
|
|
||||||
f"{settings.FRONTEND_URL.rstrip('/')}/escalations"
|
|
||||||
if settings.FRONTEND_URL
|
|
||||||
else None
|
|
||||||
)
|
|
||||||
|
|
||||||
results = await asyncio.gather(
|
|
||||||
*[
|
|
||||||
EmailService.send_notification_email(
|
|
||||||
to_email=r.email,
|
|
||||||
title=title,
|
|
||||||
body=body,
|
|
||||||
link_url=link_url,
|
|
||||||
)
|
|
||||||
for r in recipients
|
|
||||||
],
|
|
||||||
return_exceptions=True,
|
|
||||||
)
|
|
||||||
sent = sum(1 for r in results if r is True)
|
|
||||||
logger.info(
|
|
||||||
"Escalation notifications dispatched for handoff %s: %d/%d recipients",
|
|
||||||
handoff.id,
|
|
||||||
sent,
|
|
||||||
len(recipients),
|
|
||||||
)
|
|
||||||
return sent
|
|
||||||
|
|
||||||
except Exception:
|
|
||||||
logger.exception(
|
|
||||||
"Escalation notification dispatch failed for handoff %s",
|
|
||||||
handoff.id,
|
|
||||||
)
|
|
||||||
return 0
|
|
||||||
|
|
||||||
async def _generate_snapshot(self, session: AISession) -> dict[str, Any]:
|
async def _generate_snapshot(self, session: AISession) -> dict[str, Any]:
|
||||||
"""Generate a snapshot of the session state at handoff time."""
|
"""Generate a snapshot of the session state at handoff time."""
|
||||||
snapshot: dict[str, Any] = {
|
snapshot: dict[str, Any] = {
|
||||||
|
|||||||
@@ -5,7 +5,6 @@ from uuid import UUID
|
|||||||
|
|
||||||
from sqlalchemy import select
|
from sqlalchemy import select
|
||||||
from sqlalchemy.ext.asyncio import AsyncSession
|
from sqlalchemy.ext.asyncio import AsyncSession
|
||||||
from sqlalchemy.orm import selectinload
|
|
||||||
|
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.session_resolution_output import SessionResolutionOutput
|
from app.models.session_resolution_output import SessionResolutionOutput
|
||||||
@@ -22,9 +21,7 @@ class ResolutionOutputGenerator:
|
|||||||
|
|
||||||
async def generate_all(self, session_id: UUID) -> list[SessionResolutionOutput]:
|
async def generate_all(self, session_id: UUID) -> list[SessionResolutionOutput]:
|
||||||
result = await self.db.execute(
|
result = await self.db.execute(
|
||||||
select(AISession)
|
select(AISession).where(AISession.id == session_id)
|
||||||
.options(selectinload(AISession.steps))
|
|
||||||
.where(AISession.id == session_id)
|
|
||||||
)
|
)
|
||||||
session = result.scalar_one_or_none()
|
session = result.scalar_one_or_none()
|
||||||
if not session:
|
if not session:
|
||||||
|
|||||||
@@ -360,7 +360,6 @@ async def save_to_library(
|
|||||||
category_id: UUID | None,
|
category_id: UUID | None,
|
||||||
share_with_team: bool,
|
share_with_team: bool,
|
||||||
user_id: UUID,
|
user_id: UUID,
|
||||||
account_id: UUID,
|
|
||||||
team_id: UUID | None,
|
team_id: UUID | None,
|
||||||
script_body: str | None = None,
|
script_body: str | None = None,
|
||||||
parameters_schema: dict | None = None,
|
parameters_schema: dict | None = None,
|
||||||
@@ -402,7 +401,6 @@ async def save_to_library(
|
|||||||
id=uuid_mod.uuid4(),
|
id=uuid_mod.uuid4(),
|
||||||
category_id=resolved_category_id,
|
category_id=resolved_category_id,
|
||||||
created_by=user_id,
|
created_by=user_id,
|
||||||
account_id=account_id,
|
|
||||||
team_id=team_id if share_with_team else None,
|
team_id=team_id if share_with_team else None,
|
||||||
name=name,
|
name=name,
|
||||||
slug=slug,
|
slug=slug,
|
||||||
|
|||||||
@@ -35,9 +35,6 @@ testpaths = tests
|
|||||||
# Warnings
|
# Warnings
|
||||||
filterwarnings =
|
filterwarnings =
|
||||||
error
|
error
|
||||||
ignore:unclosed <socket\.socket.*:ResourceWarning
|
|
||||||
ignore:unclosed transport .*:ResourceWarning
|
|
||||||
ignore:unclosed event loop .*:ResourceWarning
|
|
||||||
ignore::DeprecationWarning
|
ignore::DeprecationWarning
|
||||||
ignore::PendingDeprecationWarning
|
ignore::PendingDeprecationWarning
|
||||||
ignore::pluggy.PluggyTeardownRaisedWarning
|
ignore::pluggy.PluggyTeardownRaisedWarning
|
||||||
|
|||||||
@@ -4,7 +4,6 @@
|
|||||||
# Testing — pytest-asyncio 0.24+ requires pytest>=8.2
|
# Testing — pytest-asyncio 0.24+ requires pytest>=8.2
|
||||||
pytest==8.4.2
|
pytest==8.4.2
|
||||||
pytest-asyncio==0.24.0
|
pytest-asyncio==0.24.0
|
||||||
pytest-xdist==3.6.1
|
|
||||||
httpx>=0.27.0
|
httpx>=0.27.0
|
||||||
pytest-cov==5.0.0
|
pytest-cov==5.0.0
|
||||||
|
|
||||||
|
|||||||
@@ -5,7 +5,6 @@ Provides test database setup, client fixtures, and authentication helpers.
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
import os
|
import os
|
||||||
import asyncio
|
|
||||||
from typing import AsyncGenerator
|
from typing import AsyncGenerator
|
||||||
import pytest
|
import pytest
|
||||||
import sqlalchemy as sa
|
import sqlalchemy as sa
|
||||||
@@ -35,64 +34,11 @@ settings.REQUIRE_INVITE_CODE = False
|
|||||||
# would silently nuke the dev database. Only DATABASE_TEST_URL is honored,
|
# would silently nuke the dev database. Only DATABASE_TEST_URL is honored,
|
||||||
# and the safety assertion below refuses to run against a DB whose name
|
# and the safety assertion below refuses to run against a DB whose name
|
||||||
# doesn't contain "test".
|
# doesn't contain "test".
|
||||||
_BASE_TEST_DATABASE_URL = os.environ.get(
|
TEST_DATABASE_URL = os.environ.get(
|
||||||
"DATABASE_TEST_URL",
|
"DATABASE_TEST_URL",
|
||||||
"postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
|
"postgresql+asyncpg://postgres:postgres@localhost:5432/resolutionflow_test",
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
def _worker_db_url(base_url: str) -> str:
|
|
||||||
"""Per-worker DB URL for pytest-xdist parallelization.
|
|
||||||
|
|
||||||
pytest-xdist sets PYTEST_XDIST_WORKER to 'gw0', 'gw1', ... per worker
|
|
||||||
process. Each worker needs its own database so the per-test
|
|
||||||
`DROP SCHEMA public CASCADE` doesn't race across workers. Master/serial
|
|
||||||
runs (no xdist) keep the base DB. The base DB is created by the postgres
|
|
||||||
service container; per-worker DBs are CREATE DATABASE-d on first import
|
|
||||||
by `_ensure_worker_db_exists` below.
|
|
||||||
"""
|
|
||||||
worker = os.environ.get("PYTEST_XDIST_WORKER")
|
|
||||||
if not worker or worker == "master":
|
|
||||||
return base_url
|
|
||||||
head, tail = base_url.rsplit("/", 1)
|
|
||||||
db_name, _, query = tail.partition("?")
|
|
||||||
suffix = f"?{query}" if query else ""
|
|
||||||
return f"{head}/{db_name}_{worker}{suffix}"
|
|
||||||
|
|
||||||
|
|
||||||
def _ensure_worker_db_exists(worker_url: str, base_url: str) -> None:
|
|
||||||
"""Create the per-worker DB if it doesn't exist. Runs synchronously at
|
|
||||||
conftest import time (before any async test machinery), using psycopg2
|
|
||||||
against the postgres maintenance DB. No-op when not running under xdist.
|
|
||||||
"""
|
|
||||||
if worker_url == base_url:
|
|
||||||
return
|
|
||||||
head, tail = worker_url.rsplit("/", 1)
|
|
||||||
worker_db = tail.partition("?")[0]
|
|
||||||
# Strip the +asyncpg dialect for sync psycopg2 + connect to 'postgres'.
|
|
||||||
sync_head = head.replace("+asyncpg", "")
|
|
||||||
admin_url = f"{sync_head}/postgres"
|
|
||||||
# Lazy import — psycopg2 is a transitive backend dep; not imported at
|
|
||||||
# module top to keep the conftest light when xdist isn't in use.
|
|
||||||
from sqlalchemy import create_engine
|
|
||||||
engine = create_engine(admin_url, isolation_level="AUTOCOMMIT")
|
|
||||||
try:
|
|
||||||
with engine.begin() as conn:
|
|
||||||
exists = conn.execute(
|
|
||||||
sa.text("SELECT 1 FROM pg_database WHERE datname = :n"),
|
|
||||||
{"n": worker_db},
|
|
||||||
).scalar()
|
|
||||||
if not exists:
|
|
||||||
# Identifier interpolation is safe — worker_db is built from
|
|
||||||
# the trusted base URL + 'gw\d+' worker suffix.
|
|
||||||
conn.execute(sa.text(f'CREATE DATABASE "{worker_db}"'))
|
|
||||||
finally:
|
|
||||||
engine.dispose()
|
|
||||||
|
|
||||||
|
|
||||||
TEST_DATABASE_URL = _worker_db_url(_BASE_TEST_DATABASE_URL)
|
|
||||||
_ensure_worker_db_exists(TEST_DATABASE_URL, _BASE_TEST_DATABASE_URL)
|
|
||||||
|
|
||||||
# Belt-and-suspenders: refuse to run tests against a DB whose name doesn't
|
# Belt-and-suspenders: refuse to run tests against a DB whose name doesn't
|
||||||
# contain "test". Parses the last path segment of the URL (everything after
|
# contain "test". Parses the last path segment of the URL (everything after
|
||||||
# the final '/', with query string stripped) so credentials / hosts that
|
# the final '/', with query string stripped) so credentials / hosts that
|
||||||
@@ -127,20 +73,6 @@ def pytest_collection_modifyitems(config, items):
|
|||||||
items[:] = selected
|
items[:] = selected
|
||||||
|
|
||||||
|
|
||||||
@pytest.hookimpl(trylast=True, hookwrapper=True)
|
|
||||||
def pytest_runtest_teardown(item, nextitem):
|
|
||||||
"""Close pytest-asyncio's post-test clean loop before warnings collect it."""
|
|
||||||
yield
|
|
||||||
policy = asyncio.get_event_loop_policy()
|
|
||||||
try:
|
|
||||||
loop = policy.get_event_loop()
|
|
||||||
except RuntimeError:
|
|
||||||
return
|
|
||||||
if not loop.is_running() and not loop.is_closed():
|
|
||||||
loop.close()
|
|
||||||
policy.set_event_loop(None)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
async def test_db() -> AsyncGenerator[AsyncSession, None]:
|
async def test_db() -> AsyncGenerator[AsyncSession, None]:
|
||||||
"""
|
"""
|
||||||
@@ -205,7 +137,6 @@ async def test_db() -> AsyncGenerator[AsyncSession, None]:
|
|||||||
# Dispose engine first so all pooled connections are released,
|
# Dispose engine first so all pooled connections are released,
|
||||||
# then reconnect to perform the schema teardown cleanly.
|
# then reconnect to perform the schema teardown cleanly.
|
||||||
await engine.dispose()
|
await engine.dispose()
|
||||||
await asyncio.sleep(0.01)
|
|
||||||
|
|
||||||
# Drop all tables after test (CASCADE for circular FKs)
|
# Drop all tables after test (CASCADE for circular FKs)
|
||||||
teardown_engine = create_async_engine(
|
teardown_engine = create_async_engine(
|
||||||
@@ -219,7 +150,6 @@ async def test_db() -> AsyncGenerator[AsyncSession, None]:
|
|||||||
await conn.execute(sa.text("CREATE SCHEMA public"))
|
await conn.execute(sa.text("CREATE SCHEMA public"))
|
||||||
finally:
|
finally:
|
||||||
await teardown_engine.dispose()
|
await teardown_engine.dispose()
|
||||||
await asyncio.sleep(0.01)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
|
|||||||
@@ -74,25 +74,19 @@ def _mock_ai_provider(text: str, input_tokens: int = 100, output_tokens: int = 2
|
|||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
def enable_ai():
|
def enable_ai():
|
||||||
"""Temporarily enable AI by setting a fake API key."""
|
"""Temporarily enable AI by setting a fake API key."""
|
||||||
original_anthropic = settings.ANTHROPIC_API_KEY
|
original = settings.ANTHROPIC_API_KEY
|
||||||
original_google = settings.GOOGLE_AI_API_KEY
|
|
||||||
settings.ANTHROPIC_API_KEY = "test-key-fake"
|
settings.ANTHROPIC_API_KEY = "test-key-fake"
|
||||||
settings.GOOGLE_AI_API_KEY = None
|
|
||||||
yield
|
yield
|
||||||
settings.ANTHROPIC_API_KEY = original_anthropic
|
settings.ANTHROPIC_API_KEY = original
|
||||||
settings.GOOGLE_AI_API_KEY = original_google
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.fixture
|
@pytest.fixture
|
||||||
def disable_ai():
|
def disable_ai():
|
||||||
"""Ensure AI is disabled."""
|
"""Ensure AI is disabled."""
|
||||||
original_anthropic = settings.ANTHROPIC_API_KEY
|
original = settings.ANTHROPIC_API_KEY
|
||||||
original_google = settings.GOOGLE_AI_API_KEY
|
|
||||||
settings.ANTHROPIC_API_KEY = None
|
settings.ANTHROPIC_API_KEY = None
|
||||||
settings.GOOGLE_AI_API_KEY = None
|
|
||||||
yield
|
yield
|
||||||
settings.ANTHROPIC_API_KEY = original_anthropic
|
settings.ANTHROPIC_API_KEY = original
|
||||||
settings.GOOGLE_AI_API_KEY = original_google
|
|
||||||
|
|
||||||
|
|
||||||
# ── Quota endpoint ──
|
# ── Quota endpoint ──
|
||||||
|
|||||||
@@ -66,7 +66,6 @@ async def test_create_fork(client: AsyncClient, test_user, auth_headers, test_db
|
|||||||
|
|
||||||
step = AISessionStep(
|
step = AISessionStep(
|
||||||
session_id=session.id,
|
session_id=session.id,
|
||||||
account_id=session.account_id,
|
|
||||||
step_order=0,
|
step_order=0,
|
||||||
step_type="question",
|
step_type="question",
|
||||||
content={"text": "What's the issue?"},
|
content={"text": "What's the issue?"},
|
||||||
@@ -120,7 +119,7 @@ async def test_switch_branch(client: AsyncClient, test_user, auth_headers, test_
|
|||||||
root = await manager.create_root_branch(session.id)
|
root = await manager.create_root_branch(session.id)
|
||||||
|
|
||||||
step = AISessionStep(
|
step = AISessionStep(
|
||||||
session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
|
session_id=session.id, step_order=0, step_type="question",
|
||||||
content={"text": "test"}, confidence_at_step=0.5,
|
content={"text": "test"}, confidence_at_step=0.5,
|
||||||
)
|
)
|
||||||
test_db.add(step)
|
test_db.add(step)
|
||||||
@@ -198,7 +197,7 @@ async def test_get_branch_tree(client: AsyncClient, test_user, auth_headers, tes
|
|||||||
root = await manager.create_root_branch(session.id)
|
root = await manager.create_root_branch(session.id)
|
||||||
|
|
||||||
step = AISessionStep(
|
step = AISessionStep(
|
||||||
session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
|
session_id=session.id, step_order=0, step_type="question",
|
||||||
content={"text": "test"}, confidence_at_step=0.5,
|
content={"text": "test"}, confidence_at_step=0.5,
|
||||||
)
|
)
|
||||||
test_db.add(step)
|
test_db.add(step)
|
||||||
|
|||||||
@@ -1,106 +0,0 @@
|
|||||||
"""Unit tests for the in-memory escalation pub/sub bus."""
|
|
||||||
import asyncio
|
|
||||||
from uuid import uuid4
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
|
|
||||||
from app.core.escalation_bus import EscalationBus
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_publish_with_no_subscribers_returns_zero():
|
|
||||||
bus = EscalationBus()
|
|
||||||
delivered = await bus.publish(uuid4(), {"type": "handoff_created"})
|
|
||||||
assert delivered == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_subscribe_then_publish_delivers_event():
|
|
||||||
bus = EscalationBus()
|
|
||||||
account = uuid4()
|
|
||||||
queue = await bus.subscribe(account)
|
|
||||||
try:
|
|
||||||
delivered = await bus.publish(account, {"type": "handoff_created", "id": "x"})
|
|
||||||
assert delivered == 1
|
|
||||||
event = await asyncio.wait_for(queue.get(), timeout=1.0)
|
|
||||||
assert event == {"type": "handoff_created", "id": "x"}
|
|
||||||
finally:
|
|
||||||
await bus.unsubscribe(account, queue)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_two_subscribers_same_account_both_receive():
|
|
||||||
bus = EscalationBus()
|
|
||||||
account = uuid4()
|
|
||||||
q1 = await bus.subscribe(account)
|
|
||||||
q2 = await bus.subscribe(account)
|
|
||||||
try:
|
|
||||||
delivered = await bus.publish(account, {"type": "x"})
|
|
||||||
assert delivered == 2
|
|
||||||
e1 = await asyncio.wait_for(q1.get(), timeout=1.0)
|
|
||||||
e2 = await asyncio.wait_for(q2.get(), timeout=1.0)
|
|
||||||
assert e1 == e2 == {"type": "x"}
|
|
||||||
finally:
|
|
||||||
await bus.unsubscribe(account, q1)
|
|
||||||
await bus.unsubscribe(account, q2)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_subscriber_in_other_account_does_not_receive():
|
|
||||||
"""Cross-tenant isolation is the whole point — sanity check it directly."""
|
|
||||||
bus = EscalationBus()
|
|
||||||
account_a = uuid4()
|
|
||||||
account_b = uuid4()
|
|
||||||
q_a = await bus.subscribe(account_a)
|
|
||||||
q_b = await bus.subscribe(account_b)
|
|
||||||
try:
|
|
||||||
delivered = await bus.publish(account_a, {"type": "x"})
|
|
||||||
assert delivered == 1
|
|
||||||
|
|
||||||
e_a = await asyncio.wait_for(q_a.get(), timeout=1.0)
|
|
||||||
assert e_a == {"type": "x"}
|
|
||||||
|
|
||||||
# B's queue must remain empty.
|
|
||||||
with pytest.raises(asyncio.TimeoutError):
|
|
||||||
await asyncio.wait_for(q_b.get(), timeout=0.1)
|
|
||||||
finally:
|
|
||||||
await bus.unsubscribe(account_a, q_a)
|
|
||||||
await bus.unsubscribe(account_b, q_b)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_unsubscribe_drops_subscriber_count_to_zero():
|
|
||||||
bus = EscalationBus()
|
|
||||||
account = uuid4()
|
|
||||||
q = await bus.subscribe(account)
|
|
||||||
assert bus.subscriber_count(account) == 1
|
|
||||||
await bus.unsubscribe(account, q)
|
|
||||||
assert bus.subscriber_count(account) == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_publish_drops_events_when_subscriber_queue_is_full():
|
|
||||||
"""A stuck subscriber must not back-pressure publishers."""
|
|
||||||
bus = EscalationBus()
|
|
||||||
account = uuid4()
|
|
||||||
queue = await bus.subscribe(account)
|
|
||||||
try:
|
|
||||||
# Stuff the queue past capacity (maxsize is 64) without consuming.
|
|
||||||
for _ in range(65):
|
|
||||||
await bus.publish(account, {"type": "x"})
|
|
||||||
# Sanity: queue holds at most maxsize.
|
|
||||||
assert queue.qsize() <= 64
|
|
||||||
# Publishes after capacity didn't raise — they were dropped silently.
|
|
||||||
finally:
|
|
||||||
await bus.unsubscribe(account, queue)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_unsubscribe_unknown_queue_is_noop():
|
|
||||||
"""Defensive: unsubscribe on an account/queue that isn't registered
|
|
||||||
should not raise — finally blocks rely on this."""
|
|
||||||
bus = EscalationBus()
|
|
||||||
account = uuid4()
|
|
||||||
fake_queue: asyncio.Queue = asyncio.Queue()
|
|
||||||
# Should not raise.
|
|
||||||
await bus.unsubscribe(account, fake_queue)
|
|
||||||
@@ -1,363 +0,0 @@
|
|||||||
"""Tests for GET /analytics/flowpilot/escalations — Escalation Mode wedge metric.
|
|
||||||
|
|
||||||
Covers the in-product time-to-first-action measurement that powers the queue
|
|
||||||
stat-card and the analytics page. The savings claim itself comes from the
|
|
||||||
manual baseline (the Assignment); these tests only cover what the in-product
|
|
||||||
endpoint returns.
|
|
||||||
"""
|
|
||||||
from datetime import datetime, timedelta, timezone
|
|
||||||
from uuid import UUID as PyUUID
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
from httpx import AsyncClient
|
|
||||||
from sqlalchemy import select
|
|
||||||
|
|
||||||
from app.models.ai_session import AISession
|
|
||||||
from app.models.ai_session_step import AISessionStep
|
|
||||||
from app.models.session_handoff import SessionHandoff
|
|
||||||
from app.models.user import User
|
|
||||||
|
|
||||||
|
|
||||||
URL = "/api/v1/analytics/flowpilot/escalations"
|
|
||||||
|
|
||||||
|
|
||||||
# ─── Helpers ──────────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
async def _make_session(db, *, user_id, account_id) -> AISession:
|
|
||||||
s = AISession(
|
|
||||||
user_id=user_id,
|
|
||||||
account_id=account_id,
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "test"},
|
|
||||||
status="escalated",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
db.add(s)
|
|
||||||
await db.flush()
|
|
||||||
return s
|
|
||||||
|
|
||||||
|
|
||||||
async def _make_handoff(
|
|
||||||
db,
|
|
||||||
*,
|
|
||||||
session_id,
|
|
||||||
account_id,
|
|
||||||
user_id,
|
|
||||||
claimed_at: datetime | None,
|
|
||||||
claimed_by=None,
|
|
||||||
) -> SessionHandoff:
|
|
||||||
h = SessionHandoff(
|
|
||||||
session_id=session_id,
|
|
||||||
account_id=account_id,
|
|
||||||
handed_off_by=user_id,
|
|
||||||
intent="escalate",
|
|
||||||
snapshot={"branch_map": "stub"},
|
|
||||||
priority="normal",
|
|
||||||
claimed_at=claimed_at,
|
|
||||||
claimed_by=claimed_by,
|
|
||||||
)
|
|
||||||
db.add(h)
|
|
||||||
await db.flush()
|
|
||||||
return h
|
|
||||||
|
|
||||||
|
|
||||||
async def _make_step(db, *, session_id, account_id, created_at: datetime) -> AISessionStep:
|
|
||||||
"""Insert an ai_session_step row with an explicit created_at.
|
|
||||||
|
|
||||||
SQLAlchemy's default would set created_at to now(); the metric query keys
|
|
||||||
off this column so the tests need to control it directly.
|
|
||||||
"""
|
|
||||||
step = AISessionStep(
|
|
||||||
session_id=session_id,
|
|
||||||
account_id=account_id,
|
|
||||||
step_order=1,
|
|
||||||
step_type="note",
|
|
||||||
content={"text": "first action"},
|
|
||||||
confidence_at_step=0.5,
|
|
||||||
input_tokens=0,
|
|
||||||
output_tokens=0,
|
|
||||||
is_fork_point=False,
|
|
||||||
was_free_text=False,
|
|
||||||
was_skipped=False,
|
|
||||||
created_at=created_at,
|
|
||||||
)
|
|
||||||
db.add(step)
|
|
||||||
await db.flush()
|
|
||||||
return step
|
|
||||||
|
|
||||||
|
|
||||||
# ─── Tests ────────────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_returns_zero_metrics_when_no_handoffs(
|
|
||||||
client: AsyncClient, auth_headers, test_user
|
|
||||||
):
|
|
||||||
"""Empty account → n_handoffs_claimed=0, all stats None, 200 OK."""
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 200
|
|
||||||
body = response.json()
|
|
||||||
assert body["period"] == "30d"
|
|
||||||
assert body["n_handoffs_claimed"] == 0
|
|
||||||
assert body["n_handoffs_with_action"] == 0
|
|
||||||
assert body["avg_seconds_to_first_action"] is None
|
|
||||||
assert body["median_seconds_to_first_action"] is None
|
|
||||||
assert body["p95_seconds_to_first_action"] is None
|
|
||||||
# Disclaimer is part of the contract — pilots reading the API should see it.
|
|
||||||
assert "manual baseline" in body["metric_definition"].lower()
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_happy_path_single_handoff_with_action(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""One claimed handoff + a step 90s later → avg=median=p95=90.0."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
|
|
||||||
claimed_at = datetime.now(timezone.utc) - timedelta(hours=2)
|
|
||||||
first_action_at = claimed_at + timedelta(seconds=90)
|
|
||||||
|
|
||||||
session = await _make_session(test_db, user_id=user_id, account_id=account_id)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
user_id=user_id,
|
|
||||||
claimed_at=claimed_at,
|
|
||||||
claimed_by=user_id,
|
|
||||||
)
|
|
||||||
await _make_step(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
created_at=first_action_at,
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 200
|
|
||||||
body = response.json()
|
|
||||||
assert body["n_handoffs_claimed"] == 1
|
|
||||||
assert body["n_handoffs_with_action"] == 1
|
|
||||||
assert body["avg_seconds_to_first_action"] == 90.0
|
|
||||||
assert body["median_seconds_to_first_action"] == 90.0
|
|
||||||
assert body["p95_seconds_to_first_action"] == 90.0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_handoff_claimed_but_no_action(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""Claimed handoff with no post-claim step → counted in n_handoffs_claimed
|
|
||||||
but not in n_handoffs_with_action; aggregates remain None."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
claimed_at = datetime.now(timezone.utc) - timedelta(minutes=5)
|
|
||||||
|
|
||||||
session = await _make_session(test_db, user_id=user_id, account_id=account_id)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
user_id=user_id,
|
|
||||||
claimed_at=claimed_at,
|
|
||||||
claimed_by=user_id,
|
|
||||||
)
|
|
||||||
# Pre-claim step (created_at < claimed_at) — must NOT count.
|
|
||||||
await _make_step(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
created_at=claimed_at - timedelta(seconds=30),
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 200
|
|
||||||
body = response.json()
|
|
||||||
assert body["n_handoffs_claimed"] == 1
|
|
||||||
assert body["n_handoffs_with_action"] == 0
|
|
||||||
assert body["avg_seconds_to_first_action"] is None
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_unclaimed_handoffs_excluded(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""Handoffs with claimed_at IS NULL are excluded entirely."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
|
|
||||||
session = await _make_session(test_db, user_id=user_id, account_id=account_id)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
user_id=user_id,
|
|
||||||
claimed_at=None,
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 200
|
|
||||||
assert response.json()["n_handoffs_claimed"] == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_period_window_excludes_old_handoffs(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""A handoff claimed >7d ago must not appear in ?period=7d."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
|
|
||||||
old_claimed_at = datetime.now(timezone.utc) - timedelta(days=10)
|
|
||||||
session = await _make_session(test_db, user_id=user_id, account_id=account_id)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
user_id=user_id,
|
|
||||||
claimed_at=old_claimed_at,
|
|
||||||
claimed_by=user_id,
|
|
||||||
)
|
|
||||||
await _make_step(
|
|
||||||
test_db,
|
|
||||||
session_id=session.id,
|
|
||||||
account_id=account_id,
|
|
||||||
created_at=old_claimed_at + timedelta(seconds=60),
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
# 7d window: excluded
|
|
||||||
r7 = await client.get(URL, headers=auth_headers, params={"period": "7d"})
|
|
||||||
assert r7.status_code == 200
|
|
||||||
assert r7.json()["n_handoffs_claimed"] == 0
|
|
||||||
|
|
||||||
# 90d window: included
|
|
||||||
r90 = await client.get(URL, headers=auth_headers, params={"period": "90d"})
|
|
||||||
assert r90.status_code == 200
|
|
||||||
assert r90.json()["n_handoffs_claimed"] == 1
|
|
||||||
assert r90.json()["n_handoffs_with_action"] == 1
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_aggregate_stats_for_multiple_handoffs(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""Three handoffs with deltas 30/60/180s → avg=90, median=60, p95≈180."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
|
|
||||||
base = datetime.now(timezone.utc) - timedelta(hours=3)
|
|
||||||
deltas = [30, 60, 180]
|
|
||||||
for i, delta in enumerate(deltas):
|
|
||||||
s = await _make_session(test_db, user_id=user_id, account_id=account_id)
|
|
||||||
claimed_at = base + timedelta(minutes=i * 10)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=s.id,
|
|
||||||
account_id=account_id,
|
|
||||||
user_id=user_id,
|
|
||||||
claimed_at=claimed_at,
|
|
||||||
claimed_by=user_id,
|
|
||||||
)
|
|
||||||
await _make_step(
|
|
||||||
test_db,
|
|
||||||
session_id=s.id,
|
|
||||||
account_id=account_id,
|
|
||||||
created_at=claimed_at + timedelta(seconds=delta),
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
body = response.json()
|
|
||||||
assert response.status_code == 200
|
|
||||||
assert body["n_handoffs_claimed"] == 3
|
|
||||||
assert body["n_handoffs_with_action"] == 3
|
|
||||||
assert body["avg_seconds_to_first_action"] == 90.0
|
|
||||||
assert body["median_seconds_to_first_action"] == 60.0
|
|
||||||
assert body["p95_seconds_to_first_action"] == 180.0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_account_isolation_requesting_user_only_sees_own_account(
|
|
||||||
client: AsyncClient, auth_headers, test_user, test_db
|
|
||||||
):
|
|
||||||
"""A handoff in another account must not appear in this user's response.
|
|
||||||
|
|
||||||
Critical: the Phase 4 RLS pattern can fail silently if account_id is wrong.
|
|
||||||
This test would catch an account-scoped query that accidentally returned
|
|
||||||
cross-tenant rows.
|
|
||||||
"""
|
|
||||||
from app.models.account import Account
|
|
||||||
|
|
||||||
other_account = Account(name="Other MSP", display_code="OTHER001")
|
|
||||||
test_db.add(other_account)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
other_user = User(
|
|
||||||
email="other@example.com",
|
|
||||||
password_hash="x",
|
|
||||||
name="Other Tech",
|
|
||||||
role="engineer",
|
|
||||||
account_id=other_account.id,
|
|
||||||
account_role="owner",
|
|
||||||
)
|
|
||||||
test_db.add(other_user)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
s = await _make_session(
|
|
||||||
test_db, user_id=other_user.id, account_id=other_account.id
|
|
||||||
)
|
|
||||||
claimed_at = datetime.now(timezone.utc) - timedelta(hours=1)
|
|
||||||
await _make_handoff(
|
|
||||||
test_db,
|
|
||||||
session_id=s.id,
|
|
||||||
account_id=other_account.id,
|
|
||||||
user_id=other_user.id,
|
|
||||||
claimed_at=claimed_at,
|
|
||||||
claimed_by=other_user.id,
|
|
||||||
)
|
|
||||||
await _make_step(
|
|
||||||
test_db,
|
|
||||||
session_id=s.id,
|
|
||||||
account_id=other_account.id,
|
|
||||||
created_at=claimed_at + timedelta(seconds=45),
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 200
|
|
||||||
body = response.json()
|
|
||||||
# The other account's handoff must NOT leak into this account's response.
|
|
||||||
assert body["n_handoffs_claimed"] == 0
|
|
||||||
assert body["n_handoffs_with_action"] == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_viewer_role_is_blocked(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""Downgrade the test user to 'viewer' and confirm the endpoint 403s."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
user = (
|
|
||||||
await test_db.execute(select(User).where(User.id == user_id))
|
|
||||||
).scalar_one()
|
|
||||||
user.account_role = "viewer"
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
response = await client.get(URL, headers=auth_headers)
|
|
||||||
assert response.status_code == 403
|
|
||||||
assert "engineer" in response.json()["detail"].lower()
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_invalid_period_rejected(client: AsyncClient, auth_headers):
|
|
||||||
"""period=1d is not in {7d,30d,90d} — must 422."""
|
|
||||||
response = await client.get(URL, headers=auth_headers, params={"period": "1d"})
|
|
||||||
assert response.status_code == 422
|
|
||||||
@@ -1,12 +1,8 @@
|
|||||||
"""Integration tests for HandoffManager service."""
|
"""Integration tests for HandoffManager service."""
|
||||||
from unittest.mock import AsyncMock, patch
|
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
from httpx import AsyncClient
|
from httpx import AsyncClient
|
||||||
|
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.user import User
|
|
||||||
from app.services.handoff_manager import HandoffManager
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
@@ -117,259 +113,3 @@ async def test_claim_session(client: AsyncClient, test_user, test_admin, auth_he
|
|||||||
|
|
||||||
await test_db.refresh(session)
|
await test_db.refresh(session)
|
||||||
assert session.status == "active"
|
assert session.status == "active"
|
||||||
|
|
||||||
|
|
||||||
# ─── Notification dispatch ────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_dispatch_emails_engineer_recipients_in_account(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""dispatch_escalation_notifications emails every engineer/admin in the
|
|
||||||
account except the escalator."""
|
|
||||||
# Add a second user (engineer role) in the same account.
|
|
||||||
teammate = User(
|
|
||||||
email="teammate@example.com",
|
|
||||||
password_hash="x",
|
|
||||||
name="Teammate",
|
|
||||||
role="engineer",
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
account_role="engineer",
|
|
||||||
)
|
|
||||||
test_db.add(teammate)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
# Add a viewer-role user — must NOT receive a notification.
|
|
||||||
viewer = User(
|
|
||||||
email="viewer@example.com",
|
|
||||||
password_hash="x",
|
|
||||||
name="Viewer",
|
|
||||||
role="engineer",
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
account_role="viewer",
|
|
||||||
)
|
|
||||||
test_db.add(viewer)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "vpn down"},
|
|
||||||
problem_summary="VPN won't connect after Win update",
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
manager = HandoffManager(test_db)
|
|
||||||
handoff = await manager.create_handoff(
|
|
||||||
session_id=session.id,
|
|
||||||
intent="escalate",
|
|
||||||
engineer_notes="Stuck on auth handshake",
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
with patch(
|
|
||||||
"app.services.handoff_manager.EmailService.send_notification_email",
|
|
||||||
new=AsyncMock(return_value=True),
|
|
||||||
) as send:
|
|
||||||
sent = await manager.dispatch_escalation_notifications(handoff)
|
|
||||||
|
|
||||||
assert sent == 1 # only the engineer-role teammate
|
|
||||||
recipients = {call.kwargs["to_email"] for call in send.call_args_list}
|
|
||||||
assert recipients == {"teammate@example.com"}
|
|
||||||
assert viewer.email not in recipients
|
|
||||||
assert test_user["email"] not in recipients # not self-notified
|
|
||||||
|
|
||||||
title = send.call_args_list[0].kwargs["title"]
|
|
||||||
assert "VPN won't connect after Win update" in title
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_dispatch_skipped_for_park_intent(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""park-intent handoffs are private (waiting for client logs etc) — no
|
|
||||||
team-wide email."""
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "x"},
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
manager = HandoffManager(test_db)
|
|
||||||
handoff = await manager.create_handoff(
|
|
||||||
session_id=session.id,
|
|
||||||
intent="park",
|
|
||||||
engineer_notes="waiting on customer",
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
with patch(
|
|
||||||
"app.services.handoff_manager.EmailService.send_notification_email",
|
|
||||||
new=AsyncMock(return_value=True),
|
|
||||||
) as send:
|
|
||||||
sent = await manager.dispatch_escalation_notifications(handoff)
|
|
||||||
|
|
||||||
assert sent == 0
|
|
||||||
assert send.call_count == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_dispatch_graceful_degradation_when_email_raises(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""If the email service raises (auth misconfig, network, etc.), dispatch
|
|
||||||
must NOT raise. Handoff creation has already committed; emailing is
|
|
||||||
best-effort. Codex-flagged regression."""
|
|
||||||
teammate = User(
|
|
||||||
email="t@example.com",
|
|
||||||
password_hash="x",
|
|
||||||
name="T",
|
|
||||||
role="engineer",
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
account_role="engineer",
|
|
||||||
)
|
|
||||||
test_db.add(teammate)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "x"},
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
manager = HandoffManager(test_db)
|
|
||||||
handoff = await manager.create_handoff(
|
|
||||||
session_id=session.id,
|
|
||||||
intent="escalate",
|
|
||||||
engineer_notes="help",
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
with patch(
|
|
||||||
"app.services.handoff_manager.EmailService.send_notification_email",
|
|
||||||
new=AsyncMock(side_effect=RuntimeError("SMTP down")),
|
|
||||||
):
|
|
||||||
# Must not raise.
|
|
||||||
sent = await manager.dispatch_escalation_notifications(handoff)
|
|
||||||
assert sent == 0
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_dispatch_publishes_to_escalation_bus(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""dispatch_escalation_notifications puts an event on the in-memory bus
|
|
||||||
so connected SSE subscribers see live arrivals."""
|
|
||||||
from app.core.escalation_bus import bus as escalation_bus
|
|
||||||
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "x"},
|
|
||||||
problem_summary="VPN down",
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
manager = HandoffManager(test_db)
|
|
||||||
handoff = await manager.create_handoff(
|
|
||||||
session_id=session.id,
|
|
||||||
intent="escalate",
|
|
||||||
engineer_notes="please help",
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
from uuid import UUID as PyUUID
|
|
||||||
account_id = PyUUID(test_user["user_data"]["account_id"])
|
|
||||||
|
|
||||||
queue = await escalation_bus.subscribe(account_id)
|
|
||||||
try:
|
|
||||||
with patch(
|
|
||||||
"app.services.handoff_manager.EmailService.send_notification_email",
|
|
||||||
new=AsyncMock(return_value=True),
|
|
||||||
):
|
|
||||||
await manager.dispatch_escalation_notifications(handoff)
|
|
||||||
|
|
||||||
import asyncio
|
|
||||||
event = await asyncio.wait_for(queue.get(), timeout=1.0)
|
|
||||||
assert event["type"] == "handoff_created"
|
|
||||||
assert event["handoff_id"] == str(handoff.id)
|
|
||||||
assert event["session_id"] == str(session.id)
|
|
||||||
assert event["priority"] == "normal"
|
|
||||||
finally:
|
|
||||||
await escalation_bus.unsubscribe(account_id, queue)
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_create_handoff_endpoint_dispatches_on_escalate(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""End-to-end: POST /handoff with intent=escalate triggers
|
|
||||||
dispatch_escalation_notifications after commit. Verifies the wiring in
|
|
||||||
the endpoint, not just the manager method."""
|
|
||||||
teammate = User(
|
|
||||||
email="t2@example.com",
|
|
||||||
password_hash="x",
|
|
||||||
name="T2",
|
|
||||||
role="engineer",
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
account_role="engineer",
|
|
||||||
)
|
|
||||||
test_db.add(teammate)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "x"},
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
with patch(
|
|
||||||
"app.services.handoff_manager.EmailService.send_notification_email",
|
|
||||||
new=AsyncMock(return_value=True),
|
|
||||||
) as send:
|
|
||||||
resp = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/handoff",
|
|
||||||
headers=auth_headers,
|
|
||||||
json={"intent": "escalate", "engineer_notes": "Need help"},
|
|
||||||
)
|
|
||||||
assert resp.status_code == 201
|
|
||||||
assert send.call_count == 1
|
|
||||||
assert send.call_args.kwargs["to_email"] == "t2@example.com"
|
|
||||||
|
|||||||
@@ -50,7 +50,6 @@ async def _make_session(test_db, user, *, with_psa: bool = False) -> AISession:
|
|||||||
conn = PsaConnection(
|
conn = PsaConnection(
|
||||||
account_id=user["user_data"]["account_id"],
|
account_id=user["user_data"]["account_id"],
|
||||||
provider="connectwise",
|
provider="connectwise",
|
||||||
display_name="Test ConnectWise",
|
|
||||||
site_url="https://fake.cw.local",
|
site_url="https://fake.cw.local",
|
||||||
company_id="TEST",
|
company_id="TEST",
|
||||||
credentials_encrypted=encrypt_credentials({"public_key": "x", "private_key": "y"}),
|
credentials_encrypted=encrypt_credentials({"public_key": "x", "private_key": "y"}),
|
||||||
|
|||||||
@@ -472,20 +472,19 @@ class TestScriptBuilderSlugCollision:
|
|||||||
# Pre-create a template with slug "test-script" to cause collision
|
# Pre-create a template with slug "test-script" to cause collision
|
||||||
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
||||||
user_id = user_resp.json()["id"]
|
user_id = user_resp.json()["id"]
|
||||||
account_id = user_resp.json()["account_id"]
|
|
||||||
await test_db.execute(
|
await test_db.execute(
|
||||||
sa.text("""
|
sa.text("""
|
||||||
INSERT INTO script_templates
|
INSERT INTO script_templates
|
||||||
(id, category_id, created_by, account_id, name, slug, script_body,
|
(id, category_id, created_by, name, slug, script_body,
|
||||||
parameters_schema, default_values, validation_rules, tags,
|
parameters_schema, default_values, validation_rules, tags,
|
||||||
complexity, is_active, version, usage_count, created_at, updated_at)
|
complexity, is_active, version, usage_count, created_at, updated_at)
|
||||||
VALUES
|
VALUES
|
||||||
(:id, 'a0000000-0000-0000-0000-000000000001'::uuid, :uid, :account_id,
|
(:id, 'a0000000-0000-0000-0000-000000000001'::uuid, :uid,
|
||||||
'Test Script', 'test-script', 'echo hello',
|
'Test Script', 'test-script', 'echo hello',
|
||||||
'{"parameters": []}', '{}', '{}', '["powershell"]',
|
'{"parameters": []}', '{}', '{}', '["powershell"]',
|
||||||
'beginner', true, 1, 0, NOW(), NOW())
|
'beginner', true, 1, 0, NOW(), NOW())
|
||||||
"""),
|
"""),
|
||||||
{"id": str(uuid_mod.uuid4()), "uid": user_id, "account_id": account_id},
|
{"id": str(uuid_mod.uuid4()), "uid": user_id},
|
||||||
)
|
)
|
||||||
await test_db.commit()
|
await test_db.commit()
|
||||||
|
|
||||||
@@ -562,7 +561,6 @@ class TestScriptTemplateFilters:
|
|||||||
"""mine=true returns only templates created by the current user."""
|
"""mine=true returns only templates created by the current user."""
|
||||||
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
||||||
user_id = user_resp.json()["id"]
|
user_id = user_resp.json()["id"]
|
||||||
account_id = user_resp.json()["account_id"]
|
|
||||||
|
|
||||||
second_resp = await client.get("/api/v1/auth/me", headers=second_user_headers)
|
second_resp = await client.get("/api/v1/auth/me", headers=second_user_headers)
|
||||||
second_user_id = second_resp.json()["id"]
|
second_user_id = second_resp.json()["id"]
|
||||||
@@ -573,32 +571,32 @@ class TestScriptTemplateFilters:
|
|||||||
await test_db.execute(
|
await test_db.execute(
|
||||||
sa.text("""
|
sa.text("""
|
||||||
INSERT INTO script_templates
|
INSERT INTO script_templates
|
||||||
(id, category_id, created_by, account_id, team_id, name, slug, script_body,
|
(id, category_id, created_by, team_id, name, slug, script_body,
|
||||||
parameters_schema, default_values, validation_rules, tags,
|
parameters_schema, default_values, validation_rules, tags,
|
||||||
complexity, is_active, version, usage_count, created_at, updated_at)
|
complexity, is_active, version, usage_count, created_at, updated_at)
|
||||||
VALUES
|
VALUES
|
||||||
(:id, :cat, :uid, :account_id, NULL,
|
(:id, :cat, :uid, NULL,
|
||||||
'My Script', 'my-script', 'echo mine',
|
'My Script', 'my-script', 'echo mine',
|
||||||
'{"parameters": []}', '{}', '{}', '[]',
|
'{"parameters": []}', '{}', '{}', '[]',
|
||||||
'beginner', true, 1, 0, NOW(), NOW())
|
'beginner', true, 1, 0, NOW(), NOW())
|
||||||
"""),
|
"""),
|
||||||
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id},
|
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id},
|
||||||
)
|
)
|
||||||
|
|
||||||
# Create template owned by second user (no team_id, so visible to all)
|
# Create template owned by second user (no team_id, so visible to all)
|
||||||
await test_db.execute(
|
await test_db.execute(
|
||||||
sa.text("""
|
sa.text("""
|
||||||
INSERT INTO script_templates
|
INSERT INTO script_templates
|
||||||
(id, category_id, created_by, account_id, team_id, name, slug, script_body,
|
(id, category_id, created_by, team_id, name, slug, script_body,
|
||||||
parameters_schema, default_values, validation_rules, tags,
|
parameters_schema, default_values, validation_rules, tags,
|
||||||
complexity, is_active, version, usage_count, created_at, updated_at)
|
complexity, is_active, version, usage_count, created_at, updated_at)
|
||||||
VALUES
|
VALUES
|
||||||
(:id, :cat, :uid, :account_id, NULL,
|
(:id, :cat, :uid, NULL,
|
||||||
'Other Script', 'other-script', 'echo other',
|
'Other Script', 'other-script', 'echo other',
|
||||||
'{"parameters": []}', '{}', '{}', '[]',
|
'{"parameters": []}', '{}', '{}', '[]',
|
||||||
'beginner', true, 1, 0, NOW(), NOW())
|
'beginner', true, 1, 0, NOW(), NOW())
|
||||||
"""),
|
"""),
|
||||||
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": second_user_id, "account_id": account_id},
|
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": second_user_id},
|
||||||
)
|
)
|
||||||
await test_db.commit()
|
await test_db.commit()
|
||||||
|
|
||||||
@@ -619,7 +617,6 @@ class TestScriptTemplateFilters:
|
|||||||
"""shared=true returns only templates shared with the user's team."""
|
"""shared=true returns only templates shared with the user's team."""
|
||||||
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
user_resp = await client.get("/api/v1/auth/me", headers=auth_headers)
|
||||||
user_id = user_resp.json()["id"]
|
user_id = user_resp.json()["id"]
|
||||||
account_id = user_resp.json()["account_id"]
|
|
||||||
|
|
||||||
cat_id = "b0000000-0000-0000-0000-000000000001"
|
cat_id = "b0000000-0000-0000-0000-000000000001"
|
||||||
|
|
||||||
@@ -642,32 +639,32 @@ class TestScriptTemplateFilters:
|
|||||||
await test_db.execute(
|
await test_db.execute(
|
||||||
sa.text("""
|
sa.text("""
|
||||||
INSERT INTO script_templates
|
INSERT INTO script_templates
|
||||||
(id, category_id, created_by, account_id, team_id, name, slug, script_body,
|
(id, category_id, created_by, team_id, name, slug, script_body,
|
||||||
parameters_schema, default_values, validation_rules, tags,
|
parameters_schema, default_values, validation_rules, tags,
|
||||||
complexity, is_active, version, usage_count, created_at, updated_at)
|
complexity, is_active, version, usage_count, created_at, updated_at)
|
||||||
VALUES
|
VALUES
|
||||||
(:id, :cat, :uid, :account_id, :tid,
|
(:id, :cat, :uid, :tid,
|
||||||
'Team Script', 'team-script', 'echo team',
|
'Team Script', 'team-script', 'echo team',
|
||||||
'{"parameters": []}', '{}', '{}', '[]',
|
'{"parameters": []}', '{}', '{}', '[]',
|
||||||
'beginner', true, 1, 0, NOW(), NOW())
|
'beginner', true, 1, 0, NOW(), NOW())
|
||||||
"""),
|
"""),
|
||||||
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id, "tid": team_id},
|
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "tid": team_id},
|
||||||
)
|
)
|
||||||
|
|
||||||
# Template NOT shared (no team_id)
|
# Template NOT shared (no team_id)
|
||||||
await test_db.execute(
|
await test_db.execute(
|
||||||
sa.text("""
|
sa.text("""
|
||||||
INSERT INTO script_templates
|
INSERT INTO script_templates
|
||||||
(id, category_id, created_by, account_id, team_id, name, slug, script_body,
|
(id, category_id, created_by, team_id, name, slug, script_body,
|
||||||
parameters_schema, default_values, validation_rules, tags,
|
parameters_schema, default_values, validation_rules, tags,
|
||||||
complexity, is_active, version, usage_count, created_at, updated_at)
|
complexity, is_active, version, usage_count, created_at, updated_at)
|
||||||
VALUES
|
VALUES
|
||||||
(:id, :cat, :uid, :account_id, NULL,
|
(:id, :cat, :uid, NULL,
|
||||||
'Personal Script', 'personal-script', 'echo personal',
|
'Personal Script', 'personal-script', 'echo personal',
|
||||||
'{"parameters": []}', '{}', '{}', '[]',
|
'{"parameters": []}', '{}', '{}', '[]',
|
||||||
'beginner', true, 1, 0, NOW(), NOW())
|
'beginner', true, 1, 0, NOW(), NOW())
|
||||||
"""),
|
"""),
|
||||||
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id, "account_id": account_id},
|
{"id": str(uuid_mod.uuid4()), "cat": cat_id, "uid": user_id},
|
||||||
)
|
)
|
||||||
await test_db.commit()
|
await test_db.commit()
|
||||||
|
|
||||||
|
|||||||
@@ -49,7 +49,7 @@ async def test_create_fork(client: AsyncClient, test_user, auth_headers, test_db
|
|||||||
await test_db.flush()
|
await test_db.flush()
|
||||||
|
|
||||||
step = AISessionStep(
|
step = AISessionStep(
|
||||||
session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
|
session_id=session.id, step_order=0, step_type="question",
|
||||||
content={"text": "test"}, confidence_at_step=0.5,
|
content={"text": "test"}, confidence_at_step=0.5,
|
||||||
)
|
)
|
||||||
test_db.add(step)
|
test_db.add(step)
|
||||||
@@ -88,7 +88,7 @@ async def test_switch_branch(client: AsyncClient, test_user, auth_headers, test_
|
|||||||
await test_db.flush()
|
await test_db.flush()
|
||||||
|
|
||||||
step = AISessionStep(
|
step = AISessionStep(
|
||||||
session_id=session.id, account_id=session.account_id, step_order=0, step_type="question",
|
session_id=session.id, step_order=0, step_type="question",
|
||||||
content={"text": "test"}, confidence_at_step=0.5,
|
content={"text": "test"}, confidence_at_step=0.5,
|
||||||
)
|
)
|
||||||
test_db.add(step)
|
test_db.add(step)
|
||||||
|
|||||||
@@ -1,12 +1,8 @@
|
|||||||
"""API endpoint tests for session handoffs."""
|
"""API endpoint tests for session handoffs."""
|
||||||
from uuid import UUID as PyUUID
|
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
from httpx import AsyncClient
|
from httpx import AsyncClient
|
||||||
from sqlalchemy import select
|
|
||||||
|
|
||||||
from app.models.ai_session import AISession
|
from app.models.ai_session import AISession
|
||||||
from app.models.user import User
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
@@ -62,131 +58,3 @@ async def test_get_queue(client: AsyncClient, test_user, auth_headers, test_db):
|
|||||||
assert resp.status_code == 200
|
assert resp.status_code == 200
|
||||||
data = resp.json()
|
data = resp.json()
|
||||||
assert len(data) >= 1
|
assert len(data) >= 1
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_claim_blocked_for_viewer_role(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""POST /handoffs/{id}/claim must 403 for viewer-role users.
|
|
||||||
|
|
||||||
Codex review flagged the missing role gate as wedge-relevant: the
|
|
||||||
race-condition story (two seniors clicking Pick Up simultaneously)
|
|
||||||
requires auth gating for audit integrity. Viewers must not be able
|
|
||||||
to claim escalations.
|
|
||||||
"""
|
|
||||||
# Create a session + handoff as the engineer-role test_user (default = owner).
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "test"},
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
create_resp = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/handoff",
|
|
||||||
headers=auth_headers,
|
|
||||||
json={"intent": "escalate", "engineer_notes": "Need help"},
|
|
||||||
)
|
|
||||||
assert create_resp.status_code == 201
|
|
||||||
handoff_id = create_resp.json()["id"]
|
|
||||||
|
|
||||||
# Downgrade the user to viewer.
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
user = (
|
|
||||||
await test_db.execute(select(User).where(User.id == user_id))
|
|
||||||
).scalar_one()
|
|
||||||
user.account_role = "viewer"
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
claim_resp = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/handoffs/{handoff_id}/claim",
|
|
||||||
headers=auth_headers,
|
|
||||||
)
|
|
||||||
assert claim_resp.status_code == 403
|
|
||||||
assert "engineer" in claim_resp.json()["detail"].lower()
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_escalations_stream_blocked_for_viewer(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""SSE stream is role-gated to engineer-or-admin (matches queue/claim)."""
|
|
||||||
user_id = PyUUID(test_user["user_data"]["id"])
|
|
||||||
user = (
|
|
||||||
await test_db.execute(select(User).where(User.id == user_id))
|
|
||||||
).scalar_one()
|
|
||||||
user.account_role = "viewer"
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
resp = await client.get(
|
|
||||||
"/api/v1/ai-sessions/escalations/stream", headers=auth_headers
|
|
||||||
)
|
|
||||||
assert resp.status_code == 403
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_escalations_stream_returns_sse_content_type(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""Engineer/owner can open the SSE stream and gets text/event-stream
|
|
||||||
plus an initial `ready` event. Read just enough bytes to confirm the
|
|
||||||
handshake — the full pub/sub flow is covered by the bus + dispatcher
|
|
||||||
tests separately."""
|
|
||||||
async with client.stream(
|
|
||||||
"GET",
|
|
||||||
"/api/v1/ai-sessions/escalations/stream",
|
|
||||||
headers=auth_headers,
|
|
||||||
) as resp:
|
|
||||||
assert resp.status_code == 200
|
|
||||||
assert resp.headers["content-type"].startswith("text/event-stream")
|
|
||||||
# First chunk must contain the ready event.
|
|
||||||
first = b""
|
|
||||||
async for chunk in resp.aiter_bytes():
|
|
||||||
first += chunk
|
|
||||||
if b"event: ready" in first and b"\n\n" in first:
|
|
||||||
break
|
|
||||||
assert b"event: ready" in first
|
|
||||||
assert b'"account_id"' in first
|
|
||||||
|
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
|
||||||
async def test_claim_allowed_for_engineer_role(
|
|
||||||
client: AsyncClient, test_user, auth_headers, test_db
|
|
||||||
):
|
|
||||||
"""POST /handoffs/{id}/claim succeeds for engineer-or-admin roles."""
|
|
||||||
session = AISession(
|
|
||||||
user_id=test_user["user_data"]["id"],
|
|
||||||
account_id=test_user["user_data"]["account_id"],
|
|
||||||
session_type="guided",
|
|
||||||
intake_type="free_text",
|
|
||||||
intake_content={"text": "test"},
|
|
||||||
status="active",
|
|
||||||
confidence_tier="discovery",
|
|
||||||
conversation_messages=[],
|
|
||||||
)
|
|
||||||
test_db.add(session)
|
|
||||||
await test_db.commit()
|
|
||||||
|
|
||||||
create_resp = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/handoff",
|
|
||||||
headers=auth_headers,
|
|
||||||
json={"intent": "escalate", "engineer_notes": "Need help"},
|
|
||||||
)
|
|
||||||
assert create_resp.status_code == 201
|
|
||||||
handoff_id = create_resp.json()["id"]
|
|
||||||
|
|
||||||
# Default test_user role is "owner", which passes engineer-or-admin.
|
|
||||||
claim_resp = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/handoffs/{handoff_id}/claim",
|
|
||||||
headers=auth_headers,
|
|
||||||
)
|
|
||||||
assert claim_resp.status_code == 200
|
|
||||||
assert claim_resp.json()["claimed_by"] == test_user["user_data"]["id"]
|
|
||||||
assert claim_resp.json()["claimed_at"] is not None
|
|
||||||
|
|||||||
@@ -45,7 +45,6 @@ async def test_edit_output_api(client: AsyncClient, test_user, auth_headers, tes
|
|||||||
|
|
||||||
output = SessionResolutionOutput(
|
output = SessionResolutionOutput(
|
||||||
session_id=session.id,
|
session_id=session.id,
|
||||||
account_id=session.account_id,
|
|
||||||
output_type="psa_ticket_notes",
|
output_type="psa_ticket_notes",
|
||||||
generated_content="Original",
|
generated_content="Original",
|
||||||
status="draft",
|
status="draft",
|
||||||
|
|||||||
@@ -219,7 +219,7 @@ class TestSessionSharing:
|
|||||||
json={"visibility": "public"},
|
json={"visibility": "public"},
|
||||||
headers=other_headers
|
headers=other_headers
|
||||||
)
|
)
|
||||||
assert response.status_code == 404
|
assert response.status_code == 403
|
||||||
|
|
||||||
async def test_share_nonexistent_session(self, client: AsyncClient, auth_headers):
|
async def test_share_nonexistent_session(self, client: AsyncClient, auth_headers):
|
||||||
"""Creating a share for nonexistent session returns 404."""
|
"""Creating a share for nonexistent session returns 404."""
|
||||||
|
|||||||
@@ -213,28 +213,15 @@ async def test_record_decision_persists_and_bumps_state_version(
|
|||||||
title="x",
|
title="x",
|
||||||
description="y",
|
description="y",
|
||||||
confidence_pct=50,
|
confidence_pct=50,
|
||||||
ai_drafted_script="Write-Output 'ok'",
|
|
||||||
)
|
)
|
||||||
test_db.add(fix)
|
test_db.add(fix)
|
||||||
await test_db.commit()
|
await test_db.commit()
|
||||||
|
|
||||||
# The draft_template path calls TemplateExtractionService, which needs an
|
r = await client.post(
|
||||||
# AI provider configured. CI doesn't set ANTHROPIC_API_KEY/GOOGLE_AI_API_KEY,
|
f"/api/v1/ai-sessions/{session.id}/suggested-fixes/{fix.id}/decision",
|
||||||
# and this test isn't exercising the AI integration — patch the extractor
|
headers=auth_headers,
|
||||||
# with a minimal valid response so the rest of the decision flow runs.
|
json={"decision": "draft_template"},
|
||||||
extractor_stub = AsyncMock(return_value={
|
)
|
||||||
"templated_body": "Write-Output 'ok'",
|
|
||||||
"parameters": [],
|
|
||||||
})
|
|
||||||
with patch(
|
|
||||||
"app.api.endpoints.session_suggested_fixes._extract_template_parameters",
|
|
||||||
extractor_stub,
|
|
||||||
):
|
|
||||||
r = await client.post(
|
|
||||||
f"/api/v1/ai-sessions/{session.id}/suggested-fixes/{fix.id}/decision",
|
|
||||||
headers=auth_headers,
|
|
||||||
json={"decision": "draft_template"},
|
|
||||||
)
|
|
||||||
assert r.status_code == 200
|
assert r.status_code == 200
|
||||||
assert r.json()["user_decision"] == "draft_template"
|
assert r.json()["user_decision"] == "draft_template"
|
||||||
|
|
||||||
|
|||||||
@@ -43,7 +43,7 @@ async def _create_account_and_user(db: AsyncSession, prefix: str):
|
|||||||
async def _login(client: AsyncClient, email: str, password: str) -> dict:
|
async def _login(client: AsyncClient, email: str, password: str) -> dict:
|
||||||
"""Log in and return Authorization headers."""
|
"""Log in and return Authorization headers."""
|
||||||
resp = await client.post(
|
resp = await client.post(
|
||||||
"/api/v1/auth/login/json",
|
"/api/v1/auth/login",
|
||||||
json={"email": email, "password": password},
|
json={"email": email, "password": password},
|
||||||
)
|
)
|
||||||
assert resp.status_code == 200, f"Login failed: {resp.text}"
|
assert resp.status_code == 200, f"Login failed: {resp.text}"
|
||||||
@@ -101,11 +101,11 @@ async def test_category_tree_count_scoped_to_account(
|
|||||||
acct_a, user_a, pass_a = await _create_account_and_user(test_db, "cat-a")
|
acct_a, user_a, pass_a = await _create_account_and_user(test_db, "cat-a")
|
||||||
acct_b, user_b, pass_b = await _create_account_and_user(test_db, "cat-b")
|
acct_b, user_b, pass_b = await _create_account_and_user(test_db, "cat-b")
|
||||||
|
|
||||||
# Categories are tenant-scoped; the endpoint must only count account A's trees.
|
# Shared category (account_id=None means global)
|
||||||
category = TreeCategory(
|
category = TreeCategory(
|
||||||
name="Shared Category",
|
name="Shared Category",
|
||||||
slug=f"shared-cat-{uuid.uuid4().hex[:6]}",
|
slug=f"shared-cat-{uuid.uuid4().hex[:6]}",
|
||||||
account_id=acct_a.id,
|
account_id=None,
|
||||||
is_active=True,
|
is_active=True,
|
||||||
)
|
)
|
||||||
test_db.add(category)
|
test_db.add(category)
|
||||||
@@ -270,7 +270,6 @@ async def test_get_session_returns_404_not_403_for_other_user(
|
|||||||
session_b = Session(
|
session_b = Session(
|
||||||
tree_id=tree_b.id,
|
tree_id=tree_b.id,
|
||||||
user_id=user_b.id,
|
user_id=user_b.id,
|
||||||
account_id=acct_b.id,
|
|
||||||
tree_snapshot={"id": "root", "type": "start", "children": []},
|
tree_snapshot={"id": "root", "type": "start", "children": []},
|
||||||
path_taken=[],
|
path_taken=[],
|
||||||
decisions=[],
|
decisions=[],
|
||||||
@@ -385,7 +384,6 @@ async def test_share_revoke_returns_404_not_403_for_other_user(
|
|||||||
session_b = Session(
|
session_b = Session(
|
||||||
tree_id=tree_b.id,
|
tree_id=tree_b.id,
|
||||||
user_id=user_b.id,
|
user_id=user_b.id,
|
||||||
account_id=acct_b.id,
|
|
||||||
tree_snapshot={"id": "root", "type": "start", "children": []},
|
tree_snapshot={"id": "root", "type": "start", "children": []},
|
||||||
path_taken=[],
|
path_taken=[],
|
||||||
decisions=[],
|
decisions=[],
|
||||||
@@ -536,7 +534,6 @@ async def test_maintenance_schedule_returns_404_for_other_team(
|
|||||||
# Create a schedule for that tree
|
# Create a schedule for that tree
|
||||||
schedule_b = MaintenanceSchedule(
|
schedule_b = MaintenanceSchedule(
|
||||||
tree_id=tree_b.id,
|
tree_id=tree_b.id,
|
||||||
account_id=acct_b.id,
|
|
||||||
created_by=user_b.id,
|
created_by=user_b.id,
|
||||||
cron_expression="0 2 * * 0",
|
cron_expression="0 2 * * 0",
|
||||||
timezone="UTC",
|
timezone="UTC",
|
||||||
|
|||||||
@@ -4,7 +4,6 @@ from datetime import datetime, timezone, timedelta
|
|||||||
from httpx import AsyncClient
|
from httpx import AsyncClient
|
||||||
from uuid import uuid4
|
from uuid import uuid4
|
||||||
|
|
||||||
from app.models.account import Account
|
|
||||||
from app.models.tree import Tree
|
from app.models.tree import Tree
|
||||||
from app.models.tree_share import TreeShare
|
from app.models.tree_share import TreeShare
|
||||||
from app.models.user import User
|
from app.models.user import User
|
||||||
@@ -288,17 +287,13 @@ class TestTreeSharing:
|
|||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_migration_defaults_visibility_to_team(test_db):
|
async def test_migration_defaults_visibility_to_team(test_db):
|
||||||
"""Test that existing trees default to 'team' visibility after migration."""
|
"""Test that existing trees default to 'team' visibility after migration."""
|
||||||
account = Account(name="Migration Default Test", display_code=uuid4().hex[:8])
|
|
||||||
test_db.add(account)
|
|
||||||
await test_db.flush()
|
|
||||||
|
|
||||||
# Create a tree without specifying visibility
|
# Create a tree without specifying visibility
|
||||||
tree = Tree(
|
tree = Tree(
|
||||||
name="Old Tree",
|
name="Old Tree",
|
||||||
description="Created before migration",
|
description="Created before migration",
|
||||||
tree_structure={"id": "root", "type": "decision", "question": "Test?", "children": []},
|
tree_structure={"id": "root", "type": "decision", "question": "Test?", "children": []},
|
||||||
author_id=None,
|
author_id=None,
|
||||||
account_id=account.id
|
account_id=None
|
||||||
)
|
)
|
||||||
test_db.add(tree)
|
test_db.add(tree)
|
||||||
await test_db.commit()
|
await test_db.commit()
|
||||||
|
|||||||
@@ -359,7 +359,7 @@ async def test_delete_upload_forbidden_for_non_owner(client, auth_headers, test_
|
|||||||
f"/api/v1/uploads/{upload.id}", headers=other_headers
|
f"/api/v1/uploads/{upload.id}", headers=other_headers
|
||||||
)
|
)
|
||||||
|
|
||||||
assert response.status_code == 404
|
assert response.status_code == 403
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
|
|||||||
@@ -1,494 +0,0 @@
|
|||||||
# Design: ResolutionFlow GTM — Escalation-Mode-First Wedge
|
|
||||||
|
|
||||||
Generated by /office-hours on 2026-04-26
|
|
||||||
Branch: main
|
|
||||||
Repo: chihlasm/resolutionflow
|
|
||||||
Status: APPROVED
|
|
||||||
Mode: Startup
|
|
||||||
|
|
||||||
## Problem Statement
|
|
||||||
|
|
||||||
ResolutionFlow is a multi-tenant SaaS troubleshooting platform for MSPs, currently
|
|
||||||
in Go-to-Market Validation (pre-PMF). The backend is feature-complete (55+ endpoints,
|
|
||||||
100+ tests, FlowPilot telemetry baseline accruing). The product has users but no
|
|
||||||
paying customers.
|
|
||||||
|
|
||||||
The blocker is not engineering completeness. The blocker is the absence of a sharp
|
|
||||||
GTM story tied to a number a buyer can verify. The session reframed the wedge twice
|
|
||||||
before landing on the real one.
|
|
||||||
|
|
||||||
**What ResolutionFlow actually is:** the structuring layer between conversational AI
|
|
||||||
and the way MSP techs work tickets. AI is great at producing answers; it is bad at
|
|
||||||
producing workflow-shaped output. ResolutionFlow gives the tech the AI they already
|
|
||||||
trust (Claude/GPT) but organizes the output into actionable structured steps,
|
|
||||||
records the session, captures customer-specific context, and turns the result into
|
|
||||||
PSA-formatted ticket notes — and optionally a runbook — without the tech writing
|
|
||||||
anything.
|
|
||||||
|
|
||||||
**Positioning line:** "the senior engineer looking over your shoulder."
|
|
||||||
|
|
||||||
## Demand Evidence
|
|
||||||
|
|
||||||
The founder is the first user. Senior Systems Engineer at an MSP, losing ~20
|
|
||||||
hours/week to cross-domain interruptions (systems engineer pulled into networking
|
|
||||||
problems and vice versa). At least 4 interruptions per day, with the time cost
|
|
||||||
concentrated in the gap between AI-conversation output and MSP-ticket workflow.
|
|
||||||
|
|
||||||
This is solving-your-own-problem demand evidence — strongest possible signal at
|
|
||||||
this stage. The 20 hrs/week figure is the founder's own time, not a hypothetical.
|
|
||||||
Every MSP shop with a senior tech and a junior tech has a version of this problem.
|
|
||||||
|
|
||||||
Telemetry signal (Phase 0.5 baseline accruing): captured flows pile up but are not
|
|
||||||
being re-used. This says capture works, retrieval doesn't — which means the
|
|
||||||
"hours-saved-via-re-use" number isn't yet generatable from existing data. The
|
|
||||||
GTM-grade ROI story needs a different metric until re-use lands: minutes recovered
|
|
||||||
per escalation, generated by Approach A below.
|
|
||||||
|
|
||||||
## Status Quo
|
|
||||||
|
|
||||||
MSP techs today resolve tickets via three workarounds:
|
|
||||||
|
|
||||||
1. **AI in a tab.** Junior tech opens Claude or ChatGPT, pastes the problem, gets a
|
|
||||||
wall of prose, parses it into action items in their head, executes, repeats. AI
|
|
||||||
does the diagnostic work. The tech does all the structure-extraction and
|
|
||||||
ticket-note-writing afterward.
|
|
||||||
|
|
||||||
2. **Tribal knowledge.** Junior tech pings senior in Slack. Senior tech is
|
|
||||||
interrupted (4+ times/day per the founder's own data). Context handoff is verbal
|
|
||||||
and lossy.
|
|
||||||
|
|
||||||
3. **Stale runbooks.** Half-maintained Notion / IT Glue / SharePoint pages that
|
|
||||||
nobody trusts because they're 18 months out of date and don't match the current
|
|
||||||
customer environment.
|
|
||||||
|
|
||||||
The cost of these workarounds for the founder personally: ~20 hours per week of
|
|
||||||
senior-tech time lost. For a 5-tech MSP, the equivalent is 1 full FTE worth of
|
|
||||||
senior-engineer hours leaking into context-switching and tab-hopping.
|
|
||||||
|
|
||||||
## Target User & Narrowest Wedge
|
|
||||||
|
|
||||||
**Target user:** Senior Systems Engineer at a small-to-mid MSP (5-20 techs). The
|
|
||||||
founder is exemplar #1. Buying authority is shared between senior tech (champion)
|
|
||||||
and MSP owner (signs the check).
|
|
||||||
|
|
||||||
**Narrowest paid wedge:** Escalation Mode. Single sharp feature. When a junior tech
|
|
||||||
escalates a ticket they were working in FlowPilot, the senior tech opens the ticket
|
|
||||||
and sees the entire structured session state — every step the junior tried, every
|
|
||||||
dead end, every command output — instead of starting with "tell me what you tried"
|
|
||||||
for five minutes.
|
|
||||||
|
|
||||||
Why this is the wedge:
|
|
||||||
|
|
||||||
- **Two metrics, not one** (revised after /codex review 2026-04-27):
|
|
||||||
- **Manual baseline** (the Assignment, weeks 0-2): senior tech stopwatches the
|
|
||||||
next 5 escalations. T1 (first diagnostic action) − T0 (open ticket) under
|
|
||||||
today's verbal-handoff workflow. This is the "what you currently lose" number.
|
|
||||||
- **In-product metric** (telemetry, week 3+): time-to-first-action after claim,
|
|
||||||
derived from `ai_session_step` rows where `created_at > SessionHandoff.claimed_at`
|
|
||||||
AND `user_id = SessionHandoff.claimed_by`. This is the "what it is now with
|
|
||||||
structured handoff" number.
|
|
||||||
- **The savings claim** = manual baseline − in-product metric. Quote both
|
|
||||||
explicitly in pilot conversations. Do NOT roll the in-product number alone
|
|
||||||
into "minutes recovered" — that's an apples-to-oranges miscount Codex caught
|
|
||||||
in the cross-model review.
|
|
||||||
- **Single-feature demo:** a 2-minute Loom shows the magic moment — junior hits
|
|
||||||
escalate, senior window opens with full structured context. No theory required.
|
|
||||||
- **Cross-buyer story:** sells to senior tech (less interruption) AND owner (junior
|
|
||||||
techs resolve faster, take more accounts).
|
|
||||||
- **Hours-saved math is simple:** 4-5 minutes per escalation × 15-30 escalations
|
|
||||||
per week per senior tech = 1-2 hours/week recovered per senior. At $80-150/hr
|
|
||||||
fully-loaded senior tech cost, the tool pays for itself with one customer.
|
|
||||||
|
|
||||||
## Constraints
|
|
||||||
|
|
||||||
- **One-founder shop.** Cannot run three concurrent product narratives. Sequence
|
|
||||||
matters more than scope.
|
|
||||||
- **Pre-PMF runway implied.** 4-8 week build cycles before talking to a buyer are
|
|
||||||
expensive. Approach A's 1-2 week timeline is the binding constraint.
|
|
||||||
- **Existing architecture is mostly aligned.** FlowPilot, unified_chat_service,
|
|
||||||
FlowProposal, ConnectWise PSA integration — most of the pieces exist. Risk is
|
|
||||||
positioning and UX, not capability.
|
|
||||||
- **PSA copilot competition is real.** ConnectWise / Autotask / Halo are racing to
|
|
||||||
ship AI features. The wedge has to be sharp because we lose on distribution.
|
|
||||||
|
|
||||||
## Premises
|
|
||||||
|
|
||||||
The five load-bearing claims this design rests on, all confirmed in session:
|
|
||||||
|
|
||||||
1. **Diagnostic AI is commoditized.** ResolutionFlow does not compete on
|
|
||||||
"AI solves the ticket faster." That race is over. ChatGPT/Claude already won.
|
|
||||||
2. **The structuring layer is the wedge.** AI conversational output is too dense
|
|
||||||
and unstructured for active troubleshooting. ResolutionFlow's value is
|
|
||||||
organizing that output into actionable, separable, recorded steps.
|
|
||||||
3. **Escalation context is the killer feature.** "Junior hits escalate, senior gets
|
|
||||||
full structured context in 30 seconds instead of 5 minutes" is the sharpest
|
|
||||||
demoable moment in the entire product surface.
|
|
||||||
4. **First paying customer is bottom-up, prosumer-flavored.** Senior tech at a
|
|
||||||
small MSP, $20-50/seat/month, monthly billing. Owner-targeted enterprise
|
|
||||||
pricing waits until 5+ paying shops establish baseline ROI numbers.
|
|
||||||
5. **Distribution is MSP communities, not paid SaaS ads.** r/msp, MSPGeek, RocketMSP,
|
|
||||||
PSA marketplace listings. The channel matches the buyer.
|
|
||||||
|
|
||||||
## Approaches Considered
|
|
||||||
|
|
||||||
### Approach A: Escalation Mode first (REDUCED SCOPE per /plan-eng-review)
|
|
||||||
|
|
||||||
Lead the GTM with the killer feature. Polish the escalate-with-context handoff:
|
|
||||||
junior tech mid-session hits escalate, senior tech window opens with full
|
|
||||||
structured session state. 2-min demo Loom. Pilot with **3 MSPs** in the founder's
|
|
||||||
network (capped at 3 to preserve build capacity for B). Metric: minutes recovered
|
|
||||||
per escalation.
|
|
||||||
|
|
||||||
**SCOPE REDUCTION (2026-04-27 eng review):** ~80% of Approach A is already built.
|
|
||||||
The original 2-3 week estimate assumed greenfield. Codebase audit confirms:
|
|
||||||
|
|
||||||
| What the doc said "build" | What actually exists |
|
|
||||||
|---|---|
|
|
||||||
| Session-state serialization | `ai_session.escalation_package` (JSONB), `SessionHandoff.snapshot` |
|
|
||||||
| Senior-tech inbox | [EscalationQueuePage.tsx](frontend/src/pages/EscalationQueuePage.tsx) + [EscalationQueue.tsx](frontend/src/components/flowpilot/EscalationQueue.tsx) |
|
|
||||||
| Claim workflow | [handoff_manager.py:123 claim_session()](backend/app/services/handoff_manager.py#L123) |
|
|
||||||
| API surface | [session_handoffs.py](backend/app/api/endpoints/session_handoffs.py) — POST /handoff, /claim, GET queue |
|
|
||||||
| AI assessment for senior | `_generate_ai_assessment()` in handoff_manager |
|
|
||||||
| PSA round-trip | `escalation_package_markdown`, `escalation_package_external_id` |
|
|
||||||
|
|
||||||
**Real engineering scope (~6-9 days):**
|
|
||||||
|
|
||||||
1. **Notification dual-path** (4-5 days). `notification_sent` flag is a dead column —
|
|
||||||
never written. Wire two channels in `handoff_manager.create_handoff`:
|
|
||||||
- **Email** (existing `EmailService.send_notification_email`) — handles offline seniors.
|
|
||||||
- **WebSocket / SSE push** to the EscalationQueue for live demo magic moment.
|
|
||||||
- Set `notification_sent=true` after dispatch confirmation.
|
|
||||||
- Graceful degradation: handoff still created if notification raises (regression test required).
|
|
||||||
|
|
||||||
2. **Hero metric endpoint** (~2 hours). New `GET /api/v1/analytics/escalation-metrics`,
|
|
||||||
account-scoped, role-gated to `require_engineer_or_admin`. Computes
|
|
||||||
*minutes recovered per escalation* by querying:
|
|
||||||
```
|
|
||||||
ai_session_step.created_at (first row by senior_tech_user_id where created_at > SessionHandoff.claimed_at)
|
|
||||||
minus
|
|
||||||
SessionHandoff.claimed_at
|
|
||||||
```
|
|
||||||
Returns a rolling-30-day average per account. No schema change.
|
|
||||||
|
|
||||||
3. **UX polish on EscalationQueue + receiving-engineer view** (2-3 days). Confirm the
|
|
||||||
magic-moment screen lands when senior clicks claim. Add an unread indicator on
|
|
||||||
the queue. Wire optimistic insert when SSE event arrives.
|
|
||||||
|
|
||||||
4. **Loom + landing page copy** (1-2 days). Non-engineering. Outside this plan's scope
|
|
||||||
but required for the GTM in week 3.
|
|
||||||
|
|
||||||
**Test plan:** 100% coverage of new paths — 13 tests including 4 e2e and 1 regression
|
|
||||||
(graceful-degradation when notification dispatch raises). Test plan artifact at
|
|
||||||
`~/.gstack/projects/chihlasm-resolutionflow/abc-main-eng-review-test-plan-20260427-000000.md`.
|
|
||||||
|
|
||||||
**Risk:** Low. Single feature, single metric, architecture-aligned. The dual-path
|
|
||||||
notification is the only mildly novel surface; both halves use existing infra.
|
|
||||||
|
|
||||||
**Reuses:** `services/handoff_manager.py`, `services/escalation_package_generator.py`,
|
|
||||||
`models/session_handoff.py`, `models/ai_session.py`, `services/notification_service.py`,
|
|
||||||
`models/notification_log.py`, EmailService, EscalationQueuePage + EscalationQueue.
|
|
||||||
|
|
||||||
### UI Specifications (locked by /plan-design-review 2026-04-27)
|
|
||||||
|
|
||||||
**Magic-moment screen** (new, after Pick Up click): dedicated handoff-context view that
|
|
||||||
loads BEFORE the regular FlowPilot session view, then dissolves on first senior action.
|
|
||||||
Four sections, single frame:
|
|
||||||
|
|
||||||
1. **Problem summary** (top, 2-3 lines): junior's framing. Bricolage Grotesque h2.
|
|
||||||
2. **What's been tried** (left or middle column): structured list of `dead_ends_flagged[]`
|
|
||||||
and `steps_attempted[]` from `escalation_package` JSONB. Card-flat surface, IBM Plex.
|
|
||||||
3. **AI assessment** (right column): `ai_assessment_data` rendered as 3 fields —
|
|
||||||
`likely_cause`, `suggested_steps[]`, `confidence`. accent-dim badge for confidence.
|
|
||||||
4. **Start here** (primary CTA, electric-blue, ≥44px touch target): opens FlowPilot
|
|
||||||
session at the most-likely-next-step. Senior typing or clicking anywhere triggers
|
|
||||||
200ms fade-out and FlowPilot view fades in. Re-openable via "Show handoff context"
|
|
||||||
ghost button in FlowPilot toolbar.
|
|
||||||
|
|
||||||
**Hero metric ("minutes recovered per escalation"):** lives in TWO places:
|
|
||||||
- **Queue stat-card** (above EscalationQueue list on /escalations): compact, "X.X hrs
|
|
||||||
saved this month" + "click for details" affordance. Refreshes on queue load.
|
|
||||||
- **Dedicated `/analytics/escalations` page** (owner-facing): trend chart (4-week
|
|
||||||
rolling), per-tech breakdown, per-problem-domain segmentation. Engineer-or-admin
|
|
||||||
role-gated.
|
|
||||||
|
|
||||||
**Real-time arrival visual** (when WebSocket pushes a new escalation):
|
|
||||||
- New card slides in from above the list, 200ms ease-out CSS transition.
|
|
||||||
- Browser tab title prefixes with " (1) " / " (N) " when tab is backgrounded; clears
|
|
||||||
on focus.
|
|
||||||
- No sound. MUST respect `prefers-reduced-motion: reduce` (slide-in collapses to
|
|
||||||
instant fade-in).
|
|
||||||
|
|
||||||
**Unread state:** subtle 6px dot in top-right corner of card for escalations the
|
|
||||||
current senior has never opened. Dot fades on first hover or click.
|
|
||||||
|
|
||||||
**Race-condition (two seniors click Pick Up simultaneously):** loser sees a toast
|
|
||||||
"Already claimed by [name] 2s ago" via existing `@/lib/toast`; the card flashes the
|
|
||||||
winner's name in the meta row for 1s, then dissolves from the loser's view via
|
|
||||||
optimistic update + WebSocket reconciliation.
|
|
||||||
|
|
||||||
**Unread state (Codex correction 2026-04-27):** dot indicator clears on **open,
|
|
||||||
claim, or explicit dismiss** — NOT on hover. Hover-to-clear is a bad proxy for
|
|
||||||
acknowledgment because incidental mouse movement creates false clears.
|
|
||||||
|
|
||||||
**Notification routing (Codex finding 2026-04-27):** v1 fans out the email + push
|
|
||||||
to **all engineer-or-admin role users in the same account_id as the SessionHandoff**.
|
|
||||||
No on-call/round-robin logic in v1. If pilots ask for routing, capture as v2 TODO.
|
|
||||||
The first senior to claim wins; everyone else's notification self-resolves on
|
|
||||||
WebSocket reconciliation.
|
|
||||||
|
|
||||||
**Notification delivery model (Codex correction 2026-04-27):** drop the
|
|
||||||
`notification_sent: bool` flag from v1. Replace with per-channel delivery rows
|
|
||||||
in a new `notification_log` table (already exists — reuse, don't add a new model)
|
|
||||||
keyed by `(handoff_id, channel, recipient_user_id, status)` where status ∈
|
|
||||||
{queued, sent, failed, suppressed}. This makes partial-success and per-channel
|
|
||||||
retry visible. If the existing `notification_log` schema doesn't match, defer
|
|
||||||
the per-channel persistence to a v2 TODO and v1 logs delivery attempts to the
|
|
||||||
existing telemetry stream instead. Do NOT keep the dead boolean.
|
|
||||||
|
|
||||||
**"Start here" CTA (Codex correction 2026-04-27):** opens the FlowPilot session
|
|
||||||
at the **latest known state** (the AI's most recent agent_message + the current
|
|
||||||
pending_task_lane). Surface `ai_assessment_data.suggested_steps[]` as a list of
|
|
||||||
chips below the chat input — clicking a chip prefills the input. Do NOT invent a
|
|
||||||
"jump to most-likely-next-step" capability that doesn't exist in the session model.
|
|
||||||
|
|
||||||
**`/claim` role gate (Codex correction 2026-04-27, IN-SCOPE for v1):** add
|
|
||||||
`require_engineer_or_admin` dep on POST `/handoffs/{id}/claim`. Originally
|
|
||||||
deferred to TODO during eng review; Codex correctly flagged it as wedge-relevant
|
|
||||||
because the race-condition story depends on auth gating. ~30 min change. Removed
|
|
||||||
from TODO.md.
|
|
||||||
|
|
||||||
**A11y requirements (mandatory before pilot ship):**
|
|
||||||
- Keyboard: Tab order through queue cards; Enter on focused card opens it; Pick Up
|
|
||||||
button is a reachable target; Esc closes the handoff-context overlay.
|
|
||||||
- ARIA: `role="region"` + `aria-live="polite"` on the queue list (announces arrivals);
|
|
||||||
`aria-label="N escalations awaiting pickup"` on the heading; the slide-in animation
|
|
||||||
must not announce twice (debounce live-region updates).
|
|
||||||
- Pick Up button: bump from `py-2` to `py-2.5` to clear the 44px touch-target floor.
|
|
||||||
- Color contrast: confidence-badge text on accent-dim background must be ≥4.5:1
|
|
||||||
(verify against DESIGN-SYSTEM.md tokens).
|
|
||||||
|
|
||||||
**DS token discipline:** every new piece must use `card-flat`, `accent-dim`/`accent-text`,
|
|
||||||
`text-muted-foreground`, `bg-card`/`bg-elevated`, IBM Plex / Bricolage / JetBrains,
|
|
||||||
explicit `transition` property lists (never `transition: all`). No glass, no blur,
|
|
||||||
no gradient surfaces. Electric-blue accent reserved for interactive elements only.
|
|
||||||
|
|
||||||
**Mobile responsive:** deferred to post-pilot TODO. Pre-PMF wedge target is desktop;
|
|
||||||
MSP techs work on laptops/desktops in shop environments.
|
|
||||||
|
|
||||||
**Deferred to TODO.md (out of scope for v1 wedge):**
|
|
||||||
- Peer-tech escalates colleague's session (currently session-owner-only)
|
|
||||||
- Role gate on POST /claim (currently any authenticated user in account)
|
|
||||||
|
|
||||||
### Approach B: Full Structured Resolution loop (split B1 + B2)
|
|
||||||
|
|
||||||
End-to-end demo: tech opens FlowPilot, structure appears in side panel as AI
|
|
||||||
responds, ticket notes auto-populate at end, optional runbook capture for reusable
|
|
||||||
patterns. Tells the full "senior engineer over your shoulder" story.
|
|
||||||
|
|
||||||
**B1 — Side panel + PSA-formatted ticket notes** (ships first):
|
|
||||||
- Structured side panel that surfaces parsed AI markers as live actionable steps
|
|
||||||
while the conversation runs.
|
|
||||||
- PSA-formatted ticket-notes exporter (ConnectWise first; Autotask/Halo later).
|
|
||||||
- Effort: M (~3 weeks).
|
|
||||||
|
|
||||||
**B2 — Runbook offer-and-save** (gated on pilot demand):
|
|
||||||
- "Save this resolution as a flow?" prompt at session end, with auto-drafted
|
|
||||||
runbook from the structured session state.
|
|
||||||
- Effort: S (~1 week). Don't build until at least 2 pilot customers explicitly
|
|
||||||
ask for it.
|
|
||||||
|
|
||||||
- **Risk:** Medium. The structured-output panel quality is the whole demo. If it
|
|
||||||
looks dumb, the demo dies.
|
|
||||||
- **Reuses:** FlowPilot, unified_chat_service, FlowProposal, ConnectWise PSA
|
|
||||||
integration.
|
|
||||||
|
|
||||||
### Approach C: Senior-Tech Time-Saved Counter
|
|
||||||
|
|
||||||
Continuous measurement layer underneath A and B. Every session contributes an
|
|
||||||
estimated minutes-saved number. Owner-facing dashboard quotes "this month your
|
|
||||||
shop saved N hours of senior-tech time." Sells to MSP owner with verifiable ROI.
|
|
||||||
|
|
||||||
- **Effort:** S (~1 week + ongoing measurement methodology refinement).
|
|
||||||
- **Risk:** Medium-low. Methodology has to be defensible. If numbers look
|
|
||||||
made-up, trust dies fast.
|
|
||||||
- **Reuses:** FlowPilot telemetry, session metadata, account-scoped analytics.
|
|
||||||
|
|
||||||
## Recommended Approach
|
|
||||||
|
|
||||||
**A first (1-2 weeks), then B (3-4 weeks after A ships), with C running underneath
|
|
||||||
both as a continuous backdrop.**
|
|
||||||
|
|
||||||
Sequence rationale:
|
|
||||||
|
|
||||||
- **A is the sharpest possible 2-minute demo.** Single feature, single metric,
|
|
||||||
buyer-verifiable in their own data. Get it in front of 5 MSPs in week 3.
|
|
||||||
- **B is the depth play.** Once Approach A has produced first-pilot signal,
|
|
||||||
Approach B's full structured-resolution loop becomes the "what we ship next" that
|
|
||||||
retains pilots and converts them to paid.
|
|
||||||
- **C compounds across both.** Every session under A or B contributes to the
|
|
||||||
time-saved counter. By week 6 there are real numbers to put in front of an MSP
|
|
||||||
owner — turning a senior-tech-led pilot into an owner-signed contract.
|
|
||||||
|
|
||||||
This sequence is non-negotiable. Building B before A is the classic pre-PMF trap of
|
|
||||||
perfecting product before validating GTM. Building C alone is measurement without a
|
|
||||||
demo to anchor it.
|
|
||||||
|
|
||||||
## Pricing
|
|
||||||
|
|
||||||
**Pilot pricing (first 3-5 customers): $39/seat/month, monthly billing,
|
|
||||||
month-to-month.** Anchored against IT Glue (~$29/tech), Hudu (~$25/tech),
|
|
||||||
Liongard (~$3/endpoint). The premium over IT Glue/Hudu reflects the active-session
|
|
||||||
value (vs. their static-runbook value) — 30% above the runbook-only category.
|
|
||||||
|
|
||||||
Customer #6+ pricing is an Open Question (revisit after 3 pilots produce real
|
|
||||||
hours-saved data; price up if the per-seat ROI is over $200/seat/mo).
|
|
||||||
|
|
||||||
## Open Questions
|
|
||||||
|
|
||||||
1. **Free-tier shape.** Should the time-saved counter be free forever as a
|
|
||||||
distribution lever, with paid for the structuring + escalation? Land-and-expand
|
|
||||||
pattern. Decide after 3 pilot conversions.
|
|
||||||
2. **PSA-marketplace timing.** ConnectWise Marketplace listing requires partnership
|
|
||||||
onboarding (~6-week cycle). Submit application week 5; expect listing live by
|
|
||||||
week 11. Don't gate launch on it.
|
|
||||||
3. **Customer #6+ pricing.** Revisit after 3 pilot customers produce verifiable
|
|
||||||
hours-saved numbers.
|
|
||||||
|
|
||||||
## Deferred (YAGNI until 10 paying customers)
|
|
||||||
|
|
||||||
- HIPAA / SOC2 audit positioning. Pre-PMF is too early; revisit when a regulated-
|
|
||||||
vertical MSP asks for it explicitly.
|
|
||||||
- Multi-PSA depth (Autotask, Halo). ConnectWise alone covers ~40% of the SMB MSP
|
|
||||||
market and is sufficient for first 5-10 customers.
|
|
||||||
- Cross-tenant pattern detection. The data-flywheel-across-shops play is at least
|
|
||||||
6 months out; building it before single-shop ROI is proven is premature.
|
|
||||||
|
|
||||||
## Success Criteria (revised for realism)
|
|
||||||
|
|
||||||
- **Week 3:** Approach A shipped. 3 MSPs in active free pilot (cap at 3 to
|
|
||||||
preserve B1 build capacity).
|
|
||||||
- **Weeks 3-6:** Pilot management dominates. B1 build is paused; founder runs
|
|
||||||
pilot calls, captures bug reports, iterates UX. Stripe seat-based billing is
|
|
||||||
set up in week 5.
|
|
||||||
- **Week 6:** First verbal commit from a pilot customer. Verified
|
|
||||||
minutes-recovered-per-escalation number from at least 2 pilots.
|
|
||||||
- **Week 8:** First paid customer (procurement cycles run 4-6 weeks even at small
|
|
||||||
MSPs; 2 weeks from verbal commit to signed contract is realistic). Time-saved
|
|
||||||
counter (Approach C) producing dashboard-quality data.
|
|
||||||
- **Week 11:** B1 (side panel + PSA notes) shipped. 3-5 paying customers. First
|
|
||||||
MSP-owner-led conversation. ConnectWise Marketplace listing live.
|
|
||||||
- **Quarter end:** $5K MRR or 10 paying customers, whichever comes first. Loom
|
|
||||||
demos posted publicly to r/msp and MSPGeek.
|
|
||||||
|
|
||||||
## Distribution Plan (week-by-week cadence)
|
|
||||||
|
|
||||||
- **Week 3:** Escalation Mode demo Loom posted. r/msp launch post.
|
|
||||||
- **Week 4:** MSPGeek Discord AMA scheduled. RocketMSP newsletter pitch sent.
|
|
||||||
- **Week 5:** ConnectWise Marketplace listing application submitted. Stripe
|
|
||||||
billing live for paid conversion.
|
|
||||||
- **Week 6:** First "guest on Inside MSP podcast" outreach. Second r/msp post
|
|
||||||
(case study from a pilot, anonymized).
|
|
||||||
- **Week 7-8:** Pilot conversion calls. First paying customer.
|
|
||||||
- **Week 9-11:** B1 ships. Owner-targeted demo Loom. Second podcast outreach.
|
|
||||||
|
|
||||||
**Founder-led pilot:** The first 3-5 customers come from the founder's existing
|
|
||||||
MSP network. Treat them as design partners; expect to ship feature requests
|
|
||||||
weekly during pilot. Cap at 3 active pilots until B1 ships.
|
|
||||||
|
|
||||||
**Tech audience channels:** r/msp, r/sysadmin, MSPGeek Discord, RocketMSP
|
|
||||||
newsletter, Inside MSP podcast.
|
|
||||||
**Owner audience channels:** ConnectWise Marketplace, MSP-focused Substacks,
|
|
||||||
RIA Vendor Roundup.
|
|
||||||
|
|
||||||
CI/CD: existing Railway auto-deploy via GitHub mirror. No new pipeline needed.
|
|
||||||
|
|
||||||
## Dependencies
|
|
||||||
|
|
||||||
- **Session-state serialization (Approach A blocker).** Schema design + migration
|
|
||||||
is the longest-lead engineering task. 3-5 days budget. Do this first.
|
|
||||||
- **Stripe seat-based billing (week 5 task).** No billing infrastructure exists
|
|
||||||
today. ~3-5 days of work for monthly subscriptions + invoice flow. Block on
|
|
||||||
this before week-8 first-paid milestone.
|
|
||||||
- **ConnectWise PSA integration depth.** Sufficient for ticket-notes auto-export
|
|
||||||
(Approach B1). Autotask and Halo wait until first 5 paying ConnectWise
|
|
||||||
customers.
|
|
||||||
- **Authentication.** Existing JWT + role hierarchy is sufficient for senior-tech
|
|
||||||
inbox view; no new auth work needed.
|
|
||||||
|
|
||||||
## Risks and Kill-Switch
|
|
||||||
|
|
||||||
- **Risk: Session-state serialization design churn.** If the schema needs to
|
|
||||||
change after pilot feedback, every saved session has to migrate. Mitigation:
|
|
||||||
keep schema versioned and forward-compatible from day 1.
|
|
||||||
- **Risk: Pilot-to-paid conversion slower than 4-6 weeks.** MSP procurement is
|
|
||||||
notoriously slow. Mitigation: get verbal commits in writing; price as
|
|
||||||
month-to-month with no annual contract to lower the buying friction.
|
|
||||||
- **Risk: ConnectWise ships an equivalent feature in their 2026.x release.**
|
|
||||||
Mitigation: lead the marketing on "we're independent of your PSA" — works with
|
|
||||||
any PSA, not just ConnectWise. The founder's PSA-agnostic FlowPilot is an
|
|
||||||
asset here.
|
|
||||||
- **Kill-switch criterion:** if 0 of 3 pilots produce a verifiable
|
|
||||||
hours-saved-per-week number above 1.0 by week 8, **revisit the wedge**. The
|
|
||||||
product may need to pivot to deterministic-ops territory (Read 1 from the
|
|
||||||
session) or be repositioned. Don't sink another quarter into the current GTM
|
|
||||||
story without this number.
|
|
||||||
|
|
||||||
## The Assignment
|
|
||||||
|
|
||||||
**This week, before any code:**
|
|
||||||
|
|
||||||
Time-track the next 5 escalations in your shop manually. For each, capture:
|
|
||||||
1. Time the senior tech opens the ticket
|
|
||||||
2. Time the senior tech takes their first diagnostic action (not counting the
|
|
||||||
verbal "tell me what you tried" warm-up)
|
|
||||||
3. The delta — that's the wasted time per escalation today
|
|
||||||
|
|
||||||
Average those 5 numbers. **That's the hero stat in your first sales conversation:**
|
|
||||||
"Senior techs at our shop wasted N minutes per escalation just getting up to
|
|
||||||
speed. We built the thing that takes that to zero."
|
|
||||||
|
|
||||||
Don't try to pull this from telemetry — the doc itself notes that retrieval/re-use
|
|
||||||
data isn't queryable yet. Manual stopwatch on the next 5 escalations is the
|
|
||||||
fastest path to a defensible number.
|
|
||||||
|
|
||||||
This is the assignment because it forces the GTM story into the same time-zone as
|
|
||||||
the build, and it's a one-day effort that compounds for every conversation
|
|
||||||
afterward.
|
|
||||||
|
|
||||||
## What I noticed about how you think
|
|
||||||
|
|
||||||
- You contradicted my framing twice in the same session and the second
|
|
||||||
contradiction was sharper than the first. Most founders agree with the
|
|
||||||
diagnostic and walk out with a polished version of what they came in with. You
|
|
||||||
said "I'm just questioning if flows are even the way to go" — and that
|
|
||||||
sentence reset the entire wedge. That's craft.
|
|
||||||
|
|
||||||
- "The senior engineer looking over your shoulder" came out of you spontaneously,
|
|
||||||
not as a prepared pitch. That's the line. Use it. It survives because it's
|
|
||||||
emotional truth (every junior tech has had this, every senior tech has been
|
|
||||||
this), not constructed marketing copy.
|
|
||||||
|
|
||||||
- You're solving your own problem with your own time. 20 hrs/week isn't a
|
|
||||||
hypothetical user pain — it's your Tuesday. Founders who solve their own pain
|
|
||||||
ship sharper products because the feedback loop is instant.
|
|
||||||
|
|
||||||
- The escalation feature emerged from your description, not mine. I was busy
|
|
||||||
cataloging documentation pains. You said "junior to senior escalation? no
|
|
||||||
worries there either" almost as an afterthought. That afterthought is the wedge.
|
|
||||||
Pay attention to which features you describe casually versus which you push hard
|
|
||||||
on — the casual ones are sometimes where the truth lives.
|
|
||||||
|
|
||||||
## GSTACK REVIEW REPORT
|
|
||||||
|
|
||||||
| Review | Trigger | Why | Runs | Status | Findings |
|
|
||||||
|--------|---------|-----|------|--------|----------|
|
|
||||||
| CEO Review | `/plan-ceo-review` | Scope & strategy | 0 | — | not run |
|
|
||||||
| Codex Review | `/codex review` | Independent 2nd opinion | 1 | INFO | 12 findings, 6 applied, 1 partial, 5 rejected |
|
|
||||||
| Eng Review | `/plan-eng-review` | Architecture & tests (required) | 1 | CLEAR (PLAN) | 2 issues, 0 critical gaps, scope reduced |
|
|
||||||
| Design Review | `/plan-design-review` | UI/UX gaps | 1 | CLEAR (FULL) | score 6/10 → 9/10, 8 decisions |
|
|
||||||
| DX Review | `/plan-devex-review` | Developer experience gaps | 0 | — | not run |
|
|
||||||
|
|
||||||
- **CODEX:** 12 findings reviewed. Applied: 2-metric framing (#2), notification routing spec (#3), per-channel delivery model (#4), unread-state fix (#11), Start-here CTA reframe (#9), claim role gate moved in-scope (#8). Rejected: full scope reduction to PSA-brief-only (#6/7/12 — user kept queue UI as demo hero). Partial: scope concern (#5) acknowledged in eng review's email-first/polling-fallback. Misread: #1, #10.
|
|
||||||
- **CROSS-MODEL:** Claude (eng + design reviews) and Codex agree on 6/12 findings. The major disagreement was scope — Codex argued for cutting the queue UI, user rejected. Both agree on metric definition, notification routing, claim auth gating.
|
|
||||||
- **UNRESOLVED:** 0
|
|
||||||
- **VERDICT:** ENG + DESIGN CLEARED, CODEX REVIEWED — ready to implement.
|
|
||||||
@@ -1,33 +0,0 @@
|
|||||||
# Test Plan
|
|
||||||
Generated by /plan-eng-review on 2026-04-27
|
|
||||||
Branch: main
|
|
||||||
Repo: chihlasm/resolutionflow
|
|
||||||
|
|
||||||
## Affected Pages/Routes
|
|
||||||
|
|
||||||
- `/escalations` ([EscalationQueuePage.tsx](frontend/src/pages/EscalationQueuePage.tsx)) — senior-tech inbox view; verify queue list, real-time arrival, click-through
|
|
||||||
- `/pilot/:session_id` (FlowPilotSessionPage) — verify post-claim load shows full escalation context (snapshot, ai_assessment, escalation_package)
|
|
||||||
- `GET /api/v1/analytics/escalation-metrics` (NEW) — verify hero metric calculation, account-scoping, role gate
|
|
||||||
|
|
||||||
## Key Interactions to Verify
|
|
||||||
|
|
||||||
- Junior tech clicks **Escalate** in active FlowPilot session → handoff is created → notification fires → senior sees escalation in queue within 30 seconds
|
|
||||||
- Senior tech clicks **Claim** in queue → session reactivates → senior is redirected into FlowPilot session view → ai_assessment + snapshot are visible
|
|
||||||
- Senior types first message in chat after claim → metric query starts attributing time-to-first-action
|
|
||||||
- MSP owner opens analytics page → "minutes recovered per escalation" widget shows current month's rolling average
|
|
||||||
|
|
||||||
## Edge Cases
|
|
||||||
|
|
||||||
- **Two seniors race to claim** the same handoff → one wins, the other gets a "Already claimed by [name]" message
|
|
||||||
- **Senior is offline** when escalation fires → email arrives via existing `EmailService.send_notification_email`
|
|
||||||
- **WebSocket disconnects mid-session** → frontend reconnects; missed events backfilled by re-fetching the queue
|
|
||||||
- **Notification dispatch raises** (SMTP down, WebSocket fanout fails) → handoff is still created (graceful degradation)
|
|
||||||
- **Senior takes non-chat action first** (e.g., posts directly to PSA) → metric falls back to PSA writeback timestamp or remains null; doc the chosen behavior
|
|
||||||
- **Account-scoped multi-tenancy** → senior at MSP A cannot see escalations from MSP B (Phase 4 RLS)
|
|
||||||
- **Role gate on metric endpoint** → only `engineer_or_admin` can hit `/escalation-metrics`
|
|
||||||
|
|
||||||
## Critical Paths
|
|
||||||
|
|
||||||
1. **Magic-moment demo flow** (the entire Loom): junior escalate → senior notification → senior claim → session view → first action recorded → metric updates
|
|
||||||
2. **Email fallback** when senior is offline — must not silently drop
|
|
||||||
3. **Regression: handoff creation succeeds even if notification dispatch raises** — graceful degradation is mandatory
|
|
||||||
@@ -1,111 +0,0 @@
|
|||||||
import { expect, test } from '@playwright/test'
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Regression test for the prefill-handoff `currentChatRef` bug.
|
|
||||||
*
|
|
||||||
* Symptom: a chat session created via the dashboard prefill flow
|
|
||||||
* looked fine on the first AI turn, but submitting partial answers
|
|
||||||
* from the task lane silently dropped the AI's follow-up response.
|
|
||||||
* The user saw their answers in the chat, no assistant reply, no
|
|
||||||
* toast.
|
|
||||||
*
|
|
||||||
* Root cause: the prefill effect in `AssistantChatPage` set
|
|
||||||
* `activeChatId` without also updating `currentChatRef.current`, so
|
|
||||||
* the `currentChatRef.current !== sentForChatId` guard in
|
|
||||||
* `handleTaskSubmit` (and `handleSend`) tripped on every subsequent
|
|
||||||
* request and discarded the AI response.
|
|
||||||
*
|
|
||||||
* Strategy: drive the real prefill flow against the real backend, but
|
|
||||||
* intercept the `/chat` endpoint with `page.route` so we get
|
|
||||||
* deterministic question payloads on turn 1 and a deterministic
|
|
||||||
* follow-up on turn 2. The fix is what makes turn 2 visible.
|
|
||||||
*/
|
|
||||||
test.describe('AssistantChatPage — prefill handoff regression', () => {
|
|
||||||
test('AI follow-up renders after submitting partial task lane answers', async ({ page }) => {
|
|
||||||
let chatCallCount = 0
|
|
||||||
|
|
||||||
// Clear any persisted active-chat-id so the page does not auto-resume a
|
|
||||||
// stale session left behind by a sibling spec.
|
|
||||||
await page.addInitScript(() => {
|
|
||||||
try {
|
|
||||||
sessionStorage.removeItem('rf-active-chat-id')
|
|
||||||
sessionStorage.removeItem('rf-tasklane-meta')
|
|
||||||
} catch { /* ignore */ }
|
|
||||||
})
|
|
||||||
|
|
||||||
// Intercept only the chat endpoint. Session creation, listSessions,
|
|
||||||
// facts, suggested-fixes, etc. all hit the real backend so the page
|
|
||||||
// renders normally — only the LLM call is deterministic. The pattern
|
|
||||||
// matches `/ai-sessions/<uuid>/chat` and nothing nested beneath it.
|
|
||||||
await page.route(/\/api\/v1\/ai-sessions\/[^/]+\/chat$/, async (route) => {
|
|
||||||
if (route.request().method() !== 'POST') {
|
|
||||||
await route.fallback()
|
|
||||||
return
|
|
||||||
}
|
|
||||||
chatCallCount += 1
|
|
||||||
if (chatCallCount === 1) {
|
|
||||||
await route.fulfill({
|
|
||||||
status: 200,
|
|
||||||
contentType: 'application/json',
|
|
||||||
body: JSON.stringify({
|
|
||||||
content: 'Initial diagnostic plan. Please answer the questions in the task lane.',
|
|
||||||
suggested_flows: [],
|
|
||||||
fork: null,
|
|
||||||
actions: [],
|
|
||||||
questions: [
|
|
||||||
{ text: 'Has the user recently changed their password?' },
|
|
||||||
{ text: 'Is the lockout happening at a consistent time of day?' },
|
|
||||||
],
|
|
||||||
}),
|
|
||||||
})
|
|
||||||
return
|
|
||||||
}
|
|
||||||
await route.fulfill({
|
|
||||||
status: 200,
|
|
||||||
contentType: 'application/json',
|
|
||||||
body: JSON.stringify({
|
|
||||||
content: 'Got it — based on your answer, here is what to check next.',
|
|
||||||
suggested_flows: [],
|
|
||||||
fork: null,
|
|
||||||
actions: [],
|
|
||||||
questions: [],
|
|
||||||
}),
|
|
||||||
})
|
|
||||||
})
|
|
||||||
|
|
||||||
// Drive the prefill flow exactly the way the dashboard does. The textarea
|
|
||||||
// is keyed by its placeholder copy on QuickStartPage.
|
|
||||||
await page.goto('/')
|
|
||||||
const prefillBox = page.getByPlaceholder(/Describe the issue/i)
|
|
||||||
await expect(prefillBox).toBeVisible({ timeout: 10_000 })
|
|
||||||
await prefillBox.fill('User locked out of AD weekly')
|
|
||||||
await prefillBox.press('Enter')
|
|
||||||
|
|
||||||
// After the prefill submits we land on /pilot and the first stubbed AI
|
|
||||||
// turn surfaces the task-lane question text.
|
|
||||||
await expect(page).toHaveURL(/\/pilot/)
|
|
||||||
await expect(
|
|
||||||
page.getByText('Has the user recently changed their password?'),
|
|
||||||
).toBeVisible({ timeout: 15_000 })
|
|
||||||
|
|
||||||
// Answer the first question. UI flow: click "Answer" to open the
|
|
||||||
// textarea, type, click the inline "Answer" button to mark done.
|
|
||||||
await page.getByRole('button', { name: /^Answer$/ }).first().click()
|
|
||||||
await page.getByPlaceholder('Type your answer...').fill('No, password is months old')
|
|
||||||
await page.getByRole('button', { name: /^Answer$/ }).first().click()
|
|
||||||
|
|
||||||
// Submit the partial response. Pre-fix: the response was silently dropped
|
|
||||||
// here because `currentChatRef.current` still held the mount-time value.
|
|
||||||
await page.getByRole('button', { name: /Send 1 of 2 Responses/ }).click()
|
|
||||||
|
|
||||||
// Bug repro: the assistant message must render. Pre-fix this assertion
|
|
||||||
// fails because `handleTaskSubmit` early-returns at the
|
|
||||||
// `currentChatRef.current !== sentForChatId` guard.
|
|
||||||
await expect(
|
|
||||||
page.getByText('Got it — based on your answer, here is what to check next.'),
|
|
||||||
).toBeVisible({ timeout: 15_000 })
|
|
||||||
|
|
||||||
// Both chat calls must have actually happened.
|
|
||||||
expect(chatCallCount).toBe(2)
|
|
||||||
})
|
|
||||||
})
|
|
||||||
@@ -34,11 +34,7 @@ test.describe('session history smoke tests', () => {
|
|||||||
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
|
await page.getByPlaceholder('Search by ticket number...').fill(ticketNumber)
|
||||||
await page.getByPlaceholder('Search by client name...').fill(clientName)
|
await page.getByPlaceholder('Search by client name...').fill(clientName)
|
||||||
|
|
||||||
const sessionCard = page
|
const sessionCard = page.locator('.bg-card').filter({ hasText: ticketNumber }).filter({ hasText: clientName }).first()
|
||||||
.getByTestId('flow-session-card')
|
|
||||||
.filter({ hasText: ticketNumber })
|
|
||||||
.filter({ hasText: clientName })
|
|
||||||
.first()
|
|
||||||
await expect(sessionCard).toBeVisible()
|
await expect(sessionCard).toBeVisible()
|
||||||
await expect(sessionCard.getByText(tree.name)).toBeVisible()
|
await expect(sessionCard.getByText(tree.name)).toBeVisible()
|
||||||
|
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ test.describe('flow library start-session smoke tests', () => {
|
|||||||
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
||||||
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
||||||
|
|
||||||
const treeCard = page.getByTestId('tree-card').filter({ hasText: tree.name }).first()
|
const treeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first()
|
||||||
await expect(treeCard).toBeVisible()
|
await expect(treeCard).toBeVisible()
|
||||||
await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()
|
await treeCard.getByRole('button', { name: /^Start(?: Session)?$/ }).click()
|
||||||
|
|
||||||
|
|||||||
@@ -20,7 +20,7 @@ test.describe('flow library smoke tests', () => {
|
|||||||
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
await page.getByPlaceholder('Search flows...').fill(tree.name)
|
||||||
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
await page.getByRole('button', { name: 'Search', exact: true }).click()
|
||||||
|
|
||||||
await expect(page.getByTestId('tree-card').filter({ hasText: tree.name }).first()).toBeVisible()
|
await expect(page.getByText(tree.name)).toBeVisible()
|
||||||
} finally {
|
} finally {
|
||||||
await disposeApiContext(api)
|
await disposeApiContext(api)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -28,7 +28,7 @@ test.describe('session resume smoke tests', () => {
|
|||||||
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
await page.getByRole('button', { name: 'Flow Sessions' }).click()
|
||||||
// Active sub-tab is the default and surfaces in-progress sessions.
|
// Active sub-tab is the default and surfaces in-progress sessions.
|
||||||
|
|
||||||
const resumeCard = page.getByTestId('flow-session-card').filter({ hasText: tree.name }).first()
|
const resumeCard = page.locator('.bg-card').filter({ hasText: tree.name }).first()
|
||||||
await expect(resumeCard).toBeVisible()
|
await expect(resumeCard).toBeVisible()
|
||||||
await resumeCard.getByRole('button', { name: 'Resume' }).first().click()
|
await resumeCard.getByRole('button', { name: 'Resume' }).first().click()
|
||||||
|
|
||||||
|
|||||||
@@ -31,7 +31,7 @@ test.describe('shared session management smoke tests', () => {
|
|||||||
).toBeVisible()
|
).toBeVisible()
|
||||||
await expect(page.getByText(share.share_name || '')).toBeVisible()
|
await expect(page.getByText(share.share_name || '')).toBeVisible()
|
||||||
|
|
||||||
const shareCard = page.getByTestId('share-card').filter({ hasText: share.share_name || '' }).first()
|
const shareCard = page.locator('.bg-card').filter({ hasText: share.share_name || '' }).first()
|
||||||
await shareCard.getByRole('button', { name: 'Revoke' }).click()
|
await shareCard.getByRole('button', { name: 'Revoke' }).click()
|
||||||
|
|
||||||
const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })
|
const confirmDialog = page.getByRole('dialog', { name: 'Revoke Share Link' })
|
||||||
|
|||||||
@@ -1,12 +1,5 @@
|
|||||||
import apiClient from './client'
|
import apiClient from './client'
|
||||||
import type {
|
import type { FlowPilotDashboard, KnowledgeGapReport, CoverageResponse, FlowQualityResponse, EnhancedPsaMetrics } from '@/types/flowpilot-analytics'
|
||||||
FlowPilotDashboard,
|
|
||||||
KnowledgeGapReport,
|
|
||||||
CoverageResponse,
|
|
||||||
FlowQualityResponse,
|
|
||||||
EnhancedPsaMetrics,
|
|
||||||
EscalationMetrics,
|
|
||||||
} from '@/types/flowpilot-analytics'
|
|
||||||
|
|
||||||
export const flowpilotAnalyticsApi = {
|
export const flowpilotAnalyticsApi = {
|
||||||
async getDashboard(period: string = '30d'): Promise<FlowPilotDashboard> {
|
async getDashboard(period: string = '30d'): Promise<FlowPilotDashboard> {
|
||||||
@@ -43,13 +36,6 @@ export const flowpilotAnalyticsApi = {
|
|||||||
})
|
})
|
||||||
return response.data
|
return response.data
|
||||||
},
|
},
|
||||||
|
|
||||||
async getEscalationMetrics(period: string = '30d'): Promise<EscalationMetrics> {
|
|
||||||
const response = await apiClient.get<EscalationMetrics>('/analytics/flowpilot/escalations', {
|
|
||||||
params: { period },
|
|
||||||
})
|
|
||||||
return response.data
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
|
|
||||||
export default flowpilotAnalyticsApi
|
export default flowpilotAnalyticsApi
|
||||||
|
|||||||
@@ -1,130 +0,0 @@
|
|||||||
import { useEffect, useState } from 'react'
|
|
||||||
import { Clock, TrendingUp, AlertCircle } from 'lucide-react'
|
|
||||||
import { flowpilotAnalyticsApi } from '@/api'
|
|
||||||
import type { EscalationMetrics } from '@/types/flowpilot-analytics'
|
|
||||||
|
|
||||||
interface EscalationMetricCardProps {
|
|
||||||
period?: string
|
|
||||||
}
|
|
||||||
|
|
||||||
function formatSeconds(s: number | null): string {
|
|
||||||
if (s === null) return '—'
|
|
||||||
if (s < 60) return `${Math.round(s)}s`
|
|
||||||
const mins = s / 60
|
|
||||||
if (mins < 10) return `${mins.toFixed(1)} min`
|
|
||||||
return `${Math.round(mins)} min`
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Shows the in-product time-to-first-action metric above the EscalationQueue.
|
|
||||||
*
|
|
||||||
* NOTE: this is the in-product metric only. The "minutes recovered" sales
|
|
||||||
* claim requires a manual baseline measurement (see The Assignment in
|
|
||||||
* docs/plans/2026-04-27-escalation-mode-wedge-design.md). Frame the number
|
|
||||||
* as "time-to-first-action with structured handoff," not "minutes saved."
|
|
||||||
*/
|
|
||||||
export function EscalationMetricCard({ period = '30d' }: EscalationMetricCardProps) {
|
|
||||||
const [metrics, setMetrics] = useState<EscalationMetrics | null>(null)
|
|
||||||
const [error, setError] = useState<string | null>(null)
|
|
||||||
const [isLoading, setIsLoading] = useState(true)
|
|
||||||
|
|
||||||
useEffect(() => {
|
|
||||||
let cancelled = false
|
|
||||||
|
|
||||||
const load = async () => {
|
|
||||||
setIsLoading(true)
|
|
||||||
setError(null)
|
|
||||||
try {
|
|
||||||
const data = await flowpilotAnalyticsApi.getEscalationMetrics(period)
|
|
||||||
if (!cancelled) setMetrics(data)
|
|
||||||
} catch {
|
|
||||||
if (!cancelled) setError('Failed to load metric')
|
|
||||||
} finally {
|
|
||||||
if (!cancelled) setIsLoading(false)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
load()
|
|
||||||
return () => {
|
|
||||||
cancelled = true
|
|
||||||
}
|
|
||||||
}, [period])
|
|
||||||
|
|
||||||
if (isLoading) {
|
|
||||||
return (
|
|
||||||
<div className="card-flat p-4 mb-4 animate-pulse">
|
|
||||||
<div className="h-4 w-32 bg-elevated rounded" />
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (error) {
|
|
||||||
return (
|
|
||||||
<div className="card-flat p-4 mb-4 flex items-center gap-2 text-sm text-muted-foreground">
|
|
||||||
<AlertCircle size={14} />
|
|
||||||
<span>{error}</span>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!metrics || metrics.n_handoffs_claimed === 0) {
|
|
||||||
return (
|
|
||||||
<div className="card-flat p-4 mb-4">
|
|
||||||
<p className="text-xs uppercase tracking-wider text-muted-foreground">
|
|
||||||
Time to first action ({period})
|
|
||||||
</p>
|
|
||||||
<p className="mt-1 text-sm text-muted-foreground">
|
|
||||||
No claimed escalations yet. Once your team starts using Pick Up,
|
|
||||||
we'll measure how fast they get into resolution.
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
|
|
||||||
const avgLabel = formatSeconds(metrics.avg_seconds_to_first_action)
|
|
||||||
const medianLabel = formatSeconds(metrics.median_seconds_to_first_action)
|
|
||||||
const conversionRate =
|
|
||||||
metrics.n_handoffs_claimed > 0
|
|
||||||
? Math.round(
|
|
||||||
(metrics.n_handoffs_with_action / metrics.n_handoffs_claimed) * 100,
|
|
||||||
)
|
|
||||||
: 0
|
|
||||||
|
|
||||||
return (
|
|
||||||
<div className="card-flat p-4 mb-4">
|
|
||||||
<div className="flex items-center gap-2 text-xs uppercase tracking-wider text-muted-foreground">
|
|
||||||
<TrendingUp size={12} />
|
|
||||||
<span>Time to first action — last {period}</span>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div className="mt-2 flex flex-wrap items-baseline gap-x-6 gap-y-2">
|
|
||||||
<div>
|
|
||||||
<span className="font-heading text-2xl font-bold text-foreground">
|
|
||||||
{avgLabel}
|
|
||||||
</span>
|
|
||||||
<span className="ml-1 text-xs text-muted-foreground">avg</span>
|
|
||||||
</div>
|
|
||||||
<div className="text-sm text-muted-foreground">
|
|
||||||
<span className="font-medium text-foreground">{medianLabel}</span> median
|
|
||||||
</div>
|
|
||||||
<div className="text-sm text-muted-foreground">
|
|
||||||
<span className="font-medium text-foreground">
|
|
||||||
{metrics.n_handoffs_with_action}
|
|
||||||
</span>
|
|
||||||
/{metrics.n_handoffs_claimed} claimed escalations
|
|
||||||
<span className="ml-1 text-muted-foreground/70">
|
|
||||||
({conversionRate}% reached first action)
|
|
||||||
</span>
|
|
||||||
</div>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<p className="mt-2 flex items-start gap-1.5 text-[0.6875rem] text-muted-foreground">
|
|
||||||
<Clock size={10} className="mt-0.5 flex-none" />
|
|
||||||
<span>
|
|
||||||
In-product measurement only. The savings claim requires a manual
|
|
||||||
baseline of pre-Escalation-Mode handoff time. See your team's
|
|
||||||
Assignment for the baseline number.
|
|
||||||
</span>
|
|
||||||
</p>
|
|
||||||
</div>
|
|
||||||
)
|
|
||||||
}
|
|
||||||
@@ -9,7 +9,6 @@ export { AISessionListItem } from './AISessionListItem'
|
|||||||
export { SessionTicketCard } from './SessionTicketCard'
|
export { SessionTicketCard } from './SessionTicketCard'
|
||||||
export { EscalateModal } from './EscalateModal'
|
export { EscalateModal } from './EscalateModal'
|
||||||
export { EscalationQueue } from './EscalationQueue'
|
export { EscalationQueue } from './EscalationQueue'
|
||||||
export { EscalationMetricCard } from './EscalationMetricCard'
|
|
||||||
export { SessionBriefing } from './SessionBriefing'
|
export { SessionBriefing } from './SessionBriefing'
|
||||||
export { ProposalCard } from './ProposalCard'
|
export { ProposalCard } from './ProposalCard'
|
||||||
export { ProposalDetail } from './ProposalDetail'
|
export { ProposalDetail } from './ProposalDetail'
|
||||||
|
|||||||
@@ -34,8 +34,6 @@ export function TreeGridView({
|
|||||||
{trees.map((tree) => (
|
{trees.map((tree) => (
|
||||||
<div
|
<div
|
||||||
key={tree.id}
|
key={tree.id}
|
||||||
data-testid="tree-card"
|
|
||||||
data-tree-id={tree.id}
|
|
||||||
className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
|
className="relative bg-card border border-border rounded-2xl p-4 transition-all hover:-translate-y-0.5 hover:border-primary/30 hover:shadow-md sm:p-6"
|
||||||
>
|
>
|
||||||
<div className="mb-2 flex items-start justify-between gap-2">
|
<div className="mb-2 flex items-start justify-between gap-2">
|
||||||
|
|||||||
@@ -33,8 +33,6 @@ export function TreeListView({
|
|||||||
{trees.map((tree) => (
|
{trees.map((tree) => (
|
||||||
<div
|
<div
|
||||||
key={tree.id}
|
key={tree.id}
|
||||||
data-testid="tree-card"
|
|
||||||
data-tree-id={tree.id}
|
|
||||||
className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
|
className="flex items-center gap-4 bg-card border border-border rounded-2xl p-4 transition-all hover:border-primary/30 hover:shadow-xs"
|
||||||
>
|
>
|
||||||
{/* Left: Name and Description */}
|
{/* Left: Name and Description */}
|
||||||
|
|||||||
@@ -255,12 +255,6 @@ export default function AssistantChatPage() {
|
|||||||
}
|
}
|
||||||
setChats(prev => [chatItem, ...prev])
|
setChats(prev => [chatItem, ...prev])
|
||||||
setActiveChatId(session.session_id)
|
setActiveChatId(session.session_id)
|
||||||
// Keep the in-flight guard ref in sync. Without this, currentChatRef
|
|
||||||
// stays at its mount-time value (often a stale id from sessionStorage
|
|
||||||
// or null), so subsequent handleSend / handleTaskSubmit calls bail at
|
|
||||||
// their `currentChatRef.current !== sentForChatId` check and the AI
|
|
||||||
// response is silently dropped.
|
|
||||||
currentChatRef.current = session.session_id
|
|
||||||
setMessages([{ role: 'user', content: prefill }])
|
setMessages([{ role: 'user', content: prefill }])
|
||||||
setLoading(true)
|
setLoading(true)
|
||||||
|
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
import { useState } from 'react'
|
import { useState } from 'react'
|
||||||
import { AlertTriangle } from 'lucide-react'
|
import { AlertTriangle } from 'lucide-react'
|
||||||
import { EscalationQueue, EscalationMetricCard } from '@/components/flowpilot'
|
import { EscalationQueue } from '@/components/flowpilot'
|
||||||
|
|
||||||
export default function EscalationQueuePage() {
|
export default function EscalationQueuePage() {
|
||||||
const [count, setCount] = useState<number | null>(null)
|
const [count, setCount] = useState<number | null>(null)
|
||||||
@@ -21,8 +21,6 @@ export default function EscalationQueuePage() {
|
|||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<EscalationMetricCard period="30d" />
|
|
||||||
|
|
||||||
<EscalationQueue onCountChange={setCount} />
|
<EscalationQueue onCountChange={setCount} />
|
||||||
</div>
|
</div>
|
||||||
)
|
)
|
||||||
|
|||||||
@@ -161,12 +161,7 @@ export default function MySharesPage() {
|
|||||||
const isCopied = copiedId === share.id
|
const isCopied = copiedId === share.id
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<div
|
<div key={share.id} className="bg-card border border-border rounded-xl p-5">
|
||||||
key={share.id}
|
|
||||||
data-testid="share-card"
|
|
||||||
data-share-id={share.id}
|
|
||||||
className="bg-card border border-border rounded-xl p-5"
|
|
||||||
>
|
|
||||||
{/* Top row: badge + name */}
|
{/* Top row: badge + name */}
|
||||||
<div className="flex items-center gap-3 mb-3">
|
<div className="flex items-center gap-3 mb-3">
|
||||||
<span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">
|
<span className="inline-flex items-center gap-1.5 text-xs rounded-full px-2 py-0.5 bg-accent text-muted-foreground">
|
||||||
|
|||||||
@@ -533,11 +533,7 @@ export default function SessionHistoryPage() {
|
|||||||
)}
|
)}
|
||||||
style={{ '--stagger-index': i } as React.CSSProperties}
|
style={{ '--stagger-index': i } as React.CSSProperties}
|
||||||
>
|
>
|
||||||
<div
|
<div className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]">
|
||||||
data-testid="flow-session-card"
|
|
||||||
data-session-id={session.id}
|
|
||||||
className="bg-card border border-border rounded-xl p-4 transition-all hover:border-[var(--color-border-hover)]"
|
|
||||||
>
|
|
||||||
<div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
|
<div className="flex flex-col gap-3 sm:flex-row sm:items-start sm:justify-between">
|
||||||
<div className="flex-1">
|
<div className="flex-1">
|
||||||
<div className="flex flex-wrap items-center gap-2">
|
<div className="flex flex-wrap items-center gap-2">
|
||||||
|
|||||||
@@ -134,16 +134,3 @@ export interface EnhancedPsaMetrics {
|
|||||||
push_funnel: PsaFunnel
|
push_funnel: PsaFunnel
|
||||||
daily_trend: PsaDailyTrend[]
|
daily_trend: PsaDailyTrend[]
|
||||||
}
|
}
|
||||||
|
|
||||||
// Escalation Mode wedge metric — in-product time-to-first-action.
|
|
||||||
// Pair with a manual baseline measurement for the savings claim.
|
|
||||||
// See docs/plans/2026-04-27-escalation-mode-wedge-design.md.
|
|
||||||
export interface EscalationMetrics {
|
|
||||||
period: string
|
|
||||||
n_handoffs_claimed: number
|
|
||||||
n_handoffs_with_action: number
|
|
||||||
avg_seconds_to_first_action: number | null
|
|
||||||
median_seconds_to_first_action: number | null
|
|
||||||
p95_seconds_to_first_action: number | null
|
|
||||||
metric_definition: string
|
|
||||||
}
|
|
||||||
|
|||||||
Reference in New Issue
Block a user