docs(plans): add escalation-mode wedge design + test plan
Captures the GTM thesis, premises, reduced-scope engineering plan, locked UI specs, and embedded review report for the Escalation Mode wedge — output of /office-hours, /plan-eng-review, /plan-design-review, and /codex review. Codex review surfaced two corrections we applied: - two-metric framing (manual baseline vs in-product time-to-first-action) - claim role gate moved in-scope (was deferred TODO) TODO updates: peer-tech escalation + claim role gate captured (the latter then moved in-scope by the codex pass). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
494
docs/plans/2026-04-27-escalation-mode-wedge-design.md
Normal file
494
docs/plans/2026-04-27-escalation-mode-wedge-design.md
Normal file
@@ -0,0 +1,494 @@
|
||||
# Design: ResolutionFlow GTM — Escalation-Mode-First Wedge
|
||||
|
||||
Generated by /office-hours on 2026-04-26
|
||||
Branch: main
|
||||
Repo: chihlasm/resolutionflow
|
||||
Status: APPROVED
|
||||
Mode: Startup
|
||||
|
||||
## Problem Statement
|
||||
|
||||
ResolutionFlow is a multi-tenant SaaS troubleshooting platform for MSPs, currently
|
||||
in Go-to-Market Validation (pre-PMF). The backend is feature-complete (55+ endpoints,
|
||||
100+ tests, FlowPilot telemetry baseline accruing). The product has users but no
|
||||
paying customers.
|
||||
|
||||
The blocker is not engineering completeness. The blocker is the absence of a sharp
|
||||
GTM story tied to a number a buyer can verify. The session reframed the wedge twice
|
||||
before landing on the real one.
|
||||
|
||||
**What ResolutionFlow actually is:** the structuring layer between conversational AI
|
||||
and the way MSP techs work tickets. AI is great at producing answers; it is bad at
|
||||
producing workflow-shaped output. ResolutionFlow gives the tech the AI they already
|
||||
trust (Claude/GPT) but organizes the output into actionable structured steps,
|
||||
records the session, captures customer-specific context, and turns the result into
|
||||
PSA-formatted ticket notes — and optionally a runbook — without the tech writing
|
||||
anything.
|
||||
|
||||
**Positioning line:** "the senior engineer looking over your shoulder."
|
||||
|
||||
## Demand Evidence
|
||||
|
||||
The founder is the first user. Senior Systems Engineer at an MSP, losing ~20
|
||||
hours/week to cross-domain interruptions (systems engineer pulled into networking
|
||||
problems and vice versa). At least 4 interruptions per day, with the time cost
|
||||
concentrated in the gap between AI-conversation output and MSP-ticket workflow.
|
||||
|
||||
This is solving-your-own-problem demand evidence — strongest possible signal at
|
||||
this stage. The 20 hrs/week figure is the founder's own time, not a hypothetical.
|
||||
Every MSP shop with a senior tech and a junior tech has a version of this problem.
|
||||
|
||||
Telemetry signal (Phase 0.5 baseline accruing): captured flows pile up but are not
|
||||
being re-used. This says capture works, retrieval doesn't — which means the
|
||||
"hours-saved-via-re-use" number isn't yet generatable from existing data. The
|
||||
GTM-grade ROI story needs a different metric until re-use lands: minutes recovered
|
||||
per escalation, generated by Approach A below.
|
||||
|
||||
## Status Quo
|
||||
|
||||
MSP techs today resolve tickets via three workarounds:
|
||||
|
||||
1. **AI in a tab.** Junior tech opens Claude or ChatGPT, pastes the problem, gets a
|
||||
wall of prose, parses it into action items in their head, executes, repeats. AI
|
||||
does the diagnostic work. The tech does all the structure-extraction and
|
||||
ticket-note-writing afterward.
|
||||
|
||||
2. **Tribal knowledge.** Junior tech pings senior in Slack. Senior tech is
|
||||
interrupted (4+ times/day per the founder's own data). Context handoff is verbal
|
||||
and lossy.
|
||||
|
||||
3. **Stale runbooks.** Half-maintained Notion / IT Glue / SharePoint pages that
|
||||
nobody trusts because they're 18 months out of date and don't match the current
|
||||
customer environment.
|
||||
|
||||
The cost of these workarounds for the founder personally: ~20 hours per week of
|
||||
senior-tech time lost. For a 5-tech MSP, the equivalent is 1 full FTE worth of
|
||||
senior-engineer hours leaking into context-switching and tab-hopping.
|
||||
|
||||
## Target User & Narrowest Wedge
|
||||
|
||||
**Target user:** Senior Systems Engineer at a small-to-mid MSP (5-20 techs). The
|
||||
founder is exemplar #1. Buying authority is shared between senior tech (champion)
|
||||
and MSP owner (signs the check).
|
||||
|
||||
**Narrowest paid wedge:** Escalation Mode. Single sharp feature. When a junior tech
|
||||
escalates a ticket they were working in FlowPilot, the senior tech opens the ticket
|
||||
and sees the entire structured session state — every step the junior tried, every
|
||||
dead end, every command output — instead of starting with "tell me what you tried"
|
||||
for five minutes.
|
||||
|
||||
Why this is the wedge:
|
||||
|
||||
- **Two metrics, not one** (revised after /codex review 2026-04-27):
|
||||
- **Manual baseline** (the Assignment, weeks 0-2): senior tech stopwatches the
|
||||
next 5 escalations. T1 (first diagnostic action) − T0 (open ticket) under
|
||||
today's verbal-handoff workflow. This is the "what you currently lose" number.
|
||||
- **In-product metric** (telemetry, week 3+): time-to-first-action after claim,
|
||||
derived from `ai_session_step` rows where `created_at > SessionHandoff.claimed_at`
|
||||
AND `user_id = SessionHandoff.claimed_by`. This is the "what it is now with
|
||||
structured handoff" number.
|
||||
- **The savings claim** = manual baseline − in-product metric. Quote both
|
||||
explicitly in pilot conversations. Do NOT roll the in-product number alone
|
||||
into "minutes recovered" — that's an apples-to-oranges miscount Codex caught
|
||||
in the cross-model review.
|
||||
- **Single-feature demo:** a 2-minute Loom shows the magic moment — junior hits
|
||||
escalate, senior window opens with full structured context. No theory required.
|
||||
- **Cross-buyer story:** sells to senior tech (less interruption) AND owner (junior
|
||||
techs resolve faster, take more accounts).
|
||||
- **Hours-saved math is simple:** 4-5 minutes per escalation × 15-30 escalations
|
||||
per week per senior tech = 1-2 hours/week recovered per senior. At $80-150/hr
|
||||
fully-loaded senior tech cost, the tool pays for itself with one customer.
|
||||
|
||||
## Constraints
|
||||
|
||||
- **One-founder shop.** Cannot run three concurrent product narratives. Sequence
|
||||
matters more than scope.
|
||||
- **Pre-PMF runway implied.** 4-8 week build cycles before talking to a buyer are
|
||||
expensive. Approach A's 1-2 week timeline is the binding constraint.
|
||||
- **Existing architecture is mostly aligned.** FlowPilot, unified_chat_service,
|
||||
FlowProposal, ConnectWise PSA integration — most of the pieces exist. Risk is
|
||||
positioning and UX, not capability.
|
||||
- **PSA copilot competition is real.** ConnectWise / Autotask / Halo are racing to
|
||||
ship AI features. The wedge has to be sharp because we lose on distribution.
|
||||
|
||||
## Premises
|
||||
|
||||
The five load-bearing claims this design rests on, all confirmed in session:
|
||||
|
||||
1. **Diagnostic AI is commoditized.** ResolutionFlow does not compete on
|
||||
"AI solves the ticket faster." That race is over. ChatGPT/Claude already won.
|
||||
2. **The structuring layer is the wedge.** AI conversational output is too dense
|
||||
and unstructured for active troubleshooting. ResolutionFlow's value is
|
||||
organizing that output into actionable, separable, recorded steps.
|
||||
3. **Escalation context is the killer feature.** "Junior hits escalate, senior gets
|
||||
full structured context in 30 seconds instead of 5 minutes" is the sharpest
|
||||
demoable moment in the entire product surface.
|
||||
4. **First paying customer is bottom-up, prosumer-flavored.** Senior tech at a
|
||||
small MSP, $20-50/seat/month, monthly billing. Owner-targeted enterprise
|
||||
pricing waits until 5+ paying shops establish baseline ROI numbers.
|
||||
5. **Distribution is MSP communities, not paid SaaS ads.** r/msp, MSPGeek, RocketMSP,
|
||||
PSA marketplace listings. The channel matches the buyer.
|
||||
|
||||
## Approaches Considered
|
||||
|
||||
### Approach A: Escalation Mode first (REDUCED SCOPE per /plan-eng-review)
|
||||
|
||||
Lead the GTM with the killer feature. Polish the escalate-with-context handoff:
|
||||
junior tech mid-session hits escalate, senior tech window opens with full
|
||||
structured session state. 2-min demo Loom. Pilot with **3 MSPs** in the founder's
|
||||
network (capped at 3 to preserve build capacity for B). Metric: minutes recovered
|
||||
per escalation.
|
||||
|
||||
**SCOPE REDUCTION (2026-04-27 eng review):** ~80% of Approach A is already built.
|
||||
The original 2-3 week estimate assumed greenfield. Codebase audit confirms:
|
||||
|
||||
| What the doc said "build" | What actually exists |
|
||||
|---|---|
|
||||
| Session-state serialization | `ai_session.escalation_package` (JSONB), `SessionHandoff.snapshot` |
|
||||
| Senior-tech inbox | [EscalationQueuePage.tsx](frontend/src/pages/EscalationQueuePage.tsx) + [EscalationQueue.tsx](frontend/src/components/flowpilot/EscalationQueue.tsx) |
|
||||
| Claim workflow | [handoff_manager.py:123 claim_session()](backend/app/services/handoff_manager.py#L123) |
|
||||
| API surface | [session_handoffs.py](backend/app/api/endpoints/session_handoffs.py) — POST /handoff, /claim, GET queue |
|
||||
| AI assessment for senior | `_generate_ai_assessment()` in handoff_manager |
|
||||
| PSA round-trip | `escalation_package_markdown`, `escalation_package_external_id` |
|
||||
|
||||
**Real engineering scope (~6-9 days):**
|
||||
|
||||
1. **Notification dual-path** (4-5 days). `notification_sent` flag is a dead column —
|
||||
never written. Wire two channels in `handoff_manager.create_handoff`:
|
||||
- **Email** (existing `EmailService.send_notification_email`) — handles offline seniors.
|
||||
- **WebSocket / SSE push** to the EscalationQueue for live demo magic moment.
|
||||
- Set `notification_sent=true` after dispatch confirmation.
|
||||
- Graceful degradation: handoff still created if notification raises (regression test required).
|
||||
|
||||
2. **Hero metric endpoint** (~2 hours). New `GET /api/v1/analytics/escalation-metrics`,
|
||||
account-scoped, role-gated to `require_engineer_or_admin`. Computes
|
||||
*minutes recovered per escalation* by querying:
|
||||
```
|
||||
ai_session_step.created_at (first row by senior_tech_user_id where created_at > SessionHandoff.claimed_at)
|
||||
minus
|
||||
SessionHandoff.claimed_at
|
||||
```
|
||||
Returns a rolling-30-day average per account. No schema change.
|
||||
|
||||
3. **UX polish on EscalationQueue + receiving-engineer view** (2-3 days). Confirm the
|
||||
magic-moment screen lands when senior clicks claim. Add an unread indicator on
|
||||
the queue. Wire optimistic insert when SSE event arrives.
|
||||
|
||||
4. **Loom + landing page copy** (1-2 days). Non-engineering. Outside this plan's scope
|
||||
but required for the GTM in week 3.
|
||||
|
||||
**Test plan:** 100% coverage of new paths — 13 tests including 4 e2e and 1 regression
|
||||
(graceful-degradation when notification dispatch raises). Test plan artifact at
|
||||
`~/.gstack/projects/chihlasm-resolutionflow/abc-main-eng-review-test-plan-20260427-000000.md`.
|
||||
|
||||
**Risk:** Low. Single feature, single metric, architecture-aligned. The dual-path
|
||||
notification is the only mildly novel surface; both halves use existing infra.
|
||||
|
||||
**Reuses:** `services/handoff_manager.py`, `services/escalation_package_generator.py`,
|
||||
`models/session_handoff.py`, `models/ai_session.py`, `services/notification_service.py`,
|
||||
`models/notification_log.py`, EmailService, EscalationQueuePage + EscalationQueue.
|
||||
|
||||
### UI Specifications (locked by /plan-design-review 2026-04-27)
|
||||
|
||||
**Magic-moment screen** (new, after Pick Up click): dedicated handoff-context view that
|
||||
loads BEFORE the regular FlowPilot session view, then dissolves on first senior action.
|
||||
Four sections, single frame:
|
||||
|
||||
1. **Problem summary** (top, 2-3 lines): junior's framing. Bricolage Grotesque h2.
|
||||
2. **What's been tried** (left or middle column): structured list of `dead_ends_flagged[]`
|
||||
and `steps_attempted[]` from `escalation_package` JSONB. Card-flat surface, IBM Plex.
|
||||
3. **AI assessment** (right column): `ai_assessment_data` rendered as 3 fields —
|
||||
`likely_cause`, `suggested_steps[]`, `confidence`. accent-dim badge for confidence.
|
||||
4. **Start here** (primary CTA, electric-blue, ≥44px touch target): opens FlowPilot
|
||||
session at the most-likely-next-step. Senior typing or clicking anywhere triggers
|
||||
200ms fade-out and FlowPilot view fades in. Re-openable via "Show handoff context"
|
||||
ghost button in FlowPilot toolbar.
|
||||
|
||||
**Hero metric ("minutes recovered per escalation"):** lives in TWO places:
|
||||
- **Queue stat-card** (above EscalationQueue list on /escalations): compact, "X.X hrs
|
||||
saved this month" + "click for details" affordance. Refreshes on queue load.
|
||||
- **Dedicated `/analytics/escalations` page** (owner-facing): trend chart (4-week
|
||||
rolling), per-tech breakdown, per-problem-domain segmentation. Engineer-or-admin
|
||||
role-gated.
|
||||
|
||||
**Real-time arrival visual** (when WebSocket pushes a new escalation):
|
||||
- New card slides in from above the list, 200ms ease-out CSS transition.
|
||||
- Browser tab title prefixes with " (1) " / " (N) " when tab is backgrounded; clears
|
||||
on focus.
|
||||
- No sound. MUST respect `prefers-reduced-motion: reduce` (slide-in collapses to
|
||||
instant fade-in).
|
||||
|
||||
**Unread state:** subtle 6px dot in top-right corner of card for escalations the
|
||||
current senior has never opened. Dot fades on first hover or click.
|
||||
|
||||
**Race-condition (two seniors click Pick Up simultaneously):** loser sees a toast
|
||||
"Already claimed by [name] 2s ago" via existing `@/lib/toast`; the card flashes the
|
||||
winner's name in the meta row for 1s, then dissolves from the loser's view via
|
||||
optimistic update + WebSocket reconciliation.
|
||||
|
||||
**Unread state (Codex correction 2026-04-27):** dot indicator clears on **open,
|
||||
claim, or explicit dismiss** — NOT on hover. Hover-to-clear is a bad proxy for
|
||||
acknowledgment because incidental mouse movement creates false clears.
|
||||
|
||||
**Notification routing (Codex finding 2026-04-27):** v1 fans out the email + push
|
||||
to **all engineer-or-admin role users in the same account_id as the SessionHandoff**.
|
||||
No on-call/round-robin logic in v1. If pilots ask for routing, capture as v2 TODO.
|
||||
The first senior to claim wins; everyone else's notification self-resolves on
|
||||
WebSocket reconciliation.
|
||||
|
||||
**Notification delivery model (Codex correction 2026-04-27):** drop the
|
||||
`notification_sent: bool` flag from v1. Replace with per-channel delivery rows
|
||||
in a new `notification_log` table (already exists — reuse, don't add a new model)
|
||||
keyed by `(handoff_id, channel, recipient_user_id, status)` where status ∈
|
||||
{queued, sent, failed, suppressed}. This makes partial-success and per-channel
|
||||
retry visible. If the existing `notification_log` schema doesn't match, defer
|
||||
the per-channel persistence to a v2 TODO and v1 logs delivery attempts to the
|
||||
existing telemetry stream instead. Do NOT keep the dead boolean.
|
||||
|
||||
**"Start here" CTA (Codex correction 2026-04-27):** opens the FlowPilot session
|
||||
at the **latest known state** (the AI's most recent agent_message + the current
|
||||
pending_task_lane). Surface `ai_assessment_data.suggested_steps[]` as a list of
|
||||
chips below the chat input — clicking a chip prefills the input. Do NOT invent a
|
||||
"jump to most-likely-next-step" capability that doesn't exist in the session model.
|
||||
|
||||
**`/claim` role gate (Codex correction 2026-04-27, IN-SCOPE for v1):** add
|
||||
`require_engineer_or_admin` dep on POST `/handoffs/{id}/claim`. Originally
|
||||
deferred to TODO during eng review; Codex correctly flagged it as wedge-relevant
|
||||
because the race-condition story depends on auth gating. ~30 min change. Removed
|
||||
from TODO.md.
|
||||
|
||||
**A11y requirements (mandatory before pilot ship):**
|
||||
- Keyboard: Tab order through queue cards; Enter on focused card opens it; Pick Up
|
||||
button is a reachable target; Esc closes the handoff-context overlay.
|
||||
- ARIA: `role="region"` + `aria-live="polite"` on the queue list (announces arrivals);
|
||||
`aria-label="N escalations awaiting pickup"` on the heading; the slide-in animation
|
||||
must not announce twice (debounce live-region updates).
|
||||
- Pick Up button: bump from `py-2` to `py-2.5` to clear the 44px touch-target floor.
|
||||
- Color contrast: confidence-badge text on accent-dim background must be ≥4.5:1
|
||||
(verify against DESIGN-SYSTEM.md tokens).
|
||||
|
||||
**DS token discipline:** every new piece must use `card-flat`, `accent-dim`/`accent-text`,
|
||||
`text-muted-foreground`, `bg-card`/`bg-elevated`, IBM Plex / Bricolage / JetBrains,
|
||||
explicit `transition` property lists (never `transition: all`). No glass, no blur,
|
||||
no gradient surfaces. Electric-blue accent reserved for interactive elements only.
|
||||
|
||||
**Mobile responsive:** deferred to post-pilot TODO. Pre-PMF wedge target is desktop;
|
||||
MSP techs work on laptops/desktops in shop environments.
|
||||
|
||||
**Deferred to TODO.md (out of scope for v1 wedge):**
|
||||
- Peer-tech escalates colleague's session (currently session-owner-only)
|
||||
- Role gate on POST /claim (currently any authenticated user in account)
|
||||
|
||||
### Approach B: Full Structured Resolution loop (split B1 + B2)
|
||||
|
||||
End-to-end demo: tech opens FlowPilot, structure appears in side panel as AI
|
||||
responds, ticket notes auto-populate at end, optional runbook capture for reusable
|
||||
patterns. Tells the full "senior engineer over your shoulder" story.
|
||||
|
||||
**B1 — Side panel + PSA-formatted ticket notes** (ships first):
|
||||
- Structured side panel that surfaces parsed AI markers as live actionable steps
|
||||
while the conversation runs.
|
||||
- PSA-formatted ticket-notes exporter (ConnectWise first; Autotask/Halo later).
|
||||
- Effort: M (~3 weeks).
|
||||
|
||||
**B2 — Runbook offer-and-save** (gated on pilot demand):
|
||||
- "Save this resolution as a flow?" prompt at session end, with auto-drafted
|
||||
runbook from the structured session state.
|
||||
- Effort: S (~1 week). Don't build until at least 2 pilot customers explicitly
|
||||
ask for it.
|
||||
|
||||
- **Risk:** Medium. The structured-output panel quality is the whole demo. If it
|
||||
looks dumb, the demo dies.
|
||||
- **Reuses:** FlowPilot, unified_chat_service, FlowProposal, ConnectWise PSA
|
||||
integration.
|
||||
|
||||
### Approach C: Senior-Tech Time-Saved Counter
|
||||
|
||||
Continuous measurement layer underneath A and B. Every session contributes an
|
||||
estimated minutes-saved number. Owner-facing dashboard quotes "this month your
|
||||
shop saved N hours of senior-tech time." Sells to MSP owner with verifiable ROI.
|
||||
|
||||
- **Effort:** S (~1 week + ongoing measurement methodology refinement).
|
||||
- **Risk:** Medium-low. Methodology has to be defensible. If numbers look
|
||||
made-up, trust dies fast.
|
||||
- **Reuses:** FlowPilot telemetry, session metadata, account-scoped analytics.
|
||||
|
||||
## Recommended Approach
|
||||
|
||||
**A first (1-2 weeks), then B (3-4 weeks after A ships), with C running underneath
|
||||
both as a continuous backdrop.**
|
||||
|
||||
Sequence rationale:
|
||||
|
||||
- **A is the sharpest possible 2-minute demo.** Single feature, single metric,
|
||||
buyer-verifiable in their own data. Get it in front of 5 MSPs in week 3.
|
||||
- **B is the depth play.** Once Approach A has produced first-pilot signal,
|
||||
Approach B's full structured-resolution loop becomes the "what we ship next" that
|
||||
retains pilots and converts them to paid.
|
||||
- **C compounds across both.** Every session under A or B contributes to the
|
||||
time-saved counter. By week 6 there are real numbers to put in front of an MSP
|
||||
owner — turning a senior-tech-led pilot into an owner-signed contract.
|
||||
|
||||
This sequence is non-negotiable. Building B before A is the classic pre-PMF trap of
|
||||
perfecting product before validating GTM. Building C alone is measurement without a
|
||||
demo to anchor it.
|
||||
|
||||
## Pricing
|
||||
|
||||
**Pilot pricing (first 3-5 customers): $39/seat/month, monthly billing,
|
||||
month-to-month.** Anchored against IT Glue (~$29/tech), Hudu (~$25/tech),
|
||||
Liongard (~$3/endpoint). The premium over IT Glue/Hudu reflects the active-session
|
||||
value (vs. their static-runbook value) — 30% above the runbook-only category.
|
||||
|
||||
Customer #6+ pricing is an Open Question (revisit after 3 pilots produce real
|
||||
hours-saved data; price up if the per-seat ROI is over $200/seat/mo).
|
||||
|
||||
## Open Questions
|
||||
|
||||
1. **Free-tier shape.** Should the time-saved counter be free forever as a
|
||||
distribution lever, with paid for the structuring + escalation? Land-and-expand
|
||||
pattern. Decide after 3 pilot conversions.
|
||||
2. **PSA-marketplace timing.** ConnectWise Marketplace listing requires partnership
|
||||
onboarding (~6-week cycle). Submit application week 5; expect listing live by
|
||||
week 11. Don't gate launch on it.
|
||||
3. **Customer #6+ pricing.** Revisit after 3 pilot customers produce verifiable
|
||||
hours-saved numbers.
|
||||
|
||||
## Deferred (YAGNI until 10 paying customers)
|
||||
|
||||
- HIPAA / SOC2 audit positioning. Pre-PMF is too early; revisit when a regulated-
|
||||
vertical MSP asks for it explicitly.
|
||||
- Multi-PSA depth (Autotask, Halo). ConnectWise alone covers ~40% of the SMB MSP
|
||||
market and is sufficient for first 5-10 customers.
|
||||
- Cross-tenant pattern detection. The data-flywheel-across-shops play is at least
|
||||
6 months out; building it before single-shop ROI is proven is premature.
|
||||
|
||||
## Success Criteria (revised for realism)
|
||||
|
||||
- **Week 3:** Approach A shipped. 3 MSPs in active free pilot (cap at 3 to
|
||||
preserve B1 build capacity).
|
||||
- **Weeks 3-6:** Pilot management dominates. B1 build is paused; founder runs
|
||||
pilot calls, captures bug reports, iterates UX. Stripe seat-based billing is
|
||||
set up in week 5.
|
||||
- **Week 6:** First verbal commit from a pilot customer. Verified
|
||||
minutes-recovered-per-escalation number from at least 2 pilots.
|
||||
- **Week 8:** First paid customer (procurement cycles run 4-6 weeks even at small
|
||||
MSPs; 2 weeks from verbal commit to signed contract is realistic). Time-saved
|
||||
counter (Approach C) producing dashboard-quality data.
|
||||
- **Week 11:** B1 (side panel + PSA notes) shipped. 3-5 paying customers. First
|
||||
MSP-owner-led conversation. ConnectWise Marketplace listing live.
|
||||
- **Quarter end:** $5K MRR or 10 paying customers, whichever comes first. Loom
|
||||
demos posted publicly to r/msp and MSPGeek.
|
||||
|
||||
## Distribution Plan (week-by-week cadence)
|
||||
|
||||
- **Week 3:** Escalation Mode demo Loom posted. r/msp launch post.
|
||||
- **Week 4:** MSPGeek Discord AMA scheduled. RocketMSP newsletter pitch sent.
|
||||
- **Week 5:** ConnectWise Marketplace listing application submitted. Stripe
|
||||
billing live for paid conversion.
|
||||
- **Week 6:** First "guest on Inside MSP podcast" outreach. Second r/msp post
|
||||
(case study from a pilot, anonymized).
|
||||
- **Week 7-8:** Pilot conversion calls. First paying customer.
|
||||
- **Week 9-11:** B1 ships. Owner-targeted demo Loom. Second podcast outreach.
|
||||
|
||||
**Founder-led pilot:** The first 3-5 customers come from the founder's existing
|
||||
MSP network. Treat them as design partners; expect to ship feature requests
|
||||
weekly during pilot. Cap at 3 active pilots until B1 ships.
|
||||
|
||||
**Tech audience channels:** r/msp, r/sysadmin, MSPGeek Discord, RocketMSP
|
||||
newsletter, Inside MSP podcast.
|
||||
**Owner audience channels:** ConnectWise Marketplace, MSP-focused Substacks,
|
||||
RIA Vendor Roundup.
|
||||
|
||||
CI/CD: existing Railway auto-deploy via GitHub mirror. No new pipeline needed.
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **Session-state serialization (Approach A blocker).** Schema design + migration
|
||||
is the longest-lead engineering task. 3-5 days budget. Do this first.
|
||||
- **Stripe seat-based billing (week 5 task).** No billing infrastructure exists
|
||||
today. ~3-5 days of work for monthly subscriptions + invoice flow. Block on
|
||||
this before week-8 first-paid milestone.
|
||||
- **ConnectWise PSA integration depth.** Sufficient for ticket-notes auto-export
|
||||
(Approach B1). Autotask and Halo wait until first 5 paying ConnectWise
|
||||
customers.
|
||||
- **Authentication.** Existing JWT + role hierarchy is sufficient for senior-tech
|
||||
inbox view; no new auth work needed.
|
||||
|
||||
## Risks and Kill-Switch
|
||||
|
||||
- **Risk: Session-state serialization design churn.** If the schema needs to
|
||||
change after pilot feedback, every saved session has to migrate. Mitigation:
|
||||
keep schema versioned and forward-compatible from day 1.
|
||||
- **Risk: Pilot-to-paid conversion slower than 4-6 weeks.** MSP procurement is
|
||||
notoriously slow. Mitigation: get verbal commits in writing; price as
|
||||
month-to-month with no annual contract to lower the buying friction.
|
||||
- **Risk: ConnectWise ships an equivalent feature in their 2026.x release.**
|
||||
Mitigation: lead the marketing on "we're independent of your PSA" — works with
|
||||
any PSA, not just ConnectWise. The founder's PSA-agnostic FlowPilot is an
|
||||
asset here.
|
||||
- **Kill-switch criterion:** if 0 of 3 pilots produce a verifiable
|
||||
hours-saved-per-week number above 1.0 by week 8, **revisit the wedge**. The
|
||||
product may need to pivot to deterministic-ops territory (Read 1 from the
|
||||
session) or be repositioned. Don't sink another quarter into the current GTM
|
||||
story without this number.
|
||||
|
||||
## The Assignment
|
||||
|
||||
**This week, before any code:**
|
||||
|
||||
Time-track the next 5 escalations in your shop manually. For each, capture:
|
||||
1. Time the senior tech opens the ticket
|
||||
2. Time the senior tech takes their first diagnostic action (not counting the
|
||||
verbal "tell me what you tried" warm-up)
|
||||
3. The delta — that's the wasted time per escalation today
|
||||
|
||||
Average those 5 numbers. **That's the hero stat in your first sales conversation:**
|
||||
"Senior techs at our shop wasted N minutes per escalation just getting up to
|
||||
speed. We built the thing that takes that to zero."
|
||||
|
||||
Don't try to pull this from telemetry — the doc itself notes that retrieval/re-use
|
||||
data isn't queryable yet. Manual stopwatch on the next 5 escalations is the
|
||||
fastest path to a defensible number.
|
||||
|
||||
This is the assignment because it forces the GTM story into the same time-zone as
|
||||
the build, and it's a one-day effort that compounds for every conversation
|
||||
afterward.
|
||||
|
||||
## What I noticed about how you think
|
||||
|
||||
- You contradicted my framing twice in the same session and the second
|
||||
contradiction was sharper than the first. Most founders agree with the
|
||||
diagnostic and walk out with a polished version of what they came in with. You
|
||||
said "I'm just questioning if flows are even the way to go" — and that
|
||||
sentence reset the entire wedge. That's craft.
|
||||
|
||||
- "The senior engineer looking over your shoulder" came out of you spontaneously,
|
||||
not as a prepared pitch. That's the line. Use it. It survives because it's
|
||||
emotional truth (every junior tech has had this, every senior tech has been
|
||||
this), not constructed marketing copy.
|
||||
|
||||
- You're solving your own problem with your own time. 20 hrs/week isn't a
|
||||
hypothetical user pain — it's your Tuesday. Founders who solve their own pain
|
||||
ship sharper products because the feedback loop is instant.
|
||||
|
||||
- The escalation feature emerged from your description, not mine. I was busy
|
||||
cataloging documentation pains. You said "junior to senior escalation? no
|
||||
worries there either" almost as an afterthought. That afterthought is the wedge.
|
||||
Pay attention to which features you describe casually versus which you push hard
|
||||
on — the casual ones are sometimes where the truth lives.
|
||||
|
||||
## GSTACK REVIEW REPORT
|
||||
|
||||
| Review | Trigger | Why | Runs | Status | Findings |
|
||||
|--------|---------|-----|------|--------|----------|
|
||||
| CEO Review | `/plan-ceo-review` | Scope & strategy | 0 | — | not run |
|
||||
| Codex Review | `/codex review` | Independent 2nd opinion | 1 | INFO | 12 findings, 6 applied, 1 partial, 5 rejected |
|
||||
| Eng Review | `/plan-eng-review` | Architecture & tests (required) | 1 | CLEAR (PLAN) | 2 issues, 0 critical gaps, scope reduced |
|
||||
| Design Review | `/plan-design-review` | UI/UX gaps | 1 | CLEAR (FULL) | score 6/10 → 9/10, 8 decisions |
|
||||
| DX Review | `/plan-devex-review` | Developer experience gaps | 0 | — | not run |
|
||||
|
||||
- **CODEX:** 12 findings reviewed. Applied: 2-metric framing (#2), notification routing spec (#3), per-channel delivery model (#4), unread-state fix (#11), Start-here CTA reframe (#9), claim role gate moved in-scope (#8). Rejected: full scope reduction to PSA-brief-only (#6/7/12 — user kept queue UI as demo hero). Partial: scope concern (#5) acknowledged in eng review's email-first/polling-fallback. Misread: #1, #10.
|
||||
- **CROSS-MODEL:** Claude (eng + design reviews) and Codex agree on 6/12 findings. The major disagreement was scope — Codex argued for cutting the queue UI, user rejected. Both agree on metric definition, notification routing, claim auth gating.
|
||||
- **UNRESOLVED:** 0
|
||||
- **VERDICT:** ENG + DESIGN CLEARED, CODEX REVIEWED — ready to implement.
|
||||
33
docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md
Normal file
33
docs/plans/2026-04-27-escalation-mode-wedge-test-plan.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Test Plan
|
||||
Generated by /plan-eng-review on 2026-04-27
|
||||
Branch: main
|
||||
Repo: chihlasm/resolutionflow
|
||||
|
||||
## Affected Pages/Routes
|
||||
|
||||
- `/escalations` ([EscalationQueuePage.tsx](frontend/src/pages/EscalationQueuePage.tsx)) — senior-tech inbox view; verify queue list, real-time arrival, click-through
|
||||
- `/pilot/:session_id` (FlowPilotSessionPage) — verify post-claim load shows full escalation context (snapshot, ai_assessment, escalation_package)
|
||||
- `GET /api/v1/analytics/escalation-metrics` (NEW) — verify hero metric calculation, account-scoping, role gate
|
||||
|
||||
## Key Interactions to Verify
|
||||
|
||||
- Junior tech clicks **Escalate** in active FlowPilot session → handoff is created → notification fires → senior sees escalation in queue within 30 seconds
|
||||
- Senior tech clicks **Claim** in queue → session reactivates → senior is redirected into FlowPilot session view → ai_assessment + snapshot are visible
|
||||
- Senior types first message in chat after claim → metric query starts attributing time-to-first-action
|
||||
- MSP owner opens analytics page → "minutes recovered per escalation" widget shows current month's rolling average
|
||||
|
||||
## Edge Cases
|
||||
|
||||
- **Two seniors race to claim** the same handoff → one wins, the other gets a "Already claimed by [name]" message
|
||||
- **Senior is offline** when escalation fires → email arrives via existing `EmailService.send_notification_email`
|
||||
- **WebSocket disconnects mid-session** → frontend reconnects; missed events backfilled by re-fetching the queue
|
||||
- **Notification dispatch raises** (SMTP down, WebSocket fanout fails) → handoff is still created (graceful degradation)
|
||||
- **Senior takes non-chat action first** (e.g., posts directly to PSA) → metric falls back to PSA writeback timestamp or remains null; doc the chosen behavior
|
||||
- **Account-scoped multi-tenancy** → senior at MSP A cannot see escalations from MSP B (Phase 4 RLS)
|
||||
- **Role gate on metric endpoint** → only `engineer_or_admin` can hit `/escalation-metrics`
|
||||
|
||||
## Critical Paths
|
||||
|
||||
1. **Magic-moment demo flow** (the entire Loom): junior escalate → senior notification → senior claim → session view → first action recorded → metric updates
|
||||
2. **Email fallback** when senior is offline — must not silently drop
|
||||
3. **Regression: handoff creation succeeds even if notification dispatch raises** — graceful degradation is mandatory
|
||||
Reference in New Issue
Block a user