Files
resolutionflow/CURRENT-STATE.md
Michael Chihlas 1106f79611 docs: add session-expiration-policy decision entry + CURRENT-STATE summary
Ninth and final commit in the session-expiration-policy series.

- .ai/DECISIONS.md: new entry documenting the two-window model
  (3d idle / 14d absolute defaults), per-account override design,
  grandfather strategy, error-detail taxonomy on the wire, and the
  rejected alternatives (idle-only / absolute-only / hard SECRET_KEY
  cutover / Loose preset / reveal-on-Custom UI / modal-stays-open
  for scope=all). Includes consequences and follow-up tickets.
- CURRENT-STATE.md: 'Recently shipped' entry summarizing the 8-commit
  series across backend (migration, claims, enforcement, two
  endpoints) and frontend (page, hook, toast, banner, modal),
  referencing the plan + design-review file.

Pending after this commit: open PR, merge, file the per-user
device-list + super-admin global-ceiling follow-up issues per plan §9.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-13 17:09:09 -04:00

312 lines
22 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Current State
> **Purpose:** Quick-reference file showing exactly where the project stands.
> **For Claude Code:** Read this first to understand what's done and what's next.
> **Last Updated:** May 7, 2026
---
## Active Phase: Go-to-Market Validation (Pre-PMF) — Self-serve cutover (Phase O) in flight
Self-serve signup backend (Phase 1) and frontend (Phase 2) are merged. Cutover (Phase O) is gated on manual ops: live-mode Stripe Dashboard config, Railway prod env vars, internal validation pass against prod test mode, then the public flag flip. Plan: `docs/superpowers/plans/2026-05-06-self-serve-signup-phase-2-frontend-cutover.md`.
---
## Recently shipped (post-0.1.0.0)
- **2026-05-13 — `feat/session-expiration-policy` (open)** Session expiration policy series — 8 commits, fixes the "logged in forever" bug and adds owner-side controls. Migration `b269a1add160` adds `accounts.session_idle_minutes` + `session_absolute_minutes` (NULL = use system default, defaults Strict 3d/14d via `Settings.SESSION_*_MINUTES_DEFAULT`). Refresh-token JWT carries `auth_time` + `idle_max` + `abs_max` claims (seconds) snapshotted at every login entry point (`/auth/login`, `/auth/login/json`, both OAuth callbacks). `/auth/refresh` enforces absolute cap (`now >= auth_time + abs_max` → 401 `session_expired_absolute`), atomic-revoke-then-check prevents replay. Error-detail taxonomy on the wire distinguishes `session_expired_idle` / `session_expired_absolute` / `invalid_refresh_token`. New owner-only `GET/PATCH /accounts/me/security` returns `{idle_minutes, absolute_minutes, effective_*, *_min/max, active_users}` with audit logging on PATCH. `POST /accounts/me/security/revoke-sessions` bulk-revokes refresh tokens for the account (`scope: "all" | "others"`), audited. Frontend: new `/account/security` page (Strict/Standard/Custom presets, active-users list with name + email + last-login-ago, count-aware revoke buttons + confirmation modal), `useAuthSessionExpiry` hook + top-of-app `SessionExpiryToast` (differentiated by idle vs absolute), cyan info-tone banner on `/login?reason=session_expired`. Plan + design review in `docs/plans/2026-05-13-session-expiration-policy.md` (initial 4/10 → 9/10 via `/plan-design-review`). 28 backend tests; tsc clean. Pending: open PR, merge, document follow-up issues (per-user device list, super-admin global ceiling UI).
- **2026-05-07 — PR #164 (open)** Plan taxonomy reconciliation + `INTERNAL_TESTER_EMAILS` allowlist. Marketing surface (PricingPage, Stripe products) used `Starter / Pro / Enterprise` while backend was on `free / pro / team`, leaving `plan_billing` unseeded and `BillingPlan` schema accepting a literal that violated the FK. Migration `4ce3e594cb87`: rename `team``enterprise` in `plan_limits`, add `starter` row (caps interpolated between free and pro: `max_trees=10`, `sessions=75`, `ai=15/mo`), defensive update of any subscriptions on the `team` slug. Code rename across schemas, `Subscription` paid-plan checks, admin endpoints, and frontend `useSubscription`. Resource visibility (`Tree.visibility='team'`, `StepLibrary.visibility='team'`) is a separate domain and intentionally untouched. New `backend/scripts/sync_stripe_plan_ids.py` — idempotent upsert of `plan_billing` rows from Stripe products by exact name match, picks active monthly recurring price, leaves annual fields NULL by design. Test-mode `plan_billing` populated for all 3 tiers in dev. Phase O Task 46 allowlist: `INTERNAL_TESTER_EMAILS` env var (comma-separated) bypasses `SELF_SERVE_ENABLED=false` for specific authenticated users — `Settings.is_self_serve_active_for(email)` centralizes the check; `/config/public` returns `self_serve_enabled=true` for allowlisted authenticated callers; `/auth/register` allows allowlisted emails to register without invite code. New `get_current_user_optional` dep for endpoints that work both anonymous and authed.
- **2026-05-06 — PR #163** Seed test users marked email-verified. Fixed seeded users showing the email verification banner in dev/test, blocking flows that gate on `email_verified=True`. Squash-merged into main as `dad5e1f`.
- **2026-05-06 — PR #162** Self-serve signup Phase 2 (frontend cutover). 18 commits across Tasks 2744 of the Phase 2 plan: backend remainders + frontend billing foundation + auth surfaces (OAuth + accept-invite + verify-email) + welcome wizard + dashboard redesign (TrialPill, NextStepCard, unified checklist) + public surfaces (`/pricing`, `/contact-sales`) + beta-signup deprecation. Single alembic head `c6cbfc534fad` (no new migrations in Phase 2). Squash-merged as `f1be3ab`.
- **2026-05-?? — PR #161** Self-serve signup backend (Phase 1). `plan_billing` sibling table for Stripe + catalog metadata, `sales_leads` and `stripe_events` tables, `complimentary` status with `has_pro_entitlement`, `BillingService.start_trial` wired into `/auth/register`, `/billing/checkout-session`, Stripe webhook handler with idempotency via `stripe_events`, Google + Microsoft OAuth callbacks with `oauth_identities` linking, `require_verified_email_after_grace` + `require_active_subscription` guards, bulk-create + soft-revoke invite endpoints, account-invite email-match enforcement, pilot complimentary backfill, `accounts.team_size_bucket` + `primary_psa` for wizard. Squash-merged as `f918b76`.
- **2026-05-02 — PR #159** In-product User Guides rewrite to Diátaxis how-tos. Replaced 15 feature-dump guides with 43 problem-oriented how-tos grouped under 10 categories. Dropped Maintenance Flows / AI Assistant / Flow Assist Sparkles guides (UI no longer exists). Renamed Step Library → Solutions Library. Authored 14 net-new how-tos for FlowPilot-era surfaces (tasklane keyboard flow, what-we-know, resolve, escalate, record-fix-outcome, post-docs-to-ticket, share-update, pause-and-leave, build-script-from-scratch, open-suggested-flow, pin-a-flow, invite-teammate, etc.). Schema additions: `category`, optional `relatedSlugs`. Browser-verified against engineer + owner login.
- **2026-05-?? — PR #160** Post-PR-159 UI cleanup — sidebar IA + account redesign. Squash-merged as `a8b22cf`.
- **2026-05-01 — PR #158** Session-screen UX impeccable pass + tasklane keyboard flow. Heuristic score 24/40 → 33/40 across five sub-passes (distill, quieter, layout, typeset, polish). Removed duplicate "Suggested checks" chip strip → TaskLane is the single source of truth; added inline `Next steps · N pending` cue on the latest action-bearing AI bubble; consolidated session header to Resolve + Escalate + ⋯ kebab; centered messages column to match composer; dropped all banned decorations (side stripes, gradient surfaces, backdrop blur, accent borderTop) for a single decoration channel per surface; unified 14 text sizes into a 5-step scale. TaskLane keyboard flow: Enter submits + auto-advances, Shift+Enter newline, Esc cancel, focus jumps to Send after the last task. Banner ↔ script-panel are now linked (collapse hides both, any outcome closes both). WhatWeKnow section is collapsible with `sessionStorage` memory + auto-collapse-at-5-facts. Side fix: ParameterizationPreview no longer over-highlights short parameter values (word-boundary check). Two backlog entries logged in `.ai/TODO.md`: ConcludeSessionModal multi-select and `bg-card-hover` Tailwind drift in CommandPalette.
- **2026-05-01 — PR #156** Suggested-fix "Awaiting verification" outcome. Engineers can now park a fix in `applied_pending` (waiting on client power-cycle, AD replication, license sync, etc.) instead of forcing a synchronous worked/didn't/partial verdict. PendingBanner with worked / didn't / update reason / dismiss; nudge "Still checking" records pending with a reason; page-level Resolve auto-patches pending → success before the resolution flow opens; page-level Escalate intercepts pending. Migration `c0f3a4b7e91d` (`pending_reason` column + status CHECK constraint).
- **2026-04-30 — PR #155** Escalation Mode wedge. Magic-moment handoff-context screen for senior pickup, live SSE escalation arrivals, post-claim time-to-first-action metric (`GET /analytics/flowpilot/escalations`), atomic role-gated claim with conflict resolution, queue self-exclusion, chat ownership extended to claimed sessions. The wedge for the first paying-customer push.
---
## What's Complete
### Core Platform
- FastAPI project structure with 35+ API endpoints
- PostgreSQL database with Docker, 75+ Alembic migrations
- User authentication (JWT, register, login, refresh, logout, invite codes)
- Refresh token rotation with JTI-based revocation
- Trees CRUD with full-text search (FTS index)
- Sessions tracking with decisions, outcomes, and variables
- Export API (Markdown, Text, HTML)
- Role-based access control (super_admin, team_admin, engineer, viewer)
- Production-ready logging with correlation IDs
- 100+ integration tests
- Rate limiting on auth endpoints (disabled in DEBUG)
- Audit log table with JSONB details
- Soft delete for trees with cascade cleanup
### Frontend Core
- React 19 + Vite + TypeScript + Tailwind CSS v4 (`@tailwindcss/vite`)
- **Charcoal Design System** — Flat, high-contrast dark theme (Sentry/PostHog-inspired), charcoal palette with sidebar-darkest approach
- **Brand fonts:** Bricolage Grotesque (headings), IBM Plex Sans (body), JetBrains Mono (code)
- Authentication UI (login, register, email verification)
- Tree library/browsing page with grid/list/table views
- Tree navigation interface (session player)
- Session management with history and detail pages
- **Tree Editor** — Form-based with visual preview, Zustand + immer + zundo (undo/redo)
- **Markdown rendering** in session player and node editor
- **Tree Organization** — Categories, tags (autocomplete), user folders (3-level hierarchy), filters
- **RBAC & Permissions** — `usePermissions` hook, ProtectedRoute with role guards
- **Session Scratchpad** — Floating overlay (Ctrl+/), auto-save, markdown preview
- **Admin Panel** — 8 pages (dashboard, users, invite codes, audit logs, plan limits, feature flags, settings, categories)
- **Session Quick Wins** — Timer, keyboard hints, repeat last, auto-recovery, copy step, delete tree
- **Session Outcomes** — Outcome modal on completion, step timing tracking
- **Session Sharing** — Share links, public/account views, MySharesPage
- **Procedural Editor UX** — Section headers, collapsible advanced fields, URL intake, tag input
- **Type-aware Routing** — Centralized `getTreeNavigatePath`/`getTreeEditorPath` helpers
- **Account Management** — Profile settings, delete/leave/transfer, chat retention
- **PostHog Analytics** — Event tracking, user identification, autocapture
### FlowPilot AI System (Phases 1-3 Complete)
**Phase 1 — AI Session Engine:**
- FlowPilotEngine with multi-step guided troubleshooting
- AI copilot panel + standalone assistant chat with RAG
- Confidence-tiered model routing via `settings.get_model_for_action()`
- Intake form with ticket/client fields, session pause/resume
- AI-generated ticket summaries, outcome tracking
**Phase 2 — PSA Integration & Escalation:**
- ConnectWise PSA integration (ticket linking, note posting, member mapping)
- PSA documentation auto-push with retry scheduler
- Session pause/resume, mid-session ticket linking
- Escalation handoff workflow with LLM-enhanced briefing package
- Escalation pickup flow for senior engineers
- PSA settings UI on IntegrationsPage
- In-session script generator
**Phase 3 — Knowledge Flywheel:**
- AI session analysis → automatic flow proposal generation
- FlowProposal model with review queue (approve, edit & publish, dismiss, reject)
- Knowledge gap detection (weak options, high escalation domains)
- FlowPilot analytics dashboard (metrics, confidence tiers, PSA stats, gaps)
- APScheduler batch analysis job with `max_instances=1`
- Auto-reinforcement for sessions matching existing flows
### Phase 4 — Enterprise & Growth Features (All Slices Complete)
**Slice 1 — Public Templates Gallery:**
- Public API endpoints (no auth): gallery listing, flow/script detail, categories, search
- `is_gallery_featured` and `gallery_sort_order` columns on trees and script_templates
- IP-based rate limiting (30/min), tree structure truncated to 3 levels (signup wall)
- Public `/templates` page with hero, search, category filters, responsive card grid
- Detail modal with tree preview or parameter list + signup CTA
- Admin gallery curation page (feature toggle, sort order)
- 25 backend tests
**Slice 2 — Notification System:**
- NotificationConfig, NotificationLog, Notification models + migration
- Multi-channel delivery: in-app, email (Resend), Slack webhooks, Teams webhooks
- Notification service with event routing and fire-and-forget delivery
- APScheduler retry job with exponential backoff (30s, 2m, 10m, max 3 retries)
- 9 API endpoints (config CRUD + in-app notification management)
- Wired into escalation, proposal approval, and knowledge flywheel events
- Frontend: NotificationsPanel (bell icon + dropdown), NotificationSettings UI
**Slice 3 — Session Export (Polish):**
- 5-format export already existed (markdown, text, HTML, PSA, PDF via WeasyPrint)
- Added "Generated with ResolutionFlow" branding footer to all 5 formats
- Fixed PDF template conditional that was hiding branding
- Added spinner for PDF generation loading state
**Slice 4 — Mobile/Responsive:**
- Responsive audit pass across 11 FlowPilot and analytics components
- FlowPilotSession: collapsible mobile sidebar, single-column layout on mobile
- Action bars: full-width stacked buttons on mobile, 44px touch targets
- Modals: full-width slide-up pattern on mobile
- ReviewQueuePage: stacked panels on mobile
- Analytics: single-column chart stack on mobile
**Slice 5 — Enterprise Readiness:**
- Custom branding: logo URL, primary accent color, company name (owner-only)
- CSS variable overrides applied in app shell for accent color
- Branding settings page under Account Settings
- Autotask PSA and Halo PSA stub providers (Coming Soon badges in UI)
- SSO/SAML groundwork: sso_enabled, sso_provider, sso_config columns on Account
- SSO stub service with interface methods
- "Contact us to enable SSO" section in Account Settings
### Phase 5 — Analytics Enhancement (Complete)
- Tabbed analytics page: Overview, Coverage, Flow Quality, PSA
- Coverage heatmap: domain grid with color-coded cells (resolution/escalation/guided rates, flow count)
- Domain-to-flow mapping via category cross-reference
- Flow quality scoring endpoint: quality_score = (success_rate * 0.5) + (guided_rate * 0.3) + (recency * 0.2)
- Flow quality table: sortable, top performers (emerald), needs attention (rose), mini score bars
- Flow usage tracking: usage_count, success_rate, last_matched_at wired into session matching + resolution
- PSA activity logging: psa_activity_logs table, wired into documentation push service
- Enhanced PSA metrics: time entries, hours logged, push success funnel, daily trend chart
- 13 new backend tests for coverage and flow quality endpoints
### Search & Recall + Evidence-Rich Sessions (Complete)
**Evidence:**
- Railway Object Storage (S3-compatible) integration via boto3
- file_uploads model with upload/download/list/delete API endpoints
- RichTextInput component: clipboard paste (Ctrl+V) and drag-and-drop for images
- Wired into FlowPilot intake, free-text responses, and escalation modal
- Evidence included in all 5 export formats (markdown, text, HTML, PSA, PDF)
- 15 backend tests for upload endpoints
**Search:**
- Structured filters on AI sessions: problem_domain, matched_flow, confidence_tier, ticket_id, date range
- Filter bar UI on Session History page (AI Sessions tab)
- PostgreSQL full-text search via generated tsvector column + GIN index on ai_sessions
- Command Palette extended with AI session search results
- Voyage AI semantic embeddings on ai_session_embeddings table (pgvector cosine similarity)
- Similar sessions endpoint: GET /ai-sessions/{id}/similar
- Similar Sessions sidebar component in FlowPilot session view
### Security Hardening (Phases A-D Complete)
- Registration role hardcoded to `engineer`
- HTML export XSS fix (html.escape)
- Secret key validator (rejects default when DEBUG=False)
- Role CHECK constraint on users table
- Tree access check on session start
- Centralized permissions in `permissions.py`
- `is_active` field on User model, enforced in auth
- Admin user management endpoints (6 endpoints)
- Password complexity validation (uppercase, lowercase, digit, min 10 chars)
- Soft delete cascade cleanup (folder/tag junctions)
- SQL wildcard escaping in tag search
- PSA credentials encrypted at rest (Fernet)
### Tenant Isolation (Phases 1-4 Complete)
- PostgreSQL RLS enabled across tenant-scoped tables in phased rollout
- `account_id` propagation completed across core content, sessions, analytics, notifications, shares, and remaining Phase 4 tables
- Global platform tables correctly excluded from tenant RLS where they have no `account_id` (`script_categories`, `platform_steps`, `template_trees`)
- Runtime bootstrap paths updated to use BYPASSRLS/admin sessions where needed (auth/user mutations, startup service account, background jobs, seed scripts)
- Preview Railway backend and frontend deployments green for PR 136 after the Phase 4 fixes
### Copilot-First Dashboard (March 2026)
- Redesigned dashboard as FlowPilot copilot launchpad (ChatGPT-style input)
- Chat-style input with paste images, drag-drop files, attach button, paste logs
- Suggestion chips for common troubleshooting scenarios
- Simplified sidebar: icon rail with Home, History, Flows, Scripts, Data sections
- Amber "New Session" button in sidebar
- Unified Command Palette (Cmd+K) — merged QuickLaunch into omnibar
- "Solutions Library" rename (from "Step Library") site-wide
- Maintenance flows hidden from UI for pilot (backend still supports them)
- Landing page copy rewrite: "Resolve tickets faster. Notes write themselves."
- Spring bounce hover animation on dashboard cards
- Charcoal color palette: sidebar `#10121a`, page `#1a1c23`, cards `#22252e`
### Maintenance Flows (Hidden from UI)
- Batch session launch, saved target lists
- APScheduler scheduling with croniter + pytz
- Backend fully functional; removed from sidebar, create dropdown, and filter tabs for GTM pilot
### Survey System
- Public survey page, admin invite tracking
- Response viewer with CSV export
- Email-to-self, thank-you page
- Admin read/unread/archive/delete management
### Documentation
- CLAUDE.md (comprehensive project context)
- UI-DESIGN-SYSTEM.md, REBRAND-IMPLEMENTATION-GUIDE.md
- ConnectWise API reference docs in `docs/connectwise/`
- Feature specifications through Phase 4
- Phase implementation plans in `docs/plans/`
---
## What's In Progress
- **Self-serve cutover (Phase O):** PR #164 (open) closes the last code blockers — taxonomy reconciliation + `INTERNAL_TESTER_EMAILS` allowlist. After merge, remaining work is purely manual ops: live-mode Stripe Dashboard config, Railway prod env vars, internal validation pass with Andrea Henry + 2-3 external Directors of Onboarding, then `SELF_SERVE_ENABLED=true` flip with frontend redeploy.
- **Stripe live-mode setup:** Test-mode is fully wired (3 products, monthly prices for Starter/Pro, Enterprise sales-led, `plan_billing` seeded via `sync_stripe_plan_ids.py`). Live mode requires manual Dashboard config — same script handles seeding live IDs.
- **GTM Validation:** Shadow & Ship — founder uses product for real MSP tickets daily, then hands logins to 5 colleagues.
- **Solutions Library spec:** Written at `docs/plans/2026-03-23-solutions-library-design.md`, implementation deferred to post-pilot.
---
## What's Next (Priority Order)
### Phase O Cutover (Weeks 0-1)
- Merge PR #164
- Stripe Dashboard live-mode setup (Products + Prices for Starter/Pro, no Prices on Enterprise, Customer Portal config, webhook endpoint with 5 events)
- Railway prod env vars (`sk_live_*`, `whsec_*`, `INTERNAL_TESTER_EMAILS`, prod Google + Microsoft OAuth credentials, `OAUTH_REDIRECT_BASE`)
- Run `sync_stripe_plan_ids.py` against prod backend; verify `plan_billing` has `sk_live_*` price IDs
- Internal validation pass (9 scenarios from Phase O Task 46 plan)
- Email pilots about complimentary status, flip `SELF_SERVE_ENABLED=true` (frontend redeploy required for `VITE_SELF_SERVE_ENABLED`)
- PostHog dashboards + Sentry alert at >1/hour Stripe webhook errors
### Pilot Phase (Weeks 1-2)
- Founder dogfooding: use ResolutionFlow for real MSP tickets daily
- 3 calls with external Directors of Onboarding to validate the documentation-builder thesis (cold pitch, no friendly contacts)
- Collect feedback on copilot-first experience and self-serve onboarding flow
- Fix issues discovered during real usage
### Post-Pilot (Weeks 3-4)
- Solutions Library implementation (saved resolutions + RAG + dedup + confidence scoring)
- Landing page design polish based on pilot feedback
- Dedicated Insights dashboard (strategic metrics for team leads)
### Later (Phase 6+)
- Full Autotask PSA implementation
- Full Halo PSA implementation
- Full SSO/SAML implementation (SAML + OIDC flows)
- PowerShell automation framework
- White-label deployment
- Marketplace for community flow templates
- Native mobile app (React Native or PWA)
---
## Environment Quick Reference
### Start Development
```bash
# Start PostgreSQL (Docker Compose)
docker compose up -d
# Backend (from backend/)
source venv/bin/activate
uvicorn app.main:app --reload
# Frontend (from frontend/)
npm run dev
```
### URLs
- Frontend: http://192.168.0.9:5173
- Backend API: http://192.168.0.9:8000
- API Docs: http://192.168.0.9:8000/api/docs
### Run Tests
```bash
cd backend && pytest --override-ini="addopts="
```
---
## Blockers / Known Issues
| Issue | Workaround | Status |
|-------|------------|--------|
| `analysis_status` has no CheckConstraint | Valid values documented in code comments | Low priority |
| Review queue/analytics pages have no frontend role gate | Backend 403 protects data; UX could show message | Low priority |
| Review queue capped at 50 with no pagination UI | Filters can narrow results | Low priority |