Files
resolutionflow/docs/plans/2026-03-16-stack-priorities-and-playwright-plan.md
chihlasm 9bad49d568 feat(knowledge-flywheel): add Phase 3 Knowledge Flywheel — AI analysis, review queue, analytics
Phase 3 implementation:
- AI session analysis service that generates flow proposals from resolved sessions
- APScheduler job for batch processing pending analyses (max_instances=1)
- Knowledge gap detection (weak options, high escalation signals)
- Flow proposals CRUD with team admin review workflow (approve/edit/dismiss/reject)
- FlowPilot analytics dashboard with confidence tiers, PSA metrics, knowledge gaps
- In-session script generator component
- Review queue page with filtering and proposal detail panel

Bug fixes from review (12 total):
- Fix "Edit & Publish" navigating to non-existent /editor/new route
- Hide Approve button for enhancement proposals (require Edit & Publish)
- Add max_instances=1 to scheduler to prevent TOCTOU race
- Fix eventual_success case() double-counting failed retries
- Add tree_structure validation before creating tree from proposal
- Simplify script generator rendering condition
- Add severity style fallback, toFixed on rates, Link instead of <a href>
- Add toast.warning on dismiss failure, fix dedup for domain-less sessions
- Cast Decimal to int in knowledge gap evidence dicts

Also updates CLAUDE.md with lessons 67-71 and Phase 3 project structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 05:12:10 +00:00

17 KiB
Raw Blame History

Stack Priorities And Playwright Plan

Date: 2026-03-16 Updated: 2026-03-18 Product: ResolutionFlow Purpose: Turn the recent stack-gap review into a practical, sequenced execution plan

Completion Status

Item Status Notes
Product analytics (PostHog) Complete All 9 events tracked, identifyUser/resetAnalytics wired to auth, PostHogProvider in main.tsx
Playwright e2e Complete 17 spec files, full CI job, auth storage state, both webServers managed in config
Better empty states Complete Illustrative empty states rolled out across 8 pages, upgraded EmptyState component with illustration + learn-more support, 2 new guide entries
Onboarding checklist Complete Backend status/dismiss endpoints, dashboard checklist widget with structured steps
Professional exports Complete PDF export via WeasyPrint with branded template, supporting data in all export formats, team branding CRUD + UI settings, supporting data capture CRUD + UI
Coverage gates in CI Complete Backend enforced at 80%, frontend coverage reporting enabled (no gate yet)
Security headers Complete HSTS, X-Frame-Options, X-Content-Type-Options, Referrer-Policy, Permissions-Policy, CSP report-only
Web Vitals / performance budgets Complete LCP, INP, CLS, FCP, TTFB reported to PostHog via web-vitals library
Search and recall improvements Not started
Evidence-rich sessions Not started
Smart PSA / client context Not started
Queue / worker architecture Not started
Buyer-facing trust surfaces Not started


Summary

ResolutionFlow already has a credible application stack:

  • React 19 + TypeScript + Vite on the frontend
  • FastAPI + async SQLAlchemy + PostgreSQL on the backend
  • Sentry on both frontend and backend
  • CI with backend coverage plus frontend lint/test/build
  • Strong backend integration test coverage
  • Route-level lazy loading and bundle chunking already in place

The next step is not a stack rewrite.

The biggest gains now come from:

  1. Better product visibility
  2. Better release confidence
  3. Better enterprise trust signals
  4. Better workflow gravity inside the app

Ranked Recommendations

1. Fastest Wins

These are the best short-term upgrades if the goal is to make the product feel more polished and more professional quickly.

1. Product analytics instrumentation

Why: Sentry tells us when the app breaks. It does not tell us what users value, where they stall, or what converts.

Recommended: PostHog

Track first:

  • Account created
  • First successful login
  • First flow viewed
  • First session started
  • First session completed
  • First export generated
  • First AI feature used
  • First PSA integration connected
  • First shared session created

Why this is a fast win: High leverage with low UI churn.

2. Better empty states and onboarding guidance

Why: Mature apps reduce ambiguity. Empty libraries, empty analytics, and empty integrations pages should guide the next action immediately.

Add first:

  • Empty flow library CTA
  • Empty analytics explanation + “how data appears here”
  • Empty integrations state with benefit-oriented copy
  • New team “starter checklist”

3. Professional exports

Why: Exports are one of the fastest ways for a B2B product to feel premium.

Add first:

  • Client-ready PDF export
  • Logo/header metadata block
  • Cleaner ticket/session summary layout
  • Optional evidence attachment section

4. Coverage gates in CI

Why: Cheap trust signal internally. Prevents quality drift as the codebase expands.

Add first:

  • Fail backend CI if total coverage drops below agreed threshold
  • Publish frontend coverage report

2. Best ROI

These are the best medium-term investments if the goal is to improve product quality and roadmap clarity without taking on huge platform risk.

1. Playwright end-to-end coverage

Why: Backend coverage is strong, but frontend confidence is still thinner than the apps complexity now deserves.

High-value flows to cover first:

  • Login
  • Authenticated app shell loads
  • Session history loads
  • Account settings save flow
  • Feedback submission
  • Shared session page access

Why this is high ROI: It catches real regressions users actually feel.

2. Security header hardening

Why: MSP buyers care about security posture. This is both a real protection layer and a professionalism layer.

Add first:

  • Content-Security-Policy
  • Strict-Transport-Security
  • X-Frame-Options or CSP frame-ancestors
  • Referrer-Policy
  • Permissions-Policy
  • Trusted host validation where appropriate

3. Web Vitals and performance budgeting

Why: Route splitting is already implemented, so the next step is protecting performance over time.

Track first:

  • LCP
  • INP
  • CLS
  • Initial JS size
  • Editor route chunk size
  • Landing page chunk size

4. Search and recall improvements

Why: One of the biggest compounding opportunities is turning ResolutionFlow into team memory, not just a flow runner.

Good first step:

  • Search for similar sessions and prior resolutions by flow, tag, client, or ticket context

3. Biggest Enterprise / Trust Upgrades

These are the moves most likely to change how serious buyers perceive the product.

1. Evidence-rich sessions

Add:

  • Screenshot upload/paste
  • Attachments
  • Command output capture
  • Evidence in exports

Why it matters: MSP work is proof-heavy. Evidence makes the platform feel operationally complete.

2. Smart PSA / client context

Add:

  • Ticket details
  • Client/site context
  • Related recent sessions
  • Asset/configuration context
  • SLA metadata

Why it matters: This is what reduces alt-tabbing and makes ResolutionFlow feel indispensable.

3. Queue / worker architecture

Why: AI tasks, indexing, imports, notifications, and integration syncs will eventually compete with request handling.

Likely candidates:

  • AI generation jobs
  • KB imports
  • Embedding/indexing
  • Webhook fan-out
  • Scheduled maintenance orchestration
  • PDF generation

4. Buyer-facing trust surfaces

Add:

  • Changelog
  • Status page
  • Security page
  • Backup/export promise
  • Clear onboarding docs

Why it matters: Buyers infer maturity from these before they inspect the product deeply.


Do Now

  1. Add product analytics
  2. Add Playwright for core journeys
  3. Add security headers and trust hardening
  4. Improve empty states and professional exports

Do Next

  1. Add smart PSA/client context in sessions
  2. Add evidence-rich sessions and attachments
  3. Add search/recall improvements
  4. Add Web Vitals and performance budgets

Explore After That

  1. Add queue/worker architecture
  2. Expand offline/PWA support for session running
  3. Add deeper RMM context integrations

Playwright Implementation Plan

Goal

Add Playwright in a way that improves confidence quickly without creating a brittle, high-maintenance test suite.

The right strategy is:

  • start with a small smoke suite
  • prefer stable selectors and seeded users
  • avoid highly dynamic AI/editor interactions in phase 1
  • run Chromium first
  • only expand once the suite is reliable in CI

Why Playwright Fits This Stack

ResolutionFlow is a good Playwright candidate because:

  • the frontend is a browser-heavy SPA
  • route transitions and auth flows matter a lot
  • many important regressions are UI integration issues, not backend unit issues
  • CI already exists, so there is a natural place to add an e2e job

Start with the least brittle, highest-signal journeys.

Phase 1 tests

  1. Login smoke test

    • visit /login
    • sign in with seeded test user
    • verify redirect into authenticated app
  2. Authenticated shell loads

    • verify sidebar/nav renders
    • verify key route content appears
  3. Session history page loads

    • navigate to /sessions
    • verify tabs or session history shell renders
  4. Account settings save flow

    • navigate to /account/profile or /account
    • edit a safe field if possible
    • verify success toast/message
  5. Feedback form flow

    • navigate to /feedback
    • submit a simple feedback entry
    • verify success state
  6. Shared session public page

    • only if a reliable fixture exists
    • otherwise defer to phase 2

Avoid in Phase 1

  • AI chat assertions
  • Monaco-heavy editor interactions
  • drag-and-drop editor behavior
  • cross-reference graph assertions
  • timing-sensitive maintenance flows

Those are better once the test harness is stable.


Test User Strategy

Use your existing seeded local users from seed_test_users.py.

Existing seeded accounts

  • admin@resolutionflow.example.com
  • pro@resolutionflow.example.com
  • teamadmin@resolutionflow.example.com
  • engineer@resolutionflow.example.com

Shared password

  • TestPass123!

Use teamadmin@resolutionflow.example.com for most authenticated tests.

Why:

  • broad enough permissions
  • less risky than binding all tests to super admin
  • closer to realistic team usage

How To Implement Playwright

1. Add dependencies

From frontend/:

npm install -D @playwright/test
npx playwright install chromium

Optional later:

npx playwright install

That installs Firefox/WebKit too, but Chromium is the right starting point.


2. Add package scripts

Add these scripts to frontend/package.json:

{
  "scripts": {
    "test:e2e": "playwright test",
    "test:e2e:ui": "playwright test --ui",
    "test:e2e:headed": "playwright test --headed",
    "test:e2e:debug": "playwright test --debug"
  }
}

3. Add Playwright config

Create frontend/playwright.config.ts.

Recommended shape:

import { defineConfig, devices } from '@playwright/test'

export default defineConfig({
  testDir: './e2e',
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 2 : undefined,
  reporter: [['html'], ['list']],
  use: {
    baseURL: process.env.PLAYWRIGHT_BASE_URL || 'http://127.0.0.1:4173',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },
  webServer: {
    command: 'npm run preview -- --host 127.0.0.1 --port 4173',
    port: 4173,
    reuseExistingServer: !process.env.CI,
    timeout: 120_000,
  },
  projects: [
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
    },
  ],
})

Why vite preview instead of vite dev

Use the built app in e2e so tests are closer to production behavior and less vulnerable to dev-server quirks.


4. Add an e2e folder structure

Recommended:

frontend/
  e2e/
    auth.spec.ts
    navigation.spec.ts
    feedback.spec.ts
    fixtures/
      auth.ts
    utils/
      session.ts

Keep helpers small. Avoid building a giant abstraction layer too early.


5. Add a login helper

Because the app stores tokens in localStorage, there are two valid strategies:

Option A: Log in through the UI

Best for the first smoke test.

Pros:

  • verifies the real login flow
  • simple to understand

Cons:

  • slower when repeated across many specs

Option B: Log in through the API and set storage state

Best after the first smoke test works.

Pros:

  • much faster
  • reduces duplicated login steps across tests

Cons:

  • does not itself verify the login form UI
  • keep one UI login spec
  • use API login + saved storage state for the rest

Because the backend already supports POST /api/v1/auth/login/json, Playwright can authenticate directly.

Example shape:

import { request, expect } from '@playwright/test'

export async function loginViaApi(baseApiUrl: string) {
  const api = await request.newContext()
  const response = await api.post(`${baseApiUrl}/api/v1/auth/login/json`, {
    data: {
      email: 'teamadmin@resolutionflow.example.com',
      password: 'TestPass123!',
    },
  })

  expect(response.ok()).toBeTruthy()
  return response.json()
}

Then inject access_token and refresh_token into localStorage before page load.


6. Make selectors stable

Playwright is easiest to maintain when the UI exposes stable selectors.

The current UI already has some good aria-label coverage, which is enough for many tests. Where a flow is critical or copy is likely to change, add data-testid.

  • login form
  • login submit button
  • app sidebar
  • session history page shell
  • feedback form
  • save buttons on important settings pages

Rule of thumb

  • prefer getByRole() and getByLabel() first
  • add data-testid for high-value flows where text is decorative or likely to change

7. Seed data before e2e runs

Phase 1 should not depend on manually-created accounts.

Recommended flow:

  1. start Postgres
  2. run backend migrations
  3. run python -m scripts.seed_test_users
  4. start backend
  5. start frontend preview server
  6. run Playwright

If later tests need trees, add a second seed step for flows:

python -m scripts.seed_trees
python -m scripts.seed_trees_v2
python -m scripts.seed_procedural_flows

For the very first phase, user-seeding alone is enough if tests stay focused on auth, navigation, feedback, and settings.


8. Add initial smoke specs

auth.spec.ts

Covers:

  • login page loads
  • valid login succeeds
  • invalid login shows error

navigation.spec.ts

Covers:

  • authenticated app shell renders
  • /sessions loads
  • /feedback loads
  • /account loads

feedback.spec.ts

Covers:

  • feedback form submit
  • success state visible

Keep these small. One assertion-heavy mega-test is worse than a few short focused tests.


9. Add Playwright to CI

Your existing CI workflow is already in .github/workflows/ci.yml. Add a separate e2e job instead of mixing Playwright into the existing frontend unit-test job.

  1. checkout
  2. set up Python
  3. set up Node
  4. start Postgres service
  5. install backend dependencies
  6. install frontend dependencies
  7. run migrations
  8. seed test users
  9. start backend in background
  10. build frontend
  11. install Playwright browser
  12. run Playwright against vite preview
  13. upload Playwright report/artifacts on failure

Important detail

Set VITE_API_URL=http://127.0.0.1:8000 for the frontend build used in CI e2e.


Suggested CI Commands

Backend:

cd backend
alembic upgrade head
python -m scripts.seed_test_users
uvicorn app.main:app --host 127.0.0.1 --port 8000 &

Frontend:

cd frontend
npm ci
npm run build
npx playwright install --with-deps chromium
npm run test:e2e

Use environment variables:

VITE_API_URL=http://127.0.0.1:8000
PLAYWRIGHT_BASE_URL=http://127.0.0.1:4173

Phase 2 Expansion

Once the smoke suite is stable, expand into actual business-critical flows.

Phase 2 candidates

  1. Start and resume a session
  2. Export a session
  3. Create a share link
  4. Open analytics pages
  5. Validate account integrations page behavior

Phase 2.5 candidates

  1. Tree library filters
  2. Fork flow flow
  3. Step library browse/search
  4. Public shared session experience

Phase 3 candidates

  1. Editor workflows
  2. Procedural runner
  3. Drag-and-drop interactions
  4. AI-assisted workflows

Only bring editors and AI into Playwright once the harness is already trustworthy.


Practical Advice For This Repo

Keep Playwright separate from Vitest

Vitest should stay for:

  • small component logic
  • hooks
  • utilities
  • API client logic

Playwright should cover:

  • auth
  • routing
  • critical user journeys
  • integration behavior

Dont try to test everything

You do not need Playwright coverage for every page. Cover the flows that:

  • affect demos
  • affect activation
  • affect trust
  • are expensive to break

Start with one browser

Chromium first.

Only add Firefox/WebKit after the suite is stable and worth the extra runtime.

Prefer reliable fixture creation over brittle UI setup

Use backend seeds and API helpers whenever possible.


Keep the first implementation intentionally small.

Include

  1. @playwright/test dependency
  2. playwright.config.ts
  3. e2e scripts in package.json
  4. one UI login smoke test
  5. one authenticated navigation smoke test
  6. CI e2e job

Do not include yet

  • editor drag-and-drop tests
  • AI flow tests
  • PDF validation
  • multi-browser matrix
  • large helper framework

That first PR should prove the harness works end to end.


Final Recommendation

If only one quality investment gets prioritized right now, it should be:

Playwright + product analytics together

Why:

  • Playwright improves confidence in shipping
  • analytics improves confidence in prioritizing

That combination is one of the cleanest ways to make ResolutionFlow feel more professional both internally and externally.