# PROJECT_CONTEXT.md — ResolutionFlow > SaaS troubleshooting platform for MSPs. Stable architectural truth. Updated only when the repo's shape changes. --- ## Product & naming Canonical product name is **ResolutionFlow**. `patherly` is the legacy internal name — still present in DB name (`patherly` on Railway, `resolutionflow` locally), some Railway service names, and historical paths. Treat as aliases, not canonical. Docker containers are `resolutionflow_*`. **User terminology:** "Flows" (not Trees), "Projects" (not Procedures), "Solutions Library" (not Step Library). Maintenance flows hidden from pilot UI (backend retains them). DB column `tree_type` values unchanged. --- ## SaaS shape Multi-tenant by account. Primary role hierarchy: `super_admin` > `owner` > `engineer` > `viewer` — driven by `is_super_admin` + `account_role`. Never `role=='admin'` — use `is_super_admin`. Separate team-scoped admin gate exists orthogonally to the role hierarchy: `is_team_admin=True` + valid `team_id`, enforced by `require_team_admin`. Backend deps in `app/api/deps.py`: `get_current_active_user`, `require_engineer_or_admin`, `require_admin`, `require_account_owner`, `require_team_admin`. Frontend: `usePermissions()` hook. Central logic in `backend/app/core/permissions.py` + `frontend/src/hooks/usePermissions.ts`. --- ## Status Go-to-Market Validation (pre-PMF). Backend feature-complete (55+ endpoints, 100+ tests). Phase 0.5 FlowPilot telemetry baseline accruing. See [CURRENT-STATE.md](../CURRENT-STATE.md) for live status, [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md) for phases. --- ## Tech stack - **Backend:** Python 3.11 + FastAPI, SQLAlchemy 2.0 async (asyncpg), Alembic, Pydantic v2, JWT (python-jose + bcrypt, JTI refresh rotation), APScheduler (in-process with FastAPI lifespan). - **Frontend:** React 19 + Vite + TypeScript, Tailwind v4 (CSS-only config in `index.css`), Zustand (immer + zundo), React Router v7, Axios (token-refresh interceptor), Lucide. - **DB:** PostgreSQL 16 (RLS enabled Phase 4, pgvector). --- ## Project structure ``` resolutionflow/ ├── backend/ │ ├── app/ │ │ ├── main.py # FastAPI entry │ │ ├── api/endpoints/ # 50+ routers registered in api/router.py — auth/admin, trees/sessions, AI/chat, scripts, integrations, uploads, accounts, FlowPilot, etc. │ │ ├── api/deps.py # auth deps (incl. require_team_admin) │ │ ├── api/router.py # registration │ │ ├── core/ # config, database, permissions, security, audit, rate_limit │ │ ├── models/ # SQLAlchemy (incl. FlowProposal) │ │ ├── schemas/ # Pydantic │ │ ├── services/psa/ # PSA provider pattern (base, connectwise/, autotask/, halopsa/, cache, encryption, exceptions, registry, ticket_context, types) │ │ ├── services/knowledge_flywheel.py + _scheduler.py │ │ └── services/knowledge_gap_service.py │ ├── alembic/versions/ # 001-070 sequential, then hex hash │ ├── scripts/ # seed_data, seed_trees, seed_test_users │ └── tests/ # pytest integration ├── frontend/ │ ├── src/ │ │ ├── api/ # Axios client + endpoint modules │ │ ├── components/ # common, layout, dashboard, tree-editor, session, procedural, procedural-editor, library, step-library, ui, flowpilot │ │ ├── hooks/ # usePermissions, useSessionTimer, useKeyboardShortcuts │ │ ├── pages/ │ │ ├── store/ # Zustand (auth, treeEditor, proceduralEditor, userPreferences, scriptGeneratorStore) │ │ └── types/ │ └── (Tailwind v4 CSS-only config in src/index.css) ├── docs/plans/archive/ # pre-March 2026 plans ├── docs/connectwise/ # CW API reference + best-practices guides ├── docs/LESSONS-ARCHIVE.md # archived lessons (fixes in code) ├── .ai/ # dual-agent handoff system (see .ai/README.md) ├── CLAUDE.md · AGENTS.md · CURRENT-STATE.md · DESIGN-SYSTEM.md · DEV-ENV.md ``` --- ## Dev commands Full setup in [DEV-ENV.md](../DEV-ENV.md) (host-agnostic, with homelab Proxmox reference topology). Day-to-day: ```bash docker compose -f docker-compose.dev.yml up -d # start stack cd backend && source venv/bin/activate && uvicorn app.main:app --reload cd frontend && npm run dev pytest --override-ini="addopts=" # tests (first time: CREATE DATABASE resolutionflow_test) cd backend && alembic upgrade head # migrate cd backend && alembic revision -m "desc" # manual migration (preferred per Lesson 77) cd backend && alembic revision --autogenerate -m "desc" # picks up drift; review carefully cd frontend && npm run build # stricter than tsc --noEmit — final check cd frontend && npx tsc -b # TS-only check when dist/ has EACCES docker exec -it resolutionflow_postgres psql -U postgres -d resolutionflow python -m scripts.seed_trees # seed (from backend/) ``` **Never pass `--rev-id`** to alembic — let it generate the hex hash. **On hosts without native `python`/`node`/`npm`** (e.g. the code-server LXC), run commands inside the already-running containers instead: ```bash docker exec resolutionflow_backend pytest --override-ini="addopts=" docker exec resolutionflow_backend alembic upgrade head docker exec -w /app resolutionflow_frontend npm run build docker exec -w /app resolutionflow_frontend npx tsc -b ``` --- ## URLs & test users **URLs:** Frontend , backend , API docs . **Test users** (all password `TestPass123!`): `admin@resolutionflow.example.com` (super_admin), `teamadmin@resolutionflow.example.com`, `engineer@resolutionflow.example.com`, `pro@resolutionflow.example.com`. --- ## CI Gitea (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`). `gh` CLI works for issues/PRs on the GitHub mirror, but not CI runs. --- ## Deployment (Railway) - **Prod:** `resolutionflow.com` (frontend), `api.resolutionflow.com` (backend). - Auto-deploy: Gitea push → GitHub mirror → Railway follows GitHub `main`. - PR environments auto-created; need manual domain generation + `VITE_API_URL` with `https://` prefix. - `ALLOW_RAILWAY_ORIGINS=true` for `*.up.railway.app` CORS. - Shared Variables (Railway project-level) auto-propagate to PR envs — use for secrets like `ANTHROPIC_API_KEY`. - Super admin utility: `backend/make_superadmin_simple.py list|`. --- ## ConnectWise PSA Reference: `docs/connectwise/` — start with `CONNECTWISE-API-REFERENCE.md`, then the `best-practices/` guides. Extracted OpenAPI spec in `connectwise-psa-resolutionflow-reference.json` (670 endpoints, v2025.16); full spec in `connectwise-psa-openapi-full.json`. - **Auth:** API Key (Base64 `companyId+publicKey:privateKey`) + `clientId` header every request. `clientId` is server-side (`CW_CLIENT_ID` in `config.py`) — identifies ResolutionFlow, not per-tenant. Per-connection: `company_id`, `public_key`, `private_key`, `server_url`. - **Architecture:** `services/psa/` provider pattern — `PSAProvider` base, `ConnectWiseProvider` impl, `PsaProviderRegistry` for multi-PSA dispatch. Credentials encrypted at rest via `services/psa/encryption.py` (Fernet). Per-team credentials, never per-user. Endpoints in `api/endpoints/integrations.py`. In-memory TTL cache in `services/psa/cache.py`. - **Integration flows:** session docs → ticket notes (`POST /service/tickets/{id}/notes`, markdown supported); ticket context → FlowPilot; callbacks via `/system/callbacks` with HMAC verification. - **API rules:** pin version via Accept header `application/vnd.connectwise.com+json; version=2025.16`. Paginate ≤1000/page. Dynamic base URL via `/login/companyinfo/{companyId}`. Request minimal permissions (MY, not ALL). --- ## Coding standards - **Python:** type hints everywhere, async/await for DB, Pydantic v2, `DateTime(timezone=True)` always. - **TypeScript:** interfaces for all data, `const` over `let`, functional components + hooks, shared logic in custom hooks. - **Git:** feature branch before committing (`git checkout -b feat/feature-name`). Commit format: `type: description` (feat/fix/refactor/docs/test/chore). Large features: commit per phase with `npm run build` validation. Push to Gitea — auto-mirrors to GitHub (`.gitea/workflows/mirror-to-github.yml`); never push GitHub directly. (Agent-specific `Co-Authored-By` trailers live in CLAUDE.md / AGENTS.md.) **After shipping:** update [CURRENT-STATE.md](../CURRENT-STATE.md) + [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md), `gh issue close #N` for resolved issues, add lessons only for non-obvious traps (otherwise let the code speak). --- ## Common tasks - **New endpoint:** `endpoints/` → `router.py` → `schemas/` → tests → frontend API client. - **New page:** `pages/` → route in `router.tsx` → nav in `AppLayout.tsx`. - **New public route:** top-level in `router.tsx` alongside `/login`, not inside `ProtectedRoute`. - **New frontend API module:** types in `types/` → export from `types/index.ts` → client in `api/` → export from `api/index.ts`. - **Schema change:** update model → `alembic revision -m "desc"` → review → `alembic upgrade head`. - **New `VITE_*` env var:** add as `ARG` + `ENV` in `frontend/Dockerfile` for Railway builds (Lesson 60 — Railway env vars are runtime-only, Vite bakes at build time). - **Account sub-page:** add route in `router.tsx` under `account` children + add link card in `AccountSettingsPage.tsx` — `AccountLayout` has NO sidebar nav. --- ## Design system **Source of truth: [DESIGN-SYSTEM.md](../DESIGN-SYSTEM.md).** Read before any visual change. - Flat high-contrast dark theme, Sentry/PostHog-inspired. **No** glass, backdrop blur, ambient orbs, gradient surfaces. - Accent **electric blue** (#60a5fa dark / #2563eb light) — ≤5% of UI, interactive elements only. Warning amber (#fbbf24), info cyan (#67e8f9), success green (#34d399), danger red (#f87171). Each with `-dim` at 10% opacity. - Backgrounds: `bg-sidebar` (#0e1016) → `bg-page` (#16181f) → `bg-card` (#1e2028) → `bg-elevated` (#2a2d38). Borders `border-default` / `border-hover`. - Text: `text-heading` → `text-primary` → `text-muted-foreground` → `text-muted`. - Fonts: IBM Plex Sans (body), Bricolage Grotesque (heading, 700 weight for logo), JetBrains Mono (code). - Logo: 30px gradient square (ember orange) + "ResolutionFlow" in Bricolage Grotesque. Assets in `brand-assets/`, `frontend/src/assets/brand/`, `frontend/public/icons/`. - Mockups: `docs/mockups/` (HTML). - **Deprecated — do not use:** glass-card, glass-stat, `bg-gradient-brand`, `backdrop-filter: blur()`, ambient orbs, purple gradients, ember orange as accent, cyan as accent (cyan is info only). --- ## Frontend patterns - **Component basics:** `cn()` from `@/lib/utils`, Lucide icons, `Modal.tsx` for modals (mobile-responsive `items-end sm:items-center` + `max-w-full sm:max-w-lg`). - **Types:** Create in `types/`, export from `types/index.ts`, `import type { T } from '@/types'`. - **Routing:** `getTreeNavigatePath()` / `getTreeEditorPath()` from `@/lib/routing`. Tree editor is `/trees/new`. All dashboard session clicks → `/pilot/:id` regardless of `session_type`. - **Lazy routes:** `lazyWithRetry` from `@/lib/lazyWithRetry.ts`, not `React.lazy` (auto-reload on stale chunks). - **Public pages:** raw `fetch()` with full URL, NOT `apiClient` (which requires auth tokens). - **Toast:** `toast.warning()` not `toast.warn()`. Import from `@/lib/toast` — methods: `success`, `error`, `warning`, `info`. - **Assistant chat:** uses local React `useState`, not Zustand. All three send paths (`handleSend`, `sendPrefill`, `handleResumeNew`) must call `setShowTaskLane(true)` when response has actions/questions. - **Chat backend wiring:** `aiSessionsApi.sendChatMessage` → `/ai-sessions/{id}/chat` → `unified_chat_service.py`. NOT `assistant_chat_service.py` (removed except retention settings). - **FlowPilot:** Actions live in page header (Resolve/Escalate/Share Update + overflow). `useBlocker` for active-session nav guard. "Pause & Leave" auto-pauses. - **AI markers:** `[QUESTIONS]`, `[ACTIONS]`, `[FORK]`, `[DELTA]...[/DELTA]` (editor), `[TREE_UPDATE]` (troubleshooting builder), `[STEPS_UPDATE]` (procedural builder), `[METADATA]`. Parsed in `unified_chat_service.py`; conversation history stores stripped `display_content`. If markers disappear: check system-prompt final reminder + per-user-message `[SYSTEM: ...]` injection in `_call_anthropic_cached()`. - **Image uploads:** paste/attach → Railway S3 via `uploadsApi.upload()` → resized by `storage_service.resize_image_for_vision()` (Pillow, 1568px max, PNG→JPEG) → base64 → Claude multimodal blocks. Max 3/msg. Images NOT stored in history. - **Async select-load-apply:** guard with a ref (pattern in `AssistantChatPage` `currentChatRef`). Update synchronously on every selection change; after every `await`, bail out if `ref.current !== thisId`. - **Editor-Embedded Flow Assist:** `EditorAIPanel` (320px side panel) + `useEditorAI`. Ghost nodes via `_suggestion: true`. Route actions via `settings.get_model_for_action()`. - **Script Builder:** `/script-builder`, chat-style. Backend `ScriptBuilderSession`, `script_builder_service.py`, endpoints `/scripts/builder/`. FlowPilot handoff via `action_type: "open_script_builder"` + `sessionStorage`. - **Intake form field schema:** `variable_name` + `field_type` (NOT `name` / `type`). - **Node field priority** (copilot, summaries): `title` → `question` → `description` → `content` → `label`. - **Procedural sessions auto-start** on page load (no intake/Start screen). Troubleshooting flows DO have a start screen. --- ## Critical lessons > Lessons 1-40 archived to [docs/LESSONS-ARCHIVE.md](../docs/LESSONS-ARCHIVE.md) — fixes baked into the codebase. **Grep the archive when an error message or symptom is unfamiliar, or after two failed attempts at resolving an issue.** Don't pre-load for routine work. ### Backend / data - **APScheduler interval jobs always `max_instances=1`** — without it, overlapping runs reprocess records (TOCTOU). - **`get_db` rolls back on exception** — never remove the `await session.rollback()`, or one failed request poisons the connection with `InFailedSQLTransaction` cascading. - **Startup routines on tenant-isolated tables must use `_admin_session_factory()`, not `get_db()`.** Phase 4 RLS has no `app.current_account_id` set at startup. `get_service_account_id` is safe (reads cached `app.state`). - **Backfill migrations adding `account_id`:** grep ALL `ModelClass(` sites in service code to verify `account_id=` is passed. SQLAlchemy accepts `None` silently — Phase 4 RLS WITH CHECK surfaces the problem at runtime as `InsufficientPrivilegeError: new row violates row-level security policy`. - **`tree_shares.account_id = tree.account_id`**, never `current_user.account_id`. A super_admin sharing another tenant's tree must produce the share in the tree owner's tenant, or it becomes invisible post-RLS. - **Global tables (no `account_id`, never in RLS migrations):** `script_categories`, `platform_steps`, `template_trees`, `plan_feature_defaults`, `accounts`. Scan at class level — one `.py` file can hold multiple classes with different columns (e.g. `ScriptCategory` vs `ScriptTemplate`). - **`ai_sessions.status` is VARCHAR(30)** — fits `requesting_escalation` (23 chars). Migration `f0aad74ea51b` widened from 20. - **PostgreSQL `func.sum(case(...))` returns `Decimal` via asyncpg** — cast to `int()` before Pydantic `dict[str, Any]`. - **Enhancement / branch_addition proposals need `modified_flow_data` via "Edit & Publish"** — backend 400 on direct approve. Only `new_flow` supports direct approve. - **Adding email types:** static async method on `EmailService` in `core/email.py`. Fire-and-forget from endpoints (log errors, don't fail the request). ### AI / FlowPilot - **Anthropic SDK `max_retries=1`** — default of 2 can take 3× the timeout. - **Model tier routing:** `settings.get_model_for_action(action_type)`. Always alias form (`claude-sonnet-4-6`). - **FlowPilot must ask GUI-vs-script before suggesting either** when both are viable — see `FLOWPILOT_SYSTEM_PROMPT` in `flowpilot_engine.py`. - **Telemetry events to grep:** `anthropic.cache` (prompt-cache hit/create), `mcp.turn` (per-turn MCP availability), `mcp.fallback` (MCP silent-retry fired). - **Don't put literal payloads in system prompts.** Bit us twice in one day: a worked `[QUESTIONS]` example with literal "Outlook + jsmith" content, and a full DNS troubleshooting tree, both caused Claude to recite that content on unrelated tickets — the symptom looked like task-lane state leaking across chats. The fix is structural: every output example in a system prompt uses `` syntax (`{"text": ""}`), never literal field values. Real-looking format examples live in few-shot messages (separate file, separate code path), not system prompts. Guardrail: `tests/test_prompt_anti_parrot.py` scans every `*_PROMPT`/`*_SCHEMA`/`*_PROTOCOL`/`*_FORMAT` constant in `app/services/` and `app/core/`; CI fails when a marker block contains a literal JSON value or when a known leaked token (jsmith, DC01, ADSync, Dnscache, etc.) appears anywhere in a prompt. ### Frontend / UI - **Flex height chain:** every ancestor from `app-shell` grid to React Flow canvas needs `flex` + `flex-1` + `min-h-0` or `h-full`. Missing `flex` collapses to 0. Same rule for FlowPilot action bar and any tall scroller. - **React Flow CSS in Tailwind v4:** import in `index.css`, not component JS. Override dark theme via `--xy-*` CSS vars. - **`text-secondary` renders invisible on dark** — Tailwind v4 maps it to `--color-secondary` (a surface color). Use `text-muted-foreground` for readable secondary text. Avoid `text-muted` for body — labels only. - **`bg-accent` is electric blue — never for code/kbd.** Use `bg-white/[0.12] border border-white/[0.06]` for inline code, `bg-white/[0.08]` for kbd. Accent reserved for interactive elements. - **`landing.css` uses self-contained `--lp-*` vars** — never `var(--color-*)` theme tokens (they resolve incorrectly outside the app shell). - **Never `transition: all`** — list properties explicitly, or layout props animate and jank. - **Date range filter end dates:** `setHours(23, 59, 59, 999)` before sending, or the day's items are excluded. For string-based date inputs, append `T23:59:59.999Z`. - **TopBar search:** full bar `hidden sm:block`, icon button `sm:hidden` — both open CommandPalette. - **Hover pop-out cards:** scrim `pointer-events-none`, expanded card has its own click handler at `z-50`, dismiss via `onMouseLeave` on wrapper. Never put handlers on the scrim. - **`tsc -b` in Dockerfile is stricter than `tsc --noEmit`** — enforces `noUnusedLocals` / `noUnusedParameters` as hard errors. Check IDE yellow squiggles before pushing. - **Dashboard prefill auto-submits** via `useEffect` + `prefillHandledRef` guard — no double-enter. - **Global Axios 5xx interceptor fires before component `.catch()`** — fix optional-data endpoints at the source (return `[]` / `{}` on provider failure), not in the component. - **Playwright strict mode:** scope selectors to avoid sidebar/main ambiguity. Use `getByRole('heading', { name })` or `.animate-scale-in` locators, not bare `getByText()`. ### Env / infra - **Node 20.19+ required** (Vite 7). `nvm use 20` or `PATH="$HOME/.nvm/versions/node/v20.19.0/bin:$PATH"`. - **Railway backend service is `patherly`, DB name `railway`.** Public Postgres proxy: `interchange.proxy.rlwy.net:45797`. - **Railway Object Storage bucket `resolutionflow-uploads`.** Env vars `STORAGE_*`. boto3 in `storage_service.py`. Dockerfile needs Pillow + `libjpeg-dev` / `zlib1g-dev`. - **PostHog:** `PostHogProvider` + `posthog.init()` in `main.tsx`. Helpers in `lib/analytics.ts`. Env: `VITE_PUBLIC_POSTHOG_KEY`, `VITE_PUBLIC_POSTHOG_HOST`. `identifyUser()` in `authStore.fetchUser()`, `resetAnalytics()` on logout. - **bun PATH on devserver01:** `BUN_INSTALL="$HOME/.bun"`, `PATH="$BUN_INSTALL/bin:$PATH"`. Playwright Chromium needs `libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon0 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2`. - **Full-stack change:** trace schema → endpoint → API client → hook → store → UI. Don't assume one end proves the other. - **Dev env** — see [DEV-ENV.md](../DEV-ENV.md) for current topology, `REPO_ROOT` requirement when compose runs inside a container, Vite `allowedHosts`, linuxserver.io `group_add` + custom-cont-init.d workaround, `docker compose up` no-op-on-unchanged-hash gotcha. --- ## Quick reference | What | Where | |---|---| | Detailed status | [CURRENT-STATE.md](../CURRENT-STATE.md) | | Roadmap | [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md) | | Design system | [DESIGN-SYSTEM.md](../DESIGN-SYSTEM.md) | | Dev env | [DEV-ENV.md](../DEV-ENV.md) | | Archived lessons | [docs/LESSONS-ARCHIVE.md](../docs/LESSONS-ARCHIVE.md) | | ConnectWise API | `docs/connectwise/` | | GitHub issues | `gh issue list --state open` | | Local API docs | | | Handoff system | [.ai/README.md](README.md) |