diff --git a/.ai/CURRENT_TASK.md b/.ai/CURRENT_TASK.md new file mode 100644 index 00000000..f70cc983 --- /dev/null +++ b/.ai/CURRENT_TASK.md @@ -0,0 +1,23 @@ +# CURRENT_TASK.md + + + +**Task:** One-sentence goal describing what this task accomplishes. + +**Status:** not-started + +**Definition of Done:** +- [ ] Testable criterion 1 (e.g., "endpoint returns 200 with expected payload shape") +- [ ] Testable criterion 2 (e.g., "frontend displays new field without layout regression") +- [ ] Tests added or updated +- [ ] `npm run build` passes (frontend) / `pytest` passes (backend) + +**Assumptions:** +- What we're treating as given (e.g., "existing auth middleware handles this case") +- Constraints inherited from surrounding work + +**Out of scope:** +- What this task explicitly does NOT cover (prevents scope creep across handoffs) +- Adjacent work that belongs in a separate task + + diff --git a/.ai/DECISIONS.md b/.ai/DECISIONS.md new file mode 100644 index 00000000..da20d881 --- /dev/null +++ b/.ai/DECISIONS.md @@ -0,0 +1,31 @@ +# DECISIONS.md + +> Append-only architectural decision log. Newest entries at the top. +> Entry format: +> +> ``` +> ## YYYY-MM-DD — +> **Context:** why this came up +> **Decision:** what we chose +> **Rejected:** what we didn't choose and why +> **Consequences:** what this means going forward +> ``` + +--- + +## 2026-04-24 — Adopt dual-agent handoff system (`.ai/` + `CLAUDE.md` + `AGENTS.md`) + +**Context:** Claude Code hits session and weekly usage limits. Work stalls when the primary agent is locked out. Needed a structured way for OpenAI Codex to resume where Claude left off without losing architectural truth or drifting across sessions. + +**Decision:** Split the old CLAUDE.md into `.ai/PROJECT_CONTEXT.md` (stable repo truth), agent-specific root files (`CLAUDE.md`, `AGENTS.md`) with a shared protocol block, and a small handoff toolkit (`CURRENT_TASK.md`, `HANDOFF.md`, `TODO.md`, `DECISIONS.md`, `SESSION_LOG.md`, `README.md`). Previous CLAUDE.md snapshotted in commit `e110fed` before the migration. + +**Rejected:** +- Single symlinked CLAUDE.md/AGENTS.md — diverges silently, hides agent-specific tooling differences. +- Putting GitNexus/gstack content in AGENTS.md — Codex doesn't have those tools; would mislead the resume agent. +- Keeping the old CLAUDE.md as-is and adding AGENTS.md alongside it — duplicated truth, drift guaranteed. + +**Consequences:** +- First read for either agent: `.ai/PROJECT_CONTEXT.md` + `.ai/CURRENT_TASK.md` + `.ai/HANDOFF.md`. +- Architectural changes in the repo require updating PROJECT_CONTEXT.md, not the root agent files. +- Git trailers differ per agent (`Claude Opus 4.7` vs `Codex`) — preserved in each root file. +- Legacy `SESSION-HANDOFF.md` deleted in the same commit; superseded by `.ai/HANDOFF.md`. diff --git a/.ai/HANDOFF.md b/.ai/HANDOFF.md new file mode 100644 index 00000000..0a900aa0 --- /dev/null +++ b/.ai/HANDOFF.md @@ -0,0 +1,24 @@ + + +# HANDOFF.md + +**Last updated:** 2026-04-24 (America/New_York) — initial setup + +**Active task:** See [CURRENT_TASK.md](CURRENT_TASK.md). (No real task yet — handoff system just initialized.) + +**Branch:** `feat/flowpilot-migration` + +**Where I left off:** +- File: n/a +- Next intended action: first agent to pick up real work should replace `CURRENT_TASK.md` example and start fresh here. + +**Uncommitted state:** +- `backend/pytest.ini`, `backend/requirements-dev.txt`, `backend/tests/conftest.py`, `backend/tests/test_rls_isolation.py` modified (pre-existing, unrelated to handoff-system migration). +- Safe to WIP-commit? Ask Michael — these are mid-flight test isolation fixes from `dab740d`. + +**Immediate next steps:** +1. When picking up real work, replace this file's body with the actual resume point. +2. Keep this file under ~2K tokens. If it grows, move older context into `SESSION_LOG.md`. + +**Open questions / blockers:** +- None. diff --git a/.ai/PROJECT_CONTEXT.md b/.ai/PROJECT_CONTEXT.md new file mode 100644 index 00000000..578cb065 --- /dev/null +++ b/.ai/PROJECT_CONTEXT.md @@ -0,0 +1,254 @@ +# PROJECT_CONTEXT.md — ResolutionFlow + +> SaaS troubleshooting platform for MSPs. Stable architectural truth. Updated only when the repo's shape changes. + +--- + +## Product & naming + +Canonical product name is **ResolutionFlow**. `patherly` is the legacy internal name — still present in DB name (`patherly` on Railway, `resolutionflow` locally), some Railway service names, and historical paths. Treat as aliases, not canonical. Docker containers are `resolutionflow_*`. + +**User terminology:** "Flows" (not Trees), "Projects" (not Procedures), "Solutions Library" (not Step Library). Maintenance flows hidden from pilot UI (backend retains them). DB column `tree_type` values unchanged. + +--- + +## SaaS shape + +Multi-tenant by account. Roles: `super_admin` > `team_admin` > `engineer` > `viewer`. Team admin = `role='engineer'` + `is_team_admin=True` + valid `team_id`. Never `role=='admin'` — use `is_super_admin`. Backend deps in `app/api/deps.py`: `get_current_active_user`, `require_engineer_or_admin`, `require_admin`. Frontend: `usePermissions()` hook. Central logic in `backend/app/core/permissions.py` + `frontend/src/hooks/usePermissions.ts`. + +--- + +## Status + +Go-to-Market Validation (pre-PMF). Backend feature-complete (55+ endpoints, 100+ tests). Phase 0.5 FlowPilot telemetry baseline accruing. See [CURRENT-STATE.md](../CURRENT-STATE.md) for live status, [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md) for phases. + +--- + +## Tech stack + +- **Backend:** Python 3.11 + FastAPI, SQLAlchemy 2.0 async (asyncpg), Alembic, Pydantic v2, JWT (python-jose + bcrypt, JTI refresh rotation), APScheduler (in-process with FastAPI lifespan). +- **Frontend:** React 19 + Vite + TypeScript, Tailwind v4 (CSS-only config in `index.css`), Zustand (immer + zundo), React Router v7, Axios (token-refresh interceptor), Lucide. +- **DB:** PostgreSQL 16 (RLS enabled Phase 4, pgvector). + +--- + +## Project structure + +``` +resolutionflow/ +├── backend/ +│ ├── app/ +│ │ ├── main.py # FastAPI entry +│ │ ├── api/endpoints/ # auth, trees, sessions, admin, steps, survey, copilot, assistant_chat, integrations, flow_proposals, flowpilot_analytics +│ │ ├── api/deps.py # auth deps (incl. require_team_admin) +│ │ ├── api/router.py # registration +│ │ ├── core/ # config, database, permissions, security, audit, rate_limit +│ │ ├── models/ # SQLAlchemy (incl. FlowProposal) +│ │ ├── schemas/ # Pydantic +│ │ ├── services/psa/ # PSA provider pattern (base, connectwise/, autotask/, halopsa/, cache, encryption, registry, types) +│ │ ├── services/knowledge_flywheel.py + _scheduler.py +│ │ └── services/knowledge_gap_service.py +│ ├── alembic/versions/ # 001-070 sequential, then hex hash +│ ├── scripts/ # seed_data, seed_trees, seed_test_users +│ └── tests/ # pytest integration +├── frontend/ +│ ├── src/ +│ │ ├── api/ # Axios client + endpoint modules +│ │ ├── components/ # common, layout, dashboard, tree-editor, session, procedural, procedural-editor, library, step-library, ui, flowpilot +│ │ ├── hooks/ # usePermissions, useSessionTimer, useKeyboardShortcuts +│ │ ├── pages/ +│ │ ├── store/ # Zustand (auth, treeEditor, proceduralEditor, userPreferences, scriptGeneratorStore) +│ │ └── types/ +│ └── (Tailwind v4 CSS-only config in src/index.css) +├── docs/plans/archive/ # pre-March 2026 plans +├── docs/connectwise/ # CW API reference + best-practices guides +├── docs/LESSONS-ARCHIVE.md # archived lessons (fixes in code) +├── .ai/ # dual-agent handoff system (see .ai/README.md) +├── CLAUDE.md · AGENTS.md · CURRENT-STATE.md · DESIGN-SYSTEM.md · DEV-ENV.md +``` + +--- + +## Dev commands + +Full setup in [DEV-ENV.md](../DEV-ENV.md) (host-agnostic, with homelab Proxmox reference topology). Day-to-day: + +```bash +docker compose -f docker-compose.dev.yml up -d # start stack +cd backend && source venv/bin/activate && uvicorn app.main:app --reload +cd frontend && npm run dev +pytest --override-ini="addopts=" # tests (first time: CREATE DATABASE resolutionflow_test) +cd backend && alembic upgrade head # migrate +cd backend && alembic revision -m "desc" # manual migration (preferred per Lesson 77) +cd backend && alembic revision --autogenerate -m "desc" # picks up drift; review carefully +cd frontend && npm run build # stricter than tsc --noEmit — final check +cd frontend && npx tsc -b # TS-only check when dist/ has EACCES +docker exec -it resolutionflow_postgres psql -U postgres -d resolutionflow +python -m scripts.seed_trees # seed (from backend/) +``` + +**Never pass `--rev-id`** to alembic — let it generate the hex hash. + +--- + +## URLs & test users + +**URLs:** Frontend , backend , API docs . + +**Test users** (all password `TestPass123!`): `admin@resolutionflow.example.com` (super_admin), `teamadmin@resolutionflow.example.com`, `engineer@resolutionflow.example.com`, `pro@resolutionflow.example.com`. + +--- + +## CI + +Gitea (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`). `gh` CLI works for issues/PRs on the GitHub mirror, but not CI runs. + +--- + +## Deployment (Railway) + +- **Prod:** `resolutionflow.com` (frontend), `api.resolutionflow.com` (backend). +- Auto-deploy: Gitea push → GitHub mirror → Railway follows GitHub `main`. +- PR environments auto-created; need manual domain generation + `VITE_API_URL` with `https://` prefix. +- `ALLOW_RAILWAY_ORIGINS=true` for `*.up.railway.app` CORS. +- Shared Variables (Railway project-level) auto-propagate to PR envs — use for secrets like `ANTHROPIC_API_KEY`. +- Super admin utility: `backend/make_superadmin_simple.py list|`. + +--- + +## ConnectWise PSA + +Reference: `docs/connectwise/` — start with `CONNECTWISE-API-REFERENCE.md`, then the `best-practices/` guides. Extracted OpenAPI spec in `connectwise-psa-resolutionflow-reference.json` (670 endpoints, v2025.16); full spec in `connectwise-psa-openapi-full.json`. + +- **Auth:** API Key (Base64 `companyId+publicKey:privateKey`) + `clientId` header every request. `clientId` is server-side (`CW_CLIENT_ID` in `config.py`) — identifies ResolutionFlow, not per-tenant. Per-connection: `company_id`, `public_key`, `private_key`, `server_url`. +- **Architecture:** `services/psa/` provider pattern — `PSAProvider` base, `ConnectWiseProvider` impl, `PsaProviderRegistry` for multi-PSA dispatch. Credentials encrypted at rest via `services/psa/encryption.py` (Fernet). Per-team credentials, never per-user. Endpoints in `api/endpoints/integrations.py`. In-memory TTL cache in `services/psa/cache.py`. +- **Integration flows:** session docs → ticket notes (`POST /service/tickets/{id}/notes`, markdown supported); ticket context → FlowPilot; callbacks via `/system/callbacks` with HMAC verification. +- **API rules:** pin version via Accept header `application/vnd.connectwise.com+json; version=2025.16`. Paginate ≤1000/page. Dynamic base URL via `/login/companyinfo/{companyId}`. Request minimal permissions (MY, not ALL). + +--- + +## Coding standards + +- **Python:** type hints everywhere, async/await for DB, Pydantic v2, `DateTime(timezone=True)` always. +- **TypeScript:** interfaces for all data, `const` over `let`, functional components + hooks, shared logic in custom hooks. +- **Git:** feature branch before committing (`git checkout -b feat/feature-name`). Commit format: `type: description` (feat/fix/refactor/docs/test/chore). Large features: commit per phase with `npm run build` validation. Push to Gitea — auto-mirrors to GitHub (`.gitea/workflows/mirror-to-github.yml`); never push GitHub directly. (Agent-specific `Co-Authored-By` trailers live in CLAUDE.md / AGENTS.md.) + +**After shipping:** update [CURRENT-STATE.md](../CURRENT-STATE.md) + [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md), `gh issue close #N` for resolved issues, add lessons only for non-obvious traps (otherwise let the code speak). + +--- + +## Common tasks + +- **New endpoint:** `endpoints/` → `router.py` → `schemas/` → tests → frontend API client. +- **New page:** `pages/` → route in `router.tsx` → nav in `AppLayout.tsx`. +- **New public route:** top-level in `router.tsx` alongside `/login`, not inside `ProtectedRoute`. +- **New frontend API module:** types in `types/` → export from `types/index.ts` → client in `api/` → export from `api/index.ts`. +- **Schema change:** update model → `alembic revision -m "desc"` → review → `alembic upgrade head`. +- **New `VITE_*` env var:** add as `ARG` + `ENV` in `frontend/Dockerfile` for Railway builds (Lesson 60 — Railway env vars are runtime-only, Vite bakes at build time). +- **Account sub-page:** add route in `router.tsx` under `account` children + add link card in `AccountSettingsPage.tsx` — `AccountLayout` has NO sidebar nav. + +--- + +## Design system + +**Source of truth: [DESIGN-SYSTEM.md](../DESIGN-SYSTEM.md).** Read before any visual change. + +- Flat high-contrast dark theme, Sentry/PostHog-inspired. **No** glass, backdrop blur, ambient orbs, gradient surfaces. +- Accent **electric blue** (#60a5fa dark / #2563eb light) — ≤5% of UI, interactive elements only. Warning amber (#fbbf24), info cyan (#67e8f9), success green (#34d399), danger red (#f87171). Each with `-dim` at 10% opacity. +- Backgrounds: `bg-sidebar` (#0e1016) → `bg-page` (#16181f) → `bg-card` (#1e2028) → `bg-elevated` (#2a2d38). Borders `border-default` / `border-hover`. +- Text: `text-heading` → `text-primary` → `text-muted-foreground` → `text-muted`. +- Fonts: IBM Plex Sans (body), Bricolage Grotesque (heading, 700 weight for logo), JetBrains Mono (code). +- Logo: 30px gradient square (ember orange) + "ResolutionFlow" in Bricolage Grotesque. Assets in `brand-assets/`, `frontend/src/assets/brand/`, `frontend/public/icons/`. +- Mockups: `docs/mockups/` (HTML). +- **Deprecated — do not use:** glass-card, glass-stat, `bg-gradient-brand`, `backdrop-filter: blur()`, ambient orbs, purple gradients, ember orange as accent, cyan as accent (cyan is info only). + +--- + +## Frontend patterns + +- **Component basics:** `cn()` from `@/lib/utils`, Lucide icons, `Modal.tsx` for modals (mobile-responsive `items-end sm:items-center` + `max-w-full sm:max-w-lg`). +- **Types:** Create in `types/`, export from `types/index.ts`, `import type { T } from '@/types'`. +- **Routing:** `getTreeNavigatePath()` / `getTreeEditorPath()` from `@/lib/routing`. Tree editor is `/trees/new`. All dashboard session clicks → `/pilot/:id` regardless of `session_type`. +- **Lazy routes:** `lazyWithRetry` from `@/lib/lazyWithRetry.ts`, not `React.lazy` (auto-reload on stale chunks). +- **Public pages:** raw `fetch()` with full URL, NOT `apiClient` (which requires auth tokens). +- **Toast:** `toast.warning()` not `toast.warn()`. Import from `@/lib/toast` — methods: `success`, `error`, `warning`, `info`. +- **Assistant chat:** uses local React `useState`, not Zustand. All three send paths (`handleSend`, `sendPrefill`, `handleResumeNew`) must call `setShowTaskLane(true)` when response has actions/questions. +- **Chat backend wiring:** `aiSessionsApi.sendChatMessage` → `/ai-sessions/{id}/chat` → `unified_chat_service.py`. NOT `assistant_chat_service.py` (removed except retention settings). +- **FlowPilot:** Actions live in page header (Resolve/Escalate/Share Update + overflow). `useBlocker` for active-session nav guard. "Pause & Leave" auto-pauses. +- **AI markers:** `[QUESTIONS]`, `[ACTIONS]`, `[FORK]`, `[DELTA]...[/DELTA]` (editor), `[TREE_UPDATE]` (troubleshooting builder), `[STEPS_UPDATE]` (procedural builder), `[METADATA]`. Parsed in `unified_chat_service.py`; conversation history stores stripped `display_content`. If markers disappear: check system-prompt final reminder + per-user-message `[SYSTEM: ...]` injection in `_call_anthropic_cached()`. +- **Image uploads:** paste/attach → Railway S3 via `uploadsApi.upload()` → resized by `storage_service.resize_image_for_vision()` (Pillow, 1568px max, PNG→JPEG) → base64 → Claude multimodal blocks. Max 3/msg. Images NOT stored in history. +- **Async select-load-apply:** guard with a ref (pattern in `AssistantChatPage` `currentChatRef`). Update synchronously on every selection change; after every `await`, bail out if `ref.current !== thisId`. +- **Editor-Embedded Flow Assist:** `EditorAIPanel` (320px side panel) + `useEditorAI`. Ghost nodes via `_suggestion: true`. Route actions via `settings.get_model_for_action()`. +- **Script Builder:** `/script-builder`, chat-style. Backend `ScriptBuilderSession`, `script_builder_service.py`, endpoints `/scripts/builder/`. FlowPilot handoff via `action_type: "open_script_builder"` + `sessionStorage`. +- **Intake form field schema:** `variable_name` + `field_type` (NOT `name` / `type`). +- **Node field priority** (copilot, summaries): `title` → `question` → `description` → `content` → `label`. +- **Procedural sessions auto-start** on page load (no intake/Start screen). Troubleshooting flows DO have a start screen. + +--- + +## Critical lessons + +> Lessons 1-40 archived to [docs/LESSONS-ARCHIVE.md](../docs/LESSONS-ARCHIVE.md) — fixes baked into the codebase. **Grep the archive when an error message or symptom is unfamiliar, or after two failed attempts at resolving an issue.** Don't pre-load for routine work. + +### Backend / data + +- **APScheduler interval jobs always `max_instances=1`** — without it, overlapping runs reprocess records (TOCTOU). +- **`get_db` rolls back on exception** — never remove the `await session.rollback()`, or one failed request poisons the connection with `InFailedSQLTransaction` cascading. +- **Startup routines on tenant-isolated tables must use `_admin_session_factory()`, not `get_db()`.** Phase 4 RLS has no `app.current_account_id` set at startup. `get_service_account_id` is safe (reads cached `app.state`). +- **Backfill migrations adding `account_id`:** grep ALL `ModelClass(` sites in service code to verify `account_id=` is passed. SQLAlchemy accepts `None` silently — Phase 4 RLS WITH CHECK surfaces the problem at runtime as `InsufficientPrivilegeError: new row violates row-level security policy`. +- **`tree_shares.account_id = tree.account_id`**, never `current_user.account_id`. A super_admin sharing another tenant's tree must produce the share in the tree owner's tenant, or it becomes invisible post-RLS. +- **Global tables (no `account_id`, never in RLS migrations):** `script_categories`, `platform_steps`, `template_trees`, `plan_feature_defaults`, `accounts`. Scan at class level — one `.py` file can hold multiple classes with different columns (e.g. `ScriptCategory` vs `ScriptTemplate`). +- **`ai_sessions.status` is VARCHAR(30)** — fits `requesting_escalation` (23 chars). Migration `f0aad74ea51b` widened from 20. +- **PostgreSQL `func.sum(case(...))` returns `Decimal` via asyncpg** — cast to `int()` before Pydantic `dict[str, Any]`. +- **Enhancement / branch_addition proposals need `modified_flow_data` via "Edit & Publish"** — backend 400 on direct approve. Only `new_flow` supports direct approve. +- **Adding email types:** static async method on `EmailService` in `core/email.py`. Fire-and-forget from endpoints (log errors, don't fail the request). + +### AI / FlowPilot + +- **Anthropic SDK `max_retries=1`** — default of 2 can take 3× the timeout. +- **Model tier routing:** `settings.get_model_for_action(action_type)`. Always alias form (`claude-sonnet-4-6`). +- **FlowPilot must ask GUI-vs-script before suggesting either** when both are viable — see `FLOWPILOT_SYSTEM_PROMPT` in `flowpilot_engine.py`. +- **Telemetry events to grep:** `anthropic.cache` (prompt-cache hit/create), `mcp.turn` (per-turn MCP availability), `mcp.fallback` (MCP silent-retry fired). +- **Don't put literal payloads in system prompts.** Bit us twice in one day: a worked `[QUESTIONS]` example with literal "Outlook + jsmith" content, and a full DNS troubleshooting tree, both caused Claude to recite that content on unrelated tickets — the symptom looked like task-lane state leaking across chats. The fix is structural: every output example in a system prompt uses `` syntax (`{"text": ""}`), never literal field values. Real-looking format examples live in few-shot messages (separate file, separate code path), not system prompts. Guardrail: `tests/test_prompt_anti_parrot.py` scans every `*_PROMPT`/`*_SCHEMA`/`*_PROTOCOL`/`*_FORMAT` constant in `app/services/` and `app/core/`; CI fails when a marker block contains a literal JSON value or when a known leaked token (jsmith, DC01, ADSync, Dnscache, etc.) appears anywhere in a prompt. + +### Frontend / UI + +- **Flex height chain:** every ancestor from `app-shell` grid to React Flow canvas needs `flex` + `flex-1` + `min-h-0` or `h-full`. Missing `flex` collapses to 0. Same rule for FlowPilot action bar and any tall scroller. +- **React Flow CSS in Tailwind v4:** import in `index.css`, not component JS. Override dark theme via `--xy-*` CSS vars. +- **`text-secondary` renders invisible on dark** — Tailwind v4 maps it to `--color-secondary` (a surface color). Use `text-muted-foreground` for readable secondary text. Avoid `text-muted` for body — labels only. +- **`bg-accent` is electric blue — never for code/kbd.** Use `bg-white/[0.12] border border-white/[0.06]` for inline code, `bg-white/[0.08]` for kbd. Accent reserved for interactive elements. +- **`landing.css` uses self-contained `--lp-*` vars** — never `var(--color-*)` theme tokens (they resolve incorrectly outside the app shell). +- **Never `transition: all`** — list properties explicitly, or layout props animate and jank. +- **Date range filter end dates:** `setHours(23, 59, 59, 999)` before sending, or the day's items are excluded. For string-based date inputs, append `T23:59:59.999Z`. +- **TopBar search:** full bar `hidden sm:block`, icon button `sm:hidden` — both open CommandPalette. +- **Hover pop-out cards:** scrim `pointer-events-none`, expanded card has its own click handler at `z-50`, dismiss via `onMouseLeave` on wrapper. Never put handlers on the scrim. +- **`tsc -b` in Dockerfile is stricter than `tsc --noEmit`** — enforces `noUnusedLocals` / `noUnusedParameters` as hard errors. Check IDE yellow squiggles before pushing. +- **Dashboard prefill auto-submits** via `useEffect` + `prefillHandledRef` guard — no double-enter. +- **Global Axios 5xx interceptor fires before component `.catch()`** — fix optional-data endpoints at the source (return `[]` / `{}` on provider failure), not in the component. +- **Playwright strict mode:** scope selectors to avoid sidebar/main ambiguity. Use `getByRole('heading', { name })` or `.animate-scale-in` locators, not bare `getByText()`. + +### Env / infra + +- **Node 20.19+ required** (Vite 7). `nvm use 20` or `PATH="$HOME/.nvm/versions/node/v20.19.0/bin:$PATH"`. +- **Railway backend service is `patherly`, DB name `railway`.** Public Postgres proxy: `interchange.proxy.rlwy.net:45797`. +- **Railway Object Storage bucket `resolutionflow-uploads`.** Env vars `STORAGE_*`. boto3 in `storage_service.py`. Dockerfile needs Pillow + `libjpeg-dev` / `zlib1g-dev`. +- **PostHog:** `PostHogProvider` + `posthog.init()` in `main.tsx`. Helpers in `lib/analytics.ts`. Env: `VITE_PUBLIC_POSTHOG_KEY`, `VITE_PUBLIC_POSTHOG_HOST`. `identifyUser()` in `authStore.fetchUser()`, `resetAnalytics()` on logout. +- **bun PATH on devserver01:** `BUN_INSTALL="$HOME/.bun"`, `PATH="$BUN_INSTALL/bin:$PATH"`. Playwright Chromium needs `libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon0 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2`. +- **Full-stack change:** trace schema → endpoint → API client → hook → store → UI. Don't assume one end proves the other. +- **Dev env** — see [DEV-ENV.md](../DEV-ENV.md) for current topology, `REPO_ROOT` requirement when compose runs inside a container, Vite `allowedHosts`, linuxserver.io `group_add` + custom-cont-init.d workaround, `docker compose up` no-op-on-unchanged-hash gotcha. + +--- + +## Quick reference + +| What | Where | +|---|---| +| Detailed status | [CURRENT-STATE.md](../CURRENT-STATE.md) | +| Roadmap | [03-DEVELOPMENT-ROADMAP.md](../03-DEVELOPMENT-ROADMAP.md) | +| Design system | [DESIGN-SYSTEM.md](../DESIGN-SYSTEM.md) | +| Dev env | [DEV-ENV.md](../DEV-ENV.md) | +| Archived lessons | [docs/LESSONS-ARCHIVE.md](../docs/LESSONS-ARCHIVE.md) | +| ConnectWise API | `docs/connectwise/` | +| GitHub issues | `gh issue list --state open` | +| Local API docs | | +| Handoff system | [.ai/README.md](README.md) | diff --git a/.ai/README.md b/.ai/README.md new file mode 100644 index 00000000..49a8b1d7 --- /dev/null +++ b/.ai/README.md @@ -0,0 +1,42 @@ +# .ai/ — dual-agent handoff system + +ResolutionFlow uses two coding agents: **Claude Code** (primary) and **OpenAI Codex** (resume when Claude hits session or weekly limits). This directory holds the shared state that lets either agent start a session with full context. + +## Files + +| File | Holds | Written when | Read when | +|---|---|---|---| +| [PROJECT_CONTEXT.md](PROJECT_CONTEXT.md) | Stable repo truth: stack, structure, SaaS shape, ConnectWise, coding standards, frontend patterns, critical lessons | Only when the repo's shape changes | Every session start | +| [CURRENT_TASK.md](CURRENT_TASK.md) | The single active task: goal, DoD, assumptions, out-of-scope | On task start; status updates during work | Every session start | +| [HANDOFF.md](HANDOFF.md) | Exact resume point: branch, where you left off, next steps, blockers | On session end / context-window limit | Every session start (most important) | +| [TODO.md](TODO.md) | Backlog of work NOT currently active | When deferring or queueing work | Only when `CURRENT_TASK.md` is `complete` | +| [DECISIONS.md](DECISIONS.md) | Append-only architectural decision log | When an architectural choice is made | Skim top entries each session | +| [SESSION_LOG.md](SESSION_LOG.md) | Append-only chronological history | On session end | Only when broader context is needed | + +Agent-specific tooling lives at the repo root: +- [../CLAUDE.md](../CLAUDE.md) — Claude Code's tooling (GitNexus, gstack slash commands, Claude trailer) +- [../AGENTS.md](../AGENTS.md) — OpenAI Codex's tooling (grep/rg fallbacks, Codex trailer) + +Both root files contain an **identical shared-protocol block**. If you edit one, edit the other. + +## The handoff ritual + +At session end (limit hit, task complete, or user stop): update `HANDOFF.md` to reflect the new resume point, update `CURRENT_TASK.md` status if it changed, append to `DECISIONS.md` if you made an architectural call, append a session entry to `SESSION_LOG.md`, and WIP-commit any dirty working tree with `wip(handoff): ` unless told otherwise. Don't push. + +## How to invoke a resume + +Tell the agent: + +> Read CLAUDE.md (or AGENTS.md) and follow its instructions. + +The agent will read its root file, which directs it to `.ai/PROJECT_CONTEXT.md`, `.ai/CURRENT_TASK.md`, and `.ai/HANDOFF.md` before doing anything else. + +## Recovery + +The previous monolithic CLAUDE.md is recoverable via: + +```bash +git show pre-ai-handoff:CLAUDE.md +``` + +(Tag `pre-ai-handoff` on commit `e110fed` — the snapshot taken before this migration.) diff --git a/.ai/SESSION_LOG.md b/.ai/SESSION_LOG.md new file mode 100644 index 00000000..c8c4bcd2 --- /dev/null +++ b/.ai/SESSION_LOG.md @@ -0,0 +1,21 @@ +# SESSION_LOG.md + +> Append-only chronological record. Newest entries at the top. Skim when broader context is needed. +> Entry format: +> +> ``` +> ## YYYY-MM-DD HH:MM +> - What was accomplished +> - What was left for next session +> - Files touched +> ``` + +--- + +## 2026-04-24 — Claude Code — Migrate to dual-agent handoff system + +- Split CLAUDE.md into `.ai/PROJECT_CONTEXT.md` + shared-protocol root files (`CLAUDE.md`, `AGENTS.md`). +- Seeded `CURRENT_TASK.md`, `HANDOFF.md`, `TODO.md`, `DECISIONS.md`, `SESSION_LOG.md`, `README.md`. +- Deleted legacy `SESSION-HANDOFF.md` (superseded). +- Left for next session: first real feature task should replace the seed `CURRENT_TASK.md` and update `HANDOFF.md` with real resume state. +- Files touched: `.ai/*.md` (created), `CLAUDE.md` (rewritten), `AGENTS.md` (created), `SESSION-HANDOFF.md` (deleted). diff --git a/.ai/TODO.md b/.ai/TODO.md new file mode 100644 index 00000000..44656980 --- /dev/null +++ b/.ai/TODO.md @@ -0,0 +1,12 @@ +# TODO.md + +> Backlog of work NOT currently active. Read only when `CURRENT_TASK.md` status is `complete`. +> Format: `- [ ] short description — optional link to issue/PR` + +## Up next + +- [ ] (seed entry — replace with real next-up items) + +## Backlog + +- [ ] (seed entry — replace with real backlog items) diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 00000000..29db82d4 --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,61 @@ +# AGENTS.md — ResolutionFlow + +You are OpenAI Codex, the resume agent for ResolutionFlow. Claude Code is the primary coding agent; you step in when Claude hits session or weekly limits. + +The first thing to do every session: read [`.ai/PROJECT_CONTEXT.md`](.ai/PROJECT_CONTEXT.md), [`.ai/CURRENT_TASK.md`](.ai/CURRENT_TASK.md), and [`.ai/HANDOFF.md`](.ai/HANDOFF.md). The ritual is spelled out below. + +> The protocol section below is byte-identical to the shared block in CLAUDE.md. If you edit one, edit the other. + +## Shared protocol + +### Startup ritual (every session) + +1. Read `.ai/PROJECT_CONTEXT.md` — architectural truth for this repo. +2. Read `.ai/CURRENT_TASK.md` — what we're actively working on. +3. Read `.ai/HANDOFF.md` — exact resume point. +4. Skim `.ai/DECISIONS.md` for recent entries relevant to the current task. +5. Run `git log --oneline -15` and `git status`. +6. Before taking action, state back in two sentences: the current goal and your proposed next action. + +### Handoff ritual (session end — limit hit, task complete, or user stop) + +1. Update `.ai/HANDOFF.md` to reflect new state. Keep it under ~2K tokens. +2. If `CURRENT_TASK.md` status changed, update it. +3. If you made an architectural decision, append to `.ai/DECISIONS.md`. +4. Append a session entry to `.ai/SESSION_LOG.md`. +5. If working tree is dirty, commit WIP with `wip(handoff): `. Do not push unless explicitly asked. + +### Writing rules for .ai/ files + +- Use model-neutral voice in `HANDOFF.md`, `SESSION_LOG.md`, `DECISIONS.md` ("previous session did X", NOT "Claude did X" or "Codex did X"). Exception: `SESSION_LOG.md` entries include an `` field in the header. +- Do not duplicate content between files. `CURRENT_TASK.md` holds the goal, `HANDOFF.md` holds the resume point, `TODO.md` holds the backlog. If unsure where something goes, check `.ai/README.md`. +- Don't invent facts about the repo. If you're uncertain, write `TODO: confirm` and flag it. + +### Project principle + +Prefer correct architecture over minimal diff. Flag "simpler approach" tradeoffs for review before taking them. + +## Codex-specific notes + +### Tooling you do NOT have + +- **No GitNexus tools.** Use `grep -r`, `rg`, `git grep`, or `find` for code search. For blast-radius reasoning, grep call sites manually and read the files. +- **No gstack slash commands** (`/review`, `/ship`, `/qa`, `/browse`, `/investigate`, `/design-review`, `/plan-*`). Run the equivalent work directly: `pytest` for tests, `npm run build` for frontend validation, manual PR description for review flow. +- **No `/codex` second-opinion command.** You are Codex. + +### Git trailer + +Every commit: `Co-Authored-By: Codex ` + +### Model selection + +Handled on OpenAI's side. Do not attempt to set Anthropic model aliases for your own runtime. (The repo's application code still uses Anthropic aliases like `claude-sonnet-4-6` via `settings.get_model_for_action()` — that's runtime config for the product, not your agent.) + +### Reviewing Claude's work + +When you resume from a Claude session, assume some decisions may have been informed by GitNexus queries or gstack commands whose output isn't in the handoff. If a decision looks unverified from the `.ai/` files alone, either: + +- re-verify with `grep`/`rg`/file reads, or +- flag it in `HANDOFF.md` under "Open questions" so Michael or Claude can confirm on the next handoff. + +Do not assume tooling output that isn't written down. diff --git a/CLAUDE.md b/CLAUDE.md index 27c8b955..857d80ef 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,215 +1,43 @@ # CLAUDE.md — ResolutionFlow -> SaaS troubleshooting platform for MSPs. Last reviewed 2026-04-19. +You are Claude Code, the primary coding agent for ResolutionFlow. OpenAI Codex is the resume agent when you hit session or weekly limits. -**Naming:** Canonical product name is **ResolutionFlow**. `patherly` is the legacy internal name — still present in DB name (`patherly` on Railway, `resolutionflow` locally), some Railway service names, and historical paths. Treat as aliases, not canonical. Docker containers are `resolutionflow_*`. +The first thing to do every session: read [`.ai/PROJECT_CONTEXT.md`](.ai/PROJECT_CONTEXT.md), [`.ai/CURRENT_TASK.md`](.ai/CURRENT_TASK.md), and [`.ai/HANDOFF.md`](.ai/HANDOFF.md). The ritual is spelled out below. -**User terminology:** "Flows" (not Trees), "Projects" (not Procedures), "Solutions Library" (not Step Library). Maintenance flows hidden from pilot UI (backend retains them). DB column `tree_type` values unchanged. +> The protocol section below is byte-identical to the shared block in AGENTS.md. If you edit one, edit the other. -**SaaS shape:** Multi-tenant by account. Roles: `super_admin` > `team_admin` > `engineer` > `viewer`. Team admin = `role='engineer'` + `is_team_admin=True` + valid `team_id`. Never `role=='admin'` — use `is_super_admin`. Backend deps in `app/api/deps.py`: `get_current_active_user`, `require_engineer_or_admin`, `require_admin`. Frontend: `usePermissions()` hook. Central logic in `backend/app/core/permissions.py` + `frontend/src/hooks/usePermissions.ts`. +## Shared protocol -**Status:** Go-to-Market Validation (pre-PMF). Backend feature-complete (55+ endpoints, 100+ tests). Phase 0.5 FlowPilot telemetry baseline accruing. See `CURRENT-STATE.md` for live status, `03-DEVELOPMENT-ROADMAP.md` for phases. +### Startup ritual (every session) -**Principle:** Prefer correct architecture over minimal diff. Flag "simpler approach" tradeoffs for review before taking them. +1. Read `.ai/PROJECT_CONTEXT.md` — architectural truth for this repo. +2. Read `.ai/CURRENT_TASK.md` — what we're actively working on. +3. Read `.ai/HANDOFF.md` — exact resume point. +4. Skim `.ai/DECISIONS.md` for recent entries relevant to the current task. +5. Run `git log --oneline -15` and `git status`. +6. Before taking action, state back in two sentences: the current goal and your proposed next action. ---- +### Handoff ritual (session end — limit hit, task complete, or user stop) -## Tech stack +1. Update `.ai/HANDOFF.md` to reflect new state. Keep it under ~2K tokens. +2. If `CURRENT_TASK.md` status changed, update it. +3. If you made an architectural decision, append to `.ai/DECISIONS.md`. +4. Append a session entry to `.ai/SESSION_LOG.md`. +5. If working tree is dirty, commit WIP with `wip(handoff): `. Do not push unless explicitly asked. -- **Backend:** Python 3.11 + FastAPI, SQLAlchemy 2.0 async (asyncpg), Alembic, Pydantic v2, JWT (python-jose + bcrypt, JTI refresh rotation), APScheduler (in-process with FastAPI lifespan). -- **Frontend:** React 19 + Vite + TypeScript, Tailwind v4 (CSS-only config in `index.css`), Zustand (immer + zundo), React Router v7, Axios (token-refresh interceptor), Lucide. -- **DB:** PostgreSQL 16 (RLS enabled Phase 4, pgvector). +### Writing rules for .ai/ files ---- +- Use model-neutral voice in `HANDOFF.md`, `SESSION_LOG.md`, `DECISIONS.md` ("previous session did X", NOT "Claude did X" or "Codex did X"). Exception: `SESSION_LOG.md` entries include an `` field in the header. +- Do not duplicate content between files. `CURRENT_TASK.md` holds the goal, `HANDOFF.md` holds the resume point, `TODO.md` holds the backlog. If unsure where something goes, check `.ai/README.md`. +- Don't invent facts about the repo. If you're uncertain, write `TODO: confirm` and flag it. -## Project structure +### Project principle -``` -resolutionflow/ -├── backend/ -│ ├── app/ -│ │ ├── main.py # FastAPI entry -│ │ ├── api/endpoints/ # auth, trees, sessions, admin, steps, survey, copilot, assistant_chat, integrations, flow_proposals, flowpilot_analytics -│ │ ├── api/deps.py # auth deps (incl. require_team_admin) -│ │ ├── api/router.py # registration -│ │ ├── core/ # config, database, permissions, security, audit, rate_limit -│ │ ├── models/ # SQLAlchemy (incl. FlowProposal) -│ │ ├── schemas/ # Pydantic -│ │ ├── services/psa/ # PSA provider pattern (base, connectwise/, autotask/, halopsa/, cache, encryption, registry, types) -│ │ ├── services/knowledge_flywheel.py + _scheduler.py -│ │ └── services/knowledge_gap_service.py -│ ├── alembic/versions/ # 001-070 sequential, then hex hash -│ ├── scripts/ # seed_data, seed_trees, seed_test_users -│ └── tests/ # pytest integration -├── frontend/ -│ ├── src/ -│ │ ├── api/ # Axios client + endpoint modules -│ │ ├── components/ # common, layout, dashboard, tree-editor, session, procedural, procedural-editor, library, step-library, ui, flowpilot -│ │ ├── hooks/ # usePermissions, useSessionTimer, useKeyboardShortcuts -│ │ ├── pages/ -│ │ ├── store/ # Zustand (auth, treeEditor, proceduralEditor, userPreferences, scriptGeneratorStore) -│ │ └── types/ -│ └── (Tailwind v4 CSS-only config in src/index.css) -├── docs/plans/archive/ # pre-March 2026 plans -├── docs/connectwise/ # CW API reference + best-practices guides -├── docs/LESSONS-ARCHIVE.md # archived lessons (fixes in code) -├── CLAUDE.md · CURRENT-STATE.md · DESIGN-SYSTEM.md · DEV-ENV.md -``` +Prefer correct architecture over minimal diff. Flag "simpler approach" tradeoffs for review before taking them. ---- +## Claude-specific tooling -## Design system - -**Source of truth: [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md).** Read before any visual change. - -- Flat high-contrast dark theme, Sentry/PostHog-inspired. **No** glass, backdrop blur, ambient orbs, gradient surfaces. -- Accent **electric blue** (#60a5fa dark / #2563eb light) — ≤5% of UI, interactive elements only. Warning amber (#fbbf24), info cyan (#67e8f9), success green (#34d399), danger red (#f87171). Each with `-dim` at 10% opacity. -- Backgrounds: `bg-sidebar` (#0e1016) → `bg-page` (#16181f) → `bg-card` (#1e2028) → `bg-elevated` (#2a2d38). Borders `border-default` / `border-hover`. -- Text: `text-heading` → `text-primary` → `text-muted-foreground` → `text-muted`. -- Fonts: IBM Plex Sans (body), Bricolage Grotesque (heading, 700 weight for logo), JetBrains Mono (code). -- Logo: 30px gradient square (ember orange) + "ResolutionFlow" in Bricolage Grotesque. Assets in `brand-assets/`, `frontend/src/assets/brand/`, `frontend/public/icons/`. -- Mockups: `docs/mockups/` (HTML). -- **Deprecated — do not use:** glass-card, glass-stat, `bg-gradient-brand`, `backdrop-filter: blur()`, ambient orbs, purple gradients, ember orange as accent, cyan as accent (cyan is info only). - ---- - -## ConnectWise PSA - -Reference: `docs/connectwise/` — start with `CONNECTWISE-API-REFERENCE.md`, then the `best-practices/` guides. Extracted OpenAPI spec in `connectwise-psa-resolutionflow-reference.json` (670 endpoints, v2025.16); full spec in `connectwise-psa-openapi-full.json`. - -- **Auth:** API Key (Base64 `companyId+publicKey:privateKey`) + `clientId` header every request. `clientId` is server-side (`CW_CLIENT_ID` in `config.py`) — identifies ResolutionFlow, not per-tenant. Per-connection: `company_id`, `public_key`, `private_key`, `server_url`. -- **Architecture:** `services/psa/` provider pattern — `PSAProvider` base, `ConnectWiseProvider` impl, `PsaProviderRegistry` for multi-PSA dispatch. Credentials encrypted at rest via `services/psa/encryption.py` (Fernet). Per-team credentials, never per-user. Endpoints in `api/endpoints/integrations.py`. In-memory TTL cache in `services/psa/cache.py`. -- **Integration flows:** session docs → ticket notes (`POST /service/tickets/{id}/notes`, markdown supported); ticket context → FlowPilot; callbacks via `/system/callbacks` with HMAC verification. -- **API rules:** pin version via Accept header `application/vnd.connectwise.com+json; version=2025.16`. Paginate ≤1000/page. Dynamic base URL via `/login/companyinfo/{companyId}`. Request minimal permissions (MY, not ALL). - ---- - -## Dev commands - -Full setup in [DEV-ENV.md](DEV-ENV.md) (host-agnostic, with homelab Proxmox reference topology). Day-to-day: - -```bash -docker compose -f docker-compose.dev.yml up -d # start stack -cd backend && source venv/bin/activate && uvicorn app.main:app --reload -cd frontend && npm run dev -pytest --override-ini="addopts=" # tests (first time: CREATE DATABASE resolutionflow_test) -cd backend && alembic upgrade head # migrate -cd backend && alembic revision -m "desc" # manual migration (preferred per Lesson 77) -cd backend && alembic revision --autogenerate -m "desc" # picks up drift; review carefully -cd frontend && npm run build # stricter than tsc --noEmit — final check -cd frontend && npx tsc -b # TS-only check when dist/ has EACCES -docker exec -it resolutionflow_postgres psql -U postgres -d resolutionflow -python -m scripts.seed_trees # seed (from backend/) -``` - -**URLs:** Frontend , backend , API docs . - -**Test users** (all password `TestPass123!`): `admin@resolutionflow.example.com` (super_admin), `teamadmin@resolutionflow.example.com`, `engineer@resolutionflow.example.com`, `pro@resolutionflow.example.com`. - -**CI:** Gitea (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`). `gh` CLI works for issues/PRs on the GitHub mirror, but not CI runs. - -**Never pass `--rev-id`** to alembic — let it generate the hex hash. - ---- - -## Common tasks - -- **New endpoint:** `endpoints/` → `router.py` → `schemas/` → tests → frontend API client. -- **New page:** `pages/` → route in `router.tsx` → nav in `AppLayout.tsx`. -- **New public route:** top-level in `router.tsx` alongside `/login`, not inside `ProtectedRoute`. -- **New frontend API module:** types in `types/` → export from `types/index.ts` → client in `api/` → export from `api/index.ts`. -- **Schema change:** update model → `alembic revision -m "desc"` → review → `alembic upgrade head`. -- **New `VITE_*` env var:** add as `ARG` + `ENV` in `frontend/Dockerfile` for Railway builds (Lesson 60 — Railway env vars are runtime-only, Vite bakes at build time). -- **Account sub-page:** add route in `router.tsx` under `account` children + add link card in `AccountSettingsPage.tsx` — `AccountLayout` has NO sidebar nav. - ---- - -## Coding standards - -- **Python:** type hints everywhere, async/await for DB, Pydantic v2, `DateTime(timezone=True)` always. -- **TypeScript:** interfaces for all data, `const` over `let`, functional components + hooks, shared logic in custom hooks. -- **Git:** feature branch before committing (`git checkout -b feat/feature-name`). Format: `type: description` (feat/fix/refactor/docs/test/chore). Always `Co-Authored-By: Claude Opus 4.6 `. Large features: commit per phase with `npm run build` validation. Push to Gitea — auto-mirrors to GitHub (`.gitea/workflows/mirror-to-github.yml`); never push GitHub directly. - -**After shipping:** update `CURRENT-STATE.md` + `03-DEVELOPMENT-ROADMAP.md`, `gh issue close #N` for resolved issues, add lessons here only for non-obvious traps (otherwise let the code speak). - ---- - -## Frontend patterns - -- **Component basics:** `cn()` from `@/lib/utils`, Lucide icons, `Modal.tsx` for modals (mobile-responsive `items-end sm:items-center` + `max-w-full sm:max-w-lg`). -- **Types:** Create in `types/`, export from `types/index.ts`, `import type { T } from '@/types'`. -- **Routing:** `getTreeNavigatePath()` / `getTreeEditorPath()` from `@/lib/routing`. Tree editor is `/trees/new`. All dashboard session clicks → `/pilot/:id` regardless of `session_type`. -- **Lazy routes:** `lazyWithRetry` from `@/lib/lazyWithRetry.ts`, not `React.lazy` (auto-reload on stale chunks). -- **Public pages:** raw `fetch()` with full URL, NOT `apiClient` (which requires auth tokens). -- **Toast:** `toast.warning()` not `toast.warn()`. Import from `@/lib/toast` — methods: `success`, `error`, `warning`, `info`. -- **Assistant chat:** uses local React `useState`, not Zustand. All three send paths (`handleSend`, `sendPrefill`, `handleResumeNew`) must call `setShowTaskLane(true)` when response has actions/questions. -- **Chat backend wiring:** `aiSessionsApi.sendChatMessage` → `/ai-sessions/{id}/chat` → `unified_chat_service.py`. NOT `assistant_chat_service.py` (removed except retention settings). -- **FlowPilot:** Actions live in page header (Resolve/Escalate/Share Update + overflow). `useBlocker` for active-session nav guard. "Pause & Leave" auto-pauses. -- **AI markers:** `[QUESTIONS]`, `[ACTIONS]`, `[FORK]`, `[DELTA]...[/DELTA]` (editor), `[TREE_UPDATE]` (troubleshooting builder), `[STEPS_UPDATE]` (procedural builder), `[METADATA]`. Parsed in `unified_chat_service.py`; conversation history stores stripped `display_content`. If markers disappear: check system-prompt final reminder + per-user-message `[SYSTEM: ...]` injection in `_call_anthropic_cached()`. -- **Image uploads:** paste/attach → Railway S3 via `uploadsApi.upload()` → resized by `storage_service.resize_image_for_vision()` (Pillow, 1568px max, PNG→JPEG) → base64 → Claude multimodal blocks. Max 3/msg. Images NOT stored in history. -- **Async select-load-apply:** guard with a ref (pattern in `AssistantChatPage` `currentChatRef`). Update synchronously on every selection change; after every `await`, bail out if `ref.current !== thisId`. -- **Editor-Embedded Flow Assist:** `EditorAIPanel` (320px side panel) + `useEditorAI`. Ghost nodes via `_suggestion: true`. Route actions via `settings.get_model_for_action()`. -- **Script Builder:** `/script-builder`, chat-style. Backend `ScriptBuilderSession`, `script_builder_service.py`, endpoints `/scripts/builder/`. FlowPilot handoff via `action_type: "open_script_builder"` + `sessionStorage`. -- **Intake form field schema:** `variable_name` + `field_type` (NOT `name` / `type`). -- **Node field priority** (copilot, summaries): `title` → `question` → `description` → `content` → `label`. -- **Procedural sessions auto-start** on page load (no intake/Start screen). Troubleshooting flows DO have a start screen. - ---- - -## Critical lessons - -> Lessons 1-40 archived to `docs/LESSONS-ARCHIVE.md` — fixes baked into the codebase. **Grep the archive when an error message or symptom is unfamiliar, or after two failed attempts at resolving an issue.** Don't pre-load for routine work. - -### Backend / data - -- **APScheduler interval jobs always `max_instances=1`** — without it, overlapping runs reprocess records (TOCTOU). -- **`get_db` rolls back on exception** — never remove the `await session.rollback()`, or one failed request poisons the connection with `InFailedSQLTransaction` cascading. -- **Startup routines on tenant-isolated tables must use `_admin_session_factory()`, not `get_db()`.** Phase 4 RLS has no `app.current_account_id` set at startup. `get_service_account_id` is safe (reads cached `app.state`). -- **Backfill migrations adding `account_id`:** grep ALL `ModelClass(` sites in service code to verify `account_id=` is passed. SQLAlchemy accepts `None` silently — Phase 4 RLS WITH CHECK surfaces the problem at runtime as `InsufficientPrivilegeError: new row violates row-level security policy`. -- **`tree_shares.account_id = tree.account_id`**, never `current_user.account_id`. A super_admin sharing another tenant's tree must produce the share in the tree owner's tenant, or it becomes invisible post-RLS. -- **Global tables (no `account_id`, never in RLS migrations):** `script_categories`, `platform_steps`, `template_trees`, `plan_feature_defaults`, `accounts`. Scan at class level — one `.py` file can hold multiple classes with different columns (e.g. `ScriptCategory` vs `ScriptTemplate`). -- **`ai_sessions.status` is VARCHAR(30)** — fits `requesting_escalation` (23 chars). Migration `f0aad74ea51b` widened from 20. -- **PostgreSQL `func.sum(case(...))` returns `Decimal` via asyncpg** — cast to `int()` before Pydantic `dict[str, Any]`. -- **Enhancement / branch_addition proposals need `modified_flow_data` via "Edit & Publish"** — backend 400 on direct approve. Only `new_flow` supports direct approve. -- **Adding email types:** static async method on `EmailService` in `core/email.py`. Fire-and-forget from endpoints (log errors, don't fail the request). - -### AI / FlowPilot - -- **Anthropic SDK `max_retries=1`** — default of 2 can take 3× the timeout. -- **Model tier routing:** `settings.get_model_for_action(action_type)`. Always alias form (`claude-sonnet-4-6`). -- **FlowPilot must ask GUI-vs-script before suggesting either** when both are viable — see `FLOWPILOT_SYSTEM_PROMPT` in `flowpilot_engine.py`. -- **Telemetry events to grep:** `anthropic.cache` (prompt-cache hit/create), `mcp.turn` (per-turn MCP availability), `mcp.fallback` (MCP silent-retry fired). -- **Don't put literal payloads in system prompts.** Bit us twice in one day: a worked `[QUESTIONS]` example with literal "Outlook + jsmith" content, and a full DNS troubleshooting tree, both caused Claude to recite that content on unrelated tickets — the symptom looked like task-lane state leaking across chats. The fix is structural: every output example in a system prompt uses `` syntax (`{"text": ""}`), never literal field values. Real-looking format examples live in few-shot messages (separate file, separate code path), not system prompts. Guardrail: `tests/test_prompt_anti_parrot.py` scans every `*_PROMPT`/`*_SCHEMA`/`*_PROTOCOL`/`*_FORMAT` constant in `app/services/` and `app/core/`; CI fails when a marker block contains a literal JSON value or when a known leaked token (jsmith, DC01, ADSync, Dnscache, etc.) appears anywhere in a prompt. - -### Frontend / UI - -- **Flex height chain:** every ancestor from `app-shell` grid to React Flow canvas needs `flex` + `flex-1` + `min-h-0` or `h-full`. Missing `flex` collapses to 0. Same rule for FlowPilot action bar and any tall scroller. -- **React Flow CSS in Tailwind v4:** import in `index.css`, not component JS. Override dark theme via `--xy-*` CSS vars. -- **`text-secondary` renders invisible on dark** — Tailwind v4 maps it to `--color-secondary` (a surface color). Use `text-muted-foreground` for readable secondary text. Avoid `text-muted` for body — labels only. -- **`bg-accent` is electric blue — never for code/kbd.** Use `bg-white/[0.12] border border-white/[0.06]` for inline code, `bg-white/[0.08]` for kbd. Accent reserved for interactive elements. -- **`landing.css` uses self-contained `--lp-*` vars** — never `var(--color-*)` theme tokens (they resolve incorrectly outside the app shell). -- **Never `transition: all`** — list properties explicitly, or layout props animate and jank. -- **Date range filter end dates:** `setHours(23, 59, 59, 999)` before sending, or the day's items are excluded. For string-based date inputs, append `T23:59:59.999Z`. -- **TopBar search:** full bar `hidden sm:block`, icon button `sm:hidden` — both open CommandPalette. -- **Hover pop-out cards:** scrim `pointer-events-none`, expanded card has its own click handler at `z-50`, dismiss via `onMouseLeave` on wrapper. Never put handlers on the scrim. -- **`tsc -b` in Dockerfile is stricter than `tsc --noEmit`** — enforces `noUnusedLocals` / `noUnusedParameters` as hard errors. Check IDE yellow squiggles before pushing. -- **Dashboard prefill auto-submits** via `useEffect` + `prefillHandledRef` guard — no double-enter. -- **Global Axios 5xx interceptor fires before component `.catch()`** — fix optional-data endpoints at the source (return `[]` / `{}` on provider failure), not in the component. -- **Playwright strict mode:** scope selectors to avoid sidebar/main ambiguity. Use `getByRole('heading', { name })` or `.animate-scale-in` locators, not bare `getByText()`. - -### Env / infra - -- **Node 20.19+ required** (Vite 7). `nvm use 20` or `PATH="$HOME/.nvm/versions/node/v20.19.0/bin:$PATH"`. -- **Railway backend service is `patherly`, DB name `railway`.** Public Postgres proxy: `interchange.proxy.rlwy.net:45797`. -- **Railway Object Storage bucket `resolutionflow-uploads`.** Env vars `STORAGE_*`. boto3 in `storage_service.py`. Dockerfile needs Pillow + `libjpeg-dev` / `zlib1g-dev`. -- **PostHog:** `PostHogProvider` + `posthog.init()` in `main.tsx`. Helpers in `lib/analytics.ts`. Env: `VITE_PUBLIC_POSTHOG_KEY`, `VITE_PUBLIC_POSTHOG_HOST`. `identifyUser()` in `authStore.fetchUser()`, `resetAnalytics()` on logout. -- **bun PATH on devserver01:** `BUN_INSTALL="$HOME/.bun"`, `PATH="$BUN_INSTALL/bin:$PATH"`. Playwright Chromium needs `libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon0 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2`. -- **Full-stack change:** trace schema → endpoint → API client → hook → store → UI. Don't assume one end proves the other. -- **Dev env** — see DEV-ENV.md for current topology, `REPO_ROOT` requirement when compose runs inside a container, Vite `allowedHosts`, linuxserver.io `group_add` + custom-cont-init.d workaround, `docker compose up` no-op-on-unchanged-hash gotcha. - ---- - -## GitNexus code intelligence +### GitNexus code intelligence Indexed as `resolutionflow`. Earns its cost on cross-cutting work only. @@ -224,9 +52,7 @@ Indexed as `resolutionflow`. Earns its cost on cross-cutting work only. Re-indexes automatically on commit (PostToolUse hook). Manual refresh if stale: `npx gitnexus analyze`. ---- - -## gstack skills +### gstack skills Always use `/browse` for web, never `mcp__claude-in-chrome__*`. Most-used: @@ -238,28 +64,10 @@ Always use `/browse` for web, never `mcp__claude-in-chrome__*`. Most-used: - `/codex` — OpenAI Codex second opinion - `/plan-eng-review` / `/plan-design-review` / `/plan-ceo-review` — plan critiques ---- +### Git trailer -## Deployment (Railway) +Every commit: `Co-Authored-By: Claude Opus 4.7 ` -- **Prod:** `resolutionflow.com` (frontend), `api.resolutionflow.com` (backend). -- Auto-deploy: Gitea push → GitHub mirror → Railway follows GitHub `main`. -- PR environments auto-created; need manual domain generation + `VITE_API_URL` with `https://` prefix. -- `ALLOW_RAILWAY_ORIGINS=true` for `*.up.railway.app` CORS. -- Shared Variables (Railway project-level) auto-propagate to PR envs — use for secrets like `ANTHROPIC_API_KEY`. -- Super admin utility: `backend/make_superadmin_simple.py list|`. +### Model aliases ---- - -## Quick reference - -| What | Where | -|---|---| -| Detailed status | [CURRENT-STATE.md](CURRENT-STATE.md) | -| Roadmap | [03-DEVELOPMENT-ROADMAP.md](03-DEVELOPMENT-ROADMAP.md) | -| Design system | [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md) | -| Dev env | [DEV-ENV.md](DEV-ENV.md) | -| Archived lessons | [docs/LESSONS-ARCHIVE.md](docs/LESSONS-ARCHIVE.md) | -| ConnectWise API | `docs/connectwise/` | -| GitHub issues | `gh issue list --state open` | -| Local API docs | | +Always use alias form (`claude-sonnet-4-6`, `claude-opus-4-6`, etc.) via `settings.get_model_for_action()`. Never hardcode a dated model ID. diff --git a/SESSION-HANDOFF.md b/SESSION-HANDOFF.md deleted file mode 100644 index f536317c..00000000 --- a/SESSION-HANDOFF.md +++ /dev/null @@ -1,70 +0,0 @@ -# Session Handoff — Design System v4 Migration - -> **For the next Claude session:** Read this file completely, internalize the context, then delete it (`rm SESSION-HANDOFF.md`). This is a one-time context transfer. - ---- - -## What Was Done This Session - -### 1. FlowPilot Message Bar + AI Script Builder (MERGED to main) -- PR #118 merged. Always-visible message bar in FlowPilot sessions, AI Script Builder at `/script-builder`, library reorg (My/Team Scripts tabs), FlowPilot-to-Script-Builder handoff, session abandon/close, unified session history. -- Eng review completed: normalized `script_builder_messages` table, typed content helpers, 6 edge case tests. - -### 2. Design System v4 Migration (PR #119, open, branch: `refactor/design-system-v4`) -- Complete frontend redesign from glassmorphism to flat dark theme (Sentry/PostHog-inspired) -- **CSS Foundation:** New color tokens in `index.css`, all via CSS custom properties. Light mode ready (just needs `.light` class values). -- **Icon Rail Sidebar:** 72px rail with 5 grouped icons (Home, Work, Knowledge, Insights, Help). Full-height resizable drawer on hover. Pin-to-expand to 260px. Mobile hamburger overlay. -- **Component Sweep:** ~200 files migrated. All hardcoded hex replaced with semantic Tailwind tokens (bg-card, text-foreground, border-border, etc.). -- **Landing Page:** Flat surfaces, no glow, solid buttons. -- **Interactive Shadows:** Dark-mode-aware — elevated surfaces + faint cyan accent glow (black shadows invisible on dark bg). -- **Stat Cards:** 3px colored left borders. -- **Tab Toggles:** Active state uses `tab-active-shadow` (elevated bg + faint glow). - -### 3. GTM Strategy (from /office-hours) -- Shadow & Ship approach: Michael uses ResolutionFlow on real tickets for 2 weeks, then hands logins to 5 MSP colleagues. Key metric: unprompted return. -- Design doc at `~/.gstack/projects/patherly-patherly/` - ---- - -## What Needs To Be Done Next - -### Immediate (Design System v4 polish) -1. **Home icon color fix:** The Home icon in the sidebar shouldn't have a cyan background when not active. Instead, the Home icon itself should always be cyan (brand accent), and only show the `bg-accent-dim` background when the route is actually `/`. Michael specifically requested this. -2. **Visual QA pass:** Michael hasn't done a full page-by-page walkthrough yet. Expect feedback on individual pages once he does. -3. **`font-label` cleanup:** ~10 files still reference `font-label` (deprecated alias for `font-mono`). Each needs inspection — some should be `font-mono`, others `font-sans text-xs`. -4. **Inline `style` attributes:** ~29 instances still use hardcoded hex in inline styles (sidebar, drawer, badges). Should be converted to CSS variable references or Tailwind classes where possible. - -### Before Merging PR #119 -- Run migrations: `docker exec resolutionflow_backend alembic upgrade head` (new tables from the Script Builder PR are on main now) -- Full visual QA with backend running -- Test mobile responsive (hamburger menu) -- Test FlowPilot session with new message bar + action bar positioning - -### Future -- **Light mode toggle:** CSS variables are ready. Need to add `.light` class values in `index.css` + toggle in user settings/account page. -- **Script Builder testing:** The AI Script Builder hasn't been tested end-to-end with the backend running yet. - ---- - -## Key Files to Know - -| File | What it does | -|------|-------------| -| `DESIGN-SYSTEM.md` | Single source of truth for all design decisions | -| `frontend/src/index.css` | CSS tokens, component utilities, shadow patterns | -| `frontend/src/components/layout/Sidebar.tsx` | Icon rail + drawer + pinned sidebar | -| `frontend/src/components/layout/AppLayout.tsx` | CSS Grid shell | -| `frontend/src/components/dashboard/StartSessionInput.tsx` | The Guided/Chat toggle | -| `frontend/src/components/dashboard/PerformanceCards.tsx` | Stat cards with colored borders | - -## Key Lessons From This Session - -- The component sweep agents missed `editor-ai/`, `guides/`, `maintenance/`, `scripts/`, `settings/` directories and `text-brand-dark` references. Always do a final grep audit after sweeps. -- `bg-[#hex]` hardcoding defeats the purpose of CSS variables. We had to do a second pass to replace 3,200+ hardcoded values with semantic tokens. -- Black shadows (`rgba(0,0,0,...)`) are invisible on dark backgrounds. Use elevated surfaces + faint accent glow instead. -- The sidebar flyout needed `position: fixed` to escape the CSS Grid cell clipping — `absolute` positioning was hidden behind the main content area. -- Flyout hover timing: individual item `onMouseLeave` was killing the flyout before the mouse reached the drawer. Only the outer wrapper should handle `onMouseLeave`. - ---- - -> **After reading this file:** Save relevant context to your session memory, then run `rm SESSION-HANDOFF.md` and `git add -A && git commit -m "chore: remove session handoff file"`.