220 Commits

Author SHA1 Message Date
e110fedfe4 chore: snapshot CLAUDE.md before ai-handoff migration 2026-04-24 14:21:21 -04:00
dab740ddf7 fix(tests): isolate test DB from dev DB and plug admin-db override gap
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Root cause of the 06:32 AM outage: running 'pytest tests/' inside the
resolutionflow_backend container silently dropped the public schema on
the DEV database. Two layered bugs made this possible; both are fixed.

Bug 1 — env-var lookup in conftest.TEST_DATABASE_URL put DATABASE_URL
(which normally points at the dev/prod DB) ahead of DATABASE_TEST_URL.
When DATABASE_URL is set, pytest used the dev DB as the 'test' DB and
the test_db fixture's DROP SCHEMA public CASCADE wiped it. Fixed:
  - Honor only DATABASE_TEST_URL (or the localhost fallback).
  - Assert at module load that the DB name contains 'test' — refuses
    to run otherwise. Makes future misconfiguration impossible.

Bug 2 — conftest overrode app.dependency_overrides[get_db] but not
get_admin_db. Endpoints using get_admin_db (register, admin routes)
bypassed the test session and hit the real admin DB. Before Bug 1 was
fixed this was hidden because both engines pointed at the same dev DB.
With isolation in place, register started failing 'Email already
registered' because of stale users in the dev DB. Fixed:
  - Also override get_admin_db to yield the same test session. RLS is
    not enabled in the create_all-managed test schema, so sharing is
    safe.

Also adds DATABASE_TEST_URL=resolutionflow_test to docker-compose.dev.yml
so pytest in the container works out of the box.

Verified: 49/50 Phase 8 + 9 tests pass against resolutionflow_test; the
1 failure is the pre-existing Phase 8 Issue #4
(test_record_decision_persists_and_bumps_state_version).

Refs gitea #145 (will update that issue with this as the primary fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:14:08 -04:00
24972e8444 fix(pilot): Phase 9 review — partial-outcome notes + per-fix script-builder remount
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Addresses docs/FlowAssist_Migration/Issues/phase-9-review-issues.md.

Issue #1 (High): "Applied partially" from the escalation intercept silently
dropped because the backend requires notes on applied_partial and the dialog
sent none. The catch was silent and the UI advanced into the conclude flow
as if the outcome were recorded.
- EscalateInterceptDialog now has a two-step flow: clicking the partial
  choice reveals a notes textarea (autofocused, required non-empty) plus
  Back / "Record partial & escalate" buttons.
- onChoose signature extended to (choice, notes?).
- handleInterceptChoice passes notes to patchOutcome; on failure it
  surfaces a toast and does NOT advance to the conclude modal, so the
  intercept stays open for retry.

Issue #2 (Medium/High): ScriptBuilderTab kept local state across active-fix
changes within the same pilot session, so a stale draft could PATCH against
a newer fix.id. Added key={activeFix.id} on the mount — forces a clean
remount per fix; backend get-or-create (keyed on user+ai_session_id) still
returns the same session row, which is the intended resume-on-refresh
semantic; but messages/editorBuffer/latestScript local state resets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 11:08:00 -04:00
d386d11af2 docs(pilot): correct Phase 9 migration description
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
Handoff + migration spec incorrectly claimed Phase 9 added a new
parent_pilot_session_id FK. The implementation reuses the existing
ai_session_id column; the migration only adds the origin discriminator
+ partial unique index. Also: ScriptBuilderTab wraps ScriptBuilderChat
and ScriptBodyEditor (Monaco), not "ScriptBuilderChat in ephemeral
mode" — there is no ephemeral mode on the presentational component.

Applies applied_at call-site specifics: handleScriptDecision stamps
on one_off/draft_template, TemplateMatchPanel stamps on onMarkRun,
Script Builder tab Submit does not stamp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:17:08 -04:00
65a831bf9a docs(pilot): Phase 9 handoff + migration spec update
Marks open items #1 (NoTemplateDialog narrow-lane) and #3 (Tabbed
Script Builder) as resolved. Records the applied_at semantics
correction as shipped. Final Phase 9 row added to the 'What shipped'
table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 06:14:41 -04:00
faf1d8dd12 fix(pilot): applied_at stamps on run-declaring actions, not Apply click
Per Phase 9 §5. Before: banner Apply click stamped applied_at
regardless of whether the engineer had committed to running anything,
starting the Verifying timer prematurely. After:

- handleApplyFix no longer calls applyFix(). It just routes to the
  right surface (TemplateMatchPanel / InlineNoTemplateDialog / Script
  Builder tab).
- handleScriptDecision stamps applied_at for one_off + draft_template
  (both labels are 'Run now, …' — the click is the declaration).
  build_template does not stamp.
- TemplateMatchPanel's new 'I ran this' button calls applyFix via a
  new onMarkRun prop.
- Script Builder tab Submit does not stamp (a draft is not a run).

No backend change — the /apply endpoint is unchanged. Only call sites
move.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 04:11:56 -04:00
0386fa1fd5 feat(pilot): mount ChatTabStrip + ScriptBuilderTab + InlineNoTemplateDialog
Wires the three new components into AssistantChatPage:
- ChatTabStrip renders when the active fix needs a script drafted.
- ScriptBuilderTab sits alongside chat via display:none toggling so
  chat scroll position + builder state both persist.
- InlineNoTemplateDialog replaces the task-lane bottomSlot render for
  the drafted-script evaluation case; three cards finally fit.
- Banner Apply routing updated: no-draft/no-template → Script Builder
  tab; drafted → InlineNoTemplateDialog; template → unchanged path.

applyFix() call site moves land in the next task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 04:02:20 -04:00
82db1c78e4 feat(pilot): EscalateInterceptDialog — fourth 'partial' choice
Closes the gap Phase 8 final review flagged. When a fix is in
applied_partial state and the engineer escalates, the intercept no
longer forces them to approximate with didn't-work/worked/never-applied.

AssistantChatPage's handleInterceptChoice (Task 13) already dispatches
to patchOutcome for any FixOutcome value, so no handler change is
needed — the type already supports applied_partial.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 03:04:05 -04:00
f930787200 feat(pilot): TemplateMatchPanel — explicit 'I ran this' action
Generate and Copy alone don't declare a run — the engineer can walk
away after copying. Phase 9 §5 defines an explicit run-declaration
affordance so applied_at only stamps on the engineer's positive
commitment. Wiring from AssistantChatPage lands in Task 13.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 03:02:17 -04:00
5bcb7aa7c3 feat(pilot): InlineNoTemplateDialog — chat-region placement wrapper
Slide-up wrapper around the existing NoTemplateDialog for rendering
in the chat region above the composer (parallel to ProposalBanner).
The chat region's width lets grid-cols-3 finally work as intended.

No change to NoTemplateDialog itself; decision callbacks and card
copy stay identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:56:35 -04:00
04fbfe3b8f feat(pilot): ScriptBuilderTab controller
Owns the inline Script Builder session lifecycle:
- Get-or-create (origin='pilot_inline', ai_session_id) on mount.
- Renders ScriptBuilderChat in AI mode and CodeModeEditor (Monaco) in
  'Write it myself' mode. Mode toggles via display:none so buffer and
  messages persist across switches.
- Submit → sessionSuggestedFixesApi.patchScript; emits onScriptDrafted
  to parent, which refreshes the fix and hides the tab strip.
- Relays in-progress state to the parent via onProgressChange for the
  ChatTabStrip's indicator dot.

ScriptBuilderChat is untouched (stays presentational). Persistence
semantics live on the controller, not the display component.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:55:12 -04:00
f92cbefed9 feat(pilot): ChatTabStrip component — [Chat] [Script Builder ●]
Two-tab strip for the chat region. Parent controls mounting (strip only
appears when the fix needs a script drafted). Indicator dot signals
in-progress draft state. Tab switching via onChange callback; parent
handles display:none toggling so tab contents preserve state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:45:16 -04:00
c9306e40c9 feat(pilot): frontend API client — patchScript + inline createSession
sessionSuggestedFixesApi.patchScript(sessionId, fixId, script, params?)
hits the new PATCH /script endpoint.

scriptBuilder.createSession accepts an optional options bag with
origin + aiSessionId, defaulting to standalone when omitted so legacy
callers stay behavior-preserving.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:38:07 -04:00
1c855563ee feat(pilot): PATCH /suggested-fixes/:id/script endpoint
Called by the inline Script Builder tab on Submit. Writes
ai_drafted_script + ai_drafted_parameters to the fix without stamping
applied_at (a draft is not an application — that's §5 of the Phase 9
spec). Bumps state_version so Resolve/Escalate preview bundles
regenerate.

409 on terminal fix status. 404 on wrong session. 422 on empty script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:34:06 -04:00
d4fae87236 feat(pilot): inline Script Builder session — idempotent create + auth + filtered list
POST /script-builder/sessions now supports origin='pilot_inline':
- Requires ai_session_id; validates it against current user ownership.
- Get-or-create: returns existing row for (user, ai_session_id) pair.
- Partial unique index on the DB backs the invariant; races resolve to
  the single winner row.

list_sessions + count_user_sessions default-scope to origin='standalone'
so inline scratch sessions don't pollute the /script-builder dashboard
or count against the 5-session cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:24:57 -04:00
f2fce27f0d feat(pilot): pydantic schemas for inline origin + script PATCH
- ScriptBuilderCreateRequest gains origin ('standalone' | 'pilot_inline')
  and optional ai_session_id. Handler-side validation (next task) enforces
  pilot_inline ⇒ ai_session_id required + owned by caller.
- SessionSuggestedFixScriptRequest added for the new PATCH /script
  endpoint (Phase 9 Task 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:53:28 -04:00
93c974466a feat(pilot): script_builder_sessions.origin on SQLAlchemy model
Mirrors the DB column added in the prior migration. App-level default
is 'standalone' so existing callers of ScriptBuilderSession(...) work
without code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:48:22 -04:00
8012668975 feat(pilot): add origin + inline idempotency to script_builder_sessions
Phase 9 prep. Adds:
- origin VARCHAR(20) NOT NULL with CHECK ('standalone' | 'pilot_inline')
- invariant: pilot_inline rows must have ai_session_id
- partial unique index on (user_id, ai_session_id) WHERE origin='pilot_inline'
  — backs get-or-create idempotency for the inline Script Builder tab.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:22:53 -04:00
563bb1aa6f docs(pilot): Phase 9 implementation plan
14-task plan covering:
- DB migration for origin + partial unique index on script_builder_sessions
- Pydantic schemas for inline origin + PATCH /script
- POST /script-builder/sessions idempotent for pilot_inline + auth
- list_sessions + count_user_sessions filtered to standalone
- PATCH /suggested-fixes/:id/script (bumps state_version, no applied_at)
- Frontend API client additions
- ChatTabStrip, ScriptBuilderTab (controller), InlineNoTemplateDialog
- TemplateMatchPanel 'I ran this' action
- EscalateInterceptDialog fourth 'partial' choice
- AssistantChatPage integration + applyFix call-site relocation
- Docs + handoff updates

Paired with the spec at phase-9-script-builder-tab.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:03:57 -04:00
1d2d548fc8 docs(pilot): Phase 9 spec — final consistency polish
- Frontend scriptBuilder API client inventory now matches the backend
  schema: createSession accepts BOTH origin and ai_session_id (both
  required together for inline callers, both omitted for standalone).
- 'If template -> unchanged' sharpened: render location is unchanged,
  but run stamping moves into the panel's new 'I ran this' action per
  the §5 apply lifecycle correction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:54:04 -04:00
3ee0101c6d docs(pilot): Phase 9 spec — ownership + schema corrections
- scriptBuilderMode ownership: pinned to ScriptBuilderTab, removed from
  AssistantChatPage's state list. Parent never drives the AI/editor
  toggle; controller owns it and resets naturally on session switch via
  unmount/remount. scriptBuilderHasProgress stays on the page (needed
  for the tab strip indicator dot) and is driven by the controller via
  an onProgressChange callback.
- ScriptBuilderCreateRequest schema: explicitly calls for TWO new
  optional fields (origin + ai_session_id), not just origin. Handler
  enforces: when origin='pilot_inline', ai_session_id is required and
  must pass the current-user ownership check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:49:08 -04:00
861d082ff7 docs(pilot): Phase 9 spec — consistency pass on Apply stamp call sites
Three consistency fixes:
- File inventory (backend + frontend) now names all three apply-stamp
  call sites: handleScriptDecision('one_off' | 'draft_template') plus
  TemplateMatchPanel's 'I ran this' handler. Previously listed only
  'one_off' in two places, contradicting the §5 lifecycle table.
- NoTemplateDialog relocation section no longer claims the decision
  handler is 'unchanged' — it is unchanged EXCEPT for the moved
  apply stamp, which is the point of §5.
- Open deferrals entry on ScriptBuilderChat 'ephemeral mode' removed;
  replaced with the actual new surface (ScriptBuilderTab controller),
  which reuses the existing script-builder prompt unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:41:17 -04:00
75b59123e6 docs(pilot): Phase 9 spec — fix Apply semantics + session idempotency
Four review findings addressed:

- High: draft_template 'Run now, templatize after' DOES run the
  script; applied_at table now stamps for both one_off and
  draft_template. Only build_template (no run) skips the stamp.
- Medium: TemplateMatchPanel needs an explicit '✓ I ran this' button.
  Generate/Copy don't commit to running. The new button is the stamp
  moment for template-match fixes.
- Medium: get-or-create for inline script_builder_sessions —
  POST /script-builder/sessions is now idempotent for
  origin='pilot_inline' (returns the existing row for a
  (user, ai_session_id) pair). Backed by a partial unique index:
    UNIQUE (user_id, ai_session_id) WHERE origin = 'pilot_inline'
  so remount doesn't create duplicates and draft continuity is
  preserved.
- Medium: authorization — the create endpoint validates that any
  provided ai_session_id is owned by the current user (same guard
  other pilot endpoints use). Prevents cross-user attachment of
  scratch sessions to arbitrary pilot sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:34:53 -04:00
fcd224429c docs(pilot): revise Phase 9 spec per review findings
Four findings addressed:

- High: drop proposed parent_pilot_session_id column; reuse the
  existing ai_session_id FK on script_builder_sessions. Add an
  origin + ai_session_id coherence invariant.
- High: don't add a 'mode' prop to ScriptBuilderChat (it's
  presentational). Introduce a ScriptBuilderTab controller that owns
  session lifecycle + submit, renders ScriptBuilderChat unchanged.
- Medium: filter list_sessions / count_user_sessions to origin='standalone'
  so pilot_inline scratch sessions don't pollute the /script-builder
  dashboard or count against the 5-session cap.
- Medium: applied_at is stamped only when the engineer commits to a
  run-action (one_off, TemplateMatchPanel Run), not on banner Apply
  click. Corrects a Phase 8 over-eager stamp that would otherwise
  multiply across three surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:28:53 -04:00
196c003876 docs(pilot): Phase 9 spec — tabbed Script Builder + NoTemplateDialog relocation
Design doc for the FlowPilot migration's remaining open items:
- NoTemplateDialog narrow-lane bug (resolved by moving the dialog to
  the chat region alongside ProposalBanner — three cards fit naturally
  at that width; grid-cols fix no longer needed)
- Tabbed Script Builder inside the chat (new [Chat] [Script Builder ●]
  tab strip; AI chat default with 'Write it myself' Monaco escape hatch)

Plus a Phase 8 cleanup:
- EscalateInterceptDialog fourth 'I applied some of it — partial' choice

All six architecture decisions settled via brainstorming before writing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 23:03:57 -04:00
f2b9476edb docs(pilot): log Issues #1-4 findings for Phase 8 review
Tracks the three code-review issues that were fixed on this branch
(#1 outcome-aware previews, #2 persist Apply, #3 persist proposal
rejection) plus a newly-documented pre-existing test failure (#4 —
decision-endpoint test written in Phase 3 never updated when Phase 5
added the drafted-script validation guard).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:18:13 -04:00
70c5da0c75 fix(pilot): persist AI-proposal rejection + clear on outcome write
Issue #3 from phase-8-review-issues.md. 'Not yet' on the AI-confirming
banner was a local-state hide; the proposal re-surfaced on the next
refreshSessionDerived call.

Two-part fix:
- PATCH /outcome now clears ai_outcome_proposal on any terminal action
  (engineer has taken a decision; stale AI proposal is moot).
- New DELETE /ai-sessions/:sid/suggested-fixes/:fid/ai-outcome-proposal
  endpoint for explicit 'Not yet' rejection. Does not touch status
  or state_version — pure UI state.

Frontend handleRejectAIProposal now calls the DELETE and setActiveFix
with the server response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:15:48 -04:00
de2bef3175 fix(pilot): persist Apply — stamp applied_at on click
Issue #2 from phase-8-review-issues.md. Apply was client-side-only via
a bannerApplied flag. Refresh / chat reselect / multi-tab would drop
Verifying state back to Proposed.

- New POST /ai-sessions/{sid}/suggested-fixes/{fid}/apply stamps
  applied_at without changing status (still 'proposed'). Idempotent
  if already stamped; 409 if fix is past proposed (a terminal outcome
  was already recorded).
- Bumps state_version so resolve/escalate preview bundles reflect that
  the fix has entered verifying.
- Frontend handleApplyFix calls the endpoint and uses the returned
  applied_at directly. bannerApplied client flag is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:10:52 -04:00
362c7b1d79 fix(pilot): outcome-aware Resolve/Escalate previews
Issue #1 from phase-8-review-issues.md. Cache invalidation alone isn't
enough — previews were also omitting outcome fields from the LLM bundle,
so a fresh regenerate still couldn't distinguish proposed / failed /
partial / success.

- PATCH /outcome now bumps ai_sessions.state_version (matches
  record_decision's existing pattern).
- Resolution-note + escalation-package bundles now include status,
  applied_at, verified_at, partial_notes, failure_reason on the active fix.
- Generator prompts prescribe outcome-aware phrasing (closure language
  for success; what-we've-tried + next-steps for failed/partial).
- New end-to-end test asserts the regenerated preview reflects the
  recorded outcome, not just that the cache key changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:04:56 -04:00
ec104dc8de docs(pilot): sync Phase 8 handoff with actual implementation
Correct the stale ai_sessions.fix_outcome reference (no such column) —
the real schema adds six columns to session_suggested_fixes. Update
last_commit to reflect the docs-correction tip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 19:48:54 -04:00
a47ce07326 docs(pilot): fix Phase 8 column + commit-SHA references
Correct the FLOWPILOT-MIGRATION.md stale references to a non-existent
ai_sessions.fix_outcome column — the actual implementation added six
columns to session_suggested_fixes. Also fix a stale first-commit SHA
(6721b84 → cdd8bb0, the former was amended away).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 17:42:51 -04:00
2a54127a54 docs(pilot): Phase 8 fix outcome banner — handoff + migration spec
Marks open item #2 (task-lane crowding / Suggested Fix discoverability)
as resolved by Phase 8. Open items #1 (NoTemplateDialog narrow-lane)
and #3 (Tabbed Script Builder inside chat) remain deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:52:07 -04:00
8582d24236 chore(pilot): remove deprecated SuggestedFix task-lane card
Superseded by ProposalBanner (Phase 8). The import was already removed
from AssistantChatPage in the previous commit; this deletes the orphaned
file itself and strips the now-unused suggestedFixSlot prop from
TaskLane's interface and both call sites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:48:42 -04:00
bdb238a274 feat(pilot): mount ProposalBanner + wire implicit signals
Replaces the task-lane SuggestedFix card with the ProposalBanner docked
above the chat composer. Wires:
- Resolve-while-verifying auto-marks applied_success (one-click resolve).
- Escalate-while-verifying opens EscalateInterceptDialog to capture the
  real outcome (default: didn't work) before handoff.
- 3+ post-apply engineer messages trigger the passive Nudge banner.
- AI [FIX_OUTCOME] proposals surface in the AIConfirming state; one-click
  confirm applies the outcome.

Banner state resets on session switch via resetSessionDerivedState.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 16:42:01 -04:00
075b0fc1d8 feat(pilot): EscalateInterceptDialog popover
Anchored above the Escalate button, captures fix outcome before the
engineer hands off the ticket. Defaults to 'didn't work' on Enter
(the common case). Alternatives: 'worked, escalating for another
reason' (preserves success) and 'never actually applied' (dismiss).

Task 11 will wire this to AssistantChatPage's Escalate handler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:48:33 -04:00
217747f46e feat(pilot): banner AI-confirming, Nudge, Collapsed states
Completes ProposalBanner's state machine. AIConfirming (accent-blue)
surfaces the AI's [FIX_OUTCOME] proposal with one-click accept; Nudge
is the compact passive-prompt variant for post-apply chats; Collapsed
is the 28px expand-hint strip.

Adds onSilenceNudge prop so the parent can silence the nudge without
collapsing it (Task 11 wires this). Removes the last three stale
eslint-disable-next-line comments — all sub-components now use props.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:39:08 -04:00
7fa1d6a32f feat(pilot): banner Verifying + Partial states
Verifying: amber pulse animation, confidence pill becomes 'Applied Xm ago',
three actions (overflow for Mark partial, Didn't work, It worked). window.prompt
used for the partial notes + failure reason inputs — good-enough v1 pending
an inline composer.

Partial: cyan-toned to signal 'parked, outcome unknown', shows saved notes
inline, Finish it / Didn't work / It worked actions.

Adds pulse-amber to @theme animations alongside slide-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:32:02 -04:00
ac67e48500 feat(pilot): ProposalBanner scaffold + Proposed state
New component that will replace the task-lane SuggestedFix card. Docks
above the chat composer with a 320ms slide-up animation. This commit
implements only the Proposed state (Tasks 8 & 9 fill Verifying, Partial,
AI-confirming, Nudge, Collapsed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:25:41 -04:00
cdd29b460e feat(pilot): frontend fix-outcome types + patchOutcome API
Extends SessionSuggestedFix with outcome fields (status, applied_at,
verified_at, partial_notes, failure_reason, ai_outcome_proposal) and
adds a patchOutcome method hitting the new backend endpoint.

FixStatus (5 values) + FixOutcome (4 writable values) mirror the
backend Pydantic types and the DB check constraint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:20:16 -04:00
2cde6673b0 feat(pilot): [FIX_OUTCOME] system prompt instructions
Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's
parser consumes. Placeholder-only per the anti-parrot pattern — no
literal UUIDs, outcomes, or reasons that could leak into unrelated
sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:17:21 -04:00
c0112f8bee feat(pilot): [FIX_OUTCOME] marker parser + AI outcome proposal
The AI emits [FIX_OUTCOME] when the engineer indicates in chat that a
prior suggested fix worked, didn't work, or was partially applied. The
marker writes to session_suggested_fixes.ai_outcome_proposal (JSONB),
which the frontend surfaces as a "confirm outcome?" banner. The status
column is only updated when the engineer clicks confirm (via PATCH
/outcome endpoint from Task 3).

Placeholder-only system prompt wiring comes in Task 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:08:43 -04:00
8988dbc885 feat(pilot): PATCH /suggested-fixes/:id/outcome endpoint + tests
Records engineer-reported outcome (applied_success|applied_failed|
applied_partial|dismissed). Enforces transition rules (partial → success/
failed allowed; terminal outcomes return 409) and notes requirements
(applied_partial requires notes).

Sets verified_at on success/failure, stamps applied_at if not already
set (handles the case where the AI [FIX_OUTCOME] marker fires before
the engineer clicks Apply).

Also fixes pre-existing test-infrastructure bug: network_diagram.py used
bare string server_default="'[]'" for JSONB columns, which asyncpg
rejects during test schema creation. Changed to text("'[]'::jsonb") to
match the pattern used by script_template.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:59:34 -04:00
4a8e3ae954 feat(pilot): pydantic schemas for fix outcome patch
Adds FixStatus literal (5 values matching the DB check constraint),
extends SessionSuggestedFixResponse with outcome fields, and introduces
SessionSuggestedFixOutcomeRequest for the PATCH /outcome endpoint coming
in Task 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:44:39 -04:00
cdd8bb05cc feat(pilot): add outcome tracking columns to session_suggested_fixes
Phase 8 prep for the fix outcome banner. Adds:
- status (proposed|applied_success|applied_failed|applied_partial|dismissed)
- applied_at, verified_at (timestamps)
- partial_notes, failure_reason (engineer-provided context)
- ai_outcome_proposal (JSONB for AI [FIX_OUTCOME] marker payloads)

Backfills status='dismissed' from user_decision='dismissed'. status is
orthogonal to user_decision — outcome (did the fix work?) vs script-path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:40:17 -04:00
8879f96fbf fix(pilot): drop sticky section headers in task lane
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
Each lane section (What we know, Questions, Diagnostic Checks, Suggested
fix) had its own `position: sticky; top: 0` header. As the engineer
scrolled past a section, that section's header would pin until the
section's bottom edge cleared the viewport, producing an "orphaned"
label floating over unrelated content below. Headers now scroll with
their content — in a 340px-wide lane the affordance was negative value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 16:01:14 -04:00
8a242f5db9 feat(pilot): Phase 7 — polish (loading/empty states, shortcuts, responsive drawer)
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
- WhatWeKnow shows a "synthesizing" indicator + skeleton pulse while the
  chat cycle is in-flight; task-lane header mirrors the signal with a
  "thinking" pip so engineers know the AI is still working.
- Quiet-state hint when the lane is open (facts exist) but no open
  questions, checks, or active fix — keeps the surface from looking
  "finished" when the AI is about to follow up.
- Keyboard shortcuts: ⌘↵/Ctrl+↵ send in the composer (plain Enter still
  sends), ⌘G toggles the Script Generator panel for the active fix,
  `?` opens a new ShortcutsHelpOverlay listing all bindings. ⌘K palette
  was already wired in TopBar.
- Responsive: below 1200px the task lane collapses to a bottom drawer
  with a backdrop + a floating "Tasks ●" toggle button. TaskLane now
  takes a `variant: 'side' | 'drawer'` prop; drawer variant drops the
  resize handle and uses the shared slide-in-bottom animation.
- Build hygiene: fixed a pre-existing TS error in confirm-post error
  handling (duplicate `response` type keys) and an unused-import warning
  in TemplatizePrompt.

Verified: `npx tsc -b` and `npm run build` both clean against the dev
stack; Vite HMR applied each change without errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 14:19:44 -04:00
4aaf57adb5 feat(pilot): Phase 6 — post-resolve templatize prompt + draft accept/reject
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Closes the loop on the Phase 5 "Run now, templatize after resolve" path.
After a session resolves, drafts queued by the three-option dialog surface
as a modal that lets the engineer review the AI-proposed parameterization
and either save as a reusable team template or skip. A "don't ask again"
toggle writes to account_settings.preferences so the next resolve won't
pop the modal.

Backend:
- /api/v1/draft-templates:
  * GET — list account drafts (pending_only default true; pass false for
    audit view including accepted/rejected)
  * GET /{id} — single draft
  * POST /{id}/accept — promotes to a new script_templates row with
    source_session_id / source_user_id / source_ticket_ref populated
    (drives the Script Library "generated from CW #X · resolved by Y"
    provenance chip). Draft flips to status=accepted,
    promoted_template_id set, resolved_at stamped. 409 on re-accept /
    already-rejected. 400 on unknown category_id.
  * POST /{id}/reject — flips to status=rejected. 409 on re-reject.
- /api/v1/accounts/me/preferences (GET/PATCH) — thin wrapper over
  AccountSettings.get_setting/set_setting. PATCH merges keys into the
  JSONB column, preserving existing keys the client didn't touch.
  Used by the "Don't ask again for this team" checkbox
  (templatize_prompt_enabled=false) and, forward-looking, by
  cw_resolved_status_id / cw_escalated_status_id from Phase 4.
- 13 tests: list filter, accept with/without edited_body, provenance
  copy-through, reject, 409 on re-accept / re-reject, 400 on unknown
  category, prefs round-trip with merge semantics.

Frontend:
- src/components/pilot/script/TemplatizePrompt.tsx — modal showing the
  drafted script with proposed parameters in the Phase 5
  ParameterizationPreview, editable name/category/description, an
  individual-parameter remove button, and the "don't ask again" opt-out.
  Accept posts to /draft-templates/{id}/accept + optionally PATCHes
  preferences. Skip posts /reject.
- src/api/draftTemplates.ts — typed client plus accountPreferencesApi.
- AssistantChatPage: after a successful Resolve (external OR local),
  fetches preferences + pending drafts for the session and queues the
  modal one draft at a time. Escalate does not trigger this flow.
- Sidebar: Scripts nav shows the pending-draft count as a badge. Fetched
  independently of the main sidebar stats so endpoint flakes don't
  break the rest of the sidebar.

Verified live 2026-04-22: seed two drafts → GET sees both pending →
accept draft A (template created, provenance CW #99123 populated) →
reject draft B → pending count drops → PATCH opt-out → GET confirms
persistence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:37:49 -04:00
ddae171a37 fix(pilot): clear messages in resetSessionDerivedState — was leaking across chats
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Symptom: sidebar showed "User mjones got locked out … 0 messages" but the
conversation pane was rendering 2 messages from a different chat. The
task lane content matched what was displayed (so the AI was fine post-
prompt-sweep) — the leak was purely UI: messages from the previous chat
stayed on screen until the new chat's getSession returned.

selectChat resetSessionDerivedState() then awaits getSession before
calling setMessages(detail.conversation_messages). Between the reset
and that await, the prior chat's messages remain visible. handleNewChat
already had an explicit setMessages([]) call so it was unaffected;
selectChat did not.

Folded setMessages([]) into resetSessionDerivedState so any new chat-
switch entry point gets the wipe for free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:15:39 -04:00
d0ebdef9e8 fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The "AI parrots example content from system prompt" bug bit us twice in
one day across two different prompt sites. Patching individual prompts
is treating the symptom; this commit makes the rule structural.

Audit + sanitize:
- assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in
  prior commits, but the [FORK] schema still had literal "Brief reason"
  / "Short name" / "One sentence" placeholders. Replaced with
  <angle-bracket> placeholders. Anti-parrot rule itself rewritten to
  describe the failure mode abstractly instead of naming "jsmith" so
  the rule no longer trips the guardrail (and so the model doesn't
  see "jsmith" as a token at all).
- ai_chat_service.py — removed three concrete-example offenders:
  "Get-Service ADSync" command literal, the "DC01 server_name" intake
  form payload (in two places), and the inline interview demos using
  "Azure AD Sync failures" / "Exchange Online mailbox migration".
  Replaced with technology-neutral schema descriptions.
- ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the
  fully-fleshed DNS troubleshooting tree (with literal Dnscache /
  ipconfig / google.com / Start-Service) with a placeholder schema
  showing only ID-linkage shape.
- kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked
  Server Manager + DC01 example payload with a placeholder schema.

Guardrail (tests/test_prompt_anti_parrot.py):
- Imports every module under app/services/ and app/core/ and walks
  every uppercase string constant ending in _PROMPT, _SCHEMA,
  _PROTOCOL, _FORMAT, or _CONTEXT.
- test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache,
  google.com, "Outlook keeps", "Teams drops") must not appear in any
  prompt constant. Add to the list when a new leak shows up in prod —
  the list IS the audit trail.
- test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.)
  must contain placeholders only. Distinguishes JSON keys (followed
  by ':', allowed) from JSON values (followed by ',' / ']' / '}',
  must be <placeholder>); allows pipe-separated enum types
  (text|password|select) and a small set of fixed enum values
  (question, diagnostic_check, decision, action, ...). Verified by
  feeding the test a known-bad block — caught it correctly.

Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming
the test as the enforcement point so future contributors know how to
extend it (add to the known-leaked list when a new leak surfaces).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:09:30 -04:00
50215b9110 fix(pilot): strip literal example content from system prompt — model was parroting
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The system prompt had a "Complete example of a correct first response"
section with a specific Outlook/WiFi/jsmith scenario plus literal JSON
payloads in [QUESTIONS], [ACTIONS], [SUGGEST_FIX], and [PROMOTE]
markers. The model was emitting those literal strings (the same
WiFi/laptop questions, the same "Clear cached credentials" suggested
fix, the same "OWA login confirmed for jsmith" promote) on EVERY
unrelated chat — making the task lane look like it was leaking previous-
session data when in fact the AI was just reciting the prompt examples.

Replaced literal example content with `<placeholder>` schemas. Added an
explicit ANTI-PARROT RULE in the FINAL REMINDER section calling out
that the angle-bracket placeholders show SHAPE, not CONTENT, with
concrete examples of the failure mode (printer ticket → don't ask
about Outlook; user not named jsmith → don't name jsmith).

Same scrub applied to the FORK section's "Outlook AND Teams dropping"
and the worked fork-flow example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:36:29 -04:00
ce7c8ac3d5 fix(pilot): wipe full task-lane state on chat switch + extract palette event
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Two fixes from the Phase 5 shakedown:

1. Stale lane data leaking across chats. handleNewChat, sendPrefill, and
   handleResumeNew were each missed when Phase 3/5 added activeFix,
   previewKind, previewData, and scriptPanelOpen — only selectChat reset
   the full set. Result: starting a new chat while a Suggested Fix card
   was active showed the previous session's fix card (and any open
   preview/script panel) until the next backend refresh swept it.
   Consolidated all four entry points behind a single
   resetSessionDerivedState() helper so adding new lane state in future
   phases only requires touching one place.

2. CommandPalette TDZ on cold load. SCRIPTS_INLINE_QUICK_ACTION (line 66)
   referenced PILOT_INLINE_SCRIPT_PATH declared at line 94 — module-level
   evaluation hit the use before the declaration. Browser blanked with
   "Cannot access 'PILOT_INLINE_SCRIPT_PATH' before initialization".
   Moved the path const above its first use; also extracted
   PILOT_INLINE_SCRIPT_EVENT into a tiny @/lib/pilotEvents module so
   AssistantChatPage doesn't import the palette component just to read a
   string — that mixed-export pattern broke Fast Refresh ("consistent
   components exports") and added an unnecessary import edge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:30:18 -04:00
fa61376303 feat(pilot): Phase 5 — inline Script Generator integration
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Wires the SuggestedFix card to an inline panel that handles both cases:
template-matched fixes open the Script Library generator with parameters
pre-filled from session context; un-matched fixes open the three-option
dialog (one_off / draft_template / build_template). The decision endpoint
records the path choice with side effects: draft_template persists a
draft_templates row via a Sonnet-driven TemplateExtractionService;
build_template returns a redirect to the Script Builder; one_off just
records the choice.

Backend:
- TemplateExtractionService: drafts a parameter schema from a concrete
  rendered script. Conservative by default ("prefer fewer parameters").
  Round-trip-validates that templated_body only references declared
  parameters; missing-key mismatch falls back to the original script
  with no params. LLM/parse failures fall back identically — the
  engineer can still create a draft and refine in the post-resolve
  prompt (Phase 6).
- /suggested-fixes/{fix_id}/decision side effects:
  * one_off → returns rendered_script (engineer's edited version or the
    fix's ai_drafted_script verbatim)
  * draft_template → same + creates draft_templates row with extracted
    params, returns draft_template_id
  * build_template → returns redirect_path=/scripts/builder?from_session=
    &fix= so the frontend can navigate to the builder pre-loaded
- 400 when a non-template fix has no ai_drafted_script (template-matched
  fixes take the dedicated /scripts/generate path, not this endpoint).
- 12 tests: TemplateExtractionService parse + fallback paths, all four
  decision branches, edited_script override, missing-script 400.

Frontend:
- src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog,
  ParameterizationPreview}.tsx — inline panels rendered in the task
  lane's bottom slot when the engineer clicks a SuggestedFix card.
- TemplateMatchPanel: loads template via /scripts/templates/{id},
  pre-fills params from fix.ai_drafted_parameters with cyan "from
  session" tags, generates via existing /scripts/generate (already
  bumps state_version on ai_session_id from Phase 3). 404 falls back
  with a clear message instead of erroring.
- NoTemplateDialog: shows the AI-drafted script with proposed parameter
  values highlighted in amber via ParameterizationPreview; three option
  cards with the middle (draft_template) flagged Recommended; inline
  edit on the script body before deciding.
- SuggestedFix card now clickable: onActivate toggles the inline panel.
- AssistantChatPage: scriptPanelOpen state + handleScriptDecision that
  navigates on build_template and toasts on the other paths. Active fix
  changes auto-close the panel so engineers don't act on stale state.
- Cmd+K → "Open inline Script Generator" palette entry surfaces only on
  /pilot/:id routes; fires a window event the chat page subscribes to.
  No Resolve shortcut added per Section 14 decision (browser ⌘R conflict).

Verified 2026-04-22 against the dev stack:
- one_off / draft_template / build_template all return the right shape
  with real Sonnet TemplateExtractionService for the draft path.
- Conservative extraction confirmed: cmdkey + Restart-Process script
  yielded zero proposed parameters as intended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 00:15:29 -04:00
8fd2c1bac6 feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Wires the preview popover's Confirm & post action to ConnectWise (and,
via the provider pattern, any future PSA). Adds the parallel Escalate
flow with the handoff-oriented five-section markdown. Sessions without a
linked PSA ticket resolve/escalate locally — markdown stored, status
flipped, nothing posted externally.

Backend:
- EscalationPackageGeneratorService: Sonnet, five sections (Problem /
  What we've confirmed / What we've tried / Current hypothesis /
  Suggested next steps). Shares the preview_cache with a separate KIND
  so Resolve and Escalate previews for the same state coexist.
- PSAWritebackService: post_resolution_note (RESOLUTION note type,
  customer-visible), post_escalation_package (INTERNAL_ANALYSIS,
  handoff for the next engineer only), transition_ticket_status with
  mandatory re-fetch verification. PSAStatusVerificationError surfaces
  loudly when CW silently rejects a status change — the
  ConnectWise anti-pattern CLAUDE.md flags.
- Endpoints:
  * POST /ai-sessions/{id}/escalation-package/preview
  * POST /ai-sessions/{id}/resolution-note/post
  * POST /ai-sessions/{id}/escalation-package/post
  Outcomes: "resolved" / "escalated" with external_id + verified status,
  "resolved_local" / "escalated_local" when no PSA linked.
- Target CW status IDs live in account_settings.preferences
  (cw_resolved_status_id, cw_escalated_status_id). When unset, the post
  proceeds without a status transition — response includes a
  status_transition_skipped_reason rather than silently erroring.
- 7 tests: local-only path, PSA happy path with verified transition,
  status verification failure → 502, skipped transition when
  unconfigured, 409 on already-resolved re-post, escalate parallel path,
  internal-analysis note type enforced.

Frontend:
- ResolutionNotePreview now kind-parameterized ('resolve' | 'escalate')
  with inline edit + Confirm & post. Preview loads from the matching
  backend endpoint; posting calls the matching endpoint; outcome toast
  surfaces the verified CW status or the local-only result.
- AssistantChatPage: previewKind state replaces previewOpen; two toggle
  buttons (Preview Resolve note / Escalate instead) in the lane's bottom
  slot. handleConfirmPost dispatches by kind.

Verified 2026-04-22:
- Local-only Resolve + Escalate round-trip against the dev stack.
- Live Sonnet escalation-package preview; cache hit on repeat call
  with no state change (separate cache kind from resolution-note).
- PSA post + status-verification paths covered by mocked-provider pytest
  cases. Live CW round-trip pending a test CW instance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 23:54:54 -04:00
7ccf4c602b fix(pilot): reorder Phase 3 useCallbacks to avoid TDZ on render
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
refreshSessionDerived's dep array referenced refreshActiveFix and
schedulePreviewRefresh before they were declared. React evaluates
useCallback deps synchronously during render, so the page blew up with
"Cannot access 'refreshActiveFix' before initialization" before a single
render completed. Moved the three leaf helpers above the aggregator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 23:44:19 -04:00
66e592096c feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache
Adds the AI-proposed resolution path and the inline preview of the
markdown that will be posted to the customer ticket on Resolve. The
preview is keyed on (session_id, ai_sessions.state_version) so back-to-
back fetches against unchanged state hit an in-process cache instead
of paying for a Sonnet call.

Backend:
- preview_cache: in-process LRU keyed on (kind, session_id, state_version).
  No TTL — state_version is the source of truth. Soft-cap 5000 entries.
- unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON
  payload, confidence clamped 0-100), supersession persistence (sets
  superseded_at on prior active row), atomic state_version bump.
- ResolutionNoteGeneratorService: pulls session, facts, active fix, and
  redacted script_generations into a structured input bundle for Sonnet;
  produces the four-section markdown (Problem / What we confirmed /
  Root cause / Resolution). Sensitive script parameters redacted via
  ScriptTemplateEngine.redact_sensitive driven by the template's
  parameters_schema.
- /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active
  fix or 404.
- /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records
  one_off / draft_template / build_template / dismissed; dismiss
  supersedes; bumps state_version. 409 on dismissing an already-
  superseded fix.
- /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns
  cached markdown; from_cache flag in payload signals cache hit.
- scripts.py POST /generate now bumps state_version on the linked
  ai_session_id when present (third source of preview-cache invalidation
  per Section 5.5).
- ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit,
  format, supersession semantics).
- 12 tests covering the parser (well-formed, last-wins, malformed,
  confidence clamping), supersession + state_version invariant, all
  decision branches, preview cache hit-on-no-change + miss-after-write.

Frontend:
- src/components/pilot/sections/SuggestedFix.tsx — amber-accented card
  with confidence badge; dismiss action wired to the decision endpoint.
- src/components/pilot/ResolutionNotePreview.tsx — popover with refresh,
  loading state, cached/fresh indicator, ticket-ref display.
- src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes
  404 to null so callers don't have to special-case.
- TaskLane gains suggestedFixSlot + bottomSlot props (rendered after
  Diagnostic Checks; bottomSlot anchors the Resolve action).
- AssistantChatPage: refreshSessionDerived helper batches fact + fix
  refresh; fact mutations and chat sends both schedule a 500ms-debounced
  preview refresh per the Section 5.5 spec.

Verified end-to-end against the dev stack with a real Sonnet call:
- /active 404 → fact create → preview generates four-section markdown
  grounded only in provided facts → second preview call hits cache
  (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:45:52 -04:00
625dba7548 feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs
Adds the load-bearing structural feature of the FlowPilot migration: a
"What we know" panel that holds confirmed facts for a session, fed by AI
[PROMOTE] markers and engineer-added notes. Facts feed the resolution
note preview (Phase 3) and survive across turns via stable UUIDs assigned
to pending_task_lane items.

Backend:
- FactSynthesisService: create/update/soft-delete facts with atomic
  state_version bumps; LLM-backed synthesize_from_question/check on the
  fact_synthesis (Haiku) action tier per Section 6.6.
- /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or
  via synthesis). PATCH returns 403 for question/diagnostic_check facts
  (edit the source item instead, Section 7.3).
- unified_chat_service: [PROMOTE] marker parser (JSON-block per Section
  8.1 spec drift note), stable-UUID assignment for pending_task_lane
  questions/actions preserved by exact text/label match across turns.
- ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to
  emit, hallucination guardrails, source_ref handling.
- 17 tests covering parser, stable IDs, service validation, CRUD,
  editability rule, both promote modes, 422 null-synthesis path,
  state_version invariant.

Frontend:
- src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton}
  — green-gradient section above Questions, dashed-circle check, inline
  edit/delete gated by the server's editable flag.
- TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept
  per the doc's "rename is opportunistic" guidance).
- AssistantChatPage fetches facts on selectChat and refetches after each
  chat send (so [PROMOTE]-synthesized facts appear immediately); auto-
  opens the lane when facts exist.

Verification: end-to-end smoke against the local docker stack confirms
all five endpoints (list/create/patch/delete/promote) plus the 403
editability rule. pytest suite verifies the same with mocked LLM. Live
[PROMOTE] flow remains untested until used in the UI — the marker shape
is covered by parser tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:13:44 -04:00
19cfd71995 chore(flowpilot-migration): remove migration handoff note after verification
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Gate 1 complete on Proxmox dev host (docker-01):
- Alembic at f07010f17b01 (single head); downgrade/upgrade roundtrip clean.
- Phase 0 prompt-cache verified: direct provider probe shows
  cache_create=5398 → cache_read=5398 across two calls; chat path emitted
  two anthropic.cache events 55s apart on a real FlowPilot session.
- Frontend npm run build clean (57.63s, no TS errors, no stale
  FlowPilotSessionPage imports).
- /assistant/:id → /pilot/:id redirect fires correctly and session detail
  loads (GET /api/v1/ai-sessions/<id> 200); a blank-until-click UX polish
  will be tracked separately.
- Dashboard session-tile dispatcher routes to /pilot/:id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:21:08 -04:00
3b55697c77 dev-env(proxmox): switch compose to direct-port exposure; document homelab topology
- docker-compose.dev.yml: drop Traefik/dev.resolutionflow.com labels, expose
  backend:8000 and frontend:5173 directly; swap relative bind mounts for
  ${REPO_ROOT}/... so compose works when driven from inside a code-server
  container with the host Docker socket mounted; default POSTGRES_PORT to
  5433 host-side; add explicit uvicorn/npm run dev commands; add
  ENABLE_MCP_MICROSOFT_LEARN and docker-01/Tailscale CORS origins.
- frontend/vite.config.ts: replace dev.resolutionflow.com with
  allowedHosts=['docker-01', '.ts.net', 'localhost'] for direct-port access
  over the private network.
- DEV-ENV.md: add Section 11 reference topology for the homelab Proxmox +
  code-server Option B setup, plus troubleshooting entries for the
  REPO_ROOT-empty-mount trap and the Vite allowedHosts rejection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 00:18:31 -04:00
851966966d docs(claude-md): compact CLAUDE.md for 2026-04-19 baseline
Trim from 570 → 264 lines. Archived lessons and fixes-in-code remain in
docs/LESSONS-ARCHIVE.md; CLAUDE.md now only carries what a fresh session
can't derive from the repo state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 00:18:15 -04:00
66968e4c59 docs(flowpilot-migration): add ephemeral migration handoff note
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Self-contained status snapshot for picking up Phase 0 + Phase 1 work
after the Proxmox dev-environment move. Lists what is done, what is
owed (the Gate 1 verification checklist), known drift, and the
recommended order of operations after the move.

Explicitly ephemeral — the doc instructs the reader to delete it once
Gate 1 verification has passed. Durable dev-env setup lives in
DEV-ENV.md; this file covers only the "where is the work right now"
handoff for this specific migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:31:16 +00:00
b0622f5511 docs(dev-env): rewrite DEV-ENV.md for host-agnostic setup
The previous version was tightly coupled to the Hostinger VPS at
46.202.92.250 — hardcoded IP, Traefik/Let's-Encrypt assumption,
specific Docker-volume paths. Rewriting ahead of the Proxmox migration
so a fresh clone on any Linux host (LXC, VM, bare metal, VPS) can
stand up a working dev environment without pre-baked assumptions about
topology.

Structural changes:

- Introduces Option A (all-in-one host) / Option B (Docker Compose) /
  Option C (split services) topology choice up front, so readers
  commit to one shape before touching commands.
- Adds a "per-host configuration" template the reader fills in once
  (DEV_HOST, POSTGRES_PORT, SECRET_KEY, API keys), referenced by name
  throughout the rest of the doc. No more hardcoded IPs.
- Adds an explicit verification section (Section 6) with concrete
  expected outcomes: alembic head, reversibility, prompt-cache hit,
  frontend build, /assistant→/pilot redirect, dispatcher routing, CORS.
- References the Phase 0 TODO(phase0-verify) in ai_provider.py and
  the expected alembic head (f07010f17b01) as of the current branch.
- Adds a troubleshooting section pulling in CLAUDE.md lessons that
  bite people repeatedly: stale Vite env vars, RLS policy violations,
  EACCES on dist/, multi-head alembic state, invisible cache misses.
- Documents the structured log events the backend emits
  (anthropic.cache, mcp.turn, mcp.fallback) so readers know what to
  grep for during verification.

Deliberately excluded:
- Production deployment (lives in CLAUDE.md Deployment section).
- Reverse-proxy configuration (whatever the reader prefers).
- code-server install specifics (Docker vs LXC vs native is reader's
  choice; once running, this doc applies).
- Proxmox-specific instructions — the doc is host-agnostic so it
  survives the next migration as well.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:31:03 +00:00
f3c3ee5b57 feat(pilot): unify AI troubleshooting surface at /pilot, redirect /assistant (Phase 1)
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Collapses the pre-existing dual-surface setup (AssistantChatPage at /assistant,
FlowPilotSessionPage at /pilot) into a single chat-primary surface per
architectural claim #1 of FLOWPILOT-MIGRATION.md.

Router changes (frontend/src/router.tsx):
- /pilot and /pilot/:sessionId now render AssistantChatPage.
- /assistant redirects permanently to /pilot via <Navigate replace>.
- /assistant/:sessionId redirects to /pilot/:sessionId preserving the ID
  via an AssistantSessionRedirect helper that reads the param.
- FlowPilotSessionPage is no longer imported or mounted. Per the
  beta-history-disposable decision, the file stays on disk for reference
  but is unreachable; delete once nothing else in the tree imports it.

Dispatcher de-branching — previously these sites routed by session_type
(chat -> /assistant, otherwise -> /pilot). All now unconditionally go to
/pilot/:id since session_type is no longer used for frontend routing:
- components/dashboard/ActiveFlowPilotSessions.tsx
- components/dashboard/RecentFlowPilotSessions.tsx
- components/flowpilot/AISessionListItem.tsx
  (keeps isChat for icon selection, but linkTo is unconditional)

User-facing label + navigation updates:
- components/layout/CommandPalette.tsx: "AI Assistant" palette entry
  becomes "FlowPilot" pointing to /pilot; the sparkles quick-action also
  routes to /pilot.
- components/dashboard/StartSessionInput.tsx: both navigate() call sites
  now go to /pilot instead of /assistant.
- lib/routePrefetch.ts: prefetch entry for AssistantChatPage keyed to
  /pilot (the real surface) rather than /assistant (now redirect-only).

Preserved intentionally (not user-facing routes):
- Backend /assistant/retention API path and the assistantChatApi module
  name — those are internal API and module identifiers, not SPA routes.
- src/components/assistant/* and src/types/assistant-chat — TypeScript
  module paths, not routes.
- Sidebar.tsx — no top-level AI entry existed to rename; /pilot is
  already in the History group's matchPaths. Whether FlowPilot deserves
  its own rail entry is a future UX decision, not Phase 1 scope.
- FlowPilotAnalyticsPage at /analytics/flowpilot — analytics for the
  unified product, not guided-only, per the agreed Q16 interpretation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:48:00 +00:00
b49772f1a1 feat(models): Phase 1 SQLAlchemy models — SessionFact, SessionSuggestedFix, DraftTemplate, AccountSettings
Backs the schema added in 210d310 with SQLAlchemy 2.0 models.

- SessionFact: "What we know" facts with polymorphic source_ref pointing
  at task-lane item UUIDs inside ai_sessions.pending_task_lane (not a FK
  per Section 4.2).
- SessionSuggestedFix: AI-proposed resolutions with supersession tracking
  and the full user_decision state machine.
- DraftTemplate: post-resolve templatization queue with promotion to
  script_templates.
- AccountSettings: per-account JSONB preferences grab-bag with async
  classmethod helpers — get_setting(db, account_id, key, default) reads
  without creating, set_setting(db, account_id, key, value) upserts via
  Postgres ON CONFLICT + jsonb `||` merge so existing keys are preserved.
  Lazy row creation matches the Phase 1 design.

Column additions on existing models to mirror the migration:
- AISession: resolution_note_* / escalation_package_* / state_version
  (the preview-cache-invalidation counter consumed by Phase 3).
- ScriptTemplate: source_session_id / source_user_id / source_ticket_ref
  (provenance for templates promoted from DraftTemplate).

All four new models registered in app.models.__init__ and __all__.
TYPE_CHECKING-guarded relationship imports throughout, matching the
repo's existing model style.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:35:00 +00:00
210d310fb2 feat(db): Phase 1 schema — session_facts, suggested_fixes, draft_templates, account_settings
Adds the backing store for the FlowPilot unified session surface, per
the FLOWPILOT-MIGRATION.md Phase 1 deliverable. Descends from production
head 074 (add_network_diagrams_table).

New tables (all tenant-scoped, all RLS-enabled + forced):
- session_facts — "What we know" facts. source_ref is a polymorphic
  pointer to a task-lane item inside ai_sessions.pending_task_lane
  (no DB-level FK; integrity enforced at service layer per Section 4.2
  of the design doc). Soft-delete via deleted_at; active-facts partial
  index excludes deleted rows.
- session_suggested_fixes — AI-proposed resolutions. One active per
  session at a time (supersession tracked via superseded_at; partial
  index on (session_id) WHERE superseded_at IS NULL powers the
  "find active fix" query).
- draft_templates — scripts pending post-resolve templatization.
  Partial index on (account_id) WHERE status='pending' supports the
  "N scripts ready to review" Script Library badge.
- account_settings — new per-account table with JSONB preferences
  grab-bag. Rows created lazily on first write; get_setting returns
  default when no row exists.

Column additions on ai_sessions:
- resolution_note_markdown / posted_at / external_id
- escalation_package_markdown / posted_at / external_id
- state_version (INTEGER NOT NULL DEFAULT 0) — incremented atomically
  by any write that invalidates the resolution note preview cache
  per Section 5.5. Phase 3 consumes this.

Column additions on script_templates:
- source_session_id, source_user_id, source_ticket_ref — powers the
  "generated from CW #X · resolved by Y · used N times" provenance
  chip in the Script Library.

RLS pattern matches the repo convention (074 / network_diagrams is the
nearest template): ENABLE + FORCE, USING + WITH CHECK on
`account_id = app.current_account_id`. Downgrade is reversible —
drops in the inverse order of creation so FK dependencies unwind.

No runtime verification from code-server; migration apply + downgrade
will be verified on the new dev environment per the standing deferral.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:14:26 +00:00
92fadfb90a docs(flowpilot-migration): integrate Codex plan review + Phase 0 audit findings
Significant rewrite of FLOWPILOT-MIGRATION.md after post-Codex plan review
and the Phase 0 in-flight audit. Archives the pre-rewrite version as
FLOWPILOT-MIGRATION-v1.md and keeps the Codex review under
CODEX-FlowAssist-Migration-PLAN.md for traceability.

Substantive changes that affect implementation:

- Section 0.1 adds a spec-drift note listing corrections integrated into
  this revision (API namespace, task-lane item UUIDs, account_settings
  creation, missing /tickets/ai-parse endpoint).
- Section 2 adds "Task lane item ID" terminology — stable UUID assigned
  to items inside ai_sessions.pending_task_lane so session_facts.source_ref
  has something reliable to point to.
- Section 4.1 adds ai_sessions.state_version (INTEGER NOT NULL DEFAULT 0)
  and escalation_package_external_id. state_version drives preview cache
  invalidation; incremented atomically on writes to facts / suggested
  fixes / script_generations.
- Section 4.6 creates account_settings as a new table with JSONB
  preferences column, lazy row creation, and a promotion rule for when a
  setting should graduate to a typed column.
- Section 5 namespaces all session-scoped routes under
  /api/v1/ai-sessions/{id}/... to match the existing codebase pattern.
- Section 5.5 documents the preview caching strategy (state_version
  keyed, 500ms client debounce, Redis planned).
- Section 6.6 adds per-service MCP capability flags alongside the model
  tier flags.
- Section 7.1 makes the /assistant -> /pilot redirect include the
  session-deep-link path and preserve the session ID.
- Section 8.2 adds supersession semantics for [SUGGEST_FIX] markers.
- Section 9 Phase 1 now explicitly includes account_settings and
  state_version; Phase 3 uses state_version-keyed caching; Phase 5
  mentions MCP inheritance via chat_call_cached wrapper.
- Section 11 adds a dedicated test plan (migrations, backend, frontend,
  manual QA).
- Section 14 captures the eight planning decisions made during the
  Phase 0 conversation so they are traceable.

No code changes in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:05:04 +00:00
3f0a132058 refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4)
Renames the chat caller to a name that signals its actual purpose, and
factors the reusable cached-system-block + cached-history + cache-usage-log
primitives out to app.core.ai_provider so they can be shared with the
provider-generic path without pulling MCP/beta/images into the abstract
interface.

Helpers added to ai_provider.py:
- `build_anthropic_chat_messages(history, new_message, images, format_reminder)`
  — owns: copy history, apply cache_control to last history message,
  append format reminder to new message, render images as multimodal blocks.
  Anthropic-shaped by design; do not call from Gemini paths.

chat_call_cached keeps exactly the concerns that are unique to the one
MCP/beta/multimodal chat caller:
- Anthropic beta endpoint invocation
- Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN)
- Retry-without-MCP fallback
- Format-reminder content string (declared as module constant)
- Phase 0.5 telemetry (mcp.turn, mcp.fallback)

Documents in the module docstring AND at the function site that this is
the ONE MCP/beta chat caller and should not become the general provider
path. MCP/beta/images are features of exactly one optional Anthropic beta
endpoint; routing them through AnthropicProvider would leak a provider-
specific concern into the abstract interface that also serves Gemini.

Behavior change: chat_call_cached now reuses the singleton AnthropicProvider
HTTP client via `_get_anthropic_client(...)` instead of instantiating a new
`anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern
and avoids burning connections per-turn. No user-visible difference.

No runtime verification from code-server. TODO(phase0-verify) in
ai_provider.py tracks the cache-hit verification owed on the new dev env.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:03:09 +00:00
da93ae55c3 feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3)
Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).

Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
  (~2.5k tokens with few-shot example); retries inside the same function
  re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
  (each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
  substitution — same-language sessions share cache entries)

Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.

script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.

No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:29:45 +00:00
56fd440b16 docs(flowpilot-migration): flag Phase 0.2 as pending-endpoint; target not yet built
The /tickets/ai-parse endpoint named in Phase 0.2 does not exist in the
codebase (verified: zero matches for ai-parse/ai_parse across endpoints,
services, models, and all branches/commit messages). integrations.py:557
is get_ticket_statuses — a CW passthrough with no AI call.

Adding a block-quoted note under the 0.2 deliverable that flags the
drift, records the cached-system-block pattern to apply when the endpoint
is built, and instructs the next editor to remove the note once applied.
No implementation change this commit — guidance only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:24:33 +00:00
b3be66652e feat(ai): structured-system-block caching in AnthropicProvider (Phase 0.1)
Widens AIProvider.generate_json / generate_text / generate_text_stream
signatures to accept `system_prompt: str | list[SystemBlock]`:

- `str` (the existing call shape): passes through uncached, unchanged
  behavior. Every existing caller stays on the uncached path — no silent
  behavior change.
- `list[SystemBlock]`: enables Anthropic prompt caching via structured
  system blocks. Caller-authored `cache_control` is honored verbatim
  (policy α); if no block carries it, the provider applies
  `cache_control: {"type": "ephemeral"}` to the first block only.

Gemini ignores cache_control and concatenates list entries into one
system string — the widened signature is strictly additive on that path.

Adds `anthropic.cache` structured-log telemetry: on every Anthropic
response (streaming included, via `stream.get_final_message()`), logs
`cache_read_input_tokens` and `cache_creation_input_tokens`. Telemetry
failure in streaming is swallowed so the user-facing stream never breaks.

Verification deferred: cannot run from code-server (no Python, no DB,
no dev env). TODO(phase0-verify) left inline in the module docstring.
First verification task on the new dev environment is to hit any
FlowPilot endpoint twice within 5 minutes and confirm the second call
shows cache_read_input_tokens > 0 in the `anthropic.cache` log event.
If verification fails, that's a debug task on the new env — not a
blocker for continuing Phase 0.2/0.3/0.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:17:12 +00:00
0fbc1e0a57 feat(telemetry): add MCP per-turn structured-log telemetry (Phase 0.5)
Emits structured `mcp.turn` log events on every Anthropic-path chat turn,
capturing whether MCP was wired in (mcp_available), whether the model
actually invoked an MCP tool (mcp_invoked), which tool names fired,
and whether the silent retry-without-MCP fallback was triggered.
Adds a separate `mcp.fallback` event with error type/message for
fallback occurrences.

Establishes baseline data for deciding whether MCP investment is earning
its keep before Phase 2+ expands the product footprint. Scope: the one
MCP-using code path (`_call_anthropic_cached`) — not a general
instrumentation layer.

No new dependencies, no schema changes, no behavior change. Standard
library `logging` is the sink; PostHog is not wired on the backend.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:57:13 +00:00
46291f30b9 docs: add FlowPilot migration design doc and mockups
Brings the locked FlowPilot migration design onto the branch that will
implement it. Includes the annotated target UI mockups (primary session
view + three Script Generator integration states) and the superseded
FLOWPILOT-AND-RESOLUTIONASSIST.md for historical reference.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:22:39 +00:00
f0ccf313a4 docs: add lessons 110-111 (RLS backfill audit, axios interceptor pattern)
Some checks failed
CI / backend (push) Failing after 15m45s
CI / frontend (push) Failing after 47s
CI / e2e (push) Has been skipped
Mirror to GitHub / mirror (push) Successful in 3s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 12:50:43 +00:00
0d9babb986 fix(rls): add account_id to AISessionStep creations, fix boards toast
Some checks failed
CI / backend (push) Failing after 16m37s
CI / frontend (push) Failing after 45s
CI / e2e (push) Has been skipped
Mirror to GitHub / mirror (push) Successful in 3s
- flowpilot_engine: pass account_id at all 5 AISessionStep instantiation
  sites (_create_step_from_parsed x3, briefing step, status update step).
  Phase 4 RLS blocked every INSERT with NULL account_id — this broke all
  new FlowPilot sessions since the Phase 4 migration was applied.
- integrations: list_boards returns [] on PSAError instead of 502, stopping
  the spurious 'Server error' toast on dashboard load (boards are optional).
- client.ts: 5xx global toast now shows backend detail when available.
- useFlowPilotSession: startSession extracts backend detail for error state;
  suppresses duplicate toast for 5xx (global interceptor already handles it).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 04:41:14 +00:00
567985402f fix(psa): use board/id in (...) for multi-board filter per CW docs
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:54:05 +00:00
08a4c6600d fix(psa): use resources contains identifier for my tickets filter
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 3s
CW resources field is a plain string of member identifiers (login names),
not a navigable object. resources/member/id was invalid syntax causing 403.

Now resolves the CW member identifier from the cached member list and
uses: resources contains '{identifier}' which is the correct condition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:53:26 +00:00
29fa48e71b fix(psa): revert to resources/member/id for my tickets filter
Some checks failed
CI / backend (push) Has started running
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
Mirror to GitHub / mirror (push) Has been cancelled
Requires CW API member security role to have All scope on Service Tickets.
owner/id was incorrect for workflows using resources-based assignment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:48:10 +00:00
908a867986 fix(psa): use owner/id instead of resources/member/id for my tickets filter
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Has been cancelled
resources/member/id requires All scope on Service Tickets security role.
owner/id (primary assignee) works with standard Mine scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:43:34 +00:00
346576a730 feat(psa): ticket queue dashboard with board selector and session auto-start
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
- Add PSABoard type + list_boards() to CW provider (cached 1h)
- Extend search_tickets with assigned_to_me, unassigned, board_ids, page, page_size
- New GET /integrations/psa/boards endpoint
- New TicketQueue dashboard component: My Tickets / Unassigned tabs,
  multi-select board filter, Load more pagination, Start Session per ticket
- Add TicketQueue to QuickStartPage after active sessions
- FlowPilotSessionPage auto-starts with ticket context when navigated
  from TicketQueue (psaTicketId + psaTicket in location.state)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:20:45 +00:00
b18072e24b fix(psa): set account_id on PsaMemberMapping in save and auto-match
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
2026-04-15 02:59:49 +00:00
e0f44e2985 fix(ci): connect to postgres service by hostname, not localhost
Some checks failed
CI / backend (push) Failing after 16m41s
CI / frontend (push) Failing after 56s
CI / e2e (push) Has been skipped
Mirror to GitHub / mirror (push) Successful in 2s
2026-04-15 01:52:03 +00:00
adfbb39297 fix(ci): use --break-system-packages for pip on Ubuntu 24.04
Some checks failed
Mirror to GitHub / mirror (push) Successful in 2s
CI / backend (push) Failing after 50s
CI / frontend (push) Failing after 42s
CI / e2e (push) Has been skipped
2026-04-15 01:49:58 +00:00
6bae205a8c chore: trigger CI
Some checks failed
CI / backend (push) Failing after 12s
CI / frontend (push) Failing after 1m6s
CI / e2e (push) Has been skipped
Mirror to GitHub / mirror (push) Successful in 3s
2026-04-15 01:48:17 +00:00
ee2b2c2399 feat(ci): port CI workflow from Github Actions to Gitea
Some checks failed
Mirror to GitHub / mirror (push) Successful in 3s
CI / backend (push) Failing after 35s
CI / frontend (push) Failing after 32s
CI / e2e (push) Has been skipped
2026-04-14 23:33:12 +00:00
37bc47b75b chore: add runner probe workflow
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
2026-04-14 23:27:30 +00:00
c8bdd0014e Update Github mirror workflow
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
2026-04-14 22:50:53 +00:00
2a2b770405 Update Github mirror workflow
Some checks failed
Mirror to GitHub / mirror (push) Failing after 3s
2026-04-14 22:49:20 +00:00
d6d0e9f3c1 Add GitHub mirror workflow
Some checks failed
Mirror to GitHub / mirror (push) Failing after 1s
2026-04-14 22:43:09 +00:00
ab4bf3b32f Add GitHub mirror workflow
Some checks failed
Mirror to GitHub / mirror (push) Failing after 42s
2026-04-14 22:31:37 +00:00
chihlasm
d3c93cd006 feat(admin): allow setting owner when creating an account
feat(admin): allow setting owner when creating an account
2026-04-14 17:27:02 -04:00
chihlasm
4037a5213e fix(admin): use EmailStr for owner_email validation in AdminAccountCreate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:25:03 +00:00
chihlasm
0ed5977fee feat(admin): allow setting owner when creating an account
Adds optional owner_email field to the Create Account modal. Superadmin
can specify an existing user's email to assign as account owner at
creation time. Backend 404s with a clear message if the email is unknown.
Error detail now surfaces to the toast instead of a generic message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:30:23 +00:00
chihlasm
c5b8229ef6 fix(admin): allow owner and admin account roles in user creation and role management
Four places were hardcoded to engineer|viewer only:
- AccountRoleUpdate schema (user.py) — blocked PUT /admin/users/{id}/account-role at the API level
- AdminUserCreate schema (admin.py) — blocked creating users with owner/admin role
- AccountDetailPage role dropdowns (create form + inline member role changer)
- AccountsPage create user role dropdown

Now all four accept the full set: owner, admin, engineer, viewer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 13:24:17 +00:00
chihlasm
eba50e1f95 docs(claude-md): trim GitNexus section to selective-use guidance
Remove mandatory "MUST run before every edit" rules — they add overhead
without value for additive/isolated changes. Keep the tools table and
use-it-when-it-matters guidance.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 12:58:40 +00:00
chihlasm
8eb814283d fix(psa): fix time entry AttributeError and show all users in member mapping
- Fix create_time_entry() using self._client instead of self.client
- GET /member-mappings now returns all active account users, not just mapped
  ones — allows manual assignment when auto-match by email doesn't work
- PsaMemberMappingResponse mapping fields are now Optional (id, external_member_id,
  external_member_name, matched_by) to represent unmapped users
- Frontend MemberMappingTab skips null external_member_id when building
  localMappings, and derives user list from all returned entries
- Add docs/connectwise-psa-testing-checklist.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 06:09:01 +00:00
chihlasm
b433b232dc polish(network): visual refinements across node, edge, and panel components
- DeviceNode: flat bg-card (no surface gradient), darker icon plate inset,
  correct text-muted token for category label
- GroupNode: label pill gets bg-card/90 background so it reads against canvas
- ConnectionEdge: label now has border + bg-card so it doesn't float invisible
- BaseHandle: tightened to 12px with accent-toned border
- NodeStatusIndicator: glow reduced to 0.15 opacity (design system compliant)
- ContextMenu: Ungroup now uses Ungroup icon instead of BoxSelect
- DeviceToolbar: group type icons coloured with semantic palette
- PropertiesPanel: empty state gets icon tile + cleaner copy hierarchy
- DiagramEditor: shortcut ? button repositioned above MiniMap, accent hover
- NetworkDiagrams list: card thumbnail placeholder uses dot-grid pattern,
  card menu gets icons and divider before destructive action

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 05:35:25 +00:00
chihlasm
015df1fe5f fix(network): consolidate import buttons, redesign empty state, add shortcut overlay
- Import/Export button in editor header: removed standalone Import button, moved
  draw.io import into Export/Import dropdown with labelled sections; fixes
  conceptual trap where Import implied operating on the current diagram
- List page: replaced two identical Upload-icon Import buttons with a single
  dropdown (Import JSON / Import draw.io) with format descriptions
- Empty state: replaced icon-in-box with a horizontal card featuring a static
  SVG topology preview, MSP-specific value prop, and dual CTAs
- Keyboard shortcuts: new KeyboardShortcutsOverlay component (4-group grid),
  triggered by ? key or the ? button pinned to the canvas bottom-right corner;
  wired into useCanvasShortcuts hook
- Fixed Share2 → FileOutput icon for draw.io export (Share2 = send to someone,
  FileOutput = export file format)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 04:49:25 +00:00
chihlasm
cf9c258f9e fix(network): surface connect tool and middle-pan 2026-04-14 03:41:21 +00:00
chihlasm
c063952f12 feat(network): add connect tool and middle-pan 2026-04-14 03:28:07 +00:00
chihlasm
36721eb5af feat(network): improve connector editing 2026-04-14 02:56:28 +00:00
chihlasm
3cd4084f78 refactor(network): simplify diagram node visuals 2026-04-14 02:42:47 +00:00
chihlasm
ed763d1cea chore(network): remove asset style lab 2026-04-14 02:29:26 +00:00
chihlasm
c37e216e0b feat(network): add asset style lab mockups 2026-04-14 02:10:48 +00:00
chihlasm
91cc9a4170 feat(network): draw.io XML import
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:30:22 +00:00
chihlasm
2a4220b496 feat(network): draw.io XML export
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:25:49 +00:00
chihlasm
c8f571db39 feat(network): thumbnail generation on save, shown on list page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:22:51 +00:00
chihlasm
7efa22454d feat(network): improve PDF export with print stylesheet
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:20:28 +00:00
chihlasm
05421fc65c feat(network): add SVG export
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:19:19 +00:00
chihlasm
dfcad531e2 fix(network): context menu on groups + group/ungroup in properties panel
Context menu fix:
- Group nodes pass pointer events through to children in React Flow, so
  right-clicking a group fires onPaneContextMenu instead of onNodeContextMenu
- handlePaneContextMenu now checks for selected nodes and shows the node
  context menu (with align/group options) when any nodes are selected

Properties panel multi-select:
- Add Group section with type dropdown (Subnet, VLAN, Site, DMZ, Custom)
- "Group into [Type]" button creates a group of the chosen type
- Ungroup button appears when a group node is in the selection
- useDiagramCommands.groupSelection now accepts a groupType param and
  uses it as the label and color key for the new group node

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 00:55:34 +00:00
chihlasm
684fb07e47 feat(network): add pointer/hand mode toggle to diagram toolbar
- Header shows MousePointer2 (select) and Hand (pan) toggle buttons
- Select mode: drag on canvas draws a selection box (selectionOnDrag)
- Pan mode: drag on canvas pans the viewport (panOnDrag)
- Space held in either mode temporarily switches to pan (panActivationKeyCode)
- Keyboard shortcuts: V = select mode, H = pan mode
- Cursor changes to grab/grabbing in pan mode

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 00:38:51 +00:00
chihlasm
4a12c9b37d fix(network): persist group node type, size, and child parentId on save/load
Backend DiagramNode schema was missing nodeType, style, and parentId fields —
Pydantic stripped them on save, so group nodes lost their identity on reload
and re-appeared as small device icons.

- Backend: add nodeType, style (NodeStyle), parentId to DiagramNode schema
- Frontend: serialize parentId for device nodes inside groups
- Frontend: restore parentId + extent:'parent' on both deserializer paths (setNodes + history init)
- Frontend: add parentId to DiagramNode interface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 23:49:26 +00:00
chihlasm
e41d7bd960 fix(network): align resize border with node visual boundary
NodeResizer handles positioned at RF wrapper size, but NodeTooltip and
NodeStatusIndicator wrappers had no size constraints, causing BaseNode
(w-full h-full) to shrink to content size instead of filling the wrapper.

Add w-full h-full to NodeTooltip, NodeTooltipTrigger, and
NodeStatusIndicator so the full height chain is maintained.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 21:27:58 +00:00
chihlasm
f2c3bd7a9b fix(network): normalize z-order to 1..N after bring-to-front/send-to-back
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:17:44 +00:00
chihlasm
9786c6b1fb feat(network): add inline label editing on DeviceNode (double-click)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:17:41 +00:00
chihlasm
4529955f7d feat(network): add orthogonal edge routing option
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:17:33 +00:00
chihlasm
b7b0d41f92 feat(network): add group/ungroup commands with bounding box calculation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:14:26 +00:00
chihlasm
a4512dcf90 feat(network): add GroupNode component with resize, inline label, and group type colors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:13:03 +00:00
chihlasm
764db79060 feat(network): add alignment toolbar to PropertiesPanel for multi-select
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:11:12 +00:00
chihlasm
f90e2c956f feat(network): add align/distribute/group sections to context menu
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:09:32 +00:00
chihlasm
bdaea68dd3 feat(network): add useDiagramCommands — alignment and distribution command layer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:08:37 +00:00
chihlasm
02c19a7580 feat(network): add undo/redo shortcuts (Ctrl+Z/Y) and arrow key nudging
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:06:33 +00:00
chihlasm
a392d24101 feat(network): add undo/redo buttons to DiagramHeader
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:04:58 +00:00
chihlasm
b9c9bb548d fix(network): force re-render on undo/redo so canUndo/canRedo stay accurate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:03:35 +00:00
chihlasm
662df2907d feat(network): add undo/redo snapshot history stack to DiagramEditor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 20:01:21 +00:00
chihlasm
b9547e6ce1 docs: add network diagrams Phase 2 implementation plan
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 18:23:23 +00:00
chihlasm
760e0f77f8 docs: add network diagram draw.io-style implementation plan
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 18:16:54 +00:00
chihlasm
a71f082e25 feat: extract admin account management rework from PR 124 (#138)
* feat: reorganize admin panel around accounts

* feat: expand admin customer account controls

* feat: add admin account detail management

* fix: remove unused admin account icon import

* refactor: design critique fixes for account pages

- Admin accounts: replace dense card grid with compact DataTable
- Account settings: remove redundant hero card, stat grid, header pills
- Fix bg-accent (orange) misuse on decorative elements across 7 files
- Add ConfirmButton for destructive actions (deactivate, remove member)
- Replace single-field modals with inline editing (plan, trial)
- Add contextual help: display code tooltip, improved empty states
- Non-owner aside explanation for hidden owner-only sections
- Admin sidebar: group 11 items into 5 labeled sections
- Rename UsersPage.tsx → AccountsPage.tsx to match route
- Fix border radius consistency, hide zero-count badges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use get_admin_db for all new admin account endpoints

All admin endpoints query across tenants without a tenant context.
get_db (app-role, subject to RLS) was never imported and would crash
at runtime — replace all 6 occurrences with get_admin_db (BYPASSRLS).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 04:44:51 -04:00
chihlasm
abd79bc763 feat: extract network map builder from PR 124 (#137)
* feat: add device_types table with system seed data

Creates DeviceType SQLAlchemy model and migration 073 that provisions the
device_types table with 28 system-seeded device types across 7 categories
(network, compute, storage, cloud, endpoint, infrastructure, security).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network_diagrams table

Create NetworkDiagram SQLAlchemy model with JSONB nodes/edges, team-scoped with client/asset metadata, and Alembic migration 074.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Pydantic schemas for device types and network diagrams

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add device types CRUD router

Adds GET/POST/PUT/DELETE endpoints at /device-types with team-scoped access. System types are read-only; custom types are scoped to the creating team.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add AI generation service for network diagrams

Adds network_diagram_ai_service.py with generate_diagram() function that
calls the AI provider to convert plain-English network descriptions into
structured DiagramNode/DiagramEdge data. Registers the action in
ACTION_MODEL_MAP as a standard-tier route.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network diagrams CRUD + AI generate + export/import router

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add TypeScript types for network diagrams

Adds all interfaces for network diagrams and device types including
DiagramNode, DiagramEdge, DeviceProperties, NetworkDiagramResponse,
AI generate request/response, import/export shapes, and list item types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add frontend API clients for device types and network diagrams

Adds deviceTypesApi (list, create, update, remove) and networkDiagramsApi
(list, get, create, update, archive, duplicate, exportJson, importJson,
aiGenerate, listClients) following the existing apiClient module pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add device registry, DeviceNode, ConnectionEdge for React Flow

Creates the React Flow building blocks for the network diagram editor:
device type registry with icon/color mappings, DeviceNode component with
status indicators and connection handles, ConnectionEdge with per-type
styling, and nodeTypes/edgeTypes registration maps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add DeviceToolbar panel with search, categories, drag-drop, custom type creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add PropertiesPanel for node and edge property editing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add AIAssistPanel with replace and merge modes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add NetworkCanvas wrapper and DiagramHeader components

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add DiagramEditor page assembling all panels with auto-save and AI generation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Network Diagrams list page with search, client filter, import

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Network Maps to sidebar navigation and router

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve TypeScript errors in DeviceToolbar and DiagramEditor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve stale selection bug in network diagram PropertiesPanel

Selection state now stores IDs and derives objects from live arrays,
so edits in PropertiesPanel inputs reflect immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add React Flow UI foundation components for network diagrams

BaseNode (structured node shell with header/content/footer slots),
BaseHandle (styled connection handle), LabeledHandle (handle with
port label), NodeStatusIndicator (status border effect),
NodeTooltip (hover details via NodeToolbar).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add LabeledGroupNode and AnimatedSvgEdge components

GroupNode for subnet/VLAN/site grouping with positioned label badge.
AnimatedSvgEdge for traffic flow visualization with animated SVG
shape along edge path. Both registered in type maps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: DeviceNode uses BaseNode, BaseHandle, StatusIndicator, Tooltip

Replaces hand-rolled node layout with composable React Flow UI
components. Status is now a border effect instead of a dot.
Hover tooltip shows hostname, IP, vendor, role, notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add grouping toolbar items and traffic flow toggle

DeviceToolbar gets Subnet/VLAN/Site/DMZ grouping section with
drag-drop. PropertiesPanel gets Show Traffic toggle that switches
edges between connection and animated types. DiagramEditor handles
both device and group node drops.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address code review findings for React Flow UI integration

- Use screenToFlowPosition() for drop coordinates (fixes zoom/pan bug)
- Remove duplicate selection border from DeviceNode (BaseNode handles it)
- Add w-full to GroupNode for proper container sizing
- Remove unused 'selected' destructuring from DeviceNode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add ISP icon to network diagram device registry

Globe icon with accent color, under cloud category.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: improve drag-and-drop feel in network diagram editor

Grip icons on draggable toolbar items, press effect on drag start,
dashed border overlay with 'Drop to add' text when dragging over canvas.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add ContextMenu component for network diagram editor

Charcoal-styled context menu with action factories for node
and canvas variants. Viewport-clamped positioning, auto-dismiss
on click outside, escape, or scroll.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add useCanvasShortcuts hook for copy/paste/duplicate

Keyboard shortcuts with preventDefault and input guard.
Clipboard stores nodes with relative positions and edge indices.
Paste computes canvas center via screenToFlowPosition.
Duplicate offsets +30px. Supports both device and group nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: wire context menu and keyboard shortcuts into diagram editor

Right-click context menus for nodes (copy/duplicate/delete) and
canvas (paste/select-all/fit-view). Right-click selects the node
per spec. serializeNodes now handles group nodes correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: context menu dismisses on pane click, ISP in toolbar

Context menu now closes when clicking anywhere on the canvas via
onPaneClick prop. ISP device added as built-in toolbar item under
Internet section so it's always available without a database entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: backend code review fixes for network diagrams

- Replace legacy Optional imports with modern str | None syntax
- Type JSONB columns as Mapped[list[dict[str, Any]]]
- Escape SQL LIKE wildcards (%, _) in diagram search
- Type DiagramNode.position as Position(x, y) Pydantic model
- Wrap AI response parsing in KeyError handler for clean 422 errors
- Remove unused Optional/TYPE_CHECKING imports from schemas/models
- Extract _get_available_slugs helper to DRY duplicate queries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: network diagram editor UX — straight edges, snap-to-grid, ISP in Cloud, group resize

- Straight edges: replace SmoothStepEdge with BaseEdge + getStraightPath so
  connections draw direct diagonal lines instead of orthogonal bent paths
- Snap-to-grid: add snapToGrid/snapGrid=[20,20] to NetworkCanvas so nodes
  align consistently when dragged
- ISP in Cloud: remove standalone "Internet" sidebar section, inject ISP into
  the Cloud category loop with search support and correct item count
- Group node resize: add NodeResizer to GroupNode (subnet/VLAN/site/DMZ),
  handles visible when selected; dimensions saved/restored correctly on
  reload (also fixes group node load bug where type was always 'device')
- DiagramNode type: add nodeType and style optional fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: network diagram team_id guard + multi-style edge routing

Backend:
- Guard create_diagram with 422 if current_user.team_id is None (prevents
  NOT NULL constraint crash for accounts not yet assigned to a team)
- Add routing field to DiagramEdge schema (straight/curved/step)

Frontend:
- ConnectionEdge now supports straight (default), curved (bezier), and
  step (smooth-step) routing per-edge via routing field in edge data
- PropertiesPanel Connection section gets a Line Style toggle:
  Straight | Curved | Step buttons, active state highlights in accent
- handleEdgeUpdate and serializeEdges now propagate the routing field
- DiagramEdge type gets optional routing field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: network diagrams UX overhaul — icons, empty canvas, properties panel

- Colorize: semantic category colors for all device types (network=blue,
  security=orange, compute=emerald, endpoint=amber, storage=violet,
  cloud=cyan, infra=steel); better icons (Router, ShieldAlert, Boxes,
  Package, Gauge, PlugZap, Video, Radio); MiniMap uses category colors
- Onboard: centered AI generate prompt on empty canvas with 5 MSP-specific
  example chips, ⌘↵ shortcut, spinner; AIAssistPanel only shown with nodes
- Arrange: properties panel — status badge grid at top, fields grouped into
  Network (IP/Subnet/VLAN) and Hardware (Hostname/Vendor/Model/Role) sections
- Delight: segmented topology color bar on listing cards; backend returns
  category_counts via single extra query on list endpoint
- Harden: real PNG export via html-to-image + getNodesBounds/getViewportForBounds
- Polish: ChevronDown replaces unicode ▾, click-outside for client filter,
  consistent spinner in empty prompt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: drop changelog noise from network extraction

* fix: align network map builder with account isolation

* feat: add manual create option for network maps

* feat: make manual network map creation easier to discover

* fix(network-maps): address design critique — harden, normalize, clarify, polish

- Archive: two-step inline confirm in card dropdown menu
- Delete Device/Edge: two-step inline confirm in PropertiesPanel footer
- Context menu Delete: floating confirm bar instead of immediate deletion
- AI Generate New: two-step confirm when replacing existing diagram nodes
- DiagramHeader: show 'Unsaved changes' in amber when isDirty and not saving
- deviceRegistry: SECURITY_COLOR #f97316 → #f87171 (deprecated ember orange removed)
- CanvasEmptyPrompt: remove backdrop-blur (design system violation)
- CanvasEmptyPrompt: remove redundant 'Skip AI' bottom button (duplicate of Build manually card)
- CanvasEmptyPrompt: rounded-xl/rounded-2xl → rounded-lg, border-2 → border
- Topology bar: h-1 → h-2 + native tooltip with category breakdown
- AIAssistPanel: replace pulse-dot loading with spinner (consistent with rest of feature)
- ContextMenu: add shadow-lg (consistent with other dropdowns)
- DeviceNode tooltip: Position.Bottom → Position.Top (avoids canvas-edge clipping)
- CanvasEmptyPrompt: raise ⌘↵ hint from /50 opacity to full text-muted-foreground

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(network-maps): bring to front / send to back layering for nodes

Three entry points for z-index control:
- Right-click context menu: Bring to Front / Send to Back with ] / [ shortcuts, separated by dividers from copy/delete groups
- Properties panel: Layer row with Bring Front + Send Back buttons, tooltip shows keyboard shortcut
- Keyboard: ] brings selected node(s) to front, [ sends to back (skips when input focused)

Context menu also gains divider support (dividerBefore flag) for visual grouping.
Layering handlers use max/min zIndex across all nodes so repeated presses always stack correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: swap switch icon from Layers → Network (Lucide)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: icon size picker (S/M/L) on device nodes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: drag-to-resize device nodes + BrickWallFire for firewall

- NodeResizer on DeviceNode (same pattern as group nodes); icon scales
  proportionally with node width, clamped 16–60px
- Removes S/M/L static picker — resize is now direct manipulation
- firewall: ShieldAlert → BrickWallFire

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: trigger Railway rebuild

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add missing hero_001.jpg to git (was untracked, broke Railway deploy)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: ShieldAlert still referenced in CATEGORY_DEFAULTS after icon swap

Removed ShieldAlert from imports when swapping firewall icon to BrickWallFire
but left it in CATEGORY_DEFAULTS — runtime crash, device toolbar empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(network): proportional node resize with locked aspect ratio

Nodes grew into rectangles because NodeResizer had no aspect ratio
constraint, minWidth != minHeight, and icon/text only scaled from width.

- DeviceNode: add keepAspectRatio + equal minWidth/minHeight (80×80),
  maxWidth/maxHeight (280×280), scale icon and label/IP font sizes from
  Math.min(width, height) so all content grows uniformly
- DiagramEditor: set explicit 120×120 style on dropped device nodes so
  React Flow has a definite starting size for aspect ratio calculation
- DiagramEditor: persist device node style (width/height) in
  serializeNodes and restore it on load so size survives save/reload

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): suppress ESLint errors in network diagram components

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 02:38:01 -04:00
Claude
af5ceea7f9 docs: update CHANGELOG with Phase 4 tenant isolation details
- Added Phase 4 RLS enforcement on all 31 remaining tables (#136)
- Documented BYPASSRLS session pattern and admin session factory
- Listed Phase 4 fixes for auth deps, background jobs, and seed scripts

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-04-12 10:44:07 +00:00
chihlasm
f54d7ecd78 docs: update current state after Phase 4 merge 2026-04-12 04:35:30 +00:00
chihlasm
46593ba8ca Merge PR #136: feat: tenant isolation Phase 4 — RLS on all remaining tables 2026-04-12 04:35:01 +00:00
chihlasm
52553d62d2 fix(tests): update expectations for RLS-correct behavior
- test_rls_isolation: add pytestmark for module-scoped event loop to fix
  "Future attached to a different loop" with pytest-asyncio 0.23 + asyncpg
  module-scoped fixtures
- test_admin_categories_global: global categories use PLATFORM_ACCOUNT_ID
  not NULL; update stale assertion
- test_permissions_account: with RLS, cross-tenant tree access returns 404
  (invisible) not 403 (forbidden) — update to match actual behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:48:30 +00:00
chihlasm
a48660700a fix: background jobs and lifespan must use BYPASSRLS sessions
All code that runs outside a request context (APScheduler jobs,
lifespan startup) has no app.current_account_id set, so the
app-role session returns 0 rows from every RLS-protected table.

Changed to _admin_session_factory (BYPASSRLS) in:
- knowledge_flywheel_scheduler.py — queries ai_sessions
- psa_retry_scheduler.py — queries psa_post_log
- retention_cleanup.py — queries assistant_chats
- scheduler.py (_fire_maintenance_schedule, _cleanup_expired_ai_conversations)
- main.py (archive_stale_ai_sessions, _process_notification_retries,
  load_all_schedules at startup)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:44:23 +00:00
chihlasm
3ff886363c fix: use BYPASSRLS session for all auth deps and user-mutation endpoints
Phase 4 enabled RLS on the users table. All code paths that touch users
(or other RLS-protected tables) before require_tenant_context sets
app.current_account_id must use get_admin_db (BYPASSRLS):

- deps.py: get_current_user and get_current_active_user → get_admin_db
- auth.py: all endpoints → get_admin_db (login, register, refresh, etc.
  run before tenant context exists; mutation endpoints also need session
  consistency since current_user is in the admin session)
- accounts.py: transfer_ownership, leave_account, delete_account
  → get_admin_db (these mutate current_user directly)
- onboarding.py: dismiss_onboarding → get_admin_db (same reason)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:25:18 +00:00
chihlasm
501442e5f0 fix: seed_test_users must use ADMIN_DATABASE_URL after Phase 4 RLS on users
RLS is now enabled on the users table. The seed script was using the
app-role connection (DATABASE_URL) which has no tenant context at seed
time — all SELECTs return 0 rows and INSERTs are blocked by FORCE RLS.

Falls back to DATABASE_URL if ADMIN_DATABASE_URL is not set (local dev
without roles configured).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:12:46 +00:00
chihlasm
6f53ec06f5 docs: add lessons 107-109 — RLS startup, global tables, tree_shares account_id
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 02:58:12 +00:00
chihlasm
ec322f7cdf fix: bootstrap service account with BYPASSRLS session 2026-04-12 02:44:36 +00:00
chihlasm
f9248aeaa8 fix: remove platform_steps and template_trees from Phase 4 RLS
Both tables have no account_id column — they are globally readable
by all authenticated users and must not have RLS policies.

Also removes the corresponding test cases that assumed these tables
had account_id-based policies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 01:48:50 +00:00
chihlasm
c6da4ebee5 fix: remove script_categories from Phase 4 RLS — no account_id column
script_categories is a global lookup table (shared across all tenants).
The account_id column belongs to ScriptTemplate in the same model file,
not ScriptCategory. The Python scan matched the file, not the class.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 01:32:42 +00:00
chihlasm
64f004a62c feat: tenant isolation Phase 4 — RLS on 31 remaining tables + script_builder fix
Enable RLS on all remaining tenant-scoped tables (31 tables):

Standard policy (tenant sees own rows):
  users, account_invites, account_limit_overrides, account_feature_overrides,
  subscriptions, ai_chat_sessions, ai_conversations, ai_session_steps,
  ai_session_embeddings, ai_suggestions, ai_usage, assistant_chats,
  attachments, copilot_conversations, feedback, file_uploads, fork_points,
  kb_imports, notifications, notification_configs, notification_logs,
  psa_activity_logs, psa_member_mappings, script_builder_sessions,
  script_categories, session_ratings, tree_embeddings, user_folders,
  user_pinned_trees

Platform-visibility policy (own rows OR PLATFORM_ACCOUNT_ID):
  platform_steps, template_trees

Intentionally skipped:
  accounts (IS the root table, no account_id column)
  plan_feature_defaults (platform config, no account_id column)

Also fixes script_builder_service.create_session() which was missing
account_id= on ScriptBuilderSession construction, causing 500s on all
script builder endpoints (pre-existing CI failure).

Adds Phase 4 RLS isolation tests covering: users, script_builder_sessions,
ai_session_steps, notifications, platform_steps, template_trees.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 01:25:28 +00:00
Claude
ba36e37dab docs: update CHANGELOG with Tenant Isolation Phase 2 and Phase 3 details
- Document Phase 2: PostgreSQL RLS on 11 session tables, account_id NOT NULL enforcement, Alembic migration support
- Document Phase 3: RLS on audit_logs and tree_shares, cross-tenant session access for public shares, complete account_id propagation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 10:43:10 +00:00
chihlasm
9e6965512b Merge pull request #135 from resolutionflow/feat/tenant-isolation-phase-3
feat: tenant isolation Phase 3 — audit_logs, tree_shares, remaining RLS
2026-04-11 04:28:47 -04:00
chihlasm
893b8a5008 fix: tree_shares.account_id must come from tree owner, not the actor
- trees.py: change account_id=current_user.account_id →
  account_id=tree.account_id so super-admin cross-account shares land in
  the tree's tenant where RLS will see them.

- migration a05e1a1bea7c: fix backfill to join tree_shares → trees instead
  of tree_shares → users(created_by). Same logic: historical shares belong
  to the tree's tenant.

- test_tree_sharing.py: add test_share_account_id_matches_tree_not_actor
  to assert share.account_id == tree.account_id after POST /share; also
  add missing account_id to all direct TreeShare(...) constructors in
  existing tests.

- test_phase1_migrations.py: remove team_id= from TargetList constructor
  (column dropped in Phase 3).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 07:02:35 +00:00
chihlasm
e05472615b feat: tenant isolation Phase 3 — audit_logs, tree_shares, remaining RLS
P3-A: Add account_id to audit_logs model + migration (backfill via user_id →
  users.account_id). log_audit() gains optional account_id param with fallback
  SELECT to avoid churn across 40 call sites.

P3-B: Add account_id to tree_shares model + migration (backfill via created_by
  → users.account_id). TreeShare constructor updated in trees.py.

P3-C: Enable RLS on 6 remaining tables: step_ratings, step_usage_log,
  target_lists, session_shares, audit_logs, tree_shares.

P3-D: Drop team_id from target_lists — endpoint, schema, and model now use
  account_id as the sole isolation key.

P3-E: Append Phase 3 RLS isolation tests for all 6 tables.

test_target_lists.py: fix cross-account test to use Account model (not Team)
and set account_id on new User.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 07:02:35 +00:00
chihlasm
00fdd663bc Merge pull request #134 from resolutionflow/feat/tenant-isolation-phase-2
feat: Phase 2 tenant isolation — RLS on 11 session tables
2026-04-11 03:02:25 -04:00
chihlasm
8cf58add22 fix: use valid confidence_tier value in RLS test ai_sessions INSERT
'medium' is not a valid value for ck_ai_sessions_confidence_tier.
Valid values are 'guided' | 'exploring' | 'discovery'.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 05:28:52 +00:00
chihlasm
6c231ef1c6 fix: use started_at (not created_at) in RLS test session INSERT
sessions table has started_at as the timestamp column, not created_at.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 04:53:35 +00:00
chihlasm
758cd61621 fix: propagate account_id through all write paths missing NOT NULL coverage
Service layer (production code):
- branch_manager: set account_id on SessionBranch (root + fork) and ForkPoint
  from session.account_id; load session in create_fork for this purpose
- handoff_manager: set account_id on SessionHandoff from session.account_id
- ai_suggestions endpoint: set account_id on AISuggestion from current_user
- steps endpoint (/feedback): set account_id on StepRating from current_user
- ratings endpoint: set account_id on StepRating from current_user

Test infrastructure:
- conftest.py: seed PLATFORM_ACCOUNT_ID (00000000-...-0001) account after
  Base.metadata.create_all so global categories and gallery items have a valid FK
- test_rls_isolation: add _ensure_rls_schema fixture that runs
  'alembic upgrade head' before module tests — previous function-scoped
  test_db fixtures drop the schema, leaving the RLS tests with no tables
- test_branding: create Account before User in helper functions
- test_admin_gallery: set account_id=PLATFORM_ACCOUNT_ID on Tree/ScriptTemplate
- test_public_templates: set account_id=PLATFORM_ACCOUNT_ID on Tree,
  ScriptTemplate, TreeCategory
- test_resolution_outputs: set account_id=session.account_id on
  SessionResolutionOutput
- test_analytics_phase5: set account_id on PsaPostLog
- test_draft_trees: replace account_id=None with PLATFORM_ACCOUNT_ID in
  migration default test (NOT NULL now enforced)
- test_maintenance_schedules: set account_id on other_tree
- test_save_session_as_tree: set account_id on all 5 Session() constructors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 04:24:36 +00:00
chihlasm
b9fcdd5d73 fix: use DATABASE_URL_SYNC (Railway reference var) as primary Alembic URL
DATABASE_URL_SYNC is now set as a Railway reference variable pointing to
${{pgvector.DATABASE_URL}}, which resolves to the correct postgres superuser
credentials per environment (production, PR preview, fresh DBs). This handles
the bootstrap case where resolutionflow_admin doesn't exist yet.

Falls back to ADMIN_DATABASE_URL (sync-converted) for local dev only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 03:42:07 +00:00
chihlasm
4273ed0e5c fix: use Railway native PG env vars for Alembic migrations
Prior approach (ADMIN_DATABASE_URL first) broke PR preview environments: fresh
Railway PostgreSQL instances have no resolutionflow_admin role yet, so the admin
URL fails before the create_db_roles migration can run (bootstrap deadlock).

New priority order in _alembic_sync_url():
1. PGHOST/PGUSER/PGPASSWORD/PGDATABASE — Railway auto-links these from the
   PostgreSQL service per-environment, giving correct superuser creds for every
   env including fresh PR preview DBs where no custom roles exist yet.
2. ADMIN_DATABASE_URL (resolutionflow_admin, BYPASSRLS, asyncpg→sync) — local
   dev and stable envs where the role already exists.
3. DATABASE_URL_SYNC — legacy fallback.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 03:35:04 +00:00
chihlasm
0107d2d896 fix: use resolutionflow_admin for Alembic migrations (avoid postgres superuser)
DATABASE_URL_SYNC uses the postgres superuser whose password is unavailable
in Railway after Phase 1 switched runtime to the app role. resolutionflow_admin
(BYPASSRLS) is the correct role for migrations. Derive a psycopg2 sync URL from
ADMIN_DATABASE_URL; fall back to DATABASE_URL_SYNC for local dev environments
where ADMIN_DATABASE_URL is not set separately.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 03:23:32 +00:00
chihlasm
79ae34108a fix: add Alembic migrations step + RLS env vars to CI
- Run alembic upgrade head before tests so DB roles and RLS policies exist
- Set TEST_DB_NAME=resolutionflow_test so test_rls_isolation.py connects to
  the correct database (was defaulting to patherly_test which doesn't exist in CI)
- Set DB_APP_ROLE_PASSWORD so create_db_roles migration creates resolutionflow_app
  with a known password that the RLS isolation tests can connect with

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 19:55:10 +00:00
chihlasm
bd29f590a2 fix: set account_id on all Session constructors; fix 3 ESLint errors in CI
Backend: start_session, prepare_session, batch_launch_sessions all missing
account_id=current_user.account_id — Phase 1 NOT NULL constraint made these
500 in test suite (test_ratings.py fixture couldn't create sessions).

Frontend ESLint:
- TaskLane.tsx: suppress react-refresh/only-export-components for clearTaskState
- TeamSummary.tsx: init loading from isAccountOwner to avoid sync setState in effect
- ScriptBodyEditor.tsx: move lastValueRef.current assignment into useEffect

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 14:41:42 +00:00
chihlasm
ce4cfc3240 fix: set account_id on PsaPostLog in psa_post_to_ticket (missed third write path); fix get_admin_db docstring
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 07:12:45 +00:00
chihlasm
82ee177d9b fix: harden Phase 2 RLS tests — try/finally cleanup, assert guards, seed B-data for isolation checks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 07:07:26 +00:00
chihlasm
ed8de92c52 test: add Phase 2 RLS isolation tests for 11 session tables (incl. step_library visibility regression)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 07:00:09 +00:00
chihlasm
5bd331ca92 fix: clarify step_library RLS comment; remove unused sqlalchemy import
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:57:41 +00:00
chihlasm
87fac02e9b feat: migration — enable RLS on 11 Phase 2 session tables (tenant-only + step_library visibility policy)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:55:25 +00:00
chihlasm
4f4bc435da docs: broaden admin_database docstring to cover non-admin BYPASSRLS use cases
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:51:53 +00:00
chihlasm
ac2b193909 fix: use get_admin_db in access_share to handle cross-tenant session reads (public shares)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:50:00 +00:00
chihlasm
b641ac6c55 fix: set account_id on session_supporting_data, session_resolution_outputs, maintenance_schedules, psa_post_log constructors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 06:44:17 +00:00
chihlasm
8292e6ec65 fix: handle non-default, no-team trees in global content migration
Migration 019 only backfills trees with team_id IS NOT NULL.
Migration 3a40fe11b427 only covered is_default=TRUE trees.
Trees with team_id=NULL and is_default=FALSE (e.g. inactive test trees,
pre-team-system content) fell through both passes and triggered the NULL
guard.

Add two new UPDATE steps after the is_default pass:
1. Assign remaining trees to their author's account (if author has one)
2. Final fallback to PLATFORM_ACCOUNT_ID for any still-NULL rows

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 05:21:26 +00:00
chihlasm
20bd428d83 Merge pull request #133 from resolutionflow/feat/tenant-isolation-phase-1
feat: Phase 1 tenant isolation — add account_id to all tenant tables
2026-04-10 00:57:53 -04:00
chihlasm
b9da0e7107 chore: resolve merge conflicts with main
- deps.py: keep require_tenant_context + require_admin_db (RLS deps);
  drop unused get_tenant_context stub from Phase 0
- categories.py: keep both PLATFORM_ACCOUNT_ID and tenant_filter imports
  (body uses both)
- tenant-isolation spec: keep main's resolved TargetList/teams audit answers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 04:57:39 +00:00
chihlasm
8f044849d4 fix: get_tree returns 404 (not 403) for inaccessible trees — don't leak resource existence
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 04:17:31 +00:00
chihlasm
14304be383 fix: correct RLS test fixtures — tree_structure NOT NULL, tree_tags schema, session-scoped set_config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 04:15:41 +00:00
chihlasm
a5c5eb6cc3 fix: convert DATABASE_URL_SYNC from property to overridable field for Alembic superuser URL
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 04:03:32 +00:00
chihlasm
c4f919f3a5 feat: migration — enable RLS on trees, tags, categories, psa_connections, flow_proposals
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 04:02:10 +00:00
chihlasm
8de6ee7aa4 feat: migration — create resolutionflow_app and resolutionflow_admin DB roles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:59:28 +00:00
chihlasm
83ad2e0661 feat: migrate admin endpoints to get_admin_db (BYPASSRLS) before RLS switch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:57:18 +00:00
chihlasm
ce4056c6b9 test: add failing RLS isolation tests (green after Task 10 migration + Task 11 URL switch)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:54:42 +00:00
chihlasm
9d60b9a244 feat: apply require_tenant_context to all user-facing routers
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:52:52 +00:00
chihlasm
df9ecf2d29 feat: add require_tenant_context and require_admin_db dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:50:59 +00:00
chihlasm
b0e5f12897 feat: register RLS transaction-begin listener on app engine at startup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:49:49 +00:00
chihlasm
b4f8694f6b feat: add tenant_context module — ContextVar, transaction listener, tenant_filter
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:48:34 +00:00
chihlasm
6f1becf21f feat: add admin_engine and get_admin_db for BYPASSRLS admin endpoints
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:46:29 +00:00
chihlasm
acbfb3fb37 feat: add ADMIN_DATABASE_URL setting with fallback to DATABASE_URL
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-10 03:45:52 +00:00
chihlasm
a394a1d464 fix: replace account_id=None with PLATFORM_ACCOUNT_ID for global content
After migration 174f442795b7 enforces NOT NULL on account_id, all
platform/global content must use the sentinel platform account instead
of NULL. Three categories of fixes:

1. trees.py: is_default trees now get PLATFORM_ACCOUNT_ID (not None)
2. admin_categories.py: global category CRUD now uses PLATFORM_ACCOUNT_ID
3. categories.py, tags.py, step_categories.py: creation endpoints coerce
   None → PLATFORM_ACCOUNT_ID; IS NULL filter queries updated to
   == PLATFORM_ACCOUNT_ID (IS NULL queries returned empty after migration
   backfilled all global rows to the platform account)

Defines PLATFORM_ACCOUNT_ID constant in app/core/service_account.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:35:52 +00:00
chihlasm
d2ebc4f182 fix: correct tree tags subquery in template_trees migration
The INSERT into template_trees incorrectly referenced `tags` as a column
on the `trees` table. Tags are a relationship via the `tree_tag_assignments`
join table — there is no direct column. Migration was failing with:

  UndefinedColumn: column "tags" does not exist ... FROM trees

Fixed by replacing COALESCE(tags, '[]') with a correlated subquery that
aggregates tag names from tree_tag_assignments → tree_tags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 17:30:05 +00:00
chihlasm
8bcf08ae06 fix: persist account ownership for script templates and generations 2026-04-09 17:18:38 +00:00
Claude
85575839f2 docs: update CHANGELOG with tenant isolation Phase 0 and security fixes
- Add Tenant Isolation Phase 0 (#132) — app-layer filtering, cross-tenant audit, UUID isolation
- Document CRITICAL copilot tree query isolation fix (#131)
- Add AI session search, analytics, category, PSA retry, and task lane fixes
- Note 404 (not 403) responses for cross-tenant access to avoid confirming resource existence

https://claude.ai/code/session_014EUBLi2jHrnzJupcetmdwV
2026-04-09 10:41:21 +00:00
chihlasm
478205c208 fix: platform account fallback for script_templates seeded without team/user
Migration 057 inserts 6 AD script templates with NULL team_id and NULL
created_by. Neither backfill path (created_by→users, team_id→team admin)
could attribute them to an account, causing the verify check to fail.

Fix: pre-create the platform sentinel account (ON CONFLICT DO NOTHING,
safe since 3a40fe11b427 also creates it idempotently) and add a final
fallback UPDATE assigning any remaining NULL script_templates to it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 06:41:00 +00:00
chihlasm
0f33feb6d6 fix: use correlated subquery in psa_post_log backfill to avoid invalid FROM-clause reference
PostgreSQL UPDATE...FROM does not allow the updated table to be
referenced inside the FROM clause's JOIN conditions. Replace the
LEFT JOIN psa_connections with a correlated subquery.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 06:31:17 +00:00
chihlasm
034b858fc9 fix: add depends_on 067 to cc214c63aa30 to fix fresh-DB migration order
session_resolution_outputs is created in migration 067 (sequential branch
from 064). On fresh databases, Alembic could run cc214c63aa30 before 067,
causing "table does not exist" errors. depends_on ensures 067 always runs
first regardless of branch traversal order.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 06:20:00 +00:00
chihlasm
b937cb41e4 fix: merge Phase 1 account_id chain with main head to resolve multiple-heads error
Combines the Phase 1 tenant isolation chain (064 → ... → 174f442795b7)
with the main sequential chain (064 → ... → 070) into a single Alembic
head (a9f3b2c1d4e5) so `alembic upgrade head` in the Dockerfile works
without ambiguity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-09 06:14:04 +00:00
chihlasm
0d475c71ed fix: correct Phase 1 down_revision — chain from 064 not b8d2f4a6c091
b8d2f4a6c091 was NOT the production head. The true head was 064
(064_normalize_script_builder_messages) via the chain:
b8d2f4a6c091 → f0aad74ea51b → 062 → 063 → 064

This caused 'multiple head revisions' on Railway deployment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 06:04:10 +00:00
chihlasm
417fa562ce fix: Task 9 migration — include tags in template_trees INSERT
The tags column was accidentally omitted from the is_default tree copy.
Now uses COALESCE(tags, '[]'::jsonb) to preserve source tree tags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:34:59 +00:00
chihlasm
42937b24a4 feat: Phase 1 Group 9 — enforce NOT NULL on all account_id columns
All previously-nullable account_id columns are now NOT NULL.
tree_embeddings and feedback backfilled before constraint applied.
Global content assigned to platform sentinel account (00000000-...-0001)
in preceding migration.

Tables updated: users, trees, tree_categories, tree_tags,
step_categories, step_library, tree_embeddings, feedback

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:34:32 +00:00
chihlasm
b4b8c67d3b feat: Phase 1 Group 10 — create global content tables and platform account
Creates template_trees and platform_steps (no account_id, no RLS).
Migrates is_default=TRUE trees and public steps into them.
Creates sentinel platform account (00000000-...-0001) for global
tree_categories, tree_tags, step_categories, step_library, and
is_default trees — clearing all NULL account_id rows in those tables
as prerequisite for Group 9 SET NOT NULL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:31:33 +00:00
chihlasm
d24da77604 feat: Phase 1 Group 8 — add account_id to target_lists (keep team_id)
Zero rows in production — this is a schema-only migration in practice.
team_id kept for app code compatibility. Drop deferred to later cleanup.
Backfill: team_id → team admin user → account_id; fallback: created_by.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:25:24 +00:00
chihlasm
857e782d14 feat: Phase 1 Group 7 — add account_id to script tables (keep team_id)
team_id is kept in all three tables — drop deferred until app code
is fully migrated off team_id references.

Tables: script_builder_sessions, script_templates, script_generations
Backfill: user_id/created_by → users.account_id

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:23:35 +00:00
chihlasm
086c4580f1 feat: Phase 1 Group 6 — add account_id to maintenance_schedules
Primary backfill: tree_id → trees.account_id
Fallback: created_by → users.account_id (for is_default tree rows)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:20:56 +00:00
chihlasm
0d69474128 feat: Phase 1 Group 5 — add account_id to PSA and notification tables
psa_post_log: backfill via psa_connection, fallback to posted_by user
psa_member_mappings: backfill via psa_connection
notification_logs: backfill via notification_config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:19:12 +00:00
chihlasm
b5fdb488b3 feat: Phase 1 Group 4 — add account_id to user_folders and user_pinned_trees
Backfill: user_id → users.account_id

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:16:50 +00:00
chihlasm
de5ecf4fb2 feat: Phase 1 Group 3 — add account_id to step_ratings and step_usage_log
Backfill from rater/user's account_id (not the step's account_id).
This is an explicit design decision — step rating data is attributed
to the account that performed the rating.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:15:10 +00:00
chihlasm
2779a41b94 feat: Phase 1 Group 2 — add account_id to AI branching tables
Tables: session_branches, session_handoffs, fork_points,
        ai_session_steps, ai_suggestions
Backfill: session_id → ai_sessions.account_id (all except
ai_suggestions which uses user_id → users.account_id)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:12:18 +00:00
chihlasm
4666c4f6d2 feat: Phase 1 Group 1 — add account_id to core session tables
Migration sequence: add nullable → backfill via user_id/ai_session chain
→ verify zero NULLs → SET NOT NULL → CREATE INDEX.

Tables: sessions, attachments, session_supporting_data,
        session_resolution_outputs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 05:09:14 +00:00
chihlasm
2837c6e4cf docs: add Phase 1 tenant isolation schema migrations implementation plan
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 04:58:24 +00:00
chihlasm
b3dba57bc5 feat: tenant isolation Phase 0 — app-layer filters, UUID audit, CI gate (#132)
* docs: add tenant data isolation design spec

Complete architecture plan for multi-tenant data isolation across
all layers (PostgreSQL RLS, application-layer filtering, schema
migration, testing strategy, and phased rollout checklist).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add background job isolation policy to tenant isolation spec

Documents policy for all 5 existing background jobs:
- Knowledge Flywheel and PSA Retry flagged for account_id threading
- Chat Retention already follows correct pattern (model for others)
- Maintenance Schedule Firing needs account_id in queries + Session creation
- AI Conversation Expiry approved as cross-tenant with justification

Adds approved cross-tenant query registry and Phase 2 checklist items.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add tenant isolation Phase 0 implementation plan

8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper,
get_tenant_context dependency, analytics/category/AI session gap fixes,
full UUID endpoint audit, TargetList dead code audit, teams orphan
check, and CI grep check for missing tenant filters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add tenant_filter() helper and get_tenant_context dependency

tenant_filter(model, account_id) is the canonical app-layer tenant
scoping expression. Every query on a tenant table must use it.
build_tree_access_filter and build_step_visibility_filter updated
to call tenant_filter() internally for the account_id match.

get_tenant_context is a FastAPI dependency that returns account_id
or raises 403 if the user has no account — prevents raw access to
current_user.account_id and centralises the null check.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: scope analytics/flows/{tree_id} to requesting account

Any authenticated user could read flow analytics (session counts,
completion rates, CSAT) for any tree UUID. Now returns 404 if the
tree doesn't belong to the requesting account.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: scope category tree_count to requesting account

tree_count on GET /categories/{id} was including trees from all
accounts, leaking cross-tenant row counts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: restrict AI session search to current user only

Search endpoint used OR(user_id, account_id), exposing other users'
problem_summary and problem_domain within the same account. Sessions
are user-scoped only — cross-user access requires explicit escalation
or sharing. List and search endpoints now behave consistently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add ownership check and 404 responses to ai-sessions endpoints

Cross-tenant isolation audit found:
- retry-psa-push had NO ownership check (CRITICAL) — any user could retry any session's PSA push
- save_task_lane used db.get() without ownership filter, returned 403 revealing existence
- get_session returned 403 instead of 404 for unauthorized access
- stream_documentation returned 403 instead of 404

All now use query-level user_id filtering and return 404 to avoid revealing existence.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-tenant session access

All session endpoints (get, update, complete, scratchpad, variables, export,
ticket-link) now return 404 instead of 403 when a user tries to access
another user's session. This prevents confirming existence of resources
across tenant boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-tenant tree access

get_tree and update_tree now return 404 when a user cannot access a tree
(private tree from another account). Prevents confirming resource existence
across tenant boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-tenant step access

get_step_or_404 now returns 404 when can_view_step or can_edit_step fails,
preventing confirmation of step existence across tenant boundaries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-tenant upload access

get_upload_url and delete_upload now return 404 when the upload belongs to
a different account/user, preventing resource existence confirmation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-tenant share access

revoke_share and create_share now return 404 when the caller is not the
owner, preventing resource existence confirmation across users.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-team tree access in maintenance schedules

_get_tree_or_403 now returns 404 when the user's team does not match,
preventing confirmation of tree existence across teams.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-account tag access

get_tag now returns 404 for account-specific tags that belong to another
account, preventing resource existence confirmation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 instead of 403 for cross-account step category access

get_step_category now returns 404 for account-specific categories that
belong to another account, preventing resource existence confirmation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* test: add cross-tenant isolation tests for Task 6 UUID audit

Tests cover:
- Tree GET/PUT returns 404 for cross-account access
- Session GET returns 404 for cross-user access
- AI session GET returns 404 for cross-user access
- AI session retry-psa-push requires ownership
- Upload URL returns 404 for cross-account access
- Share revoke returns 404 for cross-user access

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: return 404 (not 403) for get_documentation cross-user access; add missing Task 6 tests

get_documentation was revealing session existence via 403. Added pre-check
query filtering by session_id AND user_id before calling the engine.

Also add cross-tenant isolation tests for steps, tags, step_categories,
and maintenance_schedules endpoints fixed in Task 6 (TDD was skipped).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: address Task 6 quality review — rename helper, restore 403 for intra-account, add docs test

- Rename _get_tree_or_403 → _get_tree_or_404 in maintenance_schedules.py
  (function now raises 404, old name was misleading)
- Restore HTTP 403 for intra-account permission failures in update_tree:
  same-account users who can see a tree but can't edit it got 404 (wrong);
  only cross-account lookups should return 404 to avoid confirming existence
- Apply same 403/404 distinction to update_tree_visibility
- Add test: get_documentation must return 404 for cross-user session access
- Add comment documenting owner-only design for documentation endpoints

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: Task 7+8 — TargetList audit, CI tenant-filter grep check

Task 7: TargetList dead code audit
- Found active code references in 12+ files across backend and frontend
  (full CRUD API + frontend page + MaintenanceScheduleSection + BatchLaunchModal)
- Decision: migrate to account_id in Phase 1 (cannot drop)
- DB row count not available from code-server — must verify from VPS SSH
  before Phase 1 migration
- Teams orphan check query documented; must run from VPS SSH before Phase 1
- Results documented in spec Section 9

Task 8: CI tenant-filter enforcement check (warn mode)
- Create backend/scripts/check_tenant_filters.py
  Scans endpoint and service files for select() on tenant tables without
  tenant_filter/account_id/user_id in surrounding context. Currently
  reports 109 warnings (Phase 1 backlog). Exits 0 (warn mode).
- Add Check tenant filter enforcement step to backend CI job
  Add --fail flag after Phase 1 backlog clears to make it blocking.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: record Phase 0 audit results — 0 orphaned teams, 0 target_list rows

Both checks confirmed 2026-04-09 from production DB.
Phase 1 migration is safe to proceed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 00:42:19 -04:00
chihlasm
29a9573d6e fix: CRITICAL — scope copilot tree query to current account (#131)
* docs: add tenant data isolation design spec

Complete architecture plan for multi-tenant data isolation across
all layers (PostgreSQL RLS, application-layer filtering, schema
migration, testing strategy, and phased rollout checklist).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add background job isolation policy to tenant isolation spec

Documents policy for all 5 existing background jobs:
- Knowledge Flywheel and PSA Retry flagged for account_id threading
- Chat Retention already follows correct pattern (model for others)
- Maintenance Schedule Firing needs account_id in queries + Session creation
- AI Conversation Expiry approved as cross-tenant with justification

Adds approved cross-tenant query registry and Phase 2 checklist items.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add tenant isolation Phase 0 implementation plan

8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper,
get_tenant_context dependency, analytics/category/AI session gap fixes,
full UUID endpoint audit, TargetList dead code audit, teams orphan
check, and CI grep check for missing tenant filters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: CRITICAL — scope copilot tree query to current account

A user who knew another account's tree UUID could start a copilot
conversation, causing the tree's full node structure, names, and
descriptions to be sent to the AI as part of the system prompt.

Fix: add account_id (or is_default / visibility='public') filter to
the tree SELECT in copilot_service.start_conversation(). Returns 404
for inaccessible trees. Test added in test_tenant_isolation_p0.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 00:41:30 -04:00
chihlasm
56775eca04 docs: add tenant isolation Phase 0 implementation plan
8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper,
get_tenant_context dependency, analytics/category/AI session gap fixes,
full UUID endpoint audit, TargetList dead code audit, teams orphan
check, and CI grep check for missing tenant filters.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 03:02:19 +00:00
chihlasm
82bb7967d8 docs: add background job isolation policy to tenant isolation spec
Documents policy for all 5 existing background jobs:
- Knowledge Flywheel and PSA Retry flagged for account_id threading
- Chat Retention already follows correct pattern (model for others)
- Maintenance Schedule Firing needs account_id in queries + Session creation
- AI Conversation Expiry approved as cross-tenant with justification

Adds approved cross-tenant query registry and Phase 2 checklist items.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 02:38:55 +00:00
chihlasm
a7dff9e143 docs: add tenant data isolation design spec
Complete architecture plan for multi-tenant data isolation across
all layers (PostgreSQL RLS, application-layer filtering, schema
migration, testing strategy, and phased rollout checklist).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 02:24:38 +00:00
Claude
ba0680ce06 docs: update CHANGELOG with image support, header actions, and design token normalization
- Added image support in Assistant Chat with S3 upload and vision integration
- Moved session lifecycle actions to header bar in AssistantChatPage
- Normalized design system tokens across FlowPilot, AssistantChat, ScriptBuilder
- Fixed 'sorry something went wrong' errors and image display in chat
- Fixed Task Lane stale data and chat ref invalidation race conditions

https://claude.ai/code/session_01LGJSDQqPi3sPWjC6vh9Uyj
2026-04-08 10:40:44 +00:00
chihlasm
290f2be2fd fix: resolve "sorry something went wrong" errors and show images in chat
Three fixes from beta tester session feedback:

1. MCP error handling (backend/app/services/assistant_chat_service.py)
   - The MCP Microsoft Learn integration was catching only BadRequestError.
     Any other error type (APIStatusError, APIConnectionError, timeout) from
     the external MCP server propagated as a 502, causing the generic error.
   - Now catches all Exception types when MCP is active and retries without
     MCP using the stable client.messages.create endpoint.

2. Frontend error UX (frontend/src/pages/AssistantChatPage.tsx)
   - catch {} was silently swallowing all errors and inserting a generic
     assistant message. Now: differentiates 429 (rate limit) vs 502/503
     (AI unavailable), removes the optimistic user message on failure,
     restores the failed message to the input so users can retry without
     retyping, and logs errors to console for debugging.

3. Image attachments visible in chat (frontend/src/components/assistant/ChatMessage.tsx)
   - Uploaded images were sent to the AI correctly but never shown in the
     chat thread. Now captures preview URLs before clearing pendingUploads
     and renders thumbnails above the user bubble, clickable to full size.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 13:09:16 +00:00
chihlasm
e8e12cc7e5 fix: move session lifecycle actions to header bar in AssistantChatPage
- Add persistent session header with title, status badge, Resolve,
  Escalate, and Update Ticket/Share Update buttons — mirrors
  FlowPilotSessionPage pattern exactly
- Update Ticket label when psa_ticket_id present, Share Update otherwise
- Full mobile support via ⋯ overflow menu (Resolve, Escalate, Update, Pause)
- Strip _(not yet completed)_ markers from stored conversation_messages
  in unified_chat_service to prevent stale task lane items from prior
  turns leaking into new sessions via the AI's re-include instruction
- Add currentChatRef guard to handleResumeNew (was missing unlike handleSend)
- Remove Update/Conclude from chatbar — toolbar is now input utilities only

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 06:31:24 +00:00
chihlasm
bf45322c46 Merge pull request #126 from resolutionflow/refactor/dashboard-design-critique
refactor: normalize FlowPilot/Assistant/ScriptBuilder to design system tokens
2026-04-06 20:23:50 -04:00
Michael Chihlas
f45b045943 refactor: resolve merge conflicts — combine main improvements with token normalization
- .gitignore: keep both graphify-out/ entries and main's .gitnexus entry
- ScriptCodeBlock/ScriptPreviewModal: take main's border-border and text-accent-text
  for filename labels; use neutral ghost style for Save button in ScriptCodeBlock;
  use bg-accent (normalized from bg-primary) for Save button in ScriptPreviewModal

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 20:23:36 -04:00
Michael Chihlas
cef853d7ea refactor: normalize FlowPilot/Assistant/ScriptBuilder to design system tokens
Replace hardcoded Tailwind color utilities with semantic CSS variable tokens
across 31 files in the FlowPilot, Assistant Chat, and Script Builder feature
communities — the areas graphify identified as design-system-free.

- text-blue-400 → text-accent, bg-blue-500/10 → bg-accent-dim, border-blue-500/20 → border-accent/20
- text-amber-400 → text-warning, bg-amber-400/10 → bg-warning-dim, border-l-amber-500 → border-l-warning
- text-rose-400/500 → text-danger, bg-rose-500/10 → bg-danger-dim
- text-emerald-400 → text-success, bg-emerald-500/10 → bg-success-dim, border-l-emerald-500 → border-l-success
- bg-white/[0.08] → bg-elevated (opacity hack → semantic surface token)
- bg-gradient-to-r from-blue-500 to-blue-400 → bg-accent (no gradient surfaces)
- bg-[#60a5fa] → bg-accent (hard-coded hex removed)

Also adds graphify-out/ to .gitignore.

Theme resilience: accent color has changed twice in 5 weeks. Semantic tokens
mean the next change is a 1-line edit in index.css, not 110 grep-and-replace.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 20:20:07 -04:00
chihlasm
87cf874199 fix: invalidate currentChatRef before await in handleNewChat and handleResumeNew
The previous fix (990f044) moved state clears before the createChatSession
await but left currentChatRef.current pointing at the old session during the
entire network call. Any in-flight handleSend/handleTaskSubmit for the old
session would pass the guard (oldId === oldId) and re-apply stale task lane
data to the new empty session.

Setting currentChatRef.current = null before the await ensures in-flight
handlers from the previous session see a mismatch and bail — matching the
same pattern already used correctly in selectChat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 20:56:10 +00:00
chihlasm
2b53315cc9 Merge pull request #125 from resolutionflow/fix/task-lane-partial-submit
fix: resolve task lane stale state, partial submit, and closure bugs
2026-04-06 16:31:41 -04:00
chihlasm
1811889ed9 chore: update docs and redesign landing page hero
- CLAUDE.md: correct Docker container names, update migration format
  docs (hash IDs now default), fix Node path in Lesson 63, update
  design system values to electric blue accent, add retracted lessons
  note, add GitNexus section
- .gitignore: add .gitnexus
- Landing page: replace animated chat preview with ticket-comparison
  hero layout; remove backdrop-filter from scrolled nav (aligns with
  design system); clean up removed chat animation CSS

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 20:17:47 +00:00
chihlasm
990f04489f fix: prevent TaskLane showing stale data when starting new chat
Three race conditions in AssistantChatPage:

1. handleNewChat cleared showTaskLane/activeQuestions/activeActions
   AFTER the createChatSession await — old lane was visible during
   the network call. Moved clears before the await.

2. handleResumeNew never cleared old TaskLane state at all. Added
   upfront clears before the first await.

3. handleSend and handleTaskSubmit had no stale-session guard. If
   the user switched chats while sendChatMessage was in flight, the
   response would set showTaskLane on the wrong session. Added
   sentForChatId snapshot + currentChatRef guard (same pattern
   already used in selectChat).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 20:17:39 +00:00
chihlasm
ba815d3ee5 Merge remote-tracking branch 'origin/main' into fix/task-lane-partial-submit 2026-04-06 20:14:45 +00:00
chihlasm
8bd395a0c7 fix: resolve task lane stale state, partial submit, and closure bugs
- Import and call clearTaskState before updating questions/actions in
  handleSend and handleTaskSubmit so new AI tasks always replace stale
  sessionStorage cache instead of being overridden by it
- Include pending (not yet completed) tasks in the AI message on partial
  submit so the AI knows which tasks were left unanswered
- Fix stale closure in TaskLane saveTaskLane useEffect — use refs for
  questions/actions so the debounced backend save always uses current values
- Add responses field to pending_task_lane TypeScript type, removing the
  unsafe double-cast in selectChat
- Instruct the AI to re-surface incomplete tasks unless ≥75% confident
  the information is no longer needed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-06 16:53:48 +00:00
Claude
7198c165b2 docs: update CHANGELOG with session documentation overhaul and client communications
Added entries for:
- Session documentation overhaul with reformatted PSA notes, decimal hour display,
  and follow-up recommendations
- Client communication improvements with request_info audience type
- PSA documentation formatting enhancements
- Status update generation improvements
- Option label resolution fix

https://claude.ai/code/session_01GpyJYk4F3eGiJXwsgycChK
2026-04-06 10:35:01 +00:00
chihlasm
58fe3574bf docs: resolve all contract decisions from codex readiness review
Addresses every Red and Yellow item from the codex review:
- Canonical handoff: ResolutionOutputGenerator is the source of truth
- AI vs manual authority: manual edits win, AI never overwrites
- evidence_items: full-list replacement, frontend is merge authority
- TaskLane persistence: lifted into hook, StepsPanel is presentation-only
- Quick replies: immediate-send, full-stack contract change
- issue_category + asset_name: free text in v1
- Adds 5 implementation guardrails and Phase 2 gate for triage extraction
- Execution order updated to 37 steps with persistence extraction step

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:41:43 +00:00
chihlasm
63a84be921 docs: merge codex insights into claude super plan
Adds key architectural choices summary, assumptions section,
sidebar visual demotion (F9), message click-to-expand in compact
log, and backend-first rationale from the codex plan.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:41:43 +00:00
chihlasm
75971d8b97 docs: add MSP assistant harness super plan (claude synthesis)
Merges MSP_Assistant_Harness_Implementation_Plan.docx with the
brainstorming design spec into a single executable plan. Resolves
all open questions from the original docx, expands scope to include
backend changes, and adds a 35-step phased execution order.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:41:43 +00:00
chihlasm
7998dd237d docs: add MSP assistant harness cockpit design spec
Design spec for evolving /assistant into a live triage cockpit.
Covers layout decisions (stacked zones, drag-resizable split),
incident header (labelled fields, AI-inferred + editable),
work zone (steps checklist + FlowPilot Asks + What We Know),
conclude modal redesign, and all required backend changes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:41:43 +00:00
chihlasm
f4143e52a1 feat: overhaul session documentation, PSA notes, and client communications
- Reformat PSA resolution/escalation notes: clean single-line header,
  steps with engineer responses inline, remove duplicate timing blocks,
  remove AI confidence section, add follow-up recommendations
- Standardize time display to decimal hours (e.g. 0.25 hrs) across all
  note formatters and status update context
- Add follow_up_recommendations to SessionDocumentation schema and
  surface in SessionDocView; extracted from resolution suggestion steps
- Add _build_what_we_know() helper: uses session.evidence_items when
  cockpit branch merges, falls back to deriving findings from steps
- Fix option label lookup in generate_status_update (was passing raw
  machine values to AI instead of human-readable labels)
- Add 'What We Know' section to status update ticket notes prompt
- Improve _build_session_context in resolution_output_generator to
  include intake text and full step details instead of truncated chat
- Add request_info audience type: client-facing information request
  that skips the length step and generates a numbered question list
- Improve client_update and email_draft prompts with per-context
  guidance (status/resolution/escalation) and fix escalation subject
  line from 'Specialist Review' to 'Specialist Assistance'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 15:18:31 +00:00
328 changed files with 41183 additions and 10970 deletions

154
.gitea/workflows/ci.yml Normal file
View File

@@ -0,0 +1,154 @@
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
backend:
runs-on: ubuntu-latest
services:
postgres:
image: pgvector/pgvector:pg16
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: resolutionflow_test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
env:
DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
DATABASE_URL_SYNC: postgresql://postgres:postgres@postgres:5432/resolutionflow_test
SECRET_KEY: ci-test-secret-key-not-for-production
DEBUG: "true"
APP_NAME: ResolutionFlow
TEST_DB_NAME: resolutionflow_test
DB_APP_ROLE_PASSWORD: app_secret_ci
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
- name: Run Alembic migrations
run: cd backend && alembic upgrade head
- name: Check tenant filter enforcement
run: cd backend && python scripts/check_tenant_filters.py
- name: Run tests with coverage
run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50
- name: Display coverage summary
if: always()
run: |
cd backend
python -c "
import json
with open('coverage.json') as f:
data = json.load(f)
total = data['totals']['percent_covered_display']
print(f'Total coverage: {total}%')
print()
print('Module coverage:')
for fname, fdata in sorted(data['files'].items()):
pct = fdata['summary']['percent_covered_display']
if float(pct) < 80:
print(f' WARNING {fname}: {pct}%')
else:
print(f' OK {fname}: {pct}%')
"
frontend:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: cd frontend && npm ci
- name: Lint
run: cd frontend && npm run lint
- name: Test with coverage
run: cd frontend && npm run test:coverage
- name: Build
run: cd frontend && NODE_OPTIONS="--max-old-space-size=4096" npm run build
- name: Upload build artifact
uses: actions/upload-artifact@v4
with:
name: frontend-dist
path: frontend/dist
retention-days: 1
e2e:
needs: [frontend]
runs-on: ubuntu-latest
services:
postgres:
image: pgvector/pgvector:pg16
env:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: resolutionflow_test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
env:
PLAYWRIGHT_DATABASE_URL: postgresql+asyncpg://postgres:postgres@postgres:5432/resolutionflow_test
PLAYWRIGHT_DATABASE_URL_SYNC: postgresql://postgres:postgres@postgres:5432/resolutionflow_test
PLAYWRIGHT_API_ORIGIN: http://127.0.0.1:8000
PLAYWRIGHT_BASE_URL: http://127.0.0.1:4173
PLAYWRIGHT_SECRET_KEY: ci-playwright-secret-key
PLAYWRIGHT_TEST_EMAIL: teamadmin@resolutionflow.example.com
PLAYWRIGHT_TEST_PASSWORD: TestPass123!
steps:
- uses: actions/checkout@v4
- name: Install backend dependencies
run: pip install --break-system-packages -r backend/requirements.txt -r backend/requirements-dev.txt
- name: Install frontend dependencies
run: cd frontend && npm ci
- name: Download frontend build
uses: actions/download-artifact@v4
with:
name: frontend-dist
path: frontend/dist
- name: Install Playwright browser
run: cd frontend && npx playwright install --with-deps chromium
- name: Run Playwright smoke tests
run: cd frontend && npm run test:e2e
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: |
frontend/playwright-report
frontend/test-results
if-no-files-found: ignore

View File

@@ -0,0 +1,19 @@
name: Mirror to GitHub
on:
push:
branches:
- '**'
jobs:
mirror:
runs-on: ubuntu-latest
steps:
- name: Push to GitHub
run: |
cd /tmp
git clone --mirror https://gitea.resolutionflow.com/chihlasm/resolutionflow.git repo
cd repo
git remote add github https://x-access-token:${{ secrets.GH_MIRROR_TOKEN }}@github.com/${{ secrets.GH_MIRROR_REPO }}
git push github --all --force
git push github --tags --force

View File

@@ -0,0 +1,43 @@
name: Runner Probe
on:
workflow_dispatch:
jobs:
probe:
runs-on: ubuntu-latest
steps:
- name: Runner labels and OS
run: |
echo "=== OS ==="
uname -a
cat /etc/os-release 2>/dev/null || true
- name: Python versions
run: |
echo "=== Python ==="
which python3 && python3 --version || echo "python3 not found"
which python && python --version || echo "python not found"
ls /usr/bin/python* 2>/dev/null || true
- name: Node versions
run: |
echo "=== Node ==="
which node && node --version || echo "node not found"
which npm && npm --version || echo "npm not found"
ls /usr/bin/node* 2>/dev/null || true
ls ~/.nvm/versions/node/ 2>/dev/null || echo "no nvm versions"
- name: Docker
run: |
echo "=== Docker ==="
which docker && docker --version || echo "docker not found"
docker info 2>/dev/null | grep -E "Server Version|Operating System" || true
- name: User and home
run: |
echo "=== User ==="
whoami
echo "HOME=$HOME"
echo "PATH=$PATH"

View File

@@ -31,6 +31,8 @@ jobs:
SECRET_KEY: ci-test-secret-key-not-for-production
DEBUG: "true"
APP_NAME: ResolutionFlow
TEST_DB_NAME: resolutionflow_test
DB_APP_ROLE_PASSWORD: app_secret_ci
steps:
- uses: actions/checkout@v5
@@ -47,6 +49,14 @@ jobs:
- name: Install dependencies
run: pip install -r backend/requirements.txt -r backend/requirements-dev.txt
- name: Run Alembic migrations
run: cd backend && alembic upgrade head
- name: Check tenant filter enforcement
run: cd backend && python scripts/check_tenant_filters.py
# Warn mode only (exits 0). Switch to --fail after Phase 1 backlog clears.
# See: docs/superpowers/specs/2026-04-09-tenant-data-isolation-design.md Section 3f
- name: Run tests with coverage
run: cd backend && python -m pytest --override-ini="addopts=" --cov=app --cov-report=term-missing --cov-report=json:coverage.json --cov-fail-under=50

5
.gitignore vendored
View File

@@ -233,3 +233,8 @@ package.json
package-lock.json
.worktrees/
.gstack/
.gitnexus
# graphify knowledge graph outputs
graphify-out/
.graphify_python

View File

@@ -4,37 +4,47 @@ All notable changes to ResolutionFlow are documented here.
## [Unreleased]
## [2026-04-04] Network Diagram Editor UX Improvements
### Added
- Snap-to-grid (20px) on Network Diagram canvas — nodes align consistently when dragged
- NodeResizer on group nodes (subnet/VLAN/site/DMZ) — select a group and drag its handles to resize
- Group node dimensions now saved to and restored from the backend on reload
### Fixed
- Connection edges now render as straight lines instead of orthogonal bent paths
- ISP device now appears inside the Cloud category in the sidebar instead of a standalone "Internet" section; respects search and item count
- Group nodes now restore correctly as `type: 'group'` on diagram load (previously loaded as `type: 'device'`, breaking group display after save)
---
### Added
- Tree Templates + Import/Export marketplace (#66)
- Recurring Issue Detection — client-specific pattern alerts (#60)
- Step Feedback Flag — "This Step is Wrong" reporting (#58)
- **Tenant Isolation Phase 0** — multi-tenant data isolation (#132) with app-layer filtering helpers (`tenant_filter()`, `get_tenant_context`), cross-tenant access audit (analytics, categories, AI sessions, trees), UUID endpoint isolation with 404 responses for unauthorized access, ownership checks on all sensitive operations, and CI grep gate for missing tenant filters
- **Tenant Isolation Phase 2** — PostgreSQL Row Level Security (RLS) on 11 session-related tables (ai_sessions, session_steps, session_tags, etc.), account_id NOT NULL enforcement on all write paths, Alembic migrations with dual-env support (Railway native vars + explicit DATABASE_URL_SYNC), RLS test coverage with cross-account isolation verification, migration CI/CD integration
- **Tenant Isolation Phase 3** — RLS on audit_logs and tree_shares tables, cross-tenant session access for public shares (via get_admin_db), complete account_id propagation across PSA integration write paths, final RLS policy enforcement
- **Tenant Isolation Phase 4** (#136) — RLS enforcement on all 31 remaining tables (users, trees, teams, integrations, scripts, categories, templates, surveys, etc.), BYPASSRLS session pattern for auth deps and background jobs, admin session factory for startup routines (service accounts, seed data), global table exclusions (platform_steps, template_trees, script_categories, accounts), RLS tests with complete cross-tenant isolation verification, proper tree_shares ownership checks using tree owner's account_id
- **Script Library default view** — "All Scripts" tab now displays all accessible scripts (team + library)
- **Session documentation overhaul** — reformatted PSA resolution/escalation notes with cleaner headers, inline engineer responses, decimal hour display (0.25 hrs), follow-up recommendations, and improved "What We Know" section from evidence items
- **Client communication improvements** — new `request_info` audience type for client-facing information requests, improved status update and email draft prompts with per-context guidance
- **Image support in Assistant Chat** — paste/attach images in chat input, uploaded to S3, resized for vision model, displayed in conversation history
### Changed
- **Edit Procedure page** — layout overhaul and color system refinements for better visual hierarchy
- **Flows sidebar navigation** — collapsed to reduce visual noise; session recovery removed from library view
- **Account settings page** — audit fixes for improved consistency and usability
- **PSA documentation formatting** — removed duplicate timing blocks and AI confidence sections; added client-facing communication context guidance
- **Status update generation** — fixed option label lookup to use human-readable labels instead of machine values
- **Assistant Chat session actions** — moved Pause/Resume/Close actions from action bar to page header for consistency with FlowPilot
- **Design system token normalization** — unified FlowPilot, AssistantChat, and ScriptBuilder components to use consistent design tokens
- **Tenant data boundaries** — all session and tree endpoints now return 404 (not 403) for cross-tenant access attempts to avoid confirming resource existence
### Fixed
- **CRITICAL: Copilot tree query isolation** (#131) — user could access any tree UUID if known, exposing full tree structure to AI. Now scoped to current account with 404 for inaccessible trees.
- **AI session search isolation** — search endpoint leaked other users' sessions via OR(user_id, account_id). Now restricted to current user only.
- **Analytics endpoint isolation** — GET `/analytics/flows/{tree_id}` exposed session counts for any tree UUID. Now returns 404 if tree doesn't belong to requesting account.
- **Category tree counts** — cross-tenant row count leakage via tree_count field in GET `/categories/{id}`. Now scoped to requesting account.
- **PSA retry ownership check** — retry-psa-push had no ownership validation (CRITICAL). Now validates user ownership before allowing retry.
- **Task Lane save operation** — invalid task_lane_item UUIDs returned 403 revealing existence. Now returns 404 and uses query-level filtering.
- **Phase 4 RLS enforcement** — fixed auth deps, user-mutation endpoints, background jobs, and lifespan routines to use BYPASSRLS sessions for reading/writing tenant-isolated tables; fixed seed scripts to use ADMIN_DATABASE_URL; bootstrap service account now initializes correctly with proper BYPASSRLS context
- Dark text rendering on blue accent step-number badges across all flow types
- Script Library tab ownership filter now preserved across category and search changes
- Race conditions in script builder session creation and slug generation
- Stale async results in Assistant Chat (selectChat) no longer clobber new session task lane
- Sentry DSN hardcoded fallback removed — now uses environment variable only
- Option label resolution in status update context generation
- "Sorry something went wrong" errors in chat when rendering unsupported message types
- Task Lane stale data when creating new chat or resuming from concluded session
- Chat ref invalidation race condition between handleNewChat and async data loads
- Images now properly display in chat message history instead of blank placeholders
---

665
CLAUDE.md
View File

@@ -1,520 +1,265 @@
# CLAUDE.md - Patherly / ResolutionFlow Project Context
# CLAUDE.md ResolutionFlow
> **Last Updated:** March 27, 2026
> SaaS troubleshooting platform for MSPs. Last reviewed 2026-04-19.
**Naming:** Canonical product name is **ResolutionFlow**. `patherly` is the legacy internal name — still present in DB name (`patherly` on Railway, `resolutionflow` locally), some Railway service names, and historical paths. Treat as aliases, not canonical. Docker containers are `resolutionflow_*`.
**User terminology:** "Flows" (not Trees), "Projects" (not Procedures), "Solutions Library" (not Step Library). Maintenance flows hidden from pilot UI (backend retains them). DB column `tree_type` values unchanged.
**SaaS shape:** Multi-tenant by account. Roles: `super_admin` > `team_admin` > `engineer` > `viewer`. Team admin = `role='engineer'` + `is_team_admin=True` + valid `team_id`. Never `role=='admin'` — use `is_super_admin`. Backend deps in `app/api/deps.py`: `get_current_active_user`, `require_engineer_or_admin`, `require_admin`. Frontend: `usePermissions()` hook. Central logic in `backend/app/core/permissions.py` + `frontend/src/hooks/usePermissions.ts`.
**Status:** Go-to-Market Validation (pre-PMF). Backend feature-complete (55+ endpoints, 100+ tests). Phase 0.5 FlowPilot telemetry baseline accruing. See `CURRENT-STATE.md` for live status, `03-DEVELOPMENT-ROADMAP.md` for phases.
**Principle:** Prefer correct architecture over minimal diff. Flag "simpler approach" tradeoffs for review before taking them.
---
## Project Overview
## Tech stack
**Patherly** (user-facing brand: **ResolutionFlow**) is a **SaaS product for MSP professionals**. It provides troubleshooting decision trees that guide engineers through proven troubleshooting paths, capture decisions and notes, and generate professional ticket documentation.
**Target Market:** MSP companies — IT service providers managing infrastructure and support for multiple clients.
**SaaS Context:** Multi-tenant design — teams represent MSP companies, trees shared within teams, tiered access (super_admin, team_admin, engineer, viewer).
### Branding
| Context | Name Used |
|---------|-----------|
| Repository / directory / database / Docker | `patherly` / `patherly_postgres` |
| Backend, frontend UI, production URLs | **ResolutionFlow** |
- **Design system:** [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md) — THE source of truth for all design decisions
- **Design aesthetic:** Flat, high-contrast dark theme (Sentry/PostHog-inspired). No glass morphism, no gradients on surfaces, no ambient effects. Light mode planned.
- **Accent color:** Electric blue (#60a5fa dark / #2563eb light). Used sparingly — ≤5% of the UI. Warning is amber (#fbbf24), info is cyan (#67e8f9).
- **Fonts:** IBM Plex Sans (`font-sans`, body), Bricolage Grotesque (`font-heading`, headings), JetBrains Mono (`font-mono`, code) — loaded via Google Fonts
- **Logo:** 30px gradient square (ember orange) + "ResolutionFlow" in Bricolage Grotesque 700
- **Layout:** Icon rail sidebar (72px default) with hover flyout panels. Pinnable to full 260px sidebar. See [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md)
- **Brand assets:** `brand-assets/` (source SVGs), `frontend/src/assets/brand/` (app assets), `frontend/public/icons/` (favicon)
- **Terminology:** User-facing label is "Flows" (not "Trees"). Procedural flows are called "Projects" in the UI. Step Library is called "Solutions Library" in the UI. Maintenance flows are hidden from UI for pilot (backend still supports them). `tree_type` column values unchanged in DB.
- **Reference mockups:** `docs/mockups/` (HTML files, open in browser)
**Component styling:** See Design System section below and [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md). All colors via CSS variables. Use "Flows" not "Trees" in user-facing text; use "Projects" not "Procedures" for procedural flows.
## Implementation Principles
- Prefer correct architecture over minimal diff
- If two approaches exist, implement the one that scales, not the one that's faster to write
- Flag any "simpler approach" tradeoffs for product owner review before proceeding
- **Backend:** Python 3.11 + FastAPI, SQLAlchemy 2.0 async (asyncpg), Alembic, Pydantic v2, JWT (python-jose + bcrypt, JTI refresh rotation), APScheduler (in-process with FastAPI lifespan).
- **Frontend:** React 19 + Vite + TypeScript, Tailwind v4 (CSS-only config in `index.css`), Zustand (immer + zundo), React Router v7, Axios (token-refresh interceptor), Lucide.
- **DB:** PostgreSQL 16 (RLS enabled Phase 4, pgvector).
---
## Current State
- **Phase:** Go-to-Market Validation (Pre-PMF)
- **Backend:** Complete (55+ API endpoints, 100+ integration tests)
- **Frontend:** Core features complete, Tree Editor functional
- **Database:** PostgreSQL with Docker, 98 migrations
- **Detailed status:** [CURRENT-STATE.md](CURRENT-STATE.md)
### What's In Progress
- GTM validation: Shadow & Ship — founder dogfooding for 2 weeks, then 5 colleague pilot
- Solutions Library spec written (`docs/plans/2026-03-23-solutions-library-design.md`), implementation post-pilot
- Remaining open issues: #66 Templates + Import/Export, #60 Recurring Issue Detection, #58 Step Feedback Flag
---
## Tech Stack
### Backend
- **Framework:** Python FastAPI
- **Database:** PostgreSQL 16 (async via SQLAlchemy 2.0 + asyncpg)
- **Migrations:** Alembic
- **Auth:** JWT (python-jose) + bcrypt, refresh token rotation (JTI-based)
- **Validation:** Pydantic v2
- **Scheduling:** APScheduler 3.x (async, in-process with FastAPI lifespan) + croniter + pytz
### Frontend
- **Framework:** React 19 + Vite + TypeScript
- **Styling:** Tailwind CSS v4 (`@tailwindcss/vite` plugin, CSS-only config in `index.css`) — flat dark theme with ember orange accent (see [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md))
- **State:** Zustand (with immer + zundo for undo/redo)
- **Routing:** React Router v7
- **API Client:** Axios with token refresh interceptor
- **Icons:** Lucide React
---
## Key Project Structure
## Project structure
```
patherly/
resolutionflow/
├── backend/
│ ├── app/
│ │ ├── main.py # FastAPI entry point
│ │ ├── api/endpoints/ # Route handlers (auth, trees, sessions, admin, steps, survey, copilot, assistant_chat, integrations)
│ │ │ ├── flow_proposals.py # Knowledge Flywheel review queue CRUD
│ │ │ └── flowpilot_analytics.py # FlowPilot dashboard metrics
│ │ ├── api/deps.py # Auth dependencies (includes require_team_admin)
│ │ ├── api/router.py # Route registration
│ │ ├── core/ # config, database, permissions, security, audit, rate_limit
│ │ ├── models/ # SQLAlchemy models (includes FlowProposal)
│ │ ├── schemas/ # Pydantic schemas
│ │ ── services/psa/ # PSA provider abstraction (base, connectwise/, autotask/, halopsa/, cache, encryption, registry, types)
│ ├── services/knowledge_flywheel.py # AI session analysis → flow proposals
│ ├── services/knowledge_flywheel_scheduler.py # APScheduler job for batch analysis
│ └── services/knowledge_gap_service.py # Weak options & escalation signal detection
│ ├── alembic/ # Database migrations (001-029+)
│ ├── scripts/ # seed_data.py, seed_trees.py
│ └── tests/ # pytest integration tests
│ │ ├── main.py # FastAPI entry
│ │ ├── api/endpoints/ # auth, trees, sessions, admin, steps, survey, copilot, assistant_chat, integrations, flow_proposals, flowpilot_analytics
│ │ ├── api/deps.py # auth deps (incl. require_team_admin)
│ │ ├── api/router.py # registration
│ │ ├── core/ # config, database, permissions, security, audit, rate_limit
│ │ ├── models/ # SQLAlchemy (incl. FlowProposal)
│ │ ├── schemas/ # Pydantic
│ │ ├── services/psa/ # PSA provider pattern (base, connectwise/, autotask/, halopsa/, cache, encryption, registry, types)
│ │ ├── services/knowledge_flywheel.py + _scheduler.py
│ │ ── services/knowledge_gap_service.py
│ ├── alembic/versions/ # 001-070 sequential, then hex hash
│ ├── scripts/ # seed_data, seed_trees, seed_test_users
│ └── tests/ # pytest integration
├── frontend/
│ ├── src/
│ │ ├── api/ # Axios client + endpoint modules
│ │ ├── components/ # common, layout, dashboard, tree-editor, session, procedural, procedural-editor, library, step-library, ui, flowpilot
│ │ ├── hooks/ # usePermissions, useSessionTimer, useKeyboardShortcuts
│ │ ├── pages/ # All page components
│ │ ├── store/ # Zustand stores (auth, treeEditor, proceduralEditor, userPreferences, scriptGeneratorStore)
│ │ └── types/ # TypeScript interfaces
│ └── (Tailwind v4: CSS-only config in src/index.css)
├── docs/plans/archive/ # Archived design/impl docs (pre-March 2026)
├── CLAUDE.md # This file
├── CURRENT-STATE.md # Detailed feature status
├── LESSONS-LEARNED.md # (Deprecated — consolidated into CLAUDE.md)
└── docs/plans/ # Design docs & implementation plans
│ │ ├── api/ # Axios client + endpoint modules
│ │ ├── components/ # common, layout, dashboard, tree-editor, session, procedural, procedural-editor, library, step-library, ui, flowpilot
│ │ ├── hooks/ # usePermissions, useSessionTimer, useKeyboardShortcuts
│ │ ├── pages/
│ │ ├── store/ # Zustand (auth, treeEditor, proceduralEditor, userPreferences, scriptGeneratorStore)
│ │ └── types/
│ └── (Tailwind v4 CSS-only config in src/index.css)
├── docs/plans/archive/ # pre-March 2026 plans
├── docs/connectwise/ # CW API reference + best-practices guides
├── docs/LESSONS-ARCHIVE.md # archived lessons (fixes in code)
├── CLAUDE.md · CURRENT-STATE.md · DESIGN-SYSTEM.md · DEV-ENV.md
```
---
## Environment Variables
## Design system
### Backend (`backend/.env`)
**Source of truth: [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md).** Read before any visual change.
- Flat high-contrast dark theme, Sentry/PostHog-inspired. **No** glass, backdrop blur, ambient orbs, gradient surfaces.
- Accent **electric blue** (#60a5fa dark / #2563eb light) — ≤5% of UI, interactive elements only. Warning amber (#fbbf24), info cyan (#67e8f9), success green (#34d399), danger red (#f87171). Each with `-dim` at 10% opacity.
- Backgrounds: `bg-sidebar` (#0e1016) → `bg-page` (#16181f) → `bg-card` (#1e2028) → `bg-elevated` (#2a2d38). Borders `border-default` / `border-hover`.
- Text: `text-heading``text-primary``text-muted-foreground``text-muted`.
- Fonts: IBM Plex Sans (body), Bricolage Grotesque (heading, 700 weight for logo), JetBrains Mono (code).
- Logo: 30px gradient square (ember orange) + "ResolutionFlow" in Bricolage Grotesque. Assets in `brand-assets/`, `frontend/src/assets/brand/`, `frontend/public/icons/`.
- Mockups: `docs/mockups/` (HTML).
- **Deprecated — do not use:** glass-card, glass-stat, `bg-gradient-brand`, `backdrop-filter: blur()`, ambient orbs, purple gradients, ember orange as accent, cyan as accent (cyan is info only).
---
## ConnectWise PSA
Reference: `docs/connectwise/` — start with `CONNECTWISE-API-REFERENCE.md`, then the `best-practices/` guides. Extracted OpenAPI spec in `connectwise-psa-resolutionflow-reference.json` (670 endpoints, v2025.16); full spec in `connectwise-psa-openapi-full.json`.
- **Auth:** API Key (Base64 `companyId+publicKey:privateKey`) + `clientId` header every request. `clientId` is server-side (`CW_CLIENT_ID` in `config.py`) — identifies ResolutionFlow, not per-tenant. Per-connection: `company_id`, `public_key`, `private_key`, `server_url`.
- **Architecture:** `services/psa/` provider pattern — `PSAProvider` base, `ConnectWiseProvider` impl, `PsaProviderRegistry` for multi-PSA dispatch. Credentials encrypted at rest via `services/psa/encryption.py` (Fernet). Per-team credentials, never per-user. Endpoints in `api/endpoints/integrations.py`. In-memory TTL cache in `services/psa/cache.py`.
- **Integration flows:** session docs → ticket notes (`POST /service/tickets/{id}/notes`, markdown supported); ticket context → FlowPilot; callbacks via `/system/callbacks` with HMAC verification.
- **API rules:** pin version via Accept header `application/vnd.connectwise.com+json; version=2025.16`. Paginate ≤1000/page. Dynamic base URL via `/login/companyinfo/{companyId}`. Request minimal permissions (MY, not ALL).
---
## Dev commands
Full setup in [DEV-ENV.md](DEV-ENV.md) (host-agnostic, with homelab Proxmox reference topology). Day-to-day:
```bash
APP_NAME=ResolutionFlow
DEBUG=true
DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/patherly
DATABASE_URL_SYNC=postgresql://postgres:postgres@localhost:5432/patherly
SECRET_KEY=<openssl rand -hex 32>
ACCESS_TOKEN_EXPIRE_MINUTES=5
REFRESH_TOKEN_EXPIRE_DAYS=7
REQUIRE_INVITE_CODE=true
docker compose -f docker-compose.dev.yml up -d # start stack
cd backend && source venv/bin/activate && uvicorn app.main:app --reload
cd frontend && npm run dev
pytest --override-ini="addopts=" # tests (first time: CREATE DATABASE resolutionflow_test)
cd backend && alembic upgrade head # migrate
cd backend && alembic revision -m "desc" # manual migration (preferred per Lesson 77)
cd backend && alembic revision --autogenerate -m "desc" # picks up drift; review carefully
cd frontend && npm run build # stricter than tsc --noEmit — final check
cd frontend && npx tsc -b # TS-only check when dist/ has EACCES
docker exec -it resolutionflow_postgres psql -U postgres -d resolutionflow
python -m scripts.seed_trees # seed (from backend/)
```
### Frontend (`frontend/.env.local` - optional)
**URLs:** Frontend <http://localhost:5173>, backend <http://localhost:8000>, API docs <http://localhost:8000/api/docs>.
```bash
VITE_API_URL=http://localhost:8000
```
**Test users** (all password `TestPass123!`): `admin@resolutionflow.example.com` (super_admin), `teamadmin@resolutionflow.example.com`, `engineer@resolutionflow.example.com`, `pro@resolutionflow.example.com`.
**CI:** Gitea (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`). `gh` CLI works for issues/PRs on the GitHub mirror, but not CI runs.
**Never pass `--rev-id`** to alembic — let it generate the hex hash.
---
## ConnectWise PSA Integration
## Common tasks
ResolutionFlow integrates with ConnectWise PSA (formerly Manage) as the primary PSA integration. All ConnectWise API reference materials live in `docs/connectwise/`.
### Best Practices Documentation
Official ConnectWise developer guides live in `docs/connectwise/best-practices/`. Read these BEFORE implementing any CW API integration code:
- `PSA-API-Requests.md` — HTTP methods, response codes, condition query syntax, PATCH format, URL encoding, partial responses, custom fields. READ FIRST.
- `PSA-Callbacks.md` — Callback type/level matrix, retry behavior, URL parameter gotcha, HMAC signature verification.
- `PSA-Pagination.md` — Navigable vs Forward-Only pagination, Link headers, while-loop pattern.
- `PSA-Service-Tickets.md` — Ticket field philosophy, recommended field mappings.
- `PSA-Versioning.md` — Pin API version via Accept header. Use `application/vnd.connectwise.com+json; version=2025.16`.
- `PSA-Cloud-URL-Formatting.md` — Dynamic base URL construction via `/login/companyinfo/{companyId}`.
- `Bundled-Requests.md` — Batch multiple API calls into one request via `/system/bundles`.
- `PSA-Markdown.md` — Ticket notes support markdown. Format session documentation output accordingly.
- `PSA-Company-Synchronization.md` — Filter companies by Status/Type for mapping UI.
- `PSA-Data-Protection.md` — Security role model, request minimal permissions (MY not ALL).
### Reference Files (read in this order)
1. `docs/connectwise/CONNECTWISE-API-REFERENCE.md` — Read FIRST. Quick reference covering auth patterns, tiered endpoint map, key field mappings, and integration architecture flows.
2. `docs/connectwise/connectwise-psa-resolutionflow-reference.json` — Extracted OpenAPI 3.0.1 spec (v2025.16) with only the 670 endpoints and 342 schemas relevant to ResolutionFlow. Use for exact field types, request/response shapes, and parameter details.
3. `docs/connectwise/connectwise-psa-openapi-full.json` — Complete ConnectWise PSA OpenAPI spec (1838 endpoints, 842 schemas). Only consult if you need an endpoint outside the extracted subset.
### Integration Architecture
- **Session → Ticket Notes:** Post auto-generated session documentation to ConnectWise tickets as internal analysis notes via `POST /service/tickets/{id}/notes`
- **Ticket Context → Session Runner:** Pull ticket details, company info, and attached configurations to give FlowPilot AI real-world context
- **Callbacks:** Register webhooks via `/system/callbacks` for real-time ticket event notifications to suggest relevant Flows
### Key Implementation Rules
- Auth: API Key auth (Base64 of `companyId+publicKey:privateKey`) + `clientId` header on every request
- `clientId` is server-side config (`CW_CLIENT_ID` in `config.py`) — identifies the ResolutionFlow app, NOT per-tenant. Per-connection credentials: `company_id`, `public_key`, `private_key`, `server_url`
- All PSA integration code in `services/psa/` — provider pattern with `PSAProvider` abstract base class, `ConnectWiseProvider` implementation, `PsaProviderRegistry` for multi-PSA dispatch
- PSA endpoints in `api/endpoints/integrations.py` — connection CRUD, ticket ops, member mapping
- Credentials encrypted at rest via `services/psa/encryption.py` (Fernet)
- Each MSP tenant provides their own CW credentials — ResolutionFlow stores these per-team, never per-user
- Design for the Autotask integration following the same service layer pattern (future PSA)
- In-memory TTL cache in `services/psa/cache.py` for board/status/priority lookups
- Respect CW API: paginate with max 1000 per page, handle retries gracefully
- **New endpoint:** `endpoints/``router.py``schemas/` → tests → frontend API client.
- **New page:** `pages/` → route in `router.tsx` → nav in `AppLayout.tsx`.
- **New public route:** top-level in `router.tsx` alongside `/login`, not inside `ProtectedRoute`.
- **New frontend API module:** types in `types/` → export from `types/index.ts` → client in `api/` → export from `api/index.ts`.
- **Schema change:** update model → `alembic revision -m "desc"` → review → `alembic upgrade head`.
- **New `VITE_*` env var:** add as `ARG` + `ENV` in `frontend/Dockerfile` for Railway builds (Lesson 60 — Railway env vars are runtime-only, Vite bakes at build time).
- **Account sub-page:** add route in `router.tsx` under `account` children + add link card in `AccountSettingsPage.tsx``AccountLayout` has NO sidebar nav.
---
## Development Commands
## Coding standards
```powershell
# Start PostgreSQL
docker start patherly_postgres
- **Python:** type hints everywhere, async/await for DB, Pydantic v2, `DateTime(timezone=True)` always.
- **TypeScript:** interfaces for all data, `const` over `let`, functional components + hooks, shared logic in custom hooks.
- **Git:** feature branch before committing (`git checkout -b feat/feature-name`). Format: `type: description` (feat/fix/refactor/docs/test/chore). Always `Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>`. Large features: commit per phase with `npm run build` validation. Push to Gitea — auto-mirrors to GitHub (`.gitea/workflows/mirror-to-github.yml`); never push GitHub directly.
# Backend (from backend/)
source venv/bin/activate # Linux/Mac
# .\venv\Scripts\Activate # Windows
uvicorn app.main:app --reload
# Frontend (from frontend/)
npm run dev
# Run tests (from backend/)
pytest --override-ini="addopts="
# First time only: create test database
docker exec -it patherly_postgres psql -U postgres -c "CREATE DATABASE patherly_test;"
# Frontend build (IMPORTANT: stricter than tsc --noEmit — always use as final check)
cd frontend && npm run build
# Database migrations
cd backend && alembic upgrade head
alembic revision --autogenerate -m "Description" --rev-id=NNN # NNN = next sequential number
# IMPORTANT: Migrations use sequential 3-digit IDs (001, 002, ..., 068, 069).
# Check the latest: ls backend/alembic/versions/ | grep -E '^\d{3}_' | sort | tail -1
# The revision ID and filename prefix MUST match (e.g., revision="068", file=068_description.py).
# down_revision MUST point to the previous sequential number. Never use hex hash IDs for new migrations.
# Access PostgreSQL
docker exec -it patherly_postgres psql -U postgres -d patherly
# Seed data
cd backend && pip install httpx && python -m scripts.seed_trees
# CI/CD debugging
gh run list --limit 5 # Recent CI runs
gh run view <id> --log-failed # Failed job logs
gh run view <id> --json jobs --jq '.jobs[] | {name: .name, conclusion: .conclusion}'
# NEVER use `gh run watch` — it holds context open and burns tokens while waiting
```
### URLs
- Frontend: <http://localhost:5173>
- Backend API: <http://localhost:8000>
- API Docs: <http://localhost:8000/api/docs>
### Test Users (seeded via `scripts/seed_test_users.py`)
- All share password: `TestPass123!`
- `admin@resolutionflow.example.com` (super_admin), `teamadmin@resolutionflow.example.com` (team_admin), `engineer@resolutionflow.example.com` (engineer), `pro@resolutionflow.example.com` (solo pro)
**After shipping:** update `CURRENT-STATE.md` + `03-DEVELOPMENT-ROADMAP.md`, `gh issue close #N` for resolved issues, add lessons here only for non-obvious traps (otherwise let the code speak).
---
## Critical Lessons Learned
> Lessons 1-40 archived to `docs/LESSONS-ARCHIVE.md` — fixes are baked into the codebase. Consult if you hit a regression.
### Active Lessons (41+)
**41. Assistant chat uses local React state, not Zustand:** `AssistantChatPage.tsx` uses `useState` for `chats`, `messages`, `input`, `loading`. No store.
**42. Public pages use raw `fetch()`, not `apiClient`:** Survey, shared sessions, and no-auth pages use `fetch()` with full URL. `apiClient` requires auth tokens.
**43. Adding new email types:** Add static async method to `EmailService` in `core/email.py`. Fire-and-forget from endpoints (log errors, don't fail).
**44. AI Chat Builder is flow-type-aware:** `ai_chat_service.py` dispatches by `flow_type`. Troubleshooting: `[TREE_UPDATE]` markers. Procedural: `[STEPS_UPDATE]` markers. Both support `[METADATA]`.
**45. Intake form field schema:** Uses `variable_name` and `field_type` (NOT `name` and `type`).
**46. `CreateFlowDropdown` uses `AIPromptDialog`:** Opens prompt modal, starts AI session, generates flow, navigates to editor with `{ state: { aiPanelOpen: true, sessionId } }`.
**47. Editor-Embedded Flow Assist:** `EditorAIPanel` (320px side panel) + `useEditorAI` hook. Ghost nodes use `_suggestion: true` flag. Actions route to model tiers via `settings.get_model_for_action()`. Delta responses use `[DELTA]...[/DELTA]` markers.
**48. Tree orphan validation uses dynamic root ID:** Orphan check compares against `state.treeStructure?.id` (NOT hardcoded `'root'`).
**49. Full-stack features — verify both ends:** Check the full data flow: schema → endpoint → API client → hook → store → UI.
**50. Anthropic SDK retry:** Set `max_retries=1` to fail fast. Default `max_retries=2` can take 3× timeout.
**51. AI model tier routing:** Use `settings.get_model_for_action(action_type)`. Model IDs: use alias form (`claude-sonnet-4-6`).
**52. Mobile scroll-to-top:** Use `ref.current.scrollIntoView()`, not `window.scrollTo()`. Trigger via `useEffect`.
**53. Flex height chain:** Every ancestor must be a flex container for `flex-1` to work. Missing `flex` class collapses React Flow to 0 height.
**54. React Flow CSS in Tailwind v4:** Import in `index.css`, not component JS. Override dark theme using `--xy-*` CSS custom properties.
**55. App shell height chain:** Every wrapper between `.main-content` and canvas needs `flex` + `flex-1` + `min-h-0` or `h-full`.
**56. Railway backend service name is `patherly`:** Production DB name is `railway`. Public Postgres proxy: `interchange.proxy.rlwy.net:45797`.
**57. Node field priority:** `title``question``description``content``label`. See `copilot_service.py`.
**58. `scriptGeneratorStore.generate()` optional param:** Always wrap: `onClick={() => generate()}`, never `onClick={generate}`.
**59. ConnectWise `clientId` is server-side config:** Set in `config.py` as `CW_CLIENT_ID`. Per-connection: `company_id`, `public_key`, `private_key`, `server_url`.
**60. Dockerfile build args for Vite env vars:** Any new `VITE_*` or `VITE_PUBLIC_*` env var must be added as `ARG` + `ENV` in `frontend/Dockerfile` for Railway deploys. Railway env vars are runtime-only unless explicitly passed through as Docker build args. Without this, `import.meta.env.VITE_*` resolves to `undefined` in production builds.
**61. Procedural sessions auto-start on page load:** `ProceduralNavigationPage` calls `startSession()` immediately in `loadTree()` — there is no intake form screen or "Start" button. Variables are filled inline during execution. Troubleshooting flows DO have a start screen with ticket/client fields. Don't write tests or UI that assume a Start button on procedural flows.
**62. Playwright strict mode — scope selectors to avoid ambiguity:** Step titles appear in both the sidebar checklist and main content heading. Use `getByRole('heading', { name })` for the main content, or scope with `page.locator('.animate-scale-in')` for command palette items. `getByText()` frequently matches multiple elements due to the sidebar + main content layout.
**63. Node 20 required for frontend builds:** Vite 7+ requires Node 20.19+. The system Node may be v18; use nvm: `export NVM_DIR="$HOME/.nvm" && source "$NVM_DIR/nvm.sh" && nvm use 20`. For direct binary access without nvm sourcing: `PATH="/home/michaelchihlas/.nvm/versions/node/v20.19.0/bin:$PATH"`.
**64. PostHog product analytics:** Initialized via `PostHogProvider` in `main.tsx` with explicit `posthog.init()` + `client` prop pattern. Event helpers in `lib/analytics.ts` — use `analytics.eventName(props)` to track. `identifyUser()` called in `authStore.fetchUser()`, `resetAnalytics()` on logout. Env vars: `VITE_PUBLIC_POSTHOG_KEY`, `VITE_PUBLIC_POSTHOG_HOST`. Autocapture enabled.
**65. Local Docker Compose uses `resolutionflow` database on port 5433:** Container name is `resolutionflow_postgres`, database is `resolutionflow` (not `patherly`), port mapped to `5433` (not `5432`). The `POSTGRES_PORT` env var controls this. Playwright config defaults must match: `postgresql+asyncpg://postgres:postgres@127.0.0.1:5433/resolutionflow`.
**66. Dev environment runs on Hostinger VPS (46.202.92.250), not localhost:** Code-server runs in Docker on a VPS (previously devserver01/192.168.0.9). Frontend/backend are accessed via `46.202.92.250`, not `localhost`. CORS must include the VPS IP in `CORS_ORIGINS` and `FRONTEND_URL`. Frontend `.env` must set `VITE_API_URL` to the VPS backend URL. See [DEV-ENV.md](DEV-ENV.md) for full setup, Docker config, networking, and known issues.
**67. Tree editor route is `/trees/new`:** NOT `/editor/new`. Check `router.tsx` line 156 for the canonical path. Use `getTreeEditorPath()` from `@/lib/routing` when navigating programmatically.
**68. APScheduler jobs need `max_instances=1`:** Without it, overlapping scheduler runs can process the same records twice (TOCTOU race). Always set `max_instances=1` on interval jobs in `main.py`.
**69. PostgreSQL `func.sum(case(...))` returns `Decimal` via asyncpg:** Cast to `int()` before storing in Pydantic `dict[str, Any]` fields, or JSON serialization may produce unexpected types.
**70. Toast library uses `toast.warning()` not `toast.warn()`:** Import from `@/lib/toast`. Methods: `success`, `error`, `warning`, `info`. See `frontend/src/lib/toast.ts`.
**71. Enhancement/branch_addition proposals cannot be directly approved:** Backend returns 400 — they require `modified_flow_data` via "Edit & Publish" flow. Only `new_flow` proposals support direct approve.
**72. `ai_sessions.status` column is `VARCHAR(30)`:** Must fit `requesting_escalation` (23 chars). If adding new status values, verify length. Migration `f0aad74ea51b` widened from 20→30.
**73. `get_db` rolls back on exception:** The dependency does `await session.rollback()` on error to prevent `InFailedSQLTransaction` cascade. Never remove this — without it, one failed request poisons subsequent requests on the same connection.
**74. FlowPilot action bar height chain:** The action bar (Resolve/Escalate/Pause) requires every ancestor from `app-shell` grid down to have proper flex constraints. Key fix: `ViewTransitionOutlet` wrapper needs `flex flex-col`. If action bar disappears, check height chain with DevTools `getBoundingClientRect()` walk.
**75. Dashboard prefill auto-submits:** `StartSessionInput` navigates to `/pilot` or `/assistant` with `{ state: { prefill } }`. `FlowPilotSessionPage` auto-submits via `useEffect` + `prefillHandledRef` guard — no double-enter. `AssistantChatPage` does the same pattern.
**76. Active session navigation guard:** `FlowPilotSessionPage` uses `useBlocker` (same as `TreeEditorPage`) to intercept navigation during active sessions. "Pause & Leave" auto-pauses before proceeding.
**77. Prefer manual Alembic migrations for targeted changes:** `alembic revision --autogenerate` picks up drift from all tables. For single-column fixes, use `alembic revision -m "desc"` and write `op.alter_column()` manually.
**78. Landing page subtitle is "AI-Powered Troubleshooting for MSPs":** Not "Decision Tree Platform". This tagline appears on login, register, and the HTML `<title>`. The old "Decision Tree Platform" was internal jargon misaligned with user-facing branding.
**79. Custom modals must be mobile-responsive:** Use `items-end sm:items-center` (bottom-sheet on mobile, centered on desktop) and `max-w-full sm:max-w-lg` (full-width on mobile). The shared `Modal.tsx` does this correctly — custom modal implementations must follow the same pattern. See `PrepareSessionModal.tsx` for the fix pattern.
**80. TopBar search collapses to icon on mobile:** Full search bar (`hidden sm:block`) shows on desktop; magnifying glass icon button (`sm:hidden`) shows on mobile (<640px). Both open the same CommandPalette. Don't add `w-full` search bar without the mobile icon fallback.
**81. Never use `transition: all` in landing.css:** Specify exact properties: `transition: background 0.3s, border-color 0.3s, box-shadow 0.3s, transform 0.3s, opacity 0.3s`. `transition: all` animates layout properties and causes jank.
**82. `bun` requires PATH setup on devserver01:** `export BUN_INSTALL="$HOME/.bun" && export PATH="$BUN_INSTALL/bin:$PATH"`. The gstack browse binary and Playwright need this. Chromium system deps: `libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon0 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2`.
**83. FlowPilot ActionBar is `position: fixed; bottom: 0`:** Any UI element placed in normal document flow below the session content will be hidden behind it. New fixed-position elements (like the message bar) must use `bottom: 68px` (action bar height) and the same `left: var(--sidebar-w)` pattern. The conversation column uses `pb-32` for clearance.
**84. AI session `abandoned` status is fully wired:** `POST /ai-sessions/{id}/abandon` sets status to `abandoned` with optional `reason` param. Frontend: `aiSessionsApi.abandonSession()`, `useFlowPilotSession().abandonSession()`, "Close" button in `FlowPilotActionBar`. Redirects to `/sessions` after closing.
**85. Date range filter end dates must use end-of-day:** `toDate.toISOString()` sends midnight (start of day), excluding items created later that day. Always set `toDate.setHours(23, 59, 59, 999)` before sending. For string-based date inputs (AI sessions), append `T23:59:59.999Z`. See `SessionHistoryPage.tsx`.
**86. Script Builder system:** AI-powered script generation at `/script-builder`. Chat-style interface generates PowerShell/Bash/Python scripts from natural language. Backend: `ScriptBuilderSession` model, `script_builder_service.py`, endpoints at `/scripts/builder/`. Frontend: `ScriptBuilderPage`, `ScriptCodeBlock`, `ScriptPreviewModal`, `SaveToLibraryDialog`. FlowPilot can hand off to Script Builder via `action_type: "open_script_builder"` with `sessionStorage` context passing.
**87. FlowPilot must ask GUI vs script preference:** When a task can be done via GUI or script (e.g., creating AD users), FlowPilot must ask the engineer which approach they prefer BEFORE suggesting either. Never assume the user wants a script. See `FLOWPILOT_SYSTEM_PROMPT` rules in `flowpilot_engine.py`.
**88. Charcoal palette — sidebar-darkest approach:** Sidebar `#0e1016`, page `#16181f`, cards `#1e2028`, borders `#2a2e3a`. This gives more contrast range than true-dark. All colors via CSS variables in `index.css` `@theme` block. Accent is electric blue (#60a5fa), not orange or cyan.
**92. `tsc -b` in Dockerfile is stricter than `npx tsc --noEmit`:** The production build (`tsc -b && vite build`) enforces `noUnusedLocals` and `noUnusedParameters` as hard errors. After any refactor that moves logic between components or removes features, trace every import and destructured prop to remove orphans. IDE warnings (yellow squiggles) flag these — check them before pushing.
**93. FlowPilot actions live in the page header, not a bottom bar:** `FlowPilotSessionPage` renders Resolve/Escalate/Share Update in the header bar. Desktop: inline buttons + `⋯` overflow (Pause/Close). Mobile: single `⋯` menu. The bottom only has the message input. `FlowPilotActionBar` component still exists but is no longer used in the main session flow.
**94. Frontend chat uses unified_chat_service, not assistant_chat_service:** `AssistantChatPage` calls `/ai-sessions/{id}/chat``unified_chat_service.py`. The old `assistant_chat_service` endpoints were removed (only retention settings remain at `/assistant/retention`). When tracing chat features, start from `aiSessionsApi.sendChatMessage``ai_sessions.py``unified_chat_service.py`. Never wire chat features into `assistant_chat.py`.
**95. Image upload → AI vision pipeline:** Paste/attach images → upload to Railway S3 bucket via `uploadsApi.upload()` → send `upload_ids` with chat message → backend fetches from S3 via `storage_service.download_file()` → resized via `storage_service.resize_image_for_vision()` (Pillow, 1568px max, PNG→JPEG) → base64-encoded → sent as Claude multimodal content blocks. Max 3 images/message. Images are NOT stored in conversation history (text-only). Vision helpers live in `storage_service.py`.
**96. `bg-accent` is ember orange — never use for code/kbd elements:** In Tailwind v4, `bg-accent` maps to `--color-accent: #f97316`. Use `bg-code` for code blocks, `bg-white/[0.12] border border-white/[0.06]` for inline code/badges, `bg-white/[0.08]` for kbd shortcuts. Orange is reserved for interactive elements only (buttons, active nav, links).
**97. Railway Object Storage (S3 bucket) is provisioned:** Bucket `resolutionflow-uploads` on Railway canvas. Variables: `STORAGE_ENDPOINT`, `STORAGE_ACCESS_KEY`, `STORAGE_SECRET_KEY`, `STORAGE_BUCKET_NAME`, `STORAGE_REGION` — mapped via variable references on the `patherly` backend service. Accessed via boto3 in `storage_service.py`. Pillow (`Pillow>=10.0.0`) + `libjpeg-dev`/`zlib1g-dev` in Dockerfile for image resize.
**98. `lazyWithRetry` for stale chunk errors:** All lazy-loaded routes use `lazyWithRetry` from `@/lib/lazyWithRetry.ts` instead of `React.lazy`. Auto-reloads the page on chunk load failures (stale deploys). Uses sessionStorage debounce (10s) to prevent loops. When adding new lazy routes, use `lazyWithRetry`, not `lazy`.
**99. Tailwind v4 `text-secondary` renders invisible on dark backgrounds:** `text-secondary` maps to `--color-secondary: #2e3140` (a dark surface color), NOT `--color-text-secondary`. For readable secondary text, use `text-muted-foreground` (`#848b9b`). Also avoid `text-muted` (`#4f5666`) for body text — it's for labels only. This applies to ALL new components.
**100. Hover pop-out card pattern:** For cards that expand on hover "in front of everything": use `pointer-events-none` on the scrim (`fixed inset-0 z-40 bg-black/30`), absolute-position the expanded card at `z-50` with its own `onClick` handler, and dismiss via `onMouseLeave` on the wrapper div. Never put interactive event handlers on the scrim — it blocks clicks on sibling elements.
**101. AI marker format compliance:** The AI assistant uses `[QUESTIONS]`, `[ACTIONS]`, and `[FORK]` markers in responses. Parsed by `unified_chat_service.py` (`_parse_*_marker` functions), returned as structured data in the API response. System prompt in `assistant_chat_service.py` has a final reminder section, and each user message gets an invisible `[SYSTEM: ...]` reminder appended in `_call_anthropic_cached()`. If markers stop appearing: check conversation history stores `display_content` (stripped), verify system prompt final reminder exists, check user message reminder injection is active.
**102. TaskLane activation must happen in ALL chat response paths:** `AssistantChatPage.tsx` has three code paths calling `sendChatMessage`: `handleSend` (regular messages), `sendPrefill` (dashboard handoff), `handleResumeNew` (resume from concluded session). ALL three must check `response.actions`/`response.questions` and call `setShowTaskLane(true)`. Missing this in any path causes TaskLane to not appear on first message.
**103. Docker not available in code-server container:** The dev environment runs code-server inside Docker on the VPS. The `docker` CLI is not available inside the code-server container. To query the database, use the VPS SSH session: `docker exec resolutionflow_postgres psql -U postgres -d resolutionflow -t -c "SQL"`. Python is also not available in the container.
**104. `landing.css` uses self-contained `--lp-*` color variables:** The landing page defines its own color palette at the top of `landing.css` (`--lp-bg`, `--lp-accent`, `--lp-text-*`, etc.). Never use `var(--color-*)` theme tokens in `landing.css` — they may resolve incorrectly outside the app shell context. Extend the `--lp-*` palette for any new landing page colors.
**105. `npm run build` fails with `EACCES: permission denied` on `dist/` in code-server:** This is a filesystem permission issue in the Docker environment, not a TypeScript error — the TS compilation completes successfully. Use `npx tsc -b` to verify TypeScript cleanly without needing to write to `dist/`.
**106. Guard async "select item → load data → apply state" flows with a ref:** When a component lets the user switch between items (chat sessions, flows, scripts) and loads data asynchronously on each switch, the load for item A can complete *after* the user has already switched to item B — overwriting B's state with A's stale data. Fix pattern: keep a `currentSelectionRef = useRef(initialId)` and update it synchronously whenever the selection changes (in every creation/switch path). After every `await`, bail out if `currentSelectionRef.current !== thisItemId`. See `AssistantChatPage.tsx` `selectChat` for the reference implementation (`currentChatRef`).
## RBAC & Permissions
- **Role hierarchy:** super_admin > team_admin > engineer > viewer
- **Team Admin:** `role='engineer'` + `is_team_admin=True` + valid `team_id`
- **Backend deps:** `get_current_active_user(user, db)` (any active + auto-downgrades expired trials), `require_engineer_or_admin` (blocks viewers), `require_admin` (super admin only)
- **Never use** `role == "admin"` — use `is_super_admin` instead
- **Frontend:** `usePermissions()` hook for all permission checks
- **Centralized:** `backend/app/core/permissions.py`, `frontend/src/hooks/usePermissions.ts`
## Frontend patterns
- **Component basics:** `cn()` from `@/lib/utils`, Lucide icons, `Modal.tsx` for modals (mobile-responsive `items-end sm:items-center` + `max-w-full sm:max-w-lg`).
- **Types:** Create in `types/`, export from `types/index.ts`, `import type { T } from '@/types'`.
- **Routing:** `getTreeNavigatePath()` / `getTreeEditorPath()` from `@/lib/routing`. Tree editor is `/trees/new`. All dashboard session clicks → `/pilot/:id` regardless of `session_type`.
- **Lazy routes:** `lazyWithRetry` from `@/lib/lazyWithRetry.ts`, not `React.lazy` (auto-reload on stale chunks).
- **Public pages:** raw `fetch()` with full URL, NOT `apiClient` (which requires auth tokens).
- **Toast:** `toast.warning()` not `toast.warn()`. Import from `@/lib/toast` — methods: `success`, `error`, `warning`, `info`.
- **Assistant chat:** uses local React `useState`, not Zustand. All three send paths (`handleSend`, `sendPrefill`, `handleResumeNew`) must call `setShowTaskLane(true)` when response has actions/questions.
- **Chat backend wiring:** `aiSessionsApi.sendChatMessage``/ai-sessions/{id}/chat``unified_chat_service.py`. NOT `assistant_chat_service.py` (removed except retention settings).
- **FlowPilot:** Actions live in page header (Resolve/Escalate/Share Update + overflow). `useBlocker` for active-session nav guard. "Pause & Leave" auto-pauses.
- **AI markers:** `[QUESTIONS]`, `[ACTIONS]`, `[FORK]`, `[DELTA]...[/DELTA]` (editor), `[TREE_UPDATE]` (troubleshooting builder), `[STEPS_UPDATE]` (procedural builder), `[METADATA]`. Parsed in `unified_chat_service.py`; conversation history stores stripped `display_content`. If markers disappear: check system-prompt final reminder + per-user-message `[SYSTEM: ...]` injection in `_call_anthropic_cached()`.
- **Image uploads:** paste/attach → Railway S3 via `uploadsApi.upload()` → resized by `storage_service.resize_image_for_vision()` (Pillow, 1568px max, PNG→JPEG) → base64 → Claude multimodal blocks. Max 3/msg. Images NOT stored in history.
- **Async select-load-apply:** guard with a ref (pattern in `AssistantChatPage` `currentChatRef`). Update synchronously on every selection change; after every `await`, bail out if `ref.current !== thisId`.
- **Editor-Embedded Flow Assist:** `EditorAIPanel` (320px side panel) + `useEditorAI`. Ghost nodes via `_suggestion: true`. Route actions via `settings.get_model_for_action()`.
- **Script Builder:** `/script-builder`, chat-style. Backend `ScriptBuilderSession`, `script_builder_service.py`, endpoints `/scripts/builder/`. FlowPilot handoff via `action_type: "open_script_builder"` + `sessionStorage`.
- **Intake form field schema:** `variable_name` + `field_type` (NOT `name` / `type`).
- **Node field priority** (copilot, summaries): `title``question``description``content``label`.
- **Procedural sessions auto-start** on page load (no intake/Start screen). Troubleshooting flows DO have a start screen.
---
## Design System
## Critical lessons
**Source of truth:** [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md)always read this before making visual or UI decisions.
> Lessons 1-40 archived to `docs/LESSONS-ARCHIVE.md`fixes baked into the codebase. **Grep the archive when an error message or symptom is unfamiliar, or after two failed attempts at resolving an issue.** Don't pre-load for routine work.
- **Theme:** Flat, high-contrast dark theme (Sentry/PostHog-inspired). No glass morphism, no backdrop blur, no ambient orbs, no gradient backgrounds on surfaces. Light mode planned.
- **Backgrounds:** `bg-page` (`#1a1c23`), `bg-sidebar` (`#10121a`), `bg-card` (`#22252e`), `bg-elevated` (`#2e3140`)
- **Cards:** `bg-card` with 1px `border-default` (`#2e3240`), 8px radius. No shadows, no blur, no gradients. Hover: `border-hover` (`#3d4252`)
- **Buttons:** Primary: solid `accent` (#f97316), white text, 5px radius. Ghost: transparent + 1px border, hover `bg-elevated`
- **Inputs:** `bg-input` (`#282b35`) with 1px `border-default`, 5px radius. Focus: `border-color: accent` + `box-shadow: 0 0 0 2px accent-dim`
- **Text:** `text-heading` (`#f0f2f5`) → `text-primary` (`#e2e5eb`) → `text-muted-foreground` (`#848b9b`) → `text-muted` (`#4f5666`). NEVER use `text-secondary` — in Tailwind v4 it maps to a surface color (#2e3140), not a text color.
- **Borders:** `border-default` (`#2e3240`), `border-hover` (`#3d4252`)
- **Functional colors:** `#34d399` (success), `#eab308` (warning), `#f87171` (danger) — each with `-dim` variant at 10% opacity
- **Accent:** Ember orange `#f97316` — used sparingly (≤5% of UI). `accent-dim` = `rgba(249,115,22,0.10)`, `accent-text` = `#fdba74`
- **Deprecated:** Do NOT use `glass-card`, `glass-stat`, `bg-gradient-brand`, `text-gradient-brand`, `backdrop-filter: blur()`, ambient orbs, purple gradients, or cyan accent (`#22d3ee`)
### Backend / data
- **APScheduler interval jobs always `max_instances=1`** — without it, overlapping runs reprocess records (TOCTOU).
- **`get_db` rolls back on exception** — never remove the `await session.rollback()`, or one failed request poisons the connection with `InFailedSQLTransaction` cascading.
- **Startup routines on tenant-isolated tables must use `_admin_session_factory()`, not `get_db()`.** Phase 4 RLS has no `app.current_account_id` set at startup. `get_service_account_id` is safe (reads cached `app.state`).
- **Backfill migrations adding `account_id`:** grep ALL `ModelClass(` sites in service code to verify `account_id=` is passed. SQLAlchemy accepts `None` silently — Phase 4 RLS WITH CHECK surfaces the problem at runtime as `InsufficientPrivilegeError: new row violates row-level security policy`.
- **`tree_shares.account_id = tree.account_id`**, never `current_user.account_id`. A super_admin sharing another tenant's tree must produce the share in the tree owner's tenant, or it becomes invisible post-RLS.
- **Global tables (no `account_id`, never in RLS migrations):** `script_categories`, `platform_steps`, `template_trees`, `plan_feature_defaults`, `accounts`. Scan at class level — one `.py` file can hold multiple classes with different columns (e.g. `ScriptCategory` vs `ScriptTemplate`).
- **`ai_sessions.status` is VARCHAR(30)** — fits `requesting_escalation` (23 chars). Migration `f0aad74ea51b` widened from 20.
- **PostgreSQL `func.sum(case(...))` returns `Decimal` via asyncpg** — cast to `int()` before Pydantic `dict[str, Any]`.
- **Enhancement / branch_addition proposals need `modified_flow_data` via "Edit & Publish"** — backend 400 on direct approve. Only `new_flow` supports direct approve.
- **Adding email types:** static async method on `EmailService` in `core/email.py`. Fire-and-forget from endpoints (log errors, don't fail the request).
### AI / FlowPilot
- **Anthropic SDK `max_retries=1`** — default of 2 can take 3× the timeout.
- **Model tier routing:** `settings.get_model_for_action(action_type)`. Always alias form (`claude-sonnet-4-6`).
- **FlowPilot must ask GUI-vs-script before suggesting either** when both are viable — see `FLOWPILOT_SYSTEM_PROMPT` in `flowpilot_engine.py`.
- **Telemetry events to grep:** `anthropic.cache` (prompt-cache hit/create), `mcp.turn` (per-turn MCP availability), `mcp.fallback` (MCP silent-retry fired).
- **Don't put literal payloads in system prompts.** Bit us twice in one day: a worked `[QUESTIONS]` example with literal "Outlook + jsmith" content, and a full DNS troubleshooting tree, both caused Claude to recite that content on unrelated tickets — the symptom looked like task-lane state leaking across chats. The fix is structural: every output example in a system prompt uses `<placeholder>` syntax (`{"text": "<one short, specific question>"}`), never literal field values. Real-looking format examples live in few-shot messages (separate file, separate code path), not system prompts. Guardrail: `tests/test_prompt_anti_parrot.py` scans every `*_PROMPT`/`*_SCHEMA`/`*_PROTOCOL`/`*_FORMAT` constant in `app/services/` and `app/core/`; CI fails when a marker block contains a literal JSON value or when a known leaked token (jsmith, DC01, ADSync, Dnscache, etc.) appears anywhere in a prompt.
### Frontend / UI
- **Flex height chain:** every ancestor from `app-shell` grid to React Flow canvas needs `flex` + `flex-1` + `min-h-0` or `h-full`. Missing `flex` collapses to 0. Same rule for FlowPilot action bar and any tall scroller.
- **React Flow CSS in Tailwind v4:** import in `index.css`, not component JS. Override dark theme via `--xy-*` CSS vars.
- **`text-secondary` renders invisible on dark** — Tailwind v4 maps it to `--color-secondary` (a surface color). Use `text-muted-foreground` for readable secondary text. Avoid `text-muted` for body — labels only.
- **`bg-accent` is electric blue — never for code/kbd.** Use `bg-white/[0.12] border border-white/[0.06]` for inline code, `bg-white/[0.08]` for kbd. Accent reserved for interactive elements.
- **`landing.css` uses self-contained `--lp-*` vars** — never `var(--color-*)` theme tokens (they resolve incorrectly outside the app shell).
- **Never `transition: all`** — list properties explicitly, or layout props animate and jank.
- **Date range filter end dates:** `setHours(23, 59, 59, 999)` before sending, or the day's items are excluded. For string-based date inputs, append `T23:59:59.999Z`.
- **TopBar search:** full bar `hidden sm:block`, icon button `sm:hidden` — both open CommandPalette.
- **Hover pop-out cards:** scrim `pointer-events-none`, expanded card has its own click handler at `z-50`, dismiss via `onMouseLeave` on wrapper. Never put handlers on the scrim.
- **`tsc -b` in Dockerfile is stricter than `tsc --noEmit`** — enforces `noUnusedLocals` / `noUnusedParameters` as hard errors. Check IDE yellow squiggles before pushing.
- **Dashboard prefill auto-submits** via `useEffect` + `prefillHandledRef` guard — no double-enter.
- **Global Axios 5xx interceptor fires before component `.catch()`** — fix optional-data endpoints at the source (return `[]` / `{}` on provider failure), not in the component.
- **Playwright strict mode:** scope selectors to avoid sidebar/main ambiguity. Use `getByRole('heading', { name })` or `.animate-scale-in` locators, not bare `getByText()`.
### Env / infra
- **Node 20.19+ required** (Vite 7). `nvm use 20` or `PATH="$HOME/.nvm/versions/node/v20.19.0/bin:$PATH"`.
- **Railway backend service is `patherly`, DB name `railway`.** Public Postgres proxy: `interchange.proxy.rlwy.net:45797`.
- **Railway Object Storage bucket `resolutionflow-uploads`.** Env vars `STORAGE_*`. boto3 in `storage_service.py`. Dockerfile needs Pillow + `libjpeg-dev` / `zlib1g-dev`.
- **PostHog:** `PostHogProvider` + `posthog.init()` in `main.tsx`. Helpers in `lib/analytics.ts`. Env: `VITE_PUBLIC_POSTHOG_KEY`, `VITE_PUBLIC_POSTHOG_HOST`. `identifyUser()` in `authStore.fetchUser()`, `resetAnalytics()` on logout.
- **bun PATH on devserver01:** `BUN_INSTALL="$HOME/.bun"`, `PATH="$BUN_INSTALL/bin:$PATH"`. Playwright Chromium needs `libatk1.0-0 libatk-bridge2.0-0 libcups2 libxkbcommon0 libatspi2.0-0 libxcomposite1 libxdamage1 libxfixes3 libxrandr2 libgbm1 libasound2`.
- **Full-stack change:** trace schema → endpoint → API client → hook → store → UI. Don't assume one end proves the other.
- **Dev env** — see DEV-ENV.md for current topology, `REPO_ROOT` requirement when compose runs inside a container, Vite `allowedHosts`, linuxserver.io `group_add` + custom-cont-init.d workaround, `docker compose up` no-op-on-unchanged-hash gotcha.
---
## Frontend Patterns
## GitNexus code intelligence
- **Component guidelines:** Use `cn()` from `@/lib/utils`, Lucide icons (wrap in `<span>` for title), modals with fixed header/footer
- **Type organization:** Create in `types/`, export from `types/index.ts`, import with `import type { T } from '@/types'`
- **Scratchpad overlay:** `position: fixed`, `onOpenChange` callback for parent padding adjustment, `right-2` positioning
- **Custom step flow:** `CustomStepModal``PostStepActionModal``ContinuationModal` → custom step view. Key state: `pendingStep`, `pendingContinuationNodeId`, `customBranchMode`, `branchOriginNodeId`. Use `findCustomStep()` not `findNode()` for custom step UUIDs.
- **Session sharing:** `ShareSessionModal` manages share links, `SharedSessionPage` renders public/account views. Helper utils in `lib/sessionShare.ts`. Share URLs use `/shared/sessions/:token`.
- **Procedural navigation:** `ProceduralNavigationPage` handles intake forms, step-by-step execution, and resume via `location.state.sessionId`. Uses `StepChecklist`, `StepDetail`, `ProgressBar`, `CompletionSummary` components.
- **Routing helper:** Use `getTreeNavigatePath()` and `getTreeEditorPath()` from `@/lib/routing` for all tree/session navigation.
- **Account section layout:** `AccountLayout` has NO sidebar nav. Account sub-pages (categories, target-lists) are reached via link cards on `AccountSettingsPage.tsx`. New account pages: add route in `router.tsx` under `account` children + add a link card in `AccountSettingsPage`.
- **Dashboard cockpit:** `QuickStartPage` is the copilot-first launchpad. Greeting + "What are you troubleshooting?" + ChatGPT-style `StartSessionInput` (auto-growing textarea, paste images, drag-drop files, attach button, paste logs, suggestion chips). Below: `PendingEscalations`, `ActiveFlowPilotSessions`, `RecentFlowPilotSessions`. Collapsible "Dashboard" section for `PerformanceCards`, `KnowledgeBaseCards`, `TeamSummary`.
- **Sidebar sections:** Amber "New Session" button → Home → RESOLVE (History) → KNOWLEDGE (Flows with Solutions Library sub-item, Scripts) → INSIGHTS (Data). Footer: Account, Pin/Unpin. No help/guides/feedback in sidebar — accessible via TopBar.
Indexed as `resolutionflow`. Earns its cost on cross-cutting work only.
| Tool | When |
|---|---|
| `gitnexus_query({query})` | Find code by concept when you don't know where to look |
| `gitnexus_context({name})` | Callers/callees of a symbol before touching it |
| `gitnexus_impact({target, direction})` | Blast radius before editing shared symbols |
| `gitnexus_rename({symbol_name, new_name, dry_run: true})` | Safe multi-file rename |
**Use for:** core shared symbols (`flowpilot_engine`, `unified_chat_service`, auth middleware, `get_db`, shared hooks), cross-file renames, unfamiliar bug traces, refactor safety. **Skip for:** new endpoints, isolated fixes, changes you can read in one file.
Re-indexes automatically on commit (PostToolUse hook). Manual refresh if stale: `npx gitnexus analyze`.
---
## Common Tasks
## gstack skills
- **New endpoint:** Create in `endpoints/` → add to `router.py` → schema in `schemas/` → tests → frontend API client
- **New page:** Create in `pages/` → add route in `router.tsx` → nav link in `AppLayout.tsx`
- **New public route (no auth):** Add at top level in `router.tsx` alongside `/login`, `/register` — NOT inside the `ProtectedRoute`/`AppLayout` children.
- **Schema change:** Update model → `alembic revision --autogenerate -m "desc" --rev-id=NNN` (NNN = next sequential number, e.g., 068 → 069) → review → `alembic upgrade head`
- **New frontend API module:** Types in `types/` → export from `types/index.ts` → client in `api/` → export from `api/index.ts`
Always use `/browse` for web, never `mcp__claude-in-chrome__*`. Most-used:
---
## Coding Standards
### Python
- Type hints everywhere, async/await for DB, Pydantic for validation, `DateTime(timezone=True)` always
### TypeScript
- Interfaces for all data, `const` over `let`, functional components + hooks, reusable logic in custom hooks
### Git
- Format: `type: description` (feat, fix, refactor, docs, test, chore)
- Always include `Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>`
- Always create feature branch BEFORE committing: `git checkout -b feat/feature-name`
- Large features: commit per phase with `npm run build` validation
### After Completing Work
When a feature, fix, or significant piece of work is finished and merged/committed:
1. **Update `CURRENT-STATE.md`** — move completed items, update "In Progress" and "What's Next" sections
2. **Update `03-DEVELOPMENT-ROADMAP.md`** — check off completed work, update phase status
3. **Close related GitHub Issues** — use `gh issue close #N` for any issues resolved by the work
4. **Update `CLAUDE.md`** if the work introduced new patterns, lessons learned, or changed project structure
---
## gstack (Browser & Workflow Skills)
**Web browsing:** Always use the `/browse` skill from gstack for all web browsing needs. Never use `mcp__claude-in-chrome__*` tools.
**Available skills:**
| Skill | Purpose |
|-------|---------|
| `/office-hours` | Brainstorm new ideas (YC-style office hours) |
| `/plan-ceo-review` | CEO/founder-mode plan review (scope, ambition) |
| `/plan-eng-review` | Engineering plan review (architecture, edge cases) |
| `/plan-design-review` | Design plan review (UI/UX critique) |
| `/design-consultation` | Create a design system / DESIGN.md |
| `/review` | Pre-landing PR code review |
| `/ship` | Ship workflow (tests, review, PR creation) |
| `/browse` | Headless browser for QA testing and site dogfooding |
| `/qa` | Systematic QA testing + auto-fix bugs found |
| `/qa-only` | QA report only (no fixes) |
| `/design-review` | Visual QA — find and fix design inconsistencies |
| `/setup-browser-cookies` | Import cookies from real browser for authenticated testing |
| `/retro` | Weekly engineering retrospective |
| `/investigate` | Systematic debugging with root cause analysis |
| `/document-release` | Post-ship documentation updates |
| `/codex` | Second opinion via OpenAI Codex CLI |
| `/careful` | Safety guardrails for destructive commands |
| `/freeze` | Restrict edits to a specific directory |
| `/guard` | Full safety mode (careful + freeze) |
| `/unfreeze` | Remove edit restrictions |
| `/gstack-upgrade` | Upgrade gstack to latest version |
- `/review` — pre-land PR review
- `/ship` — tests + review + PR creation
- `/browse` + `/qa` / `/qa-only` — headless browser testing (setup: Lesson 82)
- `/design-review` — visual QA
- `/investigate` — systematic debug with root cause
- `/codex` — OpenAI Codex second opinion
- `/plan-eng-review` / `/plan-design-review` / `/plan-ceo-review` — plan critiques
---
## Deployment (Railway)
- **Production:** `resolutionflow.com` (frontend), `api.resolutionflow.com` (backend)
- Auto-deploys on push to `main`
- PR environments auto-created (need manual domain generation in Railway dashboard)
- PR envs need `VITE_API_URL` set with `https://` prefix on frontend service
- `ALLOW_RAILWAY_ORIGINS=true` enables CORS for `*.up.railway.app`
- Shared Variables (project-level in Railway dashboard) auto-propagate to all environments including PR envs — use for secrets like `ANTHROPIC_API_KEY`
- Super admin utility: `backend/make_superadmin_simple.py list|<email>`
- **Prod:** `resolutionflow.com` (frontend), `api.resolutionflow.com` (backend).
- Auto-deploy: Gitea push → GitHub mirror → Railway follows GitHub `main`.
- PR environments auto-created; need manual domain generation + `VITE_API_URL` with `https://` prefix.
- `ALLOW_RAILWAY_ORIGINS=true` for `*.up.railway.app` CORS.
- Shared Variables (Railway project-level) auto-propagate to PR envs — use for secrets like `ANTHROPIC_API_KEY`.
- Super admin utility: `backend/make_superadmin_simple.py list|<email>`.
---
## Future Roadmap
- **Phase 3:** PSA integrations (ConnectWise in progress), file attachments, client context, analytics
- **Phase 4:** Additional PSA integrations (Autotask/Kaseya), PowerShell automation, enterprise SSO
---
## Quick Reference
## Quick reference
| What | Where |
|------|-------|
| API Docs | <http://localhost:8000/api/docs> |
| Detailed Status | [CURRENT-STATE.md](CURRENT-STATE.md) |
| Development Roadmap | [03-DEVELOPMENT-ROADMAP.md](03-DEVELOPMENT-ROADMAP.md) |
| GitHub Issues | `gh issue list --state open` |
| Bugs & Fixes | CLAUDE.md → Critical Lessons Learned section |
| Design System | [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md) |
| Dev Environment | [DEV-ENV.md](DEV-ENV.md) — 46.202.92.250 setup, Docker, CORS, networking |
|---|---|
| Detailed status | [CURRENT-STATE.md](CURRENT-STATE.md) |
| Roadmap | [03-DEVELOPMENT-ROADMAP.md](03-DEVELOPMENT-ROADMAP.md) |
| Design system | [DESIGN-SYSTEM.md](DESIGN-SYSTEM.md) |
| Dev env | [DEV-ENV.md](DEV-ENV.md) |
| Archived lessons | [docs/LESSONS-ARCHIVE.md](docs/LESSONS-ARCHIVE.md) |
| ConnectWise API | `docs/connectwise/` |
| GitHub issues | `gh issue list --state open` |
| Local API docs | <http://localhost:8000/api/docs> |

View File

@@ -2,7 +2,7 @@
> **Purpose:** Quick-reference file showing exactly where the project stands.
> **For Claude Code:** Read this first to understand what's done and what's next.
> **Last Updated:** April 4, 2026 (evening)
> **Last Updated:** April 12, 2026
---
@@ -13,8 +13,8 @@
## What's Complete
### Core Platform
- FastAPI project structure with 55+ API endpoints
- PostgreSQL database with Docker, 100+ Alembic migrations
- FastAPI project structure with 35+ API endpoints
- PostgreSQL database with Docker, 75+ Alembic migrations
- User authentication (JWT, register, login, refresh, logout, invite codes)
- Refresh token rotation with JTI-based revocation
- Trees CRUD with full-text search (FTS index)
@@ -29,7 +29,7 @@
### Frontend Core
- React 19 + Vite + TypeScript + Tailwind CSS v4 (`@tailwindcss/vite`)
- **Charcoal Design System v6** — Flat, high-contrast dark theme (Sentry/PostHog-inspired), charcoal palette; accent color is electric blue (#60a5fa), replacing ember orange
- **Charcoal Design System** — Flat, high-contrast dark theme (Sentry/PostHog-inspired), charcoal palette with sidebar-darkest approach
- **Brand fonts:** Bricolage Grotesque (headings), IBM Plex Sans (body), JetBrains Mono (code)
- Authentication UI (login, register, email verification)
- Tree library/browsing page with grid/list/table views
@@ -130,36 +130,6 @@
- Enhanced PSA metrics: time entries, hours logged, push success funnel, daily trend chart
- 13 new backend tests for coverage and flow quality endpoints
### Conversational Branching (Complete)
- SessionBranch, ForkPoint, SessionHandoff, SessionResolutionOutput models + migration (4 tables, 13 columns)
- BranchManager service, BranchAwarePromptBuilder, HandoffManager service with integration tests
- Branch API endpoints: `session_branches.py`, `session_handoffs.py`, `session_resolutions.py`
- Integrated into `unified_chat_service.py` and AI session step creation
- Frontend: BranchNode, ForkCard, BranchMap, BranchRevivalCard, BranchTransitionBar, HandoffModal, ResolutionOutputPanel components
- Wired into FlowPilotSession and `useFlowPilotSession` hook
### Script Library Enhancements (Complete)
- ParameterizeAndSavePanel replaces SaveToLibraryDialog — accepts `script_body` and `parameters_schema` in save flow
- "New from Script" button on ScriptLibraryPage for one-click script creation from template
- Default tab is "All Scripts" (previously filtered to owned scripts)
- Ownership filter state preserved across category and search changes
- Backend: `save-to-library` endpoint accepts `script_body` + `parameters_schema`
### AI Vision Support (Complete)
- Image uploads (paste/drag-drop) wired into AI assistant chat via `upload_ids`
- Server-side image resize before sending to Claude (Pillow, 1568px max, PNG→JPEG)
- `storage_service.resize_image_for_vision()` handles vision pipeline
- Images are NOT stored in conversation history (text-only history)
### Mid-Session Status Updates (Complete)
- AI assistant can generate `status_update` steps (step_type added to CHECK constraint)
- Status update generation wired into `unified_chat_service.py`
- Frontend renders status update cards in session view
### Search & Recall + Evidence-Rich Sessions (Complete)
**Evidence:**
@@ -193,7 +163,14 @@
- SQL wildcard escaping in tag search
- PSA credentials encrypted at rest (Fernet)
### Copilot-First Dashboard (MarchApril 2026)
### Tenant Isolation (Phases 1-4 Complete)
- PostgreSQL RLS enabled across tenant-scoped tables in phased rollout
- `account_id` propagation completed across core content, sessions, analytics, notifications, shares, and remaining Phase 4 tables
- Global platform tables correctly excluded from tenant RLS where they have no `account_id` (`script_categories`, `platform_steps`, `template_trees`)
- Runtime bootstrap paths updated to use BYPASSRLS/admin sessions where needed (auth/user mutations, startup service account, background jobs, seed scripts)
- Preview Railway backend and frontend deployments green for PR 136 after the Phase 4 fixes
### Copilot-First Dashboard (March 2026)
- Redesigned dashboard as FlowPilot copilot launchpad (ChatGPT-style input)
- Chat-style input with paste images, drag-drop files, attach button, paste logs
@@ -203,33 +180,9 @@
- Unified Command Palette (Cmd+K) — merged QuickLaunch into omnibar
- "Solutions Library" rename (from "Step Library") site-wide
- Maintenance flows hidden from UI for pilot (backend still supports them)
- Charcoal color palette: sidebar `#0e1016`, page `#16181f`, cards `#1e2028`
- **Landing page redesign** — scroll-driven reveal animations, live chat animation, FAQ section, improved trust signals; copy: "Resolve tickets faster. Notes write themselves."
- **Session History redesign** — tabbed layout with Load More pagination
- **Edit Procedure page** — layout and color system overhaul
- **TaskLane UX** improvements in assistant chat; persistence across page reload
- TaskLane answers persist in sessionStorage; correct behavior on all three chat paths (send, prefill, resume)
- **Action bar consolidation** — Deduplicated actions across FlowPilot/Cockpit headers and chat toolbars; chat toolbar now only has input tools (Attach, Paste Logs, Tasks)
- **ViewToggle redesigned** as persistent tab bar with bottom-border active indicator and ARIA attributes (FlowPilot/Cockpit switcher)
- **Standardized action naming** across all session pages: Resolve (emerald), Update (blue), Close (rose), Pause (muted)
- **ConcludeSessionModal copy refresh** — Forward-facing action verbs, "Close & Generate" CTA, consistent outcome labels
- Deleted unused FlowPilotActionBar component (227 lines dead code)
### Network Diagrams (In Progress)
- Network diagram editor with React Flow (@xyflow/react v12) canvas
- Device node system: 27 device types across 7 categories (network, compute, storage, cloud, endpoint, infrastructure, security)
- Custom device type creation via DeviceToolbar
- Connection edges with 6 types (ethernet, fiber, wifi, vpn, vlan, wan) — color-coded, dashed for wireless/VPN
- Properties panel for editing device and connection details
- AI-assisted diagram generation (describe network → auto-layout)
- Auto-save every 30 seconds, manual save, JSON export
- **React Flow UI Components** — Cherry-picked and Charcoal-restyled: BaseNode (structured header/content/footer slots), BaseHandle (styled connection handles), LabeledHandle (named port labels), NodeStatusIndicator (status border effect: emerald/red/yellow), NodeTooltip (hover details via NodeToolbar), LabeledGroupNode (subnet/VLAN/site/DMZ containers), AnimatedSvgEdge (traffic flow visualization)
- Grouping category in toolbar: Subnet, VLAN, Site, DMZ drag-drop to canvas
- Traffic flow toggle on edges (switches between static and animated)
- Context menu with copy/paste/duplicate/select all shortcuts
- Drop position uses `screenToFlowPosition()` for correct placement at any zoom/pan level
- **Bug fix:** PropertiesPanel inputs now work — selection uses IDs instead of stale object snapshots
- Landing page copy rewrite: "Resolve tickets faster. Notes write themselves."
- Spring bounce hover animation on dashboard cards
- Charcoal color palette: sidebar `#10121a`, page `#1a1c23`, cards `#22252e`
### Maintenance Flows (Hidden from UI)
@@ -289,22 +242,21 @@
### Start Development
```bash
# Start PostgreSQL (Docker — container name resolutionflow_postgres, port 5433, DB resolutionflow)
docker start resolutionflow_postgres
# Start PostgreSQL (Docker Compose)
docker compose up -d
# Backend (from backend/)
source venv/bin/activate
uvicorn app.main:app --reload
# Frontend (from frontend/, requires Node 20)
# Frontend (from frontend/)
npm run dev
```
### URLs
- Frontend: http://46.202.92.250:5173 (or https via Traefik reverse proxy)
- Backend API: http://46.202.92.250:8000
- API Docs: http://46.202.92.250:8000/api/docs
- Dev env runs on Hostinger VPS (46.202.92.250) with Traefik + HTTPS; see [DEV-ENV.md](DEV-ENV.md)
- Frontend: http://192.168.0.9:5173
- Backend API: http://192.168.0.9:8000
- API Docs: http://192.168.0.9:8000/api/docs
### Run Tests
```bash

View File

@@ -1,262 +1,671 @@
# ResolutionFlow Dev Environment Setup & Operations Guide
# ResolutionFlow Dev Environment Setup & Operations Guide
## Server Overview
> **Scope:** Stand up a working ResolutionFlow dev environment from scratch on any Linux host (VPS, on-prem Proxmox LXC/VM, bare metal). Self-contained — do not read another doc to get the dev stack running.
> **Last rewritten:** April 2026, post-Hostinger-VPS deprecation, ahead of Proxmox migration.
> **Audience:** You (returning to the project), a teammate, or a fresh Claude Code session.
- **Provider:** Hostinger KVM VPS (srv1522117)
- **IP Address:** 46.202.92.250
- **OS:** Ubuntu 24.04 LTS
- **CPU:** 2 vCPU cores
- **RAM:** 8GB
- **Disk:** 100GB NVMe SSD
- **Swap:** 4GB (`/swapfile`, swappiness=10)
If you're picking up mid-migration and need to know what code state is on the current branch, read `docs/FlowAssist_Migration/MIGRATION-HANDOFF.md` first.
## Architecture
---
All services run as Docker containers on the host, managed via SSH or from the VS Code Server integrated terminal.
## 1. What this project needs, regardless of host
```
Host (root@srv1522117)
├── Traefik → reverse proxy + auto SSL (Let's Encrypt)
├── VS Code Server → browser IDE at https://code.resolutionflow.com
└── ResolutionFlow Stack
├── resolutionflow_frontend → Vite/React on port 5173
├── resolutionflow_backend → FastAPI/Uvicorn on port 8000
└── resolutionflow_postgres → PostgreSQL 16 + pgvector on port 5432
```
These are non-negotiable. If your host can't provide them, fix that before anything else.
## Access URLs
| Component | Required version | Notes |
|---|---|---|
| **Linux** | any mainstream distro | Ubuntu 22.04+ / Debian 12+ tested; Alpine fine for containers |
| **Python** | 3.11+ | Backend and migrations |
| **Node.js** | 20.19+ | Vite 7 fails on older versions — CLAUDE.md Lesson 63 |
| **PostgreSQL** | 16 | `gen_random_uuid()` + `jsonb` + RLS are all leaned on |
| **Docker + Docker Compose** | recent | Only if you are running Postgres and/or backend as containers |
| **Git** | recent | |
| Service | URL |
Optional but recommended:
| Tool | Why |
|---|---|
| VS Code Server | https://code.resolutionflow.com |
| Frontend (dev) | http://46.202.92.250:5173 |
| Backend API | http://46.202.92.250:8000 |
| API Docs | http://46.202.92.250:8000/docs |
| **code-server** | Browser-based VS Code; how this project has historically been edited |
| **`gh` CLI** | Mirror repo is on GitHub via Gitea; `gh` reads issues and PRs |
| **bun** | Required for the gstack `/browse` + `/qa` skills (CLAUDE.md Lesson 82) |
| **`npx gitnexus analyze`** | Code-graph for Phase 2+ work that touches `unified_chat_service` |
| **Claude Code CLI** | If you want to run Claude Code locally on the host |
## Docker Layout
---
## 2. Architectural shape
The project is three services plus your editor. Keep these facts in mind regardless of topology:
```
/docker/
├── traefik/
├── docker-compose.yml → Traefik reverse proxy
└── .env → ACME_EMAIL for Let's Encrypt
└── vscode/
├── docker-compose.yml → VS Code Server
└── .env → CODE_PASSWORD
Your browser
├─► code-server (editor, optional — usually port 8080 or behind TLS)
├─► frontend (Vite) (dev server, port 5173)
└─► backend (FastAPI) (dev server, port 8000)
└─► PostgreSQL (port 5432)
```
Project lives inside the VS Code Server Docker volume:
**The frontend calls the backend by URL at runtime.** The frontend does not proxy through the backend. Whatever URL your browser uses to reach the backend is what `VITE_API_URL` must be set to, **baked in at build time**. Changing `VITE_API_URL` requires rebuilding the frontend.
**The backend calls the database by URL at runtime.** The URL depends on where Postgres is relative to the backend — Docker service name if both are in the same compose network, `localhost` if Postgres is native on the same host, or a DNS name if they're in separate containers/VMs.
**CORS is configured explicitly.** The backend's `CORS_ORIGINS` list must include every origin your browser will use to reach the frontend. A missing origin shows up as failed preflight requests.
---
## 3. Topology choices — pick one before you start
The project is agnostic to topology, but each shape has different setup steps.
### Option A — all-in-one LXC/VM/host (simplest)
Postgres, backend, and frontend all run on one Linux host. code-server runs on the same host or a sibling. No Docker required. Best for a single-developer Proxmox LXC.
### Option B — Docker Compose on one host
Postgres, backend, and frontend run as Docker containers on one host. code-server runs outside the compose network (on the host or in another container). This is how the old Hostinger VPS was configured. Best if you want reproducible container images.
### Option C — split services across containers/VMs
Postgres in one container/VM, backend and frontend in another, code-server in a third. Most complex; requires explicit networking between them. Use only if you have a specific reason.
**Pick one and stick with it for the entire setup.** Mixing Options A and B halfway through is where setup runs off the rails.
---
## 4. Per-host configuration
These values are specific to your host. Fill them in once and reference them by name throughout the rest of the doc.
```
/var/lib/docker/volumes/vscode_vscode-data/_data/resolutionflow/
DEV_HOST = <hostname or IP your browser uses, e.g. dev.internal, 10.0.0.42>
DEV_HOST_SCHEME = <http or https; http is fine for internal dev, https if behind a TLS proxy>
FRONTEND_PORT = 5173
BACKEND_PORT = 8000
POSTGRES_PORT = 5433 # host-side port. 5433 is the recommended default on any shared host to avoid collision with a host-level Postgres. The container's internal port stays 5432.
POSTGRES_DB_NAME = resolutionflow
POSTGRES_USER = postgres
POSTGRES_PASSWORD = <local-dev-password; anything, this is not prod>
SECRET_KEY = <openssl rand -hex 32 — generate fresh per host, do not reuse>
ANTHROPIC_API_KEY = <from https://console.anthropic.com>
GOOGLE_AI_API_KEY = <optional, only if using Gemini as a fallback>
```
## VS Code Server
Store these somewhere you can copy from during setup. Do not commit them.
- **Container user:** `coder` (UID 1000)
- **Home directory:** `/home/coder`
- **Project location:** `/home/coder/resolutionflow`
- **Host volume path:** `/var/lib/docker/volumes/vscode_vscode-data/_data`
- **Access URL:** `https://code.resolutionflow.com`
- **HTTPS:** Auto-provisioned via Traefik + Let's Encrypt
> **Naming note:** the canonical database name is `resolutionflow`. If you see `patherly` in a config file, that's drift from an earlier rename and is being swept in a separate commit — use `resolutionflow`. CLAUDE.md tracks the live-code files that still reference `patherly`.
### Compose File Location
`/docker/vscode/docker-compose.yml`
---
## Traefik
## 5. Setup procedure
Handles reverse proxying and automatic SSL for all services. HTTP automatically redirects to HTTPS.
Run these in order. Stop at the first failure and investigate.
### Adding A New Service Behind Traefik
Add these labels to any new Docker service:
```yaml
labels:
- "traefik.enable=true"
- "traefik.http.routers.<n>.rule=Host(`subdomain.resolutionflow.com`)"
- "traefik.http.routers.<n>.entrypoints=websecure"
- "traefik.http.routers.<n>.tls.certresolver=letsencrypt"
- "traefik.http.services.<n>.loadbalancer.server.port=<port>"
```
Also create an A record in DNS pointing the subdomain to `46.202.92.250`.
## ResolutionFlow Dev Stack
### Important: No Docker Inside VS Code Container
The VS Code Server container does NOT have Docker. All `docker compose` commands must be run via SSH as root on the host.
### Environment Files
| File | Purpose |
|---|---|
| `.env` | Root — Docker Compose interpolation (`SECRET_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_AI_API_KEY`, `POSTGRES_PORT`) |
| `backend/.env` | Backend source of truth — all FastAPI settings, API keys, DB URLs, CORS |
| `frontend/.env` | Frontend — `VITE_API_URL` pointing to backend |
### Critical Remote Access Config
**`frontend/.env`:**
```
VITE_API_URL=http://46.202.92.250:8000
```
**`backend/.env`:**
```
CORS_ORIGINS=["http://localhost:3000","http://localhost:5173","http://127.0.0.1:3000","http://127.0.0.1:5173","http://46.202.92.250:5173","http://46.202.92.250:3000","https://resolutionflow.com","https://www.resolutionflow.com"]
FRONTEND_URL=http://46.202.92.250:5173
DATABASE_URL=postgresql+asyncpg://postgres:postgres@db:5432/resolutionflow
DATABASE_URL_SYNC=postgresql://postgres:postgres@db:5432/resolutionflow
```
Note: `DATABASE_URL` uses `@db:5432` (Docker service name), not `@localhost`.
**`docker-compose.dev.yml`:**
```yaml
- VITE_API_URL=http://46.202.92.250:8000
```
### Starting the Dev Environment
SSH into host as root:
### 5.1 Install system dependencies
```bash
cd /var/lib/docker/volumes/vscode_vscode-data/_data/resolutionflow
docker compose -f docker-compose.dev.yml up -d
# Ubuntu / Debian
sudo apt update && sudo apt install -y \
git curl build-essential \
python3.11 python3.11-venv python3-pip \
postgresql-client # not the server — only if running Postgres natively
# Node 20 via nvm (survives container rebuilds if stored in a volume)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash
export NVM_DIR="$HOME/.nvm" && source "$NVM_DIR/nvm.sh"
nvm install 20
nvm alias default 20
```
### Running Migrations (Fresh Database)
For Option B (Docker Compose), also:
```bash
cd /var/lib/docker/volumes/vscode_vscode-data/_data/resolutionflow
curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER # log out and back in for this to take effect
```
### 5.2 Clone the repo
```bash
git clone https://gitea.resolutionflow.com/chihlasm/resolutionflow.git
# or the GitHub mirror:
# git clone https://github.com/chihlasm/resolutionflow.git
cd resolutionflow
# Check out the working branch if you're continuing mid-migration.
git fetch origin
git checkout feat/flowpilot-migration
```
### 5.3 Start PostgreSQL
**Option A (native Postgres on the host):**
```bash
sudo apt install -y postgresql-16
sudo -u postgres psql -c "CREATE DATABASE resolutionflow;"
sudo -u postgres psql -c "ALTER USER postgres PASSWORD 'postgres';"
# Adjust pg_hba.conf if you need non-local connections.
```
**Option B (Postgres via Docker Compose):** The repo has a `docker-compose.dev.yml` at the root. Check its Postgres service for the container name, port mapping, and volume. The local compose defaults use container name `resolutionflow_postgres`, database `resolutionflow`, and host-side port `5433` (mapped to the container's internal `5432`) — see CLAUDE.md Lesson 65. The host-side `5433` is the recommended default on any shared host: it keeps the port free for a host-level Postgres if you ever need one. The compose file also defines explicit `command:` directives on both `backend` and `frontend` to force `--host 0.0.0.0`, and expects the caller to pass `REPO_ROOT` (see 5.4) for bind-mount resolution. Confirm what the compose file actually says on your branch before trusting these values.
```bash
docker compose -f docker-compose.dev.yml up -d db
docker compose -f docker-compose.dev.yml logs db # wait for "ready to accept connections"
```
**Verify:**
```bash
# From the host (Option A) or the backend container/LXC (Option B):
psql -h <db-host> -p <POSTGRES_PORT> -U postgres -d resolutionflow -c "SELECT now();"
```
### 5.4 Write the `.env` files
The repo expects three env files. Create each one:
**`backend/.env`** — backend source of truth:
```bash
APP_NAME=ResolutionFlow
DEBUG=true
# DB URLs — `<db-host>` is `localhost` for Option A, the Docker service name
# (e.g. `db`) for Option B, or the DB container/VM hostname for Option C.
DATABASE_URL=postgresql+asyncpg://postgres:postgres@<db-host>:<POSTGRES_PORT>/resolutionflow
DATABASE_URL_SYNC=postgresql://postgres:postgres@<db-host>:<POSTGRES_PORT>/resolutionflow
# Auth
SECRET_KEY=<SECRET_KEY>
ACCESS_TOKEN_EXPIRE_MINUTES=5
REFRESH_TOKEN_EXPIRE_DAYS=7
REQUIRE_INVITE_CODE=true
# AI providers
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=<ANTHROPIC_API_KEY>
GOOGLE_AI_API_KEY=<GOOGLE_AI_API_KEY or leave unset>
# FlowPilot MCP telemetry — leave on so the Phase 0.5 baseline data keeps accruing
ENABLE_MCP_MICROSOFT_LEARN=true
# CORS + frontend URL
FRONTEND_URL=<DEV_HOST_SCHEME>://<DEV_HOST>:<FRONTEND_PORT>
CORS_ORIGINS=["http://localhost:5173","http://127.0.0.1:5173","<DEV_HOST_SCHEME>://<DEV_HOST>:<FRONTEND_PORT>"]
```
**`frontend/.env.local`** — frontend build-time config:
```bash
VITE_API_URL=<DEV_HOST_SCHEME>://<DEV_HOST>:<BACKEND_PORT>
```
Optional PostHog (CLAUDE.md Lesson 64 — enables product analytics locally):
```bash
VITE_PUBLIC_POSTHOG_KEY=<from PostHog project settings>
VITE_PUBLIC_POSTHOG_HOST=https://us.i.posthog.com
```
**Repo root `.env`** — only needed for Option B (Docker Compose interpolation):
```bash
SECRET_KEY=<SECRET_KEY>
ANTHROPIC_API_KEY=<ANTHROPIC_API_KEY>
GOOGLE_AI_API_KEY=<GOOGLE_AI_API_KEY or leave unset>
POSTGRES_PORT=<POSTGRES_PORT>
# Absolute host-side path to the repo root. REQUIRED whenever docker-compose is
# invoked from inside a container (e.g. a code-server container with the host
# Docker socket mounted in). Without it, the bind mounts in
# docker-compose.dev.yml (`${REPO_ROOT}/backend:/app`, `${REPO_ROOT}/frontend:/app`)
# resolve against the CLI's CWD — a path the host daemon cannot see — and
# Docker silently creates empty directories there instead of mounting the code.
# If you run docker compose directly on the host shell, you can set this to `.`
# or the absolute path of the repo; being explicit is safer either way.
REPO_ROOT=/absolute/path/to/resolutionflow
```
> **Never commit any `.env` file.** The `.gitignore` already covers this.
### 5.5 Run the backend setup
**Option A (native):**
```bash
cd backend
python3.11 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
# Migrate the DB to head.
alembic upgrade head
```
**Option B (Docker):**
```bash
docker compose -f docker-compose.dev.yml up -d backend
docker compose -f docker-compose.dev.yml run --rm backend alembic upgrade head
```
### Seeding Test Users
**Expected alembic head** (as of `feat/flowpilot-migration`): `f07010f17b01`. If `alembic current` shows anything else after `upgrade head`, something has gone wrong — stop and investigate.
### 5.6 Seed test users
```bash
# Option A
cd backend && source venv/bin/activate
python -m scripts.seed_test_users
# Option B
docker exec resolutionflow_backend python -m scripts.seed_test_users
```
Test accounts (password: `TestPass123!`):
Test users (all share password `TestPass123!`):
| Email | Role | Plan |
|---|---|---|
| admin@resolutionflow.example.com | Owner | Team |
| pro@resolutionflow.example.com | Owner | Pro |
| teamadmin@resolutionflow.example.com | Owner | Team |
| engineer@resolutionflow.example.com | Engineer | Shared |
| Email | Role |
|---|---|
| `admin@resolutionflow.example.com` | super admin |
| `teamadmin@resolutionflow.example.com` | team admin |
| `engineer@resolutionflow.example.com` | engineer |
| `pro@resolutionflow.example.com` | solo pro |
### Rebuilding After Config Changes
### 5.7 Run the backend
**Option A:**
```bash
cd backend && source venv/bin/activate
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
**Option B:** Already running from `docker compose up -d backend`. Tail logs:
```bash
docker compose -f docker-compose.dev.yml logs -f backend
```
**Verify:** `curl <DEV_HOST_SCHEME>://<DEV_HOST>:<BACKEND_PORT>/api/docs` — OpenAPI docs page loads.
### 5.8 Run the frontend
**Option A:**
```bash
cd frontend
npm install
npm run dev -- --host 0.0.0.0 --port 5173
```
**Option B:**
**Frontend** (Vite bakes env vars at build time — requires rebuild):
```bash
cd /var/lib/docker/volumes/vscode_vscode-data/_data/resolutionflow
docker compose -f docker-compose.dev.yml up -d --build frontend
```
**Backend** (restart only):
**Verify:** Open `<DEV_HOST_SCHEME>://<DEV_HOST>:<FRONTEND_PORT>` in your browser. Log in with one of the test users. Navigate to `/pilot` — the FlowPilot session page should render.
---
## 6. Verification — proof the env actually works
Run these after setup. Every item has a concrete expected outcome.
### 6.1 Database schema is at the right version
```bash
# Option A
cd backend && source venv/bin/activate && alembic current
# Option B
docker compose -f docker-compose.dev.yml run --rm backend alembic current
```
Expected: `f07010f17b01 (head)` on the `feat/flowpilot-migration` branch. On `main`, expected: `074 (head)`.
### 6.2 Alembic reversibility
```bash
alembic downgrade -1 # should complete cleanly
alembic upgrade head # should return to f07010f17b01
```
If either step fails, the migration has a bug and Phase 2 cannot start.
### 6.3 Prompt-cache hit verification (the deferred Phase 0 TODO)
`backend/app/core/ai_provider.py` module docstring has a `TODO(phase0-verify)` note describing this. Procedure:
1. Confirm `AI_PROVIDER=anthropic` and `ANTHROPIC_API_KEY` is set in `backend/.env`.
2. Start the backend with log level INFO or lower.
3. In the UI, open `/pilot` and send a chat message. Wait a few seconds for the response.
4. Send a second chat message in the same session, within 5 minutes of the first.
5. In backend logs, grep for lines containing `anthropic.cache`:
```bash
# Option A
grep 'anthropic.cache' <log-path>
# Option B
docker compose -f docker-compose.dev.yml logs backend | grep 'anthropic.cache'
```
6. Expected: two `anthropic.cache` log events. First has `cache_creation_input_tokens > 0`. Second has `cache_read_input_tokens > 0`.
7. If the second shows zero reads, inspect the prompt prefix for silent invalidators (timestamps, unsorted JSON keys, varying tool list ordering). Fix before proceeding with any Phase 2 work.
### 6.4 Frontend build is TypeScript-clean
```bash
cd frontend
npx tsc -b # no errors
npm run build # no errors
```
CLAUDE.md Lesson 105 notes that `npm run build` may fail with an `EACCES` on `dist/` inside code-server — that is a Docker filesystem permission issue, not a real build error. Use `npx tsc -b` to verify TypeScript cleanliness in that case.
### 6.5 `/assistant` → `/pilot` redirect
Open `<DEV_HOST_SCHEME>://<DEV_HOST>:<FRONTEND_PORT>/assistant/<some-real-session-id>` in the browser. Expected: URL changes to `/pilot/<that-id>`; the FlowPilot session page renders. Bare `/assistant` redirects to bare `/pilot`.
### 6.6 Dispatcher de-branching
Navigate to the dashboard. Click a session in `ActiveFlowPilotSessions` or `RecentFlowPilotSessions`. Expected: routes to `/pilot/:id` regardless of the session's `session_type` value. (Check the browser URL bar.)
### 6.7 CORS
Open the browser DevTools Network tab, navigate to any backend-hitting page. Expected: no CORS errors. If you see "blocked by CORS policy," the missing origin needs adding to `backend/.env`'s `CORS_ORIGINS`.
---
## 7. Runbook
Day-to-day commands after setup is complete.
### Restart services
```bash
# Option A
# backend — Ctrl-C and re-run uvicorn
# frontend — Ctrl-C and re-run npm run dev
# Option B
docker compose -f docker-compose.dev.yml restart backend
docker compose -f docker-compose.dev.yml up -d --build frontend # rebuild required if VITE_* changed
docker compose -f docker-compose.dev.yml down && docker compose -f docker-compose.dev.yml up -d # full restart
```
**Full restart:**
```bash
docker compose -f docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up -d
```
## Installed Tools (Inside VS Code Server Container)
Installed in `/home/coder` — persists via Docker volume:
- **nvm** — Node version manager
- **Node.js 20.x** — via nvm, default alias set
- **npm** — latest
- **GitHub CLI (gh)** — authenticated via personal access token
- **Claude Code CLI** — `@anthropic-ai/claude-code` (global npm)
### Permanent Tool Installs
Tools installed via `apt` inside the container do NOT survive container rebuilds. To add permanently, modify the VS Code Server Docker image and rebuild.
Temporary (session only):
```bash
sudo apt update && sudo apt install -y <tool>
```
## SSH Access
### Apply a new migration
```bash
ssh root@46.202.92.250
# Option A
cd backend && source venv/bin/activate && alembic upgrade head
# Option B
docker compose -f docker-compose.dev.yml run --rm backend alembic upgrade head
```
Key auth configured via `~/.ssh/authorized_keys` on host.
### Create a new migration
## Useful Commands
### Check all running containers
```bash
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"
# Option A
cd backend && source venv/bin/activate
alembic revision -m "short description" # manual, preferred per CLAUDE.md Lesson 77
# OR
alembic revision --autogenerate -m "description" # pulls in drift; review carefully
```
### View container logs
Never pass `--rev-id` — let Alembic generate the hex hash.
### Inspect the database
```bash
docker logs <container_name> --tail 30 -f
# Option A (native Postgres)
psql -h localhost -p 5432 -U postgres -d resolutionflow
# Option B (Docker)
docker exec -it resolutionflow_postgres psql -U postgres -d resolutionflow
```
### Restart VS Code Server
### Run tests
```bash
cd /docker/vscode && docker compose restart
# Option A
cd backend && source venv/bin/activate
pytest --override-ini="addopts="
# Option B
docker compose -f docker-compose.dev.yml run --rm backend pytest --override-ini="addopts="
```
### Restart Traefik
First time only, create the test database:
```bash
cd /docker/traefik && docker compose restart
# Option A
sudo -u postgres psql -c "CREATE DATABASE resolutionflow_test;"
# Option B
docker exec -it resolutionflow_postgres psql -U postgres -c "CREATE DATABASE resolutionflow_test;"
```
### Restart dev stack
### View backend logs
```bash
cd /var/lib/docker/volumes/vscode_vscode-data/_data/resolutionflow
docker compose -f docker-compose.dev.yml down
docker compose -f docker-compose.dev.yml up -d
# Option A: wherever you ran uvicorn
# Option B
docker compose -f docker-compose.dev.yml logs -f --tail=100 backend
```
### Check swap
Structured events to grep for:
- `anthropic.cache` — prompt-cache hit/creation telemetry (Phase 0.1)
- `mcp.turn` — per-turn MCP availability/invocation (Phase 0.5)
- `mcp.fallback` — MCP silent-retry fallback fired (Phase 0.5)
---
## 8. Troubleshooting
### CORS errors in the browser
The backend did not accept the origin your browser used. Check `backend/.env`'s `CORS_ORIGINS` — it must include the exact scheme + host + port the browser sent. Restart the backend after editing.
### `VITE_API_URL` points at the wrong place
The frontend was built with a stale value. Rebuild the frontend. Option B: `docker compose up -d --build frontend`. Option A: restart `npm run dev`.
### `alembic upgrade head` fails with "target database is not up to date"
Your DB migration chain is out of sync with the code. On a dev box, the safe recovery is to drop the DB and re-migrate from scratch:
```bash
free -h && swapon --show
# Option A
sudo -u postgres psql -c "DROP DATABASE resolutionflow;" -c "CREATE DATABASE resolutionflow;"
cd backend && source venv/bin/activate && alembic upgrade head
# Option B
docker exec resolutionflow_postgres psql -U postgres -c "DROP DATABASE resolutionflow;" -c "CREATE DATABASE resolutionflow;"
docker compose -f docker-compose.dev.yml run --rm backend alembic upgrade head
```
### Check disk
Only do this on a dev box — it destroys all local data.
### `alembic heads` shows more than one head
Only on a local branch that has diverged from `origin/main`. Production `main` has a single head. If this happens on a fresh clone, one of your local migration files has the wrong `down_revision`. Inspect each file's `down_revision` and reconnect the chain.
### Frontend build fails with "EACCES: permission denied" on `dist/`
Filesystem permission issue inside the code-server container (CLAUDE.md Lesson 105). TypeScript compilation itself completes — use `npx tsc -b` to verify cleanliness without needing to write to `dist/`.
### Backend/frontend containers start but `/app` is empty (no code mounted)
Almost always a `REPO_ROOT` problem. `docker-compose.dev.yml` uses `${REPO_ROOT}/backend:/app` and `${REPO_ROOT}/frontend:/app` bind mounts. If `REPO_ROOT` is unset, or set to a path that doesn't exist *on the Docker host* (not inside the code-server container), Docker silently creates an empty directory at that path and mounts it — the containers come up but have no source code. Symptom: backend returns import errors, or frontend serves a default Vite page. Fix: set `REPO_ROOT` in the repo-root `.env` to the absolute host-side path to the repo, then `docker compose down && docker compose up -d`. See 5.4 for the full note. This matters specifically when `docker compose` is invoked from inside a container (e.g. code-server with the host Docker socket mounted) — the CLI's CWD is container-local but the daemon resolves paths against the host filesystem.
### Frontend shows "Blocked request. This host is not allowed" in the browser
Vite 5+ ships DNS-rebinding protection that rejects any `Host:` header not in `server.allowedHosts`. The browser's hostname must be in that list. Edit `frontend/vite.config.ts` — the `server.allowedHosts` array should include every hostname you reach the dev server from (e.g. `'docker-01'`, `'localhost'`, `.ts.net` as a wildcard for Tailscale MagicDNS). Restart the Vite dev server (for Option B: `docker compose restart frontend`). This is unrelated to CORS — Vite blocks the request before any app code runs.
### `docker` command not found inside code-server
If your code-server is itself inside a container, Docker is probably not exposed to it. CLAUDE.md Lesson 103 was written for this case on the old VPS. On Proxmox, the fix depends on topology — either SSH to the host to run Docker commands, or mount the host's Docker socket into the code-server container.
### Backend returns 500 with `InsufficientPrivilegeError: new row violates row-level security policy`
RLS is enabled on a table your code wrote to without the right `account_id`. CLAUDE.md Lessons 107, 108, 110 cover this family of bugs. The fix is always at the service layer: make sure every model creation passes `account_id=` explicitly, and that startup routines that touch tenant-isolated tables use `_admin_session_factory()` rather than `get_db()`.
### Anthropic cache reads are zero on the second turn
Something in the cached prefix is changing between turns. Inspect the system-block list and the first N history messages for timestamps, `datetime.now()`, unsorted dict keys in JSON prompts, or varying tool-list order. The `anthropic.cache` telemetry shows exactly how many tokens were read vs created — use it to narrow down the invalidator.
---
## 9. Security posture for dev environments
This doc is about dev, not production. But:
- Never commit `.env` files. The `.gitignore` covers this.
- `SECRET_KEY` should be generated per-host, not reused across environments.
- `ANTHROPIC_API_KEY` is billable — rotate if leaked into logs or chat.
- Postgres on a dev host should not be exposed to the internet. Bind it to `127.0.0.1` or to a private network interface only.
- If you expose the frontend or backend publicly (for teammates to test against), put it behind TLS with a real certificate. Do not let dev credentials travel over plain HTTP on the public internet.
---
## 10. What's not in this doc
- **Production deployment.** This is a dev-env doc. Production lives on Railway — see `CLAUDE.md`'s Deployment section.
- **How to set up Traefik or any particular reverse proxy.** Whichever proxy you use is your choice; the dev stack just needs something that routes `<host>:5173` and `<host>:8000` to the right services. **Direct port exposure over a private network** (Tailscale, WireGuard, a VPN, or a LAN behind a firewall) is a fully supported option for dev and is what the homelab reference topology in Section 11 uses — no reverse proxy, no TLS, just `http://<host>:5173` and `http://<host>:8000` reachable only from the private network. That's a perfectly reasonable choice; it's just not the only one.
- **How to configure code-server itself.** Install it however you prefer (native, Docker, LXC); point it at the repo, and the rest of this doc applies.
- **Where to host the Proxmox instance.** Up to you.
If something in this doc turns out to be wrong on your host, fix the doc. This is a living document — the whole point of rewriting it from the Hostinger-specific version was to make it survive host changes.
---
## 11. Reference topology: homelab Proxmox + code-server (Option B)
This section documents the first concrete host instantiation since the April 2026 host-agnostic rewrite. It's a worked example, not the canonical topology — Section 3's Option A/B/C framing still stands. If your setup looks different, follow Sections 110 and ignore this appendix.
### 11.1 Host
- **Hypervisor:** Proxmox (homelab).
- **VM:** `docker-01`, Debian 13, running Docker Engine + Docker Compose natively.
- **Tailscale IP:** `100.64.78.44`. MagicDNS hostname: `docker-01` (and the full `.ts.net` FQDN).
- **code-server:** runs on the same VM in its own container, with the host's Docker socket mounted in so it can drive `docker compose`. Its workspace bind-mounts the repo at `/opt/docker/code-server/workspace/resolutionflow`.
This is a concrete instance of Option B from Section 3: Postgres, backend, and frontend all run as containers from `docker-compose.dev.yml`; the editor lives outside that compose network.
### 11.2 Access pattern — direct port over Tailscale, no reverse proxy
The browser reaches the dev stack directly:
- Frontend: `http://docker-01:5173`
- Backend: `http://docker-01:8000`
- Backend API docs: `http://docker-01:8000/api/docs`
There is **no Caddy, no Traefik, no nginx, no TLS, no basic auth** in front of either service. The tailnet provides the wire encryption and access control — only devices on the tailnet can resolve `docker-01` or reach `100.64.78.44`, and Tailscale ACLs decide which of those devices are allowed to connect.
Why this choice:
- **Zero routing config to maintain.** There is no proxy rulebook to keep in sync with new services. Add a container, expose a port, you're done.
- **Backend-to-backend services stay private.** Redis, Celery workers, the planned ConnectWise proxy, the MCP server — none of them need to be reachable from the browser, so none of them need proxy rules. They stay inside the `resolutionflow` Docker network and talk by service name. The proxy would only ever have carried frontend and backend traffic, so the proxy's value was small relative to its maintenance cost.
- **Debuggability.** `curl http://docker-01:8000/api/docs` from any tailnet device works without auth headers, TLS handshakes, or DNS shenanigans.
Tradeoff: **this only works because every client device is on the tailnet.** If someone needed to test from a non-tailnet device, they'd either join the tailnet or we'd need to front the stack with a proxy. For the current single-developer setup, the tailnet-only assumption holds.
### 11.3 Per-host config values (as actually configured on `docker-01`)
Plugging these into Section 4's template:
```
DEV_HOST = docker-01
DEV_HOST_SCHEME = http
FRONTEND_PORT = 5173
BACKEND_PORT = 8000
POSTGRES_PORT = 5433 # host-side; container-internal stays 5432
POSTGRES_DB_NAME = resolutionflow
POSTGRES_USER = postgres
POSTGRES_PASSWORD = postgres # local-dev only
SECRET_KEY = <generated per host; do not reuse>
ANTHROPIC_API_KEY = <from console.anthropic.com>
GOOGLE_AI_API_KEY = <unset; Anthropic is sole provider in dev>
```
And the repo-root `.env` that `docker-compose.dev.yml` interpolates from:
```bash
df -h
SECRET_KEY=<redacted>
ANTHROPIC_API_KEY=<redacted>
POSTGRES_PORT=5433
REPO_ROOT=/opt/docker/code-server/workspace/resolutionflow
```
### Check memory + container usage
### 11.4 Why `REPO_ROOT` is non-optional on this host
code-server runs inside a container. When you open a terminal in code-server and run `docker compose -f docker-compose.dev.yml up -d`, the Docker CLI talks to the *host* daemon via the mounted socket — but the CWD it reports (`/config/workspace/resolutionflow`) is a path that only exists inside the code-server container. The host daemon has never heard of it.
Relative bind mounts like `./backend:/app` therefore resolve against a path the host can't see, and Docker silently creates empty directories there rather than erroring out. The containers come up, but `/app` is empty.
`docker-compose.dev.yml` sidesteps this by using `${REPO_ROOT}/backend:/app` and `${REPO_ROOT}/frontend:/app`. `REPO_ROOT` must be set to the absolute path **on the host** (`/opt/docker/code-server/workspace/resolutionflow`), not the path inside the code-server container. Same contents, different mount point, different name.
If you ever run `docker compose` directly from a host shell (SSH'd into `docker-01`), set `REPO_ROOT` to `.` or the absolute host path. Being explicit is always safe; leaving it unset is the failure mode.
### 11.5 Vite `server.allowedHosts` — required for `docker-01` to resolve
Vite 5+ rejects any `Host:` header not in `server.allowedHosts` (DNS-rebinding protection). `frontend/vite.config.ts` has:
```ts
server: {
host: '0.0.0.0',
allowedHosts: ['docker-01', '.ts.net', 'localhost'],
...
}
```
- `docker-01` — the MagicDNS short name the browser uses day-to-day.
- `.ts.net` — wildcard for the full Tailscale MagicDNS FQDN, in case anyone uses it.
- `localhost` — for the "am I serving anything at all" smoke-test from inside the container.
If you move this setup to a different host, add that host's hostname to `allowedHosts` or the browser will see "Blocked request. This host is not allowed." See Section 8's troubleshooting entry for the full symptom/fix.
### 11.6 CORS origins on this host
The `backend` service's `CORS_ORIGINS` environment variable is pinned in the compose file to:
```
["http://localhost:5173","http://127.0.0.1:5173","http://docker-01:5173","http://100.64.78.44:5173"]
```
The last two are what make browser calls from tailnet clients work — they cover both MagicDNS (`docker-01`) and the raw Tailscale IP. If you add a new hostname to reach the frontend from, also add the matching origin here and restart the backend.
### 11.7 Compose file shape (as of this writing)
`docker-compose.dev.yml` has been through a round of cleanup for this topology. Specifics worth knowing if you're comparing against older revisions of the file:
- **No Traefik labels.** They were removed — nothing in this topology uses Traefik.
- **No Hostinger-VPS-era origins** in `CORS_ORIGINS`.
- `Dockerfile.dev` for both `backend` and `frontend` is still the build source — this didn't change.
- Explicit `command:` directives on both `backend` (`uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload`) and `frontend` (`npm run dev -- --host 0.0.0.0 --port 5173`) — this guarantees `--host 0.0.0.0` regardless of what's baked into the image, so the services listen on all interfaces and are reachable from outside the container.
- `REPO_ROOT` is interpolated into both service volume mounts (see 11.4).
If you're adapting the file for a different host, the things most likely to need editing are `REPO_ROOT` (see 11.4), `CORS_ORIGINS` (see 11.6), `FRONTEND_URL`, `VITE_API_URL`, and `POSTGRES_PORT` if you want something other than `5433`.
### 11.8 End-to-end sanity check for this topology
From any device on the tailnet:
```bash
free -h && docker stats --no-stream
# Backend reachable
curl -sSf http://docker-01:8000/api/docs >/dev/null && echo OK
# Frontend reachable
curl -sSf http://docker-01:5173 >/dev/null && echo OK
# Alembic head matches the branch expectation
docker exec resolutionflow_backend alembic current
# expect f07010f17b01 on feat/flowpilot-migration, 074 on main
# Postgres is alive inside the compose network
docker exec resolutionflow_postgres psql -U postgres -d resolutionflow -c "SELECT now();"
```
## DNS Records (resolutionflow.com)
| Type | Name | Value | Purpose |
|---|---|---|---|
| A | code | 46.202.92.250 | VS Code Server |
## Security Notes
- UFW is inactive — Traefik and Docker manage port exposure
- All public-facing services run through Traefik with valid HTTPS certs
- PostgreSQL port 5432 is exposed on all interfaces — restrict if needed in production
- Rotate API keys (Anthropic, Voyage) if ever exposed in logs or chat
- Never commit `.env` files to Git
## VS Code Server Browser Tips
- **Command Palette:** `F1`
- **Terminal:** Ctrl+`
- **Rename file:** `F2`
- **Go to definition:** `F12`
- **Find references:** `Shift+F12`
- **Context Menu:** `Alt + Right Click`
All four passing = the dev environment is live end-to-end.

View File

@@ -29,13 +29,37 @@ from app.models.session_branch import SessionBranch # noqa: F401
from app.models.fork_point import ForkPoint # noqa: F401
from app.models.session_handoff import SessionHandoff # noqa: F401
from app.models.session_resolution_output import SessionResolutionOutput # noqa: F401
from app.core.config import settings
def _alembic_sync_url() -> str:
"""Return a psycopg2-compatible sync URL for Alembic.
Priority order:
1. DATABASE_URL_SYNC — in Railway this is set as a reference variable
(${{pgvector.DATABASE_URL}}) that resolves to the correct postgres
superuser credentials for the current environment (production, PR preview,
etc.). This always works even on fresh databases before any custom roles
have been created, because it uses the postgres superuser.
2. ADMIN_DATABASE_URL (resolutionflow_admin, BYPASSRLS) converted to a sync
driver — fallback for local dev where DATABASE_URL_SYNC may not be set.
"""
if settings.DATABASE_URL_SYNC:
return settings.DATABASE_URL_SYNC
admin_url = settings.ADMIN_DATABASE_URL
if admin_url and "+asyncpg" in admin_url:
return admin_url.replace("postgresql+asyncpg://", "postgresql://")
return settings.DATABASE_URL_SYNC
# this is the Alembic Config object
config = context.config
# Override sqlalchemy.url with the sync version for migrations
config.set_main_option("sqlalchemy.url", settings.DATABASE_URL_SYNC)
config.set_main_option("sqlalchemy.url", _alembic_sync_url())
# Interpret the config file for Python logging.
if config.config_file_name is not None:
@@ -86,7 +110,7 @@ def run_migrations_online() -> None:
from sqlalchemy import create_engine
connectable = create_engine(
settings.DATABASE_URL_SYNC,
_alembic_sync_url(),
poolclass=pool.NullPool,
)

View File

@@ -0,0 +1,59 @@
"""Enable RLS on Phase 3 tables.
Tables covered:
- step_ratings (account_id NOT NULL since migration 7167e9374b0c)
- step_usage_log (account_id NOT NULL since migration 7167e9374b0c)
- target_lists (account_id NOT NULL since migration 2c6aabd89bc6)
- session_shares (account_id NOT NULL since session_share model)
- audit_logs (account_id NOT NULL since migration 2a9056eddd90)
- tree_shares (account_id NOT NULL since migration a05e1a1bea7c)
All use a standard intra-tenant isolation policy.
Token-based access to session_shares and tree_shares goes through
endpoints that use get_admin_db (BYPASSRLS), so a strict tenant
policy here is correct.
Revision ID: 04f013768235
Revises: a05e1a1bea7c
Create Date: 2026-04-11 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
revision: str = '04f013768235'
down_revision: Union[str, None] = 'a05e1a1bea7c'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
_CURRENT_ACCOUNT = (
"COALESCE(NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000')::uuid"
)
_STANDARD_USING = f"account_id = {_CURRENT_ACCOUNT}"
_PHASE3_TABLES = [
"step_ratings",
"step_usage_log",
"target_lists",
"session_shares",
"audit_logs",
"tree_shares",
]
def upgrade() -> None:
for table in _PHASE3_TABLES:
op.execute(f"ALTER TABLE {table} ENABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON {table}
USING ({_STANDARD_USING})
""")
def downgrade() -> None:
for table in _PHASE3_TABLES:
op.execute(f"DROP POLICY IF EXISTS tenant_isolation ON {table}")
op.execute(f"ALTER TABLE {table} DISABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} NO FORCE ROW LEVEL SECURITY")

View File

@@ -1,31 +0,0 @@
"""add triage fields to ai_sessions for cockpit harness
Revision ID: 071
Revises: 070
Create Date: 2026-04-01
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import JSONB
revision = "071"
down_revision = "070"
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column("ai_sessions", sa.Column("client_name", sa.String(255), nullable=True))
op.add_column("ai_sessions", sa.Column("asset_name", sa.String(255), nullable=True))
op.add_column("ai_sessions", sa.Column("issue_category", sa.String(100), nullable=True))
op.add_column("ai_sessions", sa.Column("triage_hypothesis", sa.Text(), nullable=True))
op.add_column("ai_sessions", sa.Column("evidence_items", JSONB(), nullable=True))
def downgrade() -> None:
op.drop_column("ai_sessions", "evidence_items")
op.drop_column("ai_sessions", "triage_hypothesis")
op.drop_column("ai_sessions", "issue_category")
op.drop_column("ai_sessions", "asset_name")
op.drop_column("ai_sessions", "client_name")

View File

@@ -1,61 +0,0 @@
"""Seed flowpilot_cockpit feature flag with plan defaults.
Revision ID: 072
Revises: 071
Create Date: 2026-04-02
"""
from alembic import op
import sqlalchemy as sa
revision = "072"
down_revision = "071"
branch_labels = None
depends_on = None
def upgrade() -> None:
# Insert the feature flag
op.execute(
sa.text(
"INSERT INTO feature_flags (id, flag_key, display_name, description) "
"VALUES (gen_random_uuid(), 'flowpilot_cockpit', 'FlowPilot Cockpit', "
"'Access to the FlowPilot Cockpit triage view') "
"ON CONFLICT (flag_key) DO NOTHING"
)
)
# Set plan defaults: disabled for free, enabled for pro and team
op.execute(
sa.text(
"INSERT INTO plan_feature_defaults (id, plan, flag_id, enabled) "
"SELECT gen_random_uuid(), 'free', id, false FROM feature_flags WHERE flag_key = 'flowpilot_cockpit' "
"ON CONFLICT (plan, flag_id) DO NOTHING"
)
)
op.execute(
sa.text(
"INSERT INTO plan_feature_defaults (id, plan, flag_id, enabled) "
"SELECT gen_random_uuid(), 'pro', id, true FROM feature_flags WHERE flag_key = 'flowpilot_cockpit' "
"ON CONFLICT (plan, flag_id) DO NOTHING"
)
)
op.execute(
sa.text(
"INSERT INTO plan_feature_defaults (id, plan, flag_id, enabled) "
"SELECT gen_random_uuid(), 'team', id, true FROM feature_flags WHERE flag_key = 'flowpilot_cockpit' "
"ON CONFLICT (plan, flag_id) DO NOTHING"
)
)
def downgrade() -> None:
op.execute(
sa.text(
"DELETE FROM plan_feature_defaults WHERE flag_id IN "
"(SELECT id FROM feature_flags WHERE flag_key = 'flowpilot_cockpit')"
)
)
op.execute(
sa.text("DELETE FROM feature_flags WHERE flag_key = 'flowpilot_cockpit'")
)

View File

@@ -1,8 +1,8 @@
"""Add device_types table with system seed data.
"""Add account-scoped device_types table with platform seed data.
Revision ID: 073
Revises: 072
Create Date: 2026-04-04
Revises: b3c7e9f2a1d8
Create Date: 2026-04-12
"""
from alembic import op
import sqlalchemy as sa
@@ -11,10 +11,18 @@ import uuid
revision = "073"
down_revision = "072"
down_revision = "b3c7e9f2a1d8"
branch_labels = None
depends_on = None
_PLATFORM_UUID = "00000000-0000-0000-0000-000000000001"
_CURRENT_ACCOUNT = (
"COALESCE("
"NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000'"
")::uuid"
)
SYSTEM_DEVICE_TYPES = [
("router", "Router", "network", 0),
("switch", "Switch", "network", 1),
@@ -55,16 +63,13 @@ def upgrade() -> None:
sa.Column("label", sa.String(100), nullable=False),
sa.Column("category", sa.String(50), nullable=False),
sa.Column("is_system", sa.Boolean(), nullable=False, server_default=sa.text("false")),
sa.Column("team_id", UUID(as_uuid=True), sa.ForeignKey("teams.id", ondelete="CASCADE"), nullable=True),
sa.Column("account_id", UUID(as_uuid=True), sa.ForeignKey("accounts.id", ondelete="CASCADE"), nullable=False),
sa.Column("sort_order", sa.Integer(), nullable=False, server_default=sa.text("0")),
sa.Column("created_at", sa.DateTime(timezone=True), server_default=sa.text("now()")),
)
op.execute(
"ALTER TABLE device_types ADD CONSTRAINT uq_device_types_slug_team "
"UNIQUE NULLS NOT DISTINCT (slug, team_id)"
)
op.create_index("idx_device_types_team", "device_types", ["team_id"])
op.create_unique_constraint("uq_device_types_slug_account", "device_types", ["slug", "account_id"])
op.create_index("ix_device_types_account_id", "device_types", ["account_id"])
device_types_table = sa.table(
"device_types",
@@ -73,7 +78,7 @@ def upgrade() -> None:
sa.column("label", sa.String),
sa.column("category", sa.String),
sa.column("is_system", sa.Boolean),
sa.column("team_id", UUID(as_uuid=True)),
sa.column("account_id", UUID(as_uuid=True)),
sa.column("sort_order", sa.Integer),
)
@@ -84,12 +89,44 @@ def upgrade() -> None:
"label": label,
"category": category,
"is_system": True,
"team_id": None,
"account_id": uuid.UUID(_PLATFORM_UUID),
"sort_order": sort_order,
}
for slug, label, category, sort_order in SYSTEM_DEVICE_TYPES
])
op.execute("ALTER TABLE device_types ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE device_types FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY device_types_select ON device_types
FOR SELECT
USING (
account_id = {_CURRENT_ACCOUNT}
OR account_id = '{_PLATFORM_UUID}'::uuid
)
""")
op.execute(f"""
CREATE POLICY device_types_insert ON device_types
FOR INSERT
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
op.execute(f"""
CREATE POLICY device_types_update ON device_types
FOR UPDATE
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
op.execute(f"""
CREATE POLICY device_types_delete ON device_types
FOR DELETE
USING (account_id = {_CURRENT_ACCOUNT})
""")
def downgrade() -> None:
op.execute("DROP POLICY IF EXISTS device_types_delete ON device_types")
op.execute("DROP POLICY IF EXISTS device_types_update ON device_types")
op.execute("DROP POLICY IF EXISTS device_types_insert ON device_types")
op.execute("DROP POLICY IF EXISTS device_types_select ON device_types")
op.execute("ALTER TABLE device_types DISABLE ROW LEVEL SECURITY")
op.drop_table("device_types")

View File

@@ -2,7 +2,7 @@
Revision ID: 074
Revises: 073
Create Date: 2026-04-04
Create Date: 2026-04-12
"""
from alembic import op
import sqlalchemy as sa
@@ -14,12 +14,19 @@ down_revision = "073"
branch_labels = None
depends_on = None
_CURRENT_ACCOUNT = (
"COALESCE("
"NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000'"
")::uuid"
)
def upgrade() -> None:
op.create_table(
"network_diagrams",
sa.Column("id", UUID(as_uuid=True), primary_key=True, server_default=sa.text("gen_random_uuid()")),
sa.Column("team_id", UUID(as_uuid=True), sa.ForeignKey("teams.id", ondelete="CASCADE"), nullable=False),
sa.Column("account_id", UUID(as_uuid=True), sa.ForeignKey("accounts.id", ondelete="CASCADE"), nullable=False),
sa.Column("name", sa.String(255), nullable=False),
sa.Column("client_name", sa.String(255), nullable=True),
sa.Column("asset_name", sa.String(255), nullable=True),
@@ -33,9 +40,18 @@ def upgrade() -> None:
sa.Column("updated_at", sa.DateTime(timezone=True), server_default=sa.text("now()")),
)
op.create_index("idx_network_diagrams_team", "network_diagrams", ["team_id"])
op.create_index("idx_network_diagrams_client", "network_diagrams", ["team_id", "client_name"])
op.create_index("ix_network_diagrams_account_id", "network_diagrams", ["account_id"])
op.create_index("idx_network_diagrams_account_client", "network_diagrams", ["account_id", "client_name"])
op.execute("ALTER TABLE network_diagrams ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE network_diagrams FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON network_diagrams
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
def downgrade() -> None:
op.execute("DROP POLICY IF EXISTS tenant_isolation ON network_diagrams")
op.execute("ALTER TABLE network_diagrams DISABLE ROW LEVEL SECURITY")
op.drop_table("network_diagrams")

View File

@@ -0,0 +1,102 @@
"""create_db_roles
Revision ID: 0b470d9e6cf1
Revises: a9f3b2c1d4e5
Create Date: 2026-04-10 03:58:10.207919
"""
import os
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy import text
# revision identifiers, used by Alembic.
revision: str = '0b470d9e6cf1'
down_revision: Union[str, None] = 'a9f3b2c1d4e5'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Passwords from env vars. For local dev, defaults are sufficient.
# For production (Railway), set DB_APP_ROLE_PASSWORD and
# DB_ADMIN_ROLE_PASSWORD as environment variables before running migrations.
# Passwords must not contain single quotes.
app_pw = os.environ.get("DB_APP_ROLE_PASSWORD", "app_secret_change_me")
admin_pw = os.environ.get("DB_ADMIN_ROLE_PASSWORD", "admin_secret_change_me")
# Fetch the current database name dynamically — avoids hardcoding
# (the DB is named 'resolutionflow' in dev, potentially different elsewhere).
conn = op.get_bind()
db_name = conn.execute(text("SELECT current_database()")).scalar()
# ── Application role ────────────────────────────────────────────────────
# Subject to RLS. Used by FastAPI at runtime via DATABASE_URL.
op.execute(f"""
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'resolutionflow_app') THEN
CREATE ROLE resolutionflow_app LOGIN PASSWORD '{app_pw}';
ELSE
ALTER ROLE resolutionflow_app LOGIN PASSWORD '{app_pw}';
END IF;
END $$
""")
op.execute(f"GRANT CONNECT ON DATABASE {db_name} TO resolutionflow_app")
op.execute("GRANT USAGE ON SCHEMA public TO resolutionflow_app")
op.execute(
"GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public "
"TO resolutionflow_app"
)
op.execute(
"GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO resolutionflow_app"
)
# Ensure future tables automatically get the same permissions
op.execute(
"ALTER DEFAULT PRIVILEGES IN SCHEMA public "
"GRANT SELECT, INSERT, UPDATE, DELETE ON TABLES TO resolutionflow_app"
)
op.execute(
"ALTER DEFAULT PRIVILEGES IN SCHEMA public "
"GRANT USAGE, SELECT ON SEQUENCES TO resolutionflow_app"
)
# ── Admin role ──────────────────────────────────────────────────────────
# BYPASSRLS. Used by Alembic (DATABASE_URL_SYNC) and /admin/* endpoints
# (ADMIN_DATABASE_URL) after Task 11.
op.execute(f"""
DO $$
BEGIN
IF NOT EXISTS (SELECT 1 FROM pg_roles WHERE rolname = 'resolutionflow_admin') THEN
CREATE ROLE resolutionflow_admin LOGIN PASSWORD '{admin_pw}';
ELSE
ALTER ROLE resolutionflow_admin LOGIN PASSWORD '{admin_pw}';
END IF;
END $$
""")
op.execute("GRANT resolutionflow_app TO resolutionflow_admin")
op.execute("ALTER ROLE resolutionflow_admin BYPASSRLS")
op.execute(f"GRANT CONNECT ON DATABASE {db_name} TO resolutionflow_admin")
def downgrade() -> None:
conn = op.get_bind()
db_name = conn.execute(text("SELECT current_database()")).scalar()
op.execute(
"REVOKE ALL ON ALL TABLES IN SCHEMA public FROM resolutionflow_app"
)
op.execute(
"REVOKE ALL ON ALL SEQUENCES IN SCHEMA public FROM resolutionflow_app"
)
op.execute(
f"REVOKE CONNECT ON DATABASE {db_name} FROM resolutionflow_app"
)
op.execute(
f"REVOKE CONNECT ON DATABASE {db_name} FROM resolutionflow_admin"
)
op.execute("DROP ROLE IF EXISTS resolutionflow_admin")
op.execute("DROP ROLE IF EXISTS resolutionflow_app")

View File

@@ -0,0 +1,32 @@
"""Drop team_id from target_lists.
account_id (NOT NULL) is now the tenant isolation key; team_id is redundant.
All reads/writes use account_id via RLS + application filter.
Revision ID: 172ad76d7d20
Revises: 04f013768235
Create Date: 2026-04-11 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '172ad76d7d20'
down_revision: Union[str, None] = '04f013768235'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.drop_index('ix_target_lists_team_id', table_name='target_lists', if_exists=True)
op.drop_constraint('target_lists_team_id_fkey', 'target_lists', type_='foreignkey')
op.drop_column('target_lists', 'team_id')
def downgrade() -> None:
op.add_column('target_lists', sa.Column('team_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'target_lists_team_id_fkey', 'target_lists', 'teams',
['team_id'], ['id'], ondelete='CASCADE',
)
op.create_index('ix_target_lists_team_id', 'target_lists', ['team_id'])

View File

@@ -0,0 +1,86 @@
"""set NOT NULL on all previously-nullable account_id columns
Revision ID: 174f442795b7
Revises: 3a40fe11b427
Create Date: 2026-04-09 00:00:00.000000
All tables in this migration had account_id set to nullable previously.
Task 9 (create_global_content_tables) cleared all NULL rows.
This migration enforces the NOT NULL constraint.
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '174f442795b7'
down_revision: Union[str, None] = '3a40fe11b427'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# tree_embeddings: backfill from trees (must happen before SET NOT NULL)
op.execute("""
UPDATE tree_embeddings te
SET account_id = t.account_id
FROM trees t
WHERE te.tree_id = t.id
AND te.account_id IS NULL
""")
# feedback: backfill from users
op.execute("""
UPDATE feedback f
SET account_id = u.account_id
FROM users u
WHERE f.user_id = u.id
AND f.account_id IS NULL
""")
# Verify ALL tables before touching any SET NOT NULL
tables_with_account_id = [
'users', 'trees', 'tree_categories', 'tree_tags',
'step_categories', 'step_library', 'tree_embeddings', 'feedback',
]
for table in tables_with_account_id:
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} NULL account_id rows in {table}. "
"Run Task 9 (create_global_content_tables) first, or "
"manually backfill/delete orphaned rows."
)
# SET NOT NULL on all
for table in tables_with_account_id:
op.alter_column(table, 'account_id', nullable=False)
# Create indexes where they don't already exist
new_indexes = [
('tree_embeddings', 'ix_tree_embeddings_account_id'),
('feedback', 'ix_feedback_account_id'),
]
for table, index_name in new_indexes:
result = op.get_bind().execute(sa.text(
f"SELECT 1 FROM pg_indexes WHERE tablename='{table}' AND indexname='{index_name}'"
))
if not result.fetchone():
op.create_index(index_name, table, ['account_id'])
def downgrade() -> None:
# Revert to nullable
for table in ('users', 'trees', 'tree_categories', 'tree_tags',
'step_categories', 'step_library', 'tree_embeddings', 'feedback'):
op.alter_column(table, 'account_id', nullable=True)
for table, index_name in (
('tree_embeddings', 'ix_tree_embeddings_account_id'),
('feedback', 'ix_feedback_account_id'),
):
try:
op.drop_index(index_name, table_name=table)
except Exception:
pass

View File

@@ -0,0 +1,51 @@
"""Add account_id to audit_logs and backfill via user_id.
Revision ID: 2a9056eddd90
Revises: 70a5dd746e83
Create Date: 2026-04-11 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '2a9056eddd90'
down_revision: Union[str, None] = '70a5dd746e83'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column('audit_logs', sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'fk_audit_logs_account_id', 'audit_logs', 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Backfill: derive from the acting user's account
op.execute("""
UPDATE audit_logs al
SET account_id = u.account_id
FROM users u
WHERE al.user_id = u.id
AND u.account_id IS NOT NULL
AND al.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text("SELECT COUNT(*) FROM audit_logs WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} audit_logs rows have NULL account_id after backfill. "
"All audit log entries must have an associated user with an account."
)
op.alter_column('audit_logs', 'account_id', nullable=False)
op.create_index('ix_audit_logs_account_id', 'audit_logs', ['account_id'])
def downgrade() -> None:
op.drop_index('ix_audit_logs_account_id', table_name='audit_logs')
op.drop_constraint('fk_audit_logs_account_id', 'audit_logs', type_='foreignkey')
op.drop_column('audit_logs', 'account_id')

View File

@@ -0,0 +1,62 @@
"""add account_id to target_lists (keep team_id)
Revision ID: 2c6aabd89bc6
Revises: 78fc200abac1
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '2c6aabd89bc6'
down_revision: Union[str, None] = '78fc200abac1'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column('target_lists', sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'fk_target_lists_account_id', 'target_lists', 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Primary: team_id → team admin user → account_id
op.execute("""
UPDATE target_lists tl
SET account_id = u.account_id
FROM users u
WHERE u.team_id = tl.team_id
AND u.is_team_admin = TRUE
AND u.account_id IS NOT NULL
AND tl.account_id IS NULL
""")
# Fallback: created_by → users.account_id
op.execute("""
UPDATE target_lists tl
SET account_id = u.account_id
FROM users u
WHERE tl.created_by = u.id
AND u.account_id IS NOT NULL
AND tl.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text("SELECT COUNT(*) FROM target_lists WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} target_lists rows have NULL account_id. "
"No team admin found for these teams. Resolve before re-running."
)
op.alter_column('target_lists', 'account_id', nullable=False)
op.create_index('ix_target_lists_account_id', 'target_lists', ['account_id'])
def downgrade() -> None:
op.drop_index('ix_target_lists_account_id', table_name='target_lists')
op.drop_constraint('fk_target_lists_account_id', 'target_lists', type_='foreignkey')
op.drop_column('target_lists', 'account_id')

View File

@@ -0,0 +1,175 @@
"""create template_trees and platform_steps global content tables
Revision ID: 3a40fe11b427
Revises: 2c6aabd89bc6
Create Date: 2026-04-09 00:00:00.000000
These tables hold platform-owned content that is readable by all
authenticated users. No account_id. No RLS. Ever.
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import UUID, JSONB
revision: str = '3a40fe11b427'
down_revision: Union[str, None] = '2c6aabd89bc6'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# ── Create template_trees ─────────────────────────────────────────────────
op.create_table(
'template_trees',
sa.Column('id', UUID(), primary_key=True),
sa.Column('name', sa.String(255), nullable=False),
sa.Column('description', sa.Text(), nullable=True),
sa.Column('category', sa.String(100), nullable=True),
sa.Column('tree_type', sa.String(20), nullable=False),
sa.Column('tree_structure', JSONB(), nullable=False),
sa.Column('tags', JSONB(), nullable=False, server_default='[]'),
sa.Column('is_active', sa.Boolean(), nullable=False, server_default='true'),
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False),
sa.Column('source_tree_id', UUID(), sa.ForeignKey('trees.id', ondelete='SET NULL'), nullable=True),
)
op.create_index('ix_template_trees_tree_type', 'template_trees', ['tree_type'])
# ── Create platform_steps ────────────────────────────────────────────────
op.create_table(
'platform_steps',
sa.Column('id', UUID(), primary_key=True),
sa.Column('title', sa.String(255), nullable=False),
sa.Column('step_type', sa.String(50), nullable=False),
sa.Column('content', JSONB(), nullable=False),
sa.Column('is_active', sa.Boolean(), nullable=False, server_default='true'),
sa.Column('created_at', sa.DateTime(timezone=True), nullable=False),
sa.Column('updated_at', sa.DateTime(timezone=True), nullable=False),
sa.Column('source_step_id', UUID(), sa.ForeignKey('step_library.id', ondelete='SET NULL'), nullable=True),
)
op.create_index('ix_platform_steps_step_type', 'platform_steps', ['step_type'])
# ── Copy is_default=TRUE trees → template_trees ─────────────────────────
# Note: trees.tags is a relationship via tree_tags join table — no direct column.
# Aggregate tag names via a correlated subquery.
op.execute("""
INSERT INTO template_trees
(id, name, description, category, tree_type, tree_structure,
tags, is_active, created_at, updated_at, source_tree_id)
SELECT
gen_random_uuid(), t.name, t.description, t.category, t.tree_type,
t.tree_structure,
COALESCE(
(SELECT jsonb_agg(tt.name ORDER BY tt.name)
FROM tree_tag_assignments ta
JOIN tree_tags tt ON tt.id = ta.tag_id
WHERE ta.tree_id = t.id),
'[]'::jsonb
),
t.is_active,
COALESCE(t.created_at, NOW()), COALESCE(t.updated_at, NOW()), t.id
FROM trees t
WHERE t.is_default = TRUE
""")
# ── Copy visibility='public' steps → platform_steps ─────────────────────
op.execute("""
INSERT INTO platform_steps
(id, title, step_type, content, is_active, created_at, updated_at, source_step_id)
SELECT
gen_random_uuid(), title, step_type, content, is_active,
COALESCE(created_at, NOW()), COALESCE(updated_at, NOW()), id
FROM step_library
WHERE visibility = 'public'
""")
# ── Create platform sentinel account ─────────────────────────────────────
op.execute("""
INSERT INTO accounts (id, name, display_code, created_at, updated_at)
VALUES (
'00000000-0000-0000-0000-000000000001',
'ResolutionFlow Platform',
'PLATFORM',
NOW(),
NOW()
)
ON CONFLICT (id) DO NOTHING
""")
# ── Assign is_default trees to platform account ──────────────────────────
op.execute("""
UPDATE trees
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE is_default = TRUE
AND account_id IS NULL
""")
# ── Assign remaining trees to their author's account ─────────────────────
# Handles trees with no team_id that aren't is_default (e.g. inactive test
# trees, trees created before the team system existed).
op.execute("""
UPDATE trees
SET account_id = u.account_id
FROM users u
WHERE trees.author_id = u.id
AND trees.account_id IS NULL
AND u.account_id IS NOT NULL
""")
# ── Final fallback: any still-NULL trees go to platform account ───────────
# Covers trees whose author has no account (seeded content, system rows).
op.execute("""
UPDATE trees
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE account_id IS NULL
""")
# ── Assign global categories/tags/steps to platform account ─────────────
op.execute("""
UPDATE tree_categories
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE account_id IS NULL
""")
op.execute("""
UPDATE tree_tags
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE account_id IS NULL
""")
op.execute("""
UPDATE step_categories
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE account_id IS NULL
""")
op.execute("""
UPDATE step_library
SET account_id = '00000000-0000-0000-0000-000000000001'
WHERE account_id IS NULL
""")
# ── Verify zero NULLs in all 5 tables ───────────────────────────────────
for table in ('trees', 'tree_categories', 'tree_tags', 'step_categories', 'step_library'):
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} NULL account_id rows remain in {table} "
"after platform account assignment. Investigate before re-running."
)
def downgrade() -> None:
platform_id = '00000000-0000-0000-0000-000000000001'
for table in ('trees', 'tree_categories', 'tree_tags', 'step_categories', 'step_library'):
op.execute(f"UPDATE {table} SET account_id = NULL WHERE account_id = '{platform_id}'")
op.execute(f"DELETE FROM accounts WHERE id = '{platform_id}'")
op.drop_index('ix_platform_steps_step_type', table_name='platform_steps')
op.drop_index('ix_template_trees_tree_type', table_name='template_trees')
op.drop_table('platform_steps')
op.drop_table('template_trees')

View File

@@ -0,0 +1,77 @@
"""add account_id to AI branching tables
Revision ID: 478c159e5654
Revises: cc214c63aa30
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '478c159e5654'
down_revision: Union[str, None] = 'cc214c63aa30'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
ai_tables = ('session_branches', 'session_handoffs', 'fork_points', 'ai_session_steps')
# Step 1: ADD COLUMN (nullable)
for table in ai_tables:
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id', table, 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
op.add_column('ai_suggestions', sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'fk_ai_suggestions_account_id', 'ai_suggestions', 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Step 2: BACKFILL
for table in ai_tables:
op.execute(f"""
UPDATE {table} t
SET account_id = ai.account_id
FROM ai_sessions ai
WHERE t.session_id = ai.id
AND t.account_id IS NULL
""")
op.execute("""
UPDATE ai_suggestions s
SET account_id = u.account_id
FROM users u
WHERE s.user_id = u.id
AND s.account_id IS NULL
""")
# Step 3: VERIFY zero NULLs
for table in ai_tables + ('ai_suggestions',):
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} NULL account_id rows in {table}."
)
# Step 4: SET NOT NULL
for table in ai_tables + ('ai_suggestions',):
op.alter_column(table, 'account_id', nullable=False)
# Step 5: CREATE INDEX
for table in ai_tables + ('ai_suggestions',):
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('session_branches', 'session_handoffs', 'fork_points',
'ai_session_steps', 'ai_suggestions'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,74 @@
"""add fix outcome tracking columns to session_suggested_fixes
Adds: status, applied_at, verified_at, partial_notes, failure_reason,
ai_outcome_proposal.
status is the outcome dimension (did the fix work?), orthogonal to the
existing user_decision column (which script-path the engineer took).
Revision ID: 6492ec8d2d5b
Revises: f07010f17b01
Create Date: 2026-04-23 18:32:38.609719
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects import postgresql
# revision identifiers, used by Alembic.
revision: str = '6492ec8d2d5b'
down_revision: Union[str, None] = 'f07010f17b01'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column(
"session_suggested_fixes",
sa.Column("status", sa.String(length=20), nullable=False, server_default=sa.text("'proposed'")),
)
op.add_column(
"session_suggested_fixes",
sa.Column("applied_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"session_suggested_fixes",
sa.Column("verified_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"session_suggested_fixes",
sa.Column("partial_notes", sa.Text(), nullable=True),
)
op.add_column(
"session_suggested_fixes",
sa.Column("failure_reason", sa.Text(), nullable=True),
)
op.add_column(
"session_suggested_fixes",
sa.Column("ai_outcome_proposal", postgresql.JSONB(), nullable=True),
)
# Backfill before constraint creation so dismissed rows satisfy the new CHECK.
op.execute(
"UPDATE session_suggested_fixes "
"SET status = 'dismissed' "
"WHERE user_decision = 'dismissed'"
)
op.create_check_constraint(
"ck_session_suggested_fixes_status",
"session_suggested_fixes",
"status IN ('proposed', 'applied_success', 'applied_failed', 'applied_partial', 'dismissed')",
)
op.alter_column("session_suggested_fixes", "status", server_default=None)
def downgrade() -> None:
op.drop_constraint("ck_session_suggested_fixes_status", "session_suggested_fixes", type_="check")
op.drop_column("session_suggested_fixes", "ai_outcome_proposal")
op.drop_column("session_suggested_fixes", "failure_reason")
op.drop_column("session_suggested_fixes", "partial_notes")
op.drop_column("session_suggested_fixes", "verified_at")
op.drop_column("session_suggested_fixes", "applied_at")
op.drop_column("session_suggested_fixes", "status")

View File

@@ -0,0 +1,90 @@
"""Enable RLS on Phase 2 session and supporting tables.
10 tables use a standard tenant-only policy.
step_library uses a visibility-aware policy — public steps visible to all tenants.
NOTE: session_messages does not exist in this codebase (removed from plan).
script_generations is the correct table name (not script_template_generations).
sessions and ai_sessions are two separate tables, both in scope.
Prerequisites:
- Phase 1 migration must have run (resolutionflow_app role exists, Phase 1 tables have RLS)
- NOT NULL write-path bugs fixed (P2-A commits b641ac6)
- shares.py cross-tenant session fix deployed (P2-B commit ac2b193)
Revision ID: 70a5dd746e83
Revises: c5f48b9890f9
Create Date: 2026-04-10 06:54:49.431817
"""
from typing import Sequence, Union
from alembic import op
# revision identifiers, used by Alembic.
revision: str = '70a5dd746e83'
down_revision: Union[str, None] = 'c5f48b9890f9'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
_NULL_UUID = "00000000-0000-0000-0000-000000000000"
_CURRENT_ACCOUNT = (
f"COALESCE(NULLIF(current_setting('app.current_account_id', TRUE), ''), "
f"'{_NULL_UUID}')::uuid"
)
# Standard tenant-only policy — account_id must match the current tenant.
# When no tenant context is set, COALESCE returns the nil UUID so zero rows
# are visible (fail-closed).
_STANDARD_USING = f"account_id = {_CURRENT_ACCOUNT}"
# Visibility-aware policy for step_library — public steps (visibility='public')
# must be visible to ALL tenants regardless of account_id. This covers the
# visibility='public' arm of build_step_visibility_filter() in app/core/filters.py.
# The created_by arm (private steps visible to their author) is covered
# transitively: private steps share account_id with their creator, so the
# account_id match handles it. This relies on account_id NOT NULL on step_library.
_STEP_LIBRARY_USING = f"account_id = {_CURRENT_ACCOUNT} OR visibility = 'public'"
# Standard tables: strict tenant isolation, no cross-tenant visibility.
_STANDARD_TABLES = [
"sessions",
"ai_sessions",
"session_branches",
"session_supporting_data",
"session_resolution_outputs",
"session_handoffs",
"script_templates",
"script_generations",
"maintenance_schedules",
"psa_post_log",
]
def upgrade() -> None:
# ── Standard tenant-isolation tables ────────────────────────────────────
for table in _STANDARD_TABLES:
op.execute(f"ALTER TABLE {table} ENABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON {table}
USING ({_STANDARD_USING})
""")
# ── step_library ────────────────────────────────────────────────────────
# Public steps (visibility='public') must be readable by all tenants so
# the Solutions Library browsing experience works without tenant context.
# Private/team steps remain tenant-scoped.
op.execute("ALTER TABLE step_library ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE step_library FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON step_library
USING ({_STEP_LIBRARY_USING})
""")
def downgrade() -> None:
for table in _STANDARD_TABLES + ["step_library"]:
op.execute(f"DROP POLICY IF EXISTS tenant_isolation ON {table}")
op.execute(f"ALTER TABLE {table} DISABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} NO FORCE ROW LEVEL SECURITY")

View File

@@ -0,0 +1,46 @@
"""add account_id to step_ratings and step_usage_log
Revision ID: 7167e9374b0c
Revises: 478c159e5654
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '7167e9374b0c'
down_revision: Union[str, None] = '478c159e5654'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
for table in ('step_ratings', 'step_usage_log'):
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id', table, 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Backfill: from the RATER/LOGGER user's account (not the step's account)
op.execute(f"""
UPDATE {table} t
SET account_id = u.account_id
FROM users u
WHERE t.user_id = u.id
AND t.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(f"ROLLBACK: {count} NULL account_id rows in {table}.")
op.alter_column(table, 'account_id', nullable=False)
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('step_ratings', 'step_usage_log'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,70 @@
"""add origin discriminator + inline idempotency to script_builder_sessions
Adds:
- origin VARCHAR(20) NOT NULL DEFAULT 'standalone' with CHECK enum
- invariant: pilot_inline rows must have ai_session_id
- partial unique index: one pilot_inline session per (user, pilot session)
Revision ID: 71efd2102f49
Revises: 6492ec8d2d5b
Create Date: 2026-04-24 04:22:10.819809
"""
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = '71efd2102f49'
down_revision = '6492ec8d2d5b'
branch_labels = None
depends_on = None
def upgrade() -> None:
op.add_column(
"script_builder_sessions",
sa.Column(
"origin",
sa.String(length=20),
nullable=False,
server_default=sa.text("'standalone'"),
),
)
op.create_check_constraint(
"ck_script_builder_sessions_origin",
"script_builder_sessions",
"origin IN ('standalone', 'pilot_inline')",
)
op.create_check_constraint(
"ck_script_builder_sessions_origin_ai_session",
"script_builder_sessions",
"origin <> 'pilot_inline' OR ai_session_id IS NOT NULL",
)
op.create_index(
"ux_script_builder_sessions_pilot_inline",
"script_builder_sessions",
["user_id", "ai_session_id"],
unique=True,
postgresql_where=sa.text("origin = 'pilot_inline'"),
)
# Drop the server_default — app code owns the default via model default.
op.alter_column("script_builder_sessions", "origin", server_default=None)
def downgrade() -> None:
op.drop_index(
"ux_script_builder_sessions_pilot_inline",
table_name="script_builder_sessions",
)
op.drop_constraint(
"ck_script_builder_sessions_origin_ai_session",
"script_builder_sessions",
type_="check",
)
op.drop_constraint(
"ck_script_builder_sessions_origin",
"script_builder_sessions",
type_="check",
)
op.drop_column("script_builder_sessions", "origin")

View File

@@ -0,0 +1,103 @@
"""add account_id to script_builder_sessions, script_templates, script_generations
Revision ID: 78fc200abac1
Revises: 7f136778f5a8
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '78fc200abac1'
down_revision: Union[str, None] = '7f136778f5a8'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
PLATFORM_ACCOUNT_ID = '00000000-0000-0000-0000-000000000001'
def upgrade() -> None:
# Ensure the platform sentinel account exists before any fallback assignments.
# Migration 3a40fe11b427 also inserts this with ON CONFLICT DO NOTHING — safe.
op.execute(f"""
INSERT INTO accounts (id, name, display_code, created_at, updated_at)
VALUES (
'{PLATFORM_ACCOUNT_ID}',
'ResolutionFlow Platform',
'PLATFORM',
NOW(),
NOW()
)
ON CONFLICT (id) DO NOTHING
""")
for table in ('script_builder_sessions', 'script_templates', 'script_generations'):
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id', table, 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# script_builder_sessions: user_id → users.account_id
op.execute("""
UPDATE script_builder_sessions sbs
SET account_id = u.account_id
FROM users u
WHERE sbs.user_id = u.id
AND sbs.account_id IS NULL
""")
# script_templates: created_by → users.account_id (nullable created_by)
op.execute("""
UPDATE script_templates st
SET account_id = u.account_id
FROM users u
WHERE st.created_by = u.id
AND st.account_id IS NULL
""")
# Fallback: team_id → team admin user
op.execute("""
UPDATE script_templates st
SET account_id = u.account_id
FROM users u
WHERE u.team_id = st.team_id
AND u.is_team_admin = TRUE
AND u.account_id IS NOT NULL
AND st.account_id IS NULL
""")
# Final fallback: platform-seeded templates with NULL team_id AND NULL created_by
# (e.g. the 6 AD templates inserted by migration 057) → platform sentinel account
op.execute(f"""
UPDATE script_templates
SET account_id = '{PLATFORM_ACCOUNT_ID}'
WHERE account_id IS NULL
""")
# script_generations: user_id → users.account_id
op.execute("""
UPDATE script_generations sg
SET account_id = u.account_id
FROM users u
WHERE sg.user_id = u.id
AND sg.account_id IS NULL
""")
# VERIFY
for table in ('script_builder_sessions', 'script_templates', 'script_generations'):
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(f"ROLLBACK: {count} NULL account_id rows in {table}.")
for table in ('script_builder_sessions', 'script_templates', 'script_generations'):
op.alter_column(table, 'account_id', nullable=False)
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('script_builder_sessions', 'script_templates', 'script_generations'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,62 @@
"""add account_id to maintenance_schedules
Revision ID: 7f136778f5a8
Revises: 8aac5b372402
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '7f136778f5a8'
down_revision: Union[str, None] = '8aac5b372402'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column('maintenance_schedules',
sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'fk_maintenance_schedules_account_id', 'maintenance_schedules', 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Primary: tree_id → trees.account_id (only where tree.account_id is NOT NULL)
op.execute("""
UPDATE maintenance_schedules ms
SET account_id = t.account_id
FROM trees t
WHERE ms.tree_id = t.id
AND t.account_id IS NOT NULL
AND ms.account_id IS NULL
""")
# Fallback: created_by → users.account_id (for is_default trees with NULL account_id)
op.execute("""
UPDATE maintenance_schedules ms
SET account_id = u.account_id
FROM users u
WHERE ms.created_by = u.id
AND u.account_id IS NOT NULL
AND ms.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text("SELECT COUNT(*) FROM maintenance_schedules WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} maintenance_schedules rows have NULL account_id. "
"Check if created_by is NULL — those rows need manual resolution."
)
op.alter_column('maintenance_schedules', 'account_id', nullable=False)
op.create_index('ix_maintenance_schedules_account_id', 'maintenance_schedules', ['account_id'])
def downgrade() -> None:
op.drop_index('ix_maintenance_schedules_account_id', table_name='maintenance_schedules')
op.drop_constraint('fk_maintenance_schedules_account_id', 'maintenance_schedules', type_='foreignkey')
op.drop_column('maintenance_schedules', 'account_id')

View File

@@ -0,0 +1,81 @@
"""add account_id to PSA and notification tables
Revision ID: 8aac5b372402
Revises: a1d2a84b9abb
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = '8aac5b372402'
down_revision: Union[str, None] = 'a1d2a84b9abb'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
# Step 1: ADD COLUMN
for table in ('psa_post_log', 'psa_member_mappings', 'notification_logs'):
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id', table, 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Step 2: BACKFILL
# psa_post_log: prefer psa_connection → fallback to posted_by user
# Note: cannot reference the updated table (ppl) inside the FROM clause JOIN,
# so use a correlated subquery for psa_connections lookup instead.
op.execute("""
UPDATE psa_post_log ppl
SET account_id = COALESCE(
(SELECT account_id FROM psa_connections WHERE id = ppl.psa_connection_id),
u.account_id
)
FROM users u
WHERE ppl.posted_by = u.id
AND ppl.account_id IS NULL
""")
# psa_member_mappings: via psa_connection
op.execute("""
UPDATE psa_member_mappings pmm
SET account_id = pc.account_id
FROM psa_connections pc
WHERE pmm.psa_connection_id = pc.id
AND pmm.account_id IS NULL
""")
# notification_logs: via notification_config
op.execute("""
UPDATE notification_logs nl
SET account_id = nc.account_id
FROM notification_configs nc
WHERE nl.notification_config_id = nc.id
AND nl.account_id IS NULL
""")
# Step 3: VERIFY
for table in ('psa_post_log', 'psa_member_mappings', 'notification_logs'):
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(f"ROLLBACK: {count} NULL account_id rows in {table}.")
# Step 4: SET NOT NULL
for table in ('psa_post_log', 'psa_member_mappings', 'notification_logs'):
op.alter_column(table, 'account_id', nullable=False)
# Step 5: CREATE INDEX
for table in ('psa_post_log', 'psa_member_mappings', 'notification_logs'):
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('psa_post_log', 'psa_member_mappings', 'notification_logs'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,57 @@
"""Add account_id to tree_shares and backfill via tree owner's account.
The share belongs to the tree's tenant, not the actor who created it.
A super admin in account A can share a tree owned by account B; that share
must land in account B so account B's RLS filter sees it.
Revision ID: a05e1a1bea7c
Revises: 2a9056eddd90
Create Date: 2026-04-11 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = 'a05e1a1bea7c'
down_revision: Union[str, None] = '2a9056eddd90'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
op.add_column('tree_shares', sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
'fk_tree_shares_account_id', 'tree_shares', 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
# Backfill: derive from the tree's account, not the creator's account.
# A share lives in the same tenant as its tree so that the tree owner's
# RLS context covers their own shares regardless of who created them.
op.execute("""
UPDATE tree_shares ts
SET account_id = t.account_id
FROM trees t
WHERE ts.tree_id = t.id
AND t.account_id IS NOT NULL
AND ts.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text("SELECT COUNT(*) FROM tree_shares WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} tree_shares rows have NULL account_id after backfill. "
"All share entries must have a creating user with an account."
)
op.alter_column('tree_shares', 'account_id', nullable=False)
op.create_index('ix_tree_shares_account_id', 'tree_shares', ['account_id'])
def downgrade() -> None:
op.drop_index('ix_tree_shares_account_id', table_name='tree_shares')
op.drop_constraint('fk_tree_shares_account_id', 'tree_shares', type_='foreignkey')
op.drop_column('tree_shares', 'account_id')

View File

@@ -0,0 +1,45 @@
"""add account_id to user personalization tables
Revision ID: a1d2a84b9abb
Revises: 7167e9374b0c
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = 'a1d2a84b9abb'
down_revision: Union[str, None] = '7167e9374b0c'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
for table in ('user_folders', 'user_pinned_trees'):
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id', table, 'accounts',
['account_id'], ['id'], ondelete='CASCADE',
)
op.execute(f"""
UPDATE {table} t
SET account_id = u.account_id
FROM users u
WHERE t.user_id = u.id
AND t.account_id IS NULL
""")
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(f"ROLLBACK: {count} NULL account_id rows in {table}.")
op.alter_column(table, 'account_id', nullable=False)
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('user_folders', 'user_pinned_trees'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,24 @@
"""merge Phase 1 tenant isolation chain with main head
Revision ID: a9f3b2c1d4e5
Revises: 070, 174f442795b7
Create Date: 2026-04-09 00:00:00.000000
Merge migration: consolidates the Phase 1 account_id chain (cc214c63aa30 → … → 174f442795b7)
with the main sequential chain (… → 070) into a single head so that
`alembic upgrade head` works without ambiguity.
"""
from typing import Sequence, Union
revision: str = 'a9f3b2c1d4e5'
down_revision: Union[str, tuple] = ('070', '174f442795b7')
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
def upgrade() -> None:
pass
def downgrade() -> None:
pass

View File

@@ -0,0 +1,85 @@
"""Enable RLS on Phase 4 tables — all remaining tenant-scoped tables.
All tables in this migration already have account_id NOT NULL (enforced by
earlier migrations). This migration adds ENABLE ROW LEVEL SECURITY,
FORCE ROW LEVEL SECURITY, and the appropriate tenant isolation policy to each.
Policy variants used:
- Standard: account_id = current_setting(app.current_account_id)::uuid
- Platform: standard OR account_id = PLATFORM_ACCOUNT_ID
(for global content tables readable by all tenants)
Skipped intentionally:
- accounts — IS the root table; no account_id column
- plan_feature_defaults — platform config; no account_id column
- script_categories — global lookup table; no account_id column
- platform_steps — global content; no account_id column (readable by all)
- template_trees — global content; no account_id column (readable by all)
Revision ID: b3c7e9f2a1d8
Revises: 172ad76d7d20
Create Date: 2026-04-12
"""
from typing import Union
from alembic import op
revision: str = "b3c7e9f2a1d8"
down_revision: Union[str, None] = "172ad76d7d20"
branch_labels = None
depends_on = None
# Standard policy — tenant sees only own rows.
_STANDARD_TABLES = [
"users",
"account_invites",
"account_limit_overrides",
"account_feature_overrides",
"subscriptions",
"ai_chat_sessions",
"ai_conversations",
"ai_session_steps",
"ai_session_embeddings",
"ai_suggestions",
"ai_usage",
"assistant_chats",
"attachments",
"copilot_conversations",
"feedback",
"file_uploads",
"fork_points",
"kb_imports",
"notifications",
"notification_configs",
"notification_logs",
"psa_activity_logs",
"psa_member_mappings",
"script_builder_sessions",
"session_ratings",
"tree_embeddings",
"user_folders",
"user_pinned_trees",
]
_POLICY_EXPR = (
"account_id = COALESCE("
"NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000'"
")::uuid"
)
def upgrade() -> None:
for table in _STANDARD_TABLES:
op.execute(f"ALTER TABLE {table} ENABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON {table}
USING ({_POLICY_EXPR})
""")
def downgrade() -> None:
for table in _STANDARD_TABLES:
op.execute(f"DROP POLICY IF EXISTS tenant_isolation ON {table}")
op.execute(f"ALTER TABLE {table} DISABLE ROW LEVEL SECURITY")

View File

@@ -0,0 +1,108 @@
"""enable_rls_phase1
Revision ID: c5f48b9890f9
Revises: 0b470d9e6cf1
Create Date: 2026-04-10 04:01:13.043321
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
# revision identifiers, used by Alembic.
revision: str = 'c5f48b9890f9'
down_revision: Union[str, None] = '0b470d9e6cf1'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = None
_NULL_UUID = "00000000-0000-0000-0000-000000000000"
_PLATFORM_UUID = "00000000-0000-0000-0000-000000000001"
_CURRENT_ACCOUNT = (
f"COALESCE(NULLIF(current_setting('app.current_account_id', TRUE), ''), "
f"'{_NULL_UUID}')::uuid"
)
def upgrade() -> None:
# ── trees ───────────────────────────────────────────────────────────────
# Extended policy mirrors can_access_tree() in app/core/permissions.py.
# Tenant sees: own rows, platform rows, any default tree, any public tree,
# any gallery-featured tree.
# is_gallery_featured = TRUE is included because /public/templates is a
# no-auth endpoint — no tenant context is set, so gallery trees must pass
# RLS on their own flag rather than relying on account_id or is_public.
# Private/team trees from other accounts are hidden.
op.execute("ALTER TABLE trees ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE trees FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON trees
USING (
account_id = {_CURRENT_ACCOUNT}
OR account_id = '{_PLATFORM_UUID}'::uuid
OR is_default = TRUE
OR is_public = TRUE
OR is_gallery_featured = TRUE
)
""")
# ── tree_tags ────────────────────────────────────────────────────────────
# Own account + platform tags (global tags visible to all tenants).
op.execute("ALTER TABLE tree_tags ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE tree_tags FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON tree_tags
USING (
account_id = {_CURRENT_ACCOUNT}
OR account_id = '{_PLATFORM_UUID}'::uuid
)
""")
# ── tree_categories ──────────────────────────────────────────────────────
op.execute("ALTER TABLE tree_categories ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE tree_categories FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON tree_categories
USING (
account_id = {_CURRENT_ACCOUNT}
OR account_id = '{_PLATFORM_UUID}'::uuid
)
""")
# ── step_categories ──────────────────────────────────────────────────────
op.execute("ALTER TABLE step_categories ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE step_categories FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON step_categories
USING (
account_id = {_CURRENT_ACCOUNT}
OR account_id = '{_PLATFORM_UUID}'::uuid
)
""")
# ── psa_connections ──────────────────────────────────────────────────────
# Tenant-only — PSA credentials must never cross tenant boundaries.
op.execute("ALTER TABLE psa_connections ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE psa_connections FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON psa_connections
USING (account_id = {_CURRENT_ACCOUNT})
""")
# ── flow_proposals ────────────────────────────────────────────────────────
# Tenant-only.
op.execute("ALTER TABLE flow_proposals ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE flow_proposals FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON flow_proposals
USING (account_id = {_CURRENT_ACCOUNT})
""")
def downgrade() -> None:
for table in ["trees", "tree_tags", "tree_categories", "step_categories",
"psa_connections", "flow_proposals"]:
op.execute(f"DROP POLICY IF EXISTS tenant_isolation ON {table}")
op.execute(f"ALTER TABLE {table} DISABLE ROW LEVEL SECURITY")
op.execute(f"ALTER TABLE {table} NO FORCE ROW LEVEL SECURITY")

View File

@@ -0,0 +1,95 @@
"""add account_id to core session tables
Revision ID: cc214c63aa30
Revises: b8d2f4a6c091
Create Date: 2026-04-09 00:00:00.000000
"""
from typing import Sequence, Union
from alembic import op
import sqlalchemy as sa
revision: str = 'cc214c63aa30'
down_revision: Union[str, None] = '064'
branch_labels: Union[str, Sequence[str], None] = None
depends_on: Union[str, Sequence[str], None] = ('067',)
def upgrade() -> None:
# ── Step 1: ADD COLUMN (nullable) ────────────────────────────────────────
for table in ('sessions', 'attachments', 'session_supporting_data',
'session_resolution_outputs'):
op.add_column(table, sa.Column('account_id', sa.UUID(), nullable=True))
op.create_foreign_key(
f'fk_{table}_account_id',
table, 'accounts',
['account_id'], ['id'],
ondelete='CASCADE',
)
# ── Step 2: BACKFILL ─────────────────────────────────────────────────────
# sessions: direct join to users
op.execute("""
UPDATE sessions s
SET account_id = u.account_id
FROM users u
WHERE s.user_id = u.id
AND s.account_id IS NULL
""")
# attachments: chain through sessions (now backfilled above)
op.execute("""
UPDATE attachments a
SET account_id = s.account_id
FROM sessions s
WHERE a.session_id = s.id
AND a.account_id IS NULL
""")
# session_supporting_data: same chain
op.execute("""
UPDATE session_supporting_data sd
SET account_id = s.account_id
FROM sessions s
WHERE sd.session_id = s.id
AND sd.account_id IS NULL
""")
# session_resolution_outputs: FK is to ai_sessions, not sessions
op.execute("""
UPDATE session_resolution_outputs sro
SET account_id = ai.account_id
FROM ai_sessions ai
WHERE sro.session_id = ai.id
AND sro.account_id IS NULL
""")
# ── Step 3: VERIFY zero NULLs — raises if any remain ────────────────────
for table in ('sessions', 'attachments', 'session_supporting_data',
'session_resolution_outputs'):
result = op.get_bind().execute(
sa.text(f"SELECT COUNT(*) FROM {table} WHERE account_id IS NULL")
)
count = result.scalar()
if count > 0:
raise RuntimeError(
f"ROLLBACK: {count} NULL account_id rows remain in {table}. "
f"Fix the backfill before re-running."
)
# ── Step 4: SET NOT NULL ─────────────────────────────────────────────────
for table in ('sessions', 'attachments', 'session_supporting_data',
'session_resolution_outputs'):
op.alter_column(table, 'account_id', nullable=False)
# ── Step 5: CREATE INDEX ─────────────────────────────────────────────────
for table in ('sessions', 'attachments', 'session_supporting_data',
'session_resolution_outputs'):
op.create_index(f'ix_{table}_account_id', table, ['account_id'])
def downgrade() -> None:
for table in ('sessions', 'attachments', 'session_supporting_data',
'session_resolution_outputs'):
op.drop_index(f'ix_{table}_account_id', table_name=table)
op.drop_constraint(f'fk_{table}_account_id', table, type_='foreignkey')
op.drop_column(table, 'account_id')

View File

@@ -0,0 +1,404 @@
"""FlowPilot migration Phase 1 — schema for the unified session surface.
Revision ID: f07010f17b01
Revises: 074
Create Date: 2026-04-17
Creates the backing store for the FlowPilot unified session surface:
- `session_facts` — "What we know" facts, keyed to a session, with a polymorphic
`source_ref` pointing at a task-lane item inside `ai_sessions.pending_task_lane`
(no DB-level FK; integrity enforced at the service layer per the design doc).
- `session_suggested_fixes` — AI-proposed resolution paths. Only one active
(`superseded_at IS NULL`) per session at a time.
- `draft_templates` — scripts pending post-resolve templatization
(Option 2 in the three-option dialog).
- `account_settings` — new per-account key/value settings table with a JSONB
`preferences` grab-bag. Rows are created lazily on first write.
- Column additions to `ai_sessions` — resolution/escalation markdown + external IDs,
plus `state_version` (incremented by any write that invalidates the resolution
note preview cache).
- Column additions to `script_templates` — provenance fields for templates
promoted from draft_templates.
All four new tenant-scoped tables have RLS enabled + forced with a
`tenant_isolation` policy matching the repo pattern (USING + WITH CHECK on
`account_id = app.current_account_id`). Downgrade is reversible: drops in the
inverse order of creation.
Chained from `074` (add_network_diagrams_table) per the single-head state of
production; the other local heads on feat/flowpilot-migration are branch
artifacts not present in production.
"""
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import UUID, JSONB
revision = "f07010f17b01"
down_revision = "074"
branch_labels = None
depends_on = None
_CURRENT_ACCOUNT = (
"COALESCE("
"NULLIF(current_setting('app.current_account_id', TRUE), ''), "
"'00000000-0000-0000-0000-000000000000'"
")::uuid"
)
def upgrade() -> None:
# ── ai_sessions: resolution / escalation columns + state_version ───────
op.add_column(
"ai_sessions",
sa.Column("resolution_note_markdown", sa.Text(), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("resolution_note_posted_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("resolution_note_external_id", sa.String(128), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_markdown", sa.Text(), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_posted_at", sa.DateTime(timezone=True), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column("escalation_package_external_id", sa.String(128), nullable=True),
)
op.add_column(
"ai_sessions",
sa.Column(
"state_version",
sa.Integer(),
nullable=False,
server_default=sa.text("0"),
),
)
# ── script_templates: provenance for post-resolve promotion ────────────
op.add_column(
"script_templates",
sa.Column(
"source_session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id"),
nullable=True,
),
)
op.add_column(
"script_templates",
sa.Column(
"source_user_id",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=True,
),
)
op.add_column(
"script_templates",
sa.Column("source_ticket_ref", sa.String(64), nullable=True),
)
# ── session_facts ──────────────────────────────────────────────────────
op.create_table(
"session_facts",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column("text", sa.Text(), nullable=False),
sa.Column("source_type", sa.String(32), nullable=False),
# `source_ref` is a polymorphic pointer to a task-lane item inside
# ai_sessions.pending_task_lane JSON, NOT a FK to any table.
# Integrity enforced at the service layer per Section 4.2 of the
# migration design doc.
sa.Column("source_ref", UUID(as_uuid=True), nullable=True),
sa.Column("source_summary", sa.Text(), nullable=True),
sa.Column(
"created_by",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=False,
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column("deleted_at", sa.DateTime(timezone=True), nullable=True),
sa.CheckConstraint(
"source_type IN ('question', 'diagnostic_check', 'user_note', 'ai_synthesis')",
name="ck_session_facts_source_type",
),
)
# Active-facts-per-session; partial index excludes soft-deleted rows.
op.create_index(
"idx_session_facts_session",
"session_facts",
["session_id"],
postgresql_where=sa.text("deleted_at IS NULL"),
)
op.create_index(
"idx_session_facts_account",
"session_facts",
["account_id"],
)
op.execute("ALTER TABLE session_facts ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE session_facts FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON session_facts
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── session_suggested_fixes ────────────────────────────────────────────
op.create_table(
"session_suggested_fixes",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id", ondelete="CASCADE"),
nullable=False,
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column("title", sa.String(200), nullable=False),
sa.Column("description", sa.Text(), nullable=False),
sa.Column("confidence_pct", sa.Integer(), nullable=False),
sa.Column(
"script_template_id",
UUID(as_uuid=True),
sa.ForeignKey("script_templates.id"),
nullable=True,
),
sa.Column("ai_drafted_script", sa.Text(), nullable=True),
sa.Column("ai_drafted_parameters", JSONB(), nullable=True),
sa.Column("user_decision", sa.String(32), nullable=True),
sa.Column("superseded_at", sa.DateTime(timezone=True), nullable=True),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.CheckConstraint(
"confidence_pct BETWEEN 0 AND 100",
name="ck_session_suggested_fixes_confidence_pct",
),
sa.CheckConstraint(
"user_decision IS NULL OR user_decision IN ("
"'one_off', 'draft_template', 'build_template', 'dismissed')",
name="ck_session_suggested_fixes_user_decision",
),
)
# Only-one-active-per-session is enforced by service-layer supersession;
# this partial index serves the "find active fix" query.
op.create_index(
"idx_session_suggested_fixes_session_active",
"session_suggested_fixes",
["session_id"],
postgresql_where=sa.text("superseded_at IS NULL"),
)
op.execute("ALTER TABLE session_suggested_fixes ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE session_suggested_fixes FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON session_suggested_fixes
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── draft_templates ────────────────────────────────────────────────────
op.create_table(
"draft_templates",
sa.Column(
"id",
UUID(as_uuid=True),
primary_key=True,
server_default=sa.text("gen_random_uuid()"),
),
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id"),
nullable=False,
),
sa.Column(
"source_session_id",
UUID(as_uuid=True),
sa.ForeignKey("ai_sessions.id"),
nullable=False,
),
sa.Column(
"source_user_id",
UUID(as_uuid=True),
sa.ForeignKey("users.id"),
nullable=False,
),
sa.Column("script_body", sa.Text(), nullable=False),
sa.Column("proposed_parameters", JSONB(), nullable=False),
sa.Column("proposed_name", sa.String(200), nullable=True),
sa.Column(
"proposed_category_id",
UUID(as_uuid=True),
sa.ForeignKey("script_categories.id"),
nullable=True,
),
sa.Column(
"status",
sa.String(32),
nullable=False,
server_default=sa.text("'pending'"),
),
sa.Column("resolved_at", sa.DateTime(timezone=True), nullable=True),
sa.Column(
"promoted_template_id",
UUID(as_uuid=True),
sa.ForeignKey("script_templates.id"),
nullable=True,
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.CheckConstraint(
"status IN ('pending', 'accepted', 'rejected')",
name="ck_draft_templates_status",
),
)
# Supports the Script Library "N scripts ready to review" badge.
op.create_index(
"idx_draft_templates_account_pending",
"draft_templates",
["account_id"],
postgresql_where=sa.text("status = 'pending'"),
)
op.execute("ALTER TABLE draft_templates ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE draft_templates FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON draft_templates
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
# ── account_settings ───────────────────────────────────────────────────
# One row per account, created lazily on first write. The `preferences`
# JSONB is a grab-bag for simple settings (e.g. templatize_prompt_enabled).
# Settings graduate to typed columns via future migrations when they meet
# the promotion criteria in Section 4.6 of the design doc (hot path /
# validation / joins).
op.create_table(
"account_settings",
sa.Column(
"account_id",
UUID(as_uuid=True),
sa.ForeignKey("accounts.id", ondelete="CASCADE"),
primary_key=True,
),
sa.Column(
"preferences",
JSONB(),
nullable=False,
server_default=sa.text("'{}'::jsonb"),
),
sa.Column(
"created_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
sa.Column(
"updated_at",
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()"),
),
)
op.execute("ALTER TABLE account_settings ENABLE ROW LEVEL SECURITY")
op.execute("ALTER TABLE account_settings FORCE ROW LEVEL SECURITY")
op.execute(f"""
CREATE POLICY tenant_isolation ON account_settings
USING (account_id = {_CURRENT_ACCOUNT})
WITH CHECK (account_id = {_CURRENT_ACCOUNT})
""")
def downgrade() -> None:
# Drop in reverse order so FK dependencies unwind cleanly.
op.execute("DROP POLICY IF EXISTS tenant_isolation ON account_settings")
op.execute("ALTER TABLE account_settings DISABLE ROW LEVEL SECURITY")
op.drop_table("account_settings")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON draft_templates")
op.execute("ALTER TABLE draft_templates DISABLE ROW LEVEL SECURITY")
op.drop_index("idx_draft_templates_account_pending", table_name="draft_templates")
op.drop_table("draft_templates")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON session_suggested_fixes")
op.execute("ALTER TABLE session_suggested_fixes DISABLE ROW LEVEL SECURITY")
op.drop_index(
"idx_session_suggested_fixes_session_active",
table_name="session_suggested_fixes",
)
op.drop_table("session_suggested_fixes")
op.execute("DROP POLICY IF EXISTS tenant_isolation ON session_facts")
op.execute("ALTER TABLE session_facts DISABLE ROW LEVEL SECURITY")
op.drop_index("idx_session_facts_account", table_name="session_facts")
op.drop_index("idx_session_facts_session", table_name="session_facts")
op.drop_table("session_facts")
op.drop_column("script_templates", "source_ticket_ref")
op.drop_column("script_templates", "source_user_id")
op.drop_column("script_templates", "source_session_id")
op.drop_column("ai_sessions", "state_version")
op.drop_column("ai_sessions", "escalation_package_external_id")
op.drop_column("ai_sessions", "escalation_package_posted_at")
op.drop_column("ai_sessions", "escalation_package_markdown")
op.drop_column("ai_sessions", "resolution_note_external_id")
op.drop_column("ai_sessions", "resolution_note_posted_at")
op.drop_column("ai_sessions", "resolution_note_markdown")

View File

@@ -10,6 +10,8 @@ from app.core.database import get_db
from app.core.security import decode_token
from app.models.user import User
from app.models.plan_limits import PlanLimits
from app.core.tenant_context import set_current_account_id, clear_current_account_id
from app.core.admin_database import get_admin_db # noqa: F401 — re-exported for use in endpoints
# Routes that are allowed even when must_change_password is True
_PASSWORD_CHANGE_ALLOWLIST = {
@@ -22,10 +24,14 @@ oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/api/v1/auth/login")
async def get_current_user(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
token: Annotated[str, Depends(oauth2_scheme)]
) -> User:
"""Get current authenticated user from JWT token."""
"""Get current authenticated user from JWT token.
Must use get_admin_db (BYPASSRLS): this dep runs before require_tenant_context
sets app.current_account_id, so the users table RLS would block the lookup.
"""
credentials_exception = HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
@@ -75,10 +81,14 @@ async def get_refresh_token_payload(
async def get_current_active_user(
request: Request,
current_user: Annotated[User, Depends(get_current_user)],
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
) -> User:
"""Ensure user is active (not disabled). Auto-downgrades expired trials.
Enforces must_change_password — blocks all routes except allowlist."""
Enforces must_change_password — blocks all routes except allowlist.
Uses get_admin_db: runs before require_tenant_context sets the ContextVar,
so tenant-scoped tables (subscriptions) would return 0 rows via app role.
"""
if not current_user.is_active:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
@@ -190,3 +200,44 @@ async def get_plan_limits_for_user(
"""Get plan limits for the current user's account."""
from app.core.subscriptions import get_user_plan_limits
return await get_user_plan_limits(current_user.account_id, db)
async def require_tenant_context(
current_user: Annotated[User, Depends(get_current_active_user)],
):
"""Set per-request tenant context for RLS.
Raises 403 if the authenticated user has no account_id — never falls back
to PLATFORM_ACCOUNT_ID (that would grant platform-scope access to a
malformed account).
Sets the ContextVar that the SQLAlchemy transaction-begin listener reads to
issue set_config('app.current_account_id', …, true) on every transaction.
Applied to every user-facing router. NOT applied to /admin/* routers or
public endpoints (auth, shared, webhooks).
"""
if current_user.account_id is None:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="User account required",
)
token = set_current_account_id(current_user.account_id)
try:
yield
finally:
clear_current_account_id(token)
async def require_admin_db(
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
) -> AsyncSession:
"""Return a BYPASSRLS admin DB session after verifying super_admin role.
Use on /admin/* endpoints that query RLS-protected tables. Replaces
Depends(get_db) on the db parameter of those endpoints.
The current_user dep is still declared separately on the endpoint if
the user object is needed in the handler.
"""
return db

View File

@@ -9,12 +9,14 @@ from sqlalchemy import select
from pydantic import BaseModel
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.core.subscriptions import get_account_subscription, get_plan_limits, get_account_usage
from app.core.audit import log_audit
from app.models.refresh_token import RefreshToken
from app.core.email import EmailService
from app.models.account import Account
from app.models.account_invite import AccountInvite
from app.models.account_settings import AccountSettings
from app.models.subscription import Subscription
from app.models.user import User
from app.schemas.account import AccountResponse, AccountUpdate, AccountInviteCreate, AccountInviteResponse, TransferOwnershipRequest
@@ -148,7 +150,7 @@ async def update_member_role(
@router.post("/me/transfer-ownership", response_model=AccountResponse)
async def transfer_ownership(
data: TransferOwnershipRequest,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_account_owner)]
):
"""Transfer account ownership to another member (owner only)."""
@@ -377,7 +379,7 @@ async def list_invites(
@router.post("/me/leave")
async def leave_account(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(get_current_active_user)]
):
"""Leave the current account (non-owners only). Creates a personal account."""
@@ -423,7 +425,7 @@ class DeleteAccountRequest(BaseModel):
@router.delete("/me")
async def delete_account(
data: DeleteAccountRequest,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_account_owner)]
):
"""Delete the current account and soft-delete the user (owner only, no other members)."""
@@ -558,3 +560,65 @@ async def get_sso_status(
sso_enabled=account.sso_enabled,
sso_provider=account.sso_provider,
)
# ─── Account Preferences (FlowPilot Phase 6) ──────────────────────────────────
#
# Preferences live in `account_settings.preferences` as a JSONB grab-bag
# (per FLOWPILOT-MIGRATION.md Section 4.6). Rows are lazily created on first
# write. Any engineer-role user can read + update preferences because the
# keys stored here (templatize_prompt_enabled, cw_resolved_status_id, etc.)
# are team-level toggles rather than account-owner-gated admin settings.
class AccountPreferencesResponse(BaseModel):
preferences: dict
class AccountPreferencesUpdate(BaseModel):
"""Merge-style update — each key in `preferences` overwrites that key in
the stored JSONB, other keys are preserved. Omit the body entirely to
no-op.
"""
preferences: dict
@router.get("/me/preferences", response_model=AccountPreferencesResponse)
async def get_my_preferences(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
):
"""Return the current account's preferences JSONB (empty dict if no row)."""
result = await db.execute(
select(AccountSettings.preferences).where(
AccountSettings.account_id == current_user.account_id
)
)
prefs = result.scalar_one_or_none() or {}
return AccountPreferencesResponse(preferences=prefs)
@router.patch("/me/preferences", response_model=AccountPreferencesResponse)
async def update_my_preferences(
data: AccountPreferencesUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
):
"""Upsert preference keys. Existing keys not present in the payload are kept.
Example: posting `{"preferences": {"templatize_prompt_enabled": false}}`
from the post-resolve "Don't ask me again for this team" checkbox sets
just that key without clobbering any other preferences.
"""
for key, value in data.preferences.items():
await AccountSettings.set_setting(db, current_user.account_id, key, value)
await db.commit()
# Return the merged state so the client doesn't need a second GET.
result = await db.execute(
select(AccountSettings.preferences).where(
AccountSettings.account_id == current_user.account_id
)
)
prefs = result.scalar_one_or_none() or {}
return AccountPreferencesResponse(preferences=prefs)

View File

@@ -8,7 +8,7 @@ from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func, or_
from sqlalchemy.orm import selectinload, aliased
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.core.audit import log_audit
from app.core.config import settings
from app.core.security import get_password_hash, generate_temp_password, create_password_reset_token, decode_token, hash_token
@@ -57,7 +57,7 @@ router = APIRouter(prefix="/admin", tags=["admin"])
@router.get("/users", response_model=AdminUserListResponse)
async def list_users(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
page: Optional[int] = Query(None, ge=1),
size: Optional[int] = Query(None, ge=1, le=100),
@@ -153,7 +153,7 @@ async def list_users(
@router.get("/accounts", response_model=AdminAccountListResponse)
async def list_accounts(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
page: int = Query(1, ge=1),
size: int = Query(12, ge=1, le=100),
@@ -427,14 +427,23 @@ async def _get_account_detail_payload(
@router.post("/accounts", response_model=AdminAccountDetailResponse, status_code=status.HTTP_201_CREATED)
async def create_account(
data: AdminAccountCreate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Create a new account without requiring an initial user."""
owner_id = None
if data.owner_email:
result = await db.execute(select(User).where(User.email == data.owner_email.strip()))
owner = result.scalar_one_or_none()
if not owner:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=f"No user found with email '{data.owner_email}'")
owner_id = owner.id
display_code = await _generate_unique_display_code(db)
new_account = Account(
name=data.name.strip(),
display_code=display_code,
owner_id=owner_id,
)
db.add(new_account)
await db.flush()
@@ -448,7 +457,7 @@ async def create_account(
await log_audit(
db, current_user.id, "account.create_admin", "account", new_account.id,
{"name": new_account.name, "plan": data.plan},
{"name": new_account.name, "plan": data.plan, "owner_email": data.owner_email},
)
await db.commit()
return await _get_account_detail_payload(new_account.id, db)
@@ -457,7 +466,7 @@ async def create_account(
@router.get("/accounts/{account_id}", response_model=AdminAccountDetailResponse)
async def get_account_detail(
account_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
include_archived: bool = Query(False),
):
@@ -469,7 +478,7 @@ async def get_account_detail(
async def update_account(
account_id: UUID,
data: AdminAccountUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Update account settings from the admin panel."""
@@ -491,7 +500,7 @@ async def update_account(
@router.post("/users", response_model=AdminUserCreateResponse, status_code=status.HTTP_201_CREATED)
async def create_user(
data: AdminUserCreate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Create a new user with a temporary password (super admin only).
@@ -616,7 +625,7 @@ async def create_user(
@router.get("/users/{user_id}", response_model=UserDetailResponse)
async def get_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Get enriched user details (super admin only)."""
@@ -734,7 +743,7 @@ async def get_user(
async def update_user_role(
user_id: UUID,
role_data: RoleUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Change user role (super admin only)."""
@@ -766,7 +775,7 @@ async def update_user_role(
async def update_account_role(
user_id: UUID,
data: AccountRoleUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Change a user's account role (super admin only)."""
@@ -792,7 +801,7 @@ async def update_account_role(
async def update_super_admin_status(
user_id: UUID,
data: dict,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Promote or demote a user to/from super admin (super admin only)."""
@@ -831,7 +840,7 @@ async def update_super_admin_status(
@router.put("/users/{user_id}/deactivate", response_model=UserResponse)
async def deactivate_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Deactivate a user account (super admin only)."""
@@ -860,7 +869,7 @@ async def deactivate_user(
@router.put("/users/{user_id}/activate", response_model=UserResponse)
async def activate_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)]
):
"""Reactivate a user account (super admin only)."""
@@ -884,7 +893,7 @@ async def activate_user(
async def move_user_account(
user_id: UUID,
data: MoveUserAccount,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Move a user to a different account (super admin only)."""
@@ -959,7 +968,7 @@ async def _get_account_subscription(account_id: UUID, db: AsyncSession) -> tuple
async def update_user_plan(
user_id: UUID,
data: SubscriptionPlanUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Change a user's subscription plan (super admin only)."""
@@ -978,7 +987,7 @@ async def update_user_plan(
async def update_account_plan(
account_id: UUID,
data: SubscriptionPlanUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Change an account subscription plan (super admin only)."""
@@ -1003,7 +1012,7 @@ async def update_account_plan(
async def extend_user_trial(
user_id: UUID,
data: ExtendTrialRequest,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Extend or start a trial for a user's subscription (super admin only)."""
@@ -1033,7 +1042,7 @@ async def extend_user_trial(
async def extend_account_trial(
account_id: UUID,
data: ExtendTrialRequest,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Extend or start a trial for an account subscription (super admin only)."""
@@ -1070,7 +1079,7 @@ async def extend_account_trial(
async def admin_reset_password(
user_id: UUID,
data: AdminPasswordReset,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Admin-triggered password reset (super admin only).
@@ -1141,7 +1150,7 @@ async def admin_reset_password(
@router.put("/users/{user_id}/archive", response_model=UserResponse)
async def archive_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Archive (soft delete) a user (super admin only)."""
@@ -1176,7 +1185,7 @@ async def archive_user(
@router.put("/users/{user_id}/restore", response_model=UserResponse)
async def restore_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Restore an archived user (super admin only)."""
@@ -1201,7 +1210,7 @@ async def restore_user(
@router.get("/users/{user_id}/hard-delete-check", response_model=HardDeleteCheckResponse)
async def hard_delete_check(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Check if a user can be hard-deleted (super admin only). Returns blockers."""
@@ -1274,7 +1283,7 @@ async def hard_delete_check(
@router.delete("/users/{user_id}/hard-delete", status_code=status.HTTP_204_NO_CONTENT)
async def hard_delete_user(
user_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Permanently delete a user (super admin only). User must be archived first."""
@@ -1334,7 +1343,7 @@ async def hard_delete_user(
@router.post("/invites", status_code=status.HTTP_201_CREATED)
async def admin_create_invite(
data: dict,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Quick-invite a user to an account (super admin only).

View File

@@ -4,25 +4,26 @@ from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.core.audit import log_audit
from app.models.user import User
from app.models.category import TreeCategory
from app.models.tree import Tree
from app.schemas.admin import GlobalCategoryCreate, GlobalCategoryUpdate, GlobalCategoryResponse
from app.api.deps import require_admin
from app.core.service_account import PLATFORM_ACCOUNT_ID
router = APIRouter(prefix="/admin/categories", tags=["admin-categories"])
@router.get("/global", response_model=list[GlobalCategoryResponse])
async def list_global_categories(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""List all global categories (account_id IS NULL)."""
result = await db.execute(
select(TreeCategory).where(TreeCategory.account_id.is_(None)).order_by(TreeCategory.name)
select(TreeCategory).where(TreeCategory.account_id == PLATFORM_ACCOUNT_ID).order_by(TreeCategory.name)
)
categories = result.scalars().all()
@@ -45,36 +46,36 @@ async def list_global_categories(
@router.post("/global", response_model=GlobalCategoryResponse, status_code=status.HTTP_201_CREATED)
async def create_global_category(
data: GlobalCategoryCreate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Create a global category."""
# Check slug uniqueness for global categories
existing = await db.execute(
select(TreeCategory).where(TreeCategory.slug == data.slug, TreeCategory.account_id.is_(None))
select(TreeCategory).where(TreeCategory.slug == data.slug, TreeCategory.account_id == PLATFORM_ACCOUNT_ID)
)
if existing.scalar_one_or_none():
raise HTTPException(status_code=status.HTTP_409_CONFLICT, detail="Global category with this slug already exists")
category = TreeCategory(name=data.name, slug=data.slug, account_id=None)
category = TreeCategory(name=data.name, slug=data.slug, account_id=PLATFORM_ACCOUNT_ID)
db.add(category)
await log_audit(db, current_user.id, "global_category.create", "category", details={"name": data.name})
await db.commit()
await db.refresh(category)
return GlobalCategoryResponse(id=category.id, name=category.name, slug=category.slug, account_id=None, tree_count=0)
return GlobalCategoryResponse(id=category.id, name=category.name, slug=category.slug, account_id=PLATFORM_ACCOUNT_ID, tree_count=0)
@router.put("/global/{category_id}", response_model=GlobalCategoryResponse)
async def update_global_category(
category_id: UUID,
data: GlobalCategoryUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Update a global category."""
result = await db.execute(
select(TreeCategory).where(TreeCategory.id == category_id, TreeCategory.account_id.is_(None))
select(TreeCategory).where(TreeCategory.id == category_id, TreeCategory.account_id == PLATFORM_ACCOUNT_ID)
)
category = result.scalar_one_or_none()
if not category:
@@ -86,7 +87,7 @@ async def update_global_category(
# Check slug uniqueness
existing = await db.execute(
select(TreeCategory).where(
TreeCategory.slug == data.slug, TreeCategory.account_id.is_(None), TreeCategory.id != category_id
TreeCategory.slug == data.slug, TreeCategory.account_id == PLATFORM_ACCOUNT_ID, TreeCategory.id != category_id
)
)
if existing.scalar_one_or_none():
@@ -103,19 +104,19 @@ async def update_global_category(
return GlobalCategoryResponse(
id=category.id, name=category.name, slug=category.slug,
account_id=None, tree_count=tree_count,
account_id=PLATFORM_ACCOUNT_ID, tree_count=tree_count,
)
@router.delete("/global/{category_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_global_category(
category_id: UUID,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Delete (archive) a global category."""
result = await db.execute(
select(TreeCategory).where(TreeCategory.id == category_id, TreeCategory.account_id.is_(None))
select(TreeCategory).where(TreeCategory.id == category_id, TreeCategory.account_id == PLATFORM_ACCOUNT_ID)
)
category = result.scalar_one_or_none()
if not category:

View File

@@ -3,7 +3,7 @@ from fastapi import APIRouter, Depends
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.models.user import User
from app.models.subscription import Subscription
from app.models.tree import Tree
@@ -16,7 +16,7 @@ router = APIRouter(prefix="/admin/dashboard", tags=["admin-dashboard"])
@router.get("/metrics", response_model=DashboardMetrics)
async def get_dashboard_metrics(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Get platform overview metrics."""
@@ -45,7 +45,7 @@ async def get_dashboard_metrics(
@router.get("/activity", response_model=list[ActivityEntry])
async def get_dashboard_activity(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Get recent audit log entries for activity feed."""

View File

@@ -12,7 +12,7 @@ from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import require_admin
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.models.script_template import ScriptTemplate
from app.models.tree import Tree
from app.models.user import User
@@ -66,7 +66,7 @@ def _script_summary(script: ScriptTemplate) -> dict:
@router.get("/featured")
async def list_featured(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""List all featured flows and scripts (super admin only)."""
@@ -92,7 +92,7 @@ async def list_featured(
@router.get("/items")
async def list_all_items(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""List ALL flows and scripts with their gallery status (super admin only)."""
@@ -119,7 +119,7 @@ async def list_all_items(
async def toggle_flow_featured(
flow_id: UUID,
body: FeatureToggle,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Toggle is_gallery_featured on a flow (super admin only)."""
@@ -138,7 +138,7 @@ async def toggle_flow_featured(
async def update_flow_sort_order(
flow_id: UUID,
body: SortOrderUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Update gallery_sort_order on a flow (super admin only)."""
@@ -157,7 +157,7 @@ async def update_flow_sort_order(
async def toggle_script_featured(
script_id: UUID,
body: FeatureToggle,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Toggle is_gallery_featured on a script (super admin only)."""
@@ -176,7 +176,7 @@ async def toggle_script_featured(
async def update_script_sort_order(
script_id: UUID,
body: SortOrderUpdate,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(require_admin)],
):
"""Update gallery_sort_order on a script (super admin only)."""

View File

@@ -49,8 +49,6 @@ from app.schemas.ai_session import (
ChatMessageRequest,
ChatMessageResponse,
SaveTaskLaneRequest,
TriagePatchRequest,
TriagePatchResponse,
)
from app.services import flowpilot_engine
from app.services import unified_chat_service
@@ -122,11 +120,6 @@ def _build_session_detail(session: AISession) -> AISessionDetail:
pending_task_lane=session.pending_task_lane,
is_branching=getattr(session, 'is_branching', False),
active_branch_id=str(session.active_branch_id) if getattr(session, 'active_branch_id', None) else None,
client_name=getattr(session, 'client_name', None),
asset_name=getattr(session, 'asset_name', None),
issue_category=getattr(session, 'issue_category', None),
triage_hypothesis=getattr(session, 'triage_hypothesis', None),
evidence_items=getattr(session, 'evidence_items', None),
)
@@ -308,7 +301,7 @@ async def send_chat_message(
message = f"{message}\n\n[Attached document content]\n{doc_context}"
try:
ai_content, suggested_flows, session, fork_metadata, actions_data, questions_data, triage_update_data = await unified_chat_service.send_chat_message(
ai_content, suggested_flows, session, fork_metadata, actions_data, questions_data = await unified_chat_service.send_chat_message(
session_id=session_id,
user_id=user_id,
account_id=account_id,
@@ -353,7 +346,6 @@ async def send_chat_message(
fork=fork_metadata,
actions=actions_data,
questions=questions_data,
triage_update=triage_update_data,
)
@@ -450,12 +442,7 @@ async def resolve_session(
try:
from app.services.resolution_output_generator import ResolutionOutputGenerator
gen = ResolutionOutputGenerator(db)
await gen.generate_all(
session_id,
root_cause=data.root_cause,
steps_taken=data.steps_taken,
recommendations=data.recommendations,
)
await gen.generate_all(session_id)
except Exception:
logger.exception(f"Failed to generate resolution outputs for session {session_id}")
@@ -532,11 +519,15 @@ async def save_task_lane(
_: None = Depends(require_engineer_or_admin),
):
"""Save the current task lane state including user's in-progress responses."""
session = await db.get(AISession, session_id)
result = await db.execute(
select(AISession).where(
AISession.id == session_id,
AISession.user_id == current_user.id,
)
)
session = result.scalar_one_or_none()
if not session:
raise HTTPException(status_code=404, detail="Session not found")
if session.user_id != current_user.id:
raise HTTPException(status_code=403, detail="Not your session")
payload = {
"questions": [q.model_dump() for q in body.questions],
@@ -553,122 +544,6 @@ async def save_task_lane(
await db.commit()
# ── Triage Metadata ──
@router.patch("/{session_id}/triage", response_model=TriagePatchResponse)
@limiter.limit("30/minute")
async def update_triage(
request: Request,
session_id: UUID,
body: TriagePatchRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
):
"""Update triage metadata on a session (incident header fields)."""
session = await db.get(AISession, session_id)
if not session:
raise HTTPException(status_code=404, detail="Session not found")
if session.user_id != current_user.id:
raise HTTPException(status_code=403, detail="Not your session")
patch_data = body.model_dump(exclude_unset=True)
for field, value in patch_data.items():
setattr(session, field, value)
await db.commit()
await db.refresh(session)
return TriagePatchResponse(
id=session.id,
client_name=session.client_name,
asset_name=session.asset_name,
issue_category=session.issue_category,
triage_hypothesis=session.triage_hypothesis,
evidence_items=session.evidence_items,
)
# ── Handoff Draft ──
@router.post("/{session_id}/handoff-draft")
@limiter.limit("10/minute")
async def handoff_draft(
request: Request,
session_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
):
"""Stream a structured handoff draft for the conclude modal."""
from fastapi.responses import StreamingResponse
from app.services.assistant_chat_service import _call_ai
session = await db.get(AISession, session_id)
if not session:
raise HTTPException(status_code=404, detail="Session not found")
if session.user_id != current_user.id:
raise HTTPException(status_code=403, detail="Not your session")
# Build context from session data
context_parts = [
f"Problem: {session.problem_summary or 'Unknown'}",
f"Domain: {session.problem_domain or 'Unknown'}",
f"Client: {session.client_name or 'Unknown'}",
f"Asset: {session.asset_name or 'Unknown'}",
f"Hypothesis: {session.triage_hypothesis or 'None'}",
]
if session.evidence_items:
context_parts.append("\nEvidence collected:")
for item in session.evidence_items:
status_icon = {"confirmed": "", "ruled_out": "", "pending": "?"}.get(item.get("status", ""), "?")
context_parts.append(f" {status_icon} {item.get('text', '')}")
# Include task lane steps if available
if session.pending_task_lane:
actions = session.pending_task_lane.get("actions", [])
if actions:
context_parts.append("\nSteps taken:")
for a in actions:
context_parts.append(f" - {a.get('label', '')}")
# Include last 20 conversation messages
msgs = session.conversation_messages or []
if msgs:
context_parts.append("\nRecent conversation:")
for msg in msgs[-20:]:
role = msg.get("role", "unknown")
content = msg.get("content", "")[:300]
context_parts.append(f" [{role}]: {content}")
context = "\n".join(context_parts)
prompt = (
"Generate a structured handoff summary for this troubleshooting session.\n"
"Return ONLY valid JSON with exactly these four fields:\n"
'{"root_cause": "...", "resolution": "...", "steps_taken": ["step1", "step2"], "recommendations": "..."}\n\n'
f"Session context:\n{context}"
)
async def generate():
try:
content, _, _ = await _call_ai(
system_base="You are a concise technical documentation assistant for MSP teams. Return only JSON.",
rag_context="",
history=[],
new_message=prompt,
max_tokens=1024,
)
yield f"data: {content}\n\n"
except Exception as e:
logger.exception(f"Handoff draft generation failed for session {session_id}")
import json
yield f"data: {json.dumps({'error': str(e)})}\n\n"
return StreamingResponse(generate(), media_type="text/event-stream")
# ── Resume ──
@router.post("/{session_id}/resume", status_code=204)
@@ -891,13 +766,13 @@ async def search_sessions(
limit: int = Query(5, ge=1, le=20),
):
"""Search AI sessions by content using full-text search. Used by Command Palette."""
# Sessions are user-scoped. The list endpoint uses user_id only;
# search must be consistent. Cross-user access requires explicit
# escalation or session sharing — not ambient account membership.
result = await db.execute(
select(AISession)
.where(
or_(
AISession.user_id == current_user.id,
AISession.account_id == current_user.account_id,
),
AISession.user_id == current_user.id,
text("ai_sessions.search_vector @@ plainto_tsquery('english', :q)"),
)
.params(q=q)
@@ -1030,7 +905,7 @@ async def get_session(
pkg = session.escalation_package or {}
is_handler = pkg.get("picked_up_by") == str(current_user.id)
if session.user_id != current_user.id and session.escalated_to_id != current_user.id and not is_handler:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Not authorized")
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Session not found")
return _build_session_detail(session)
@@ -1046,6 +921,18 @@ async def get_documentation(
db: Annotated[AsyncSession, Depends(get_db)],
):
"""Get auto-generated documentation for a session."""
# Verify session ownership — owner only. Documentation endpoints require direct
# ownership; escalated_to_id / picked_up_by handlers use get_session (read-only).
# This is consistent with stream_documentation which has the same owner-only check.
result = await db.execute(
select(AISession).where(
AISession.id == session_id,
AISession.user_id == current_user.id,
)
)
if not result.scalar_one_or_none():
raise HTTPException(status_code=404, detail="Session not found")
try:
return await flowpilot_engine.get_session_documentation(
session_id=session_id,
@@ -1071,13 +958,14 @@ async def stream_documentation(
# Verify session ownership
result = await db.execute(
select(AISession).where(AISession.id == session_id)
select(AISession).where(
AISession.id == session_id,
AISession.user_id == current_user.id,
)
)
session = result.scalar_one_or_none()
if not session:
raise HTTPException(status_code=404, detail="Session not found")
if session.user_id != current_user.id:
raise HTTPException(status_code=403, detail="Not authorized")
async def event_generator():
try:
@@ -1172,6 +1060,19 @@ async def retry_psa_push_endpoint(
"""Manually retry a failed PSA documentation push."""
from app.models.psa_post_log import PsaPostLog
# Verify the session belongs to the current user
session_result = await db.execute(
select(AISession).where(
AISession.id == session_id,
AISession.user_id == current_user.id,
)
)
if not session_result.scalar_one_or_none():
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found",
)
# Find the latest failed push log for this session
result = await db.execute(
select(PsaPostLog)

View File

@@ -43,6 +43,7 @@ async def create_suggestion(
suggestion = AISuggestion(
tree_id=data.tree_id,
user_id=current_user.id,
account_id=current_user.account_id,
session_id=data.session_id,
action_type=data.action_type,
target_node_id=data.target_node_id,

View File

@@ -7,6 +7,7 @@ from sqlalchemy.ext.asyncio import AsyncSession
from app.core.database import get_db
from app.api.deps import get_current_active_user
from app.core.filters import tenant_filter
from app.models import User, Session, Tree, SessionRating
from app.schemas.analytics import (
TeamAnalyticsResponse, PersonalAnalyticsResponse, FlowAnalyticsResponse,
@@ -290,8 +291,13 @@ async def get_flow_analytics(
current_user: User = Depends(get_current_active_user),
):
"""Analytics for a specific flow."""
# Verify tree exists
result = await db.execute(select(Tree).where(Tree.id == tree_id))
# Verify tree exists and belongs to the requesting user's account.
result = await db.execute(
select(Tree).where(
Tree.id == tree_id,
tenant_filter(Tree, current_user.account_id),
)
)
tree = result.scalar_one_or_none()
if not tree:
raise HTTPException(status_code=404, detail="Flow not found")

View File

@@ -1,6 +1,5 @@
import secrets
import string
import uuid
from datetime import datetime, timezone, timedelta
from typing import Annotated
from fastapi import APIRouter, Depends, HTTPException, status, Request
@@ -9,7 +8,7 @@ from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, update as sa_update
from app.core.config import settings
from app.core.settings_manager import SettingsManager
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.core.rate_limit import limiter
from app.core.security import (
verify_password,
@@ -27,7 +26,6 @@ from app.models.refresh_token import RefreshToken
from app.models.account import Account
from app.models.subscription import Subscription
from app.models.account_invite import AccountInvite
from app.models.feature_flag import FeatureFlag, PlanFeatureDefault, AccountFeatureOverride
from app.schemas.user import UserCreate, UserResponse, UserLogin, UserUpdate
from app.schemas.token import Token
from app.schemas.auth_password import (
@@ -69,7 +67,7 @@ def _generate_display_code() -> str:
async def register(
request: Request,
user_data: UserCreate,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Register a new user.
@@ -234,7 +232,7 @@ async def register(
async def login(
request: Request,
form_data: Annotated[OAuth2PasswordRequestForm, Depends()],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Login and get access token."""
# Find user by email
@@ -272,7 +270,7 @@ async def login(
async def login_json(
request: Request,
credentials: UserLogin,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Login with JSON body (alternative to form data)."""
result = await db.execute(select(User).where(User.email == credentials.email))
@@ -306,7 +304,7 @@ async def login_json(
async def refresh_token(
request: Request,
payload: Annotated[dict, Depends(get_refresh_token_payload)],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Refresh access token using refresh token (rotation: old token is revoked)."""
user_id = payload.get("sub")
@@ -370,7 +368,7 @@ async def get_me(
async def update_me(
data: UserUpdate,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Update current user's profile (name, email)."""
update_fields = data.model_fields_set - {"current_password"}
@@ -417,7 +415,7 @@ async def update_me(
@router.post("/logout")
async def logout(
payload: Annotated[dict, Depends(get_refresh_token_payload)],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Logout user by revoking the refresh token."""
jti = payload.get("jti")
@@ -440,7 +438,7 @@ async def change_password(
request: Request,
data: ChangePasswordRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Change the current user's password."""
if not verify_password(data.current_password, current_user.password_hash):
@@ -480,7 +478,7 @@ async def change_password(
async def forgot_password(
request: Request,
data: ForgotPasswordRequest,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Request a password reset email. Always returns success (anti-enumeration)."""
result = await db.execute(select(User).where(User.email == data.email))
@@ -515,7 +513,7 @@ async def forgot_password(
@router.post("/password/verify-reset-token", response_model=VerifyResetTokenResponse)
async def verify_reset_token(
data: VerifyResetTokenRequest,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Verify a password reset token is valid."""
payload = decode_token(data.token)
@@ -546,7 +544,7 @@ async def verify_reset_token(
async def reset_password(
request: Request,
data: ResetPasswordRequest,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Reset password using a valid reset token."""
payload = decode_token(data.token)
@@ -613,7 +611,7 @@ async def reset_password(
@router.get("/email/verification-status")
async def get_verification_status(
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Check if email verification is enabled on the platform."""
enabled = await SettingsManager.get("email_verification_enabled", db, default=True)
@@ -625,7 +623,7 @@ async def get_verification_status(
async def send_verification_email(
request: Request,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Send an email verification link to the current user."""
verification_enabled = await SettingsManager.get("email_verification_enabled", db, default=True)
@@ -664,7 +662,7 @@ async def send_verification_email(
@router.post("/email/verify")
async def verify_email(
data: dict,
db: Annotated[AsyncSession, Depends(get_db)]
db: Annotated[AsyncSession, Depends(get_admin_db)]
):
"""Verify an email using a token. Public endpoint."""
token = data.get("token")
@@ -720,59 +718,3 @@ async def verify_email(
await db.commit()
return {"message": "Email verified successfully"}
@router.get("/me/feature-flags", response_model=dict[str, bool])
async def get_my_feature_flags(
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
) -> dict[str, bool]:
"""Resolve feature flags for the current user's account and plan."""
plan = "free"
if current_user.account_id:
sub_result = await db.execute(
select(Subscription).where(
Subscription.account_id == current_user.account_id,
Subscription.status.in_(["active", "trialing"]),
)
)
sub = sub_result.scalar_one_or_none()
if sub:
plan = sub.plan
flags_result = await db.execute(select(FeatureFlag))
flags = flags_result.scalars().all()
if not flags:
return {}
flag_ids = [f.id for f in flags]
defaults_result = await db.execute(
select(PlanFeatureDefault).where(
PlanFeatureDefault.flag_id.in_(flag_ids),
PlanFeatureDefault.plan == plan,
)
)
plan_defaults = {d.flag_id: d.enabled for d in defaults_result.scalars().all()}
overrides: dict[uuid.UUID, bool] = {}
if current_user.account_id:
overrides_result = await db.execute(
select(AccountFeatureOverride).where(
AccountFeatureOverride.flag_id.in_(flag_ids),
AccountFeatureOverride.account_id == current_user.account_id,
)
)
overrides = {o.flag_id: o.enabled for o in overrides_result.scalars().all()}
resolved = {}
for flag in flags:
if flag.id in overrides:
resolved[flag.flag_key] = overrides[flag.id]
elif flag.id in plan_defaults:
resolved[flag.flag_key] = plan_defaults[flag.id]
else:
resolved[flag.flag_key] = False
return resolved

View File

@@ -12,6 +12,8 @@ from app.models.user import User
from app.schemas.category import CategoryCreate, CategoryUpdate, CategoryResponse, CategoryListResponse
from app.api.deps import get_current_active_user
from app.core.permissions import can_manage_category, can_create_category
from app.core.service_account import PLATFORM_ACCOUNT_ID
from app.core.filters import tenant_filter
router = APIRouter(prefix="/categories", tags=["categories"])
@@ -47,13 +49,13 @@ async def list_categories(
elif current_user.account_id:
query = query.where(
or_(
TreeCategory.account_id.is_(None), # Global
TreeCategory.account_id == PLATFORM_ACCOUNT_ID, # Global
TreeCategory.account_id == current_user.account_id # User's account
)
)
else:
# User has no account, only show global categories
query = query.where(TreeCategory.account_id.is_(None))
query = query.where(TreeCategory.account_id == PLATFORM_ACCOUNT_ID)
query = query.order_by(TreeCategory.display_order, TreeCategory.name)
@@ -108,10 +110,12 @@ async def get_category(
detail="You don't have access to this category"
)
# Get tree count
# Get tree count — scoped to the requesting account so cross-account
# trees in shared categories are not counted.
count_query = select(func.count(Tree.id)).where(
Tree.category_id == category.id,
Tree.is_active == True
Tree.is_active == True,
tenant_filter(Tree, current_user.account_id),
)
count_result = await db.execute(count_query)
tree_count = count_result.scalar() or 0
@@ -173,7 +177,7 @@ async def create_category(
name=category_data.name,
slug=slug,
description=category_data.description,
account_id=category_data.account_id,
account_id=category_data.account_id if category_data.account_id is not None else PLATFORM_ACCOUNT_ID,
display_order=max_order + 1,
created_by=current_user.id
)

View File

@@ -15,6 +15,7 @@ from app.schemas.device_type import (
DeviceTypeUpdate,
DeviceTypeResponse,
)
from app.core.service_account import PLATFORM_ACCOUNT_ID
router = APIRouter(prefix="/device-types", tags=["device-types"])
@@ -28,8 +29,8 @@ async def list_device_types(
select(DeviceType)
.where(
or_(
DeviceType.is_system.is_(True),
DeviceType.team_id == current_user.team_id,
DeviceType.account_id == PLATFORM_ACCOUNT_ID,
DeviceType.account_id == current_user.account_id,
)
)
.order_by(DeviceType.category, DeviceType.sort_order, DeviceType.label)
@@ -48,16 +49,16 @@ async def create_device_type(
existing = await db.execute(
select(DeviceType).where(
DeviceType.slug == data.slug,
DeviceType.team_id == current_user.team_id,
DeviceType.account_id == current_user.account_id,
)
)
if existing.scalar_one_or_none():
raise HTTPException(status_code=409, detail=f"Device type '{data.slug}' already exists for your team")
raise HTTPException(status_code=409, detail=f"Device type '{data.slug}' already exists for your account")
system_existing = await db.execute(
select(DeviceType).where(
DeviceType.slug == data.slug,
DeviceType.is_system.is_(True),
DeviceType.account_id == PLATFORM_ACCOUNT_ID,
)
)
if system_existing.scalar_one_or_none():
@@ -68,7 +69,7 @@ async def create_device_type(
label=data.label,
category=data.category,
is_system=False,
team_id=current_user.team_id,
account_id=current_user.account_id,
sort_order=data.sort_order,
)
db.add(device_type)
@@ -89,7 +90,7 @@ async def update_device_type(
raise HTTPException(status_code=404, detail="Device type not found")
if device_type.is_system:
raise HTTPException(status_code=403, detail="Cannot modify system device types")
if device_type.team_id != current_user.team_id:
if device_type.account_id != current_user.account_id:
raise HTTPException(status_code=404, detail="Device type not found")
update_data = data.model_dump(exclude_unset=True)
@@ -112,7 +113,7 @@ async def delete_device_type(
raise HTTPException(status_code=404, detail="Device type not found")
if device_type.is_system:
raise HTTPException(status_code=403, detail="Cannot delete system device types")
if device_type.team_id != current_user.team_id:
if device_type.account_id != current_user.account_id:
raise HTTPException(status_code=404, detail="Device type not found")
await db.delete(device_type)

View File

@@ -0,0 +1,221 @@
"""Draft template endpoints — Phase 6 post-resolve templatization flow.
Engineers who picked "Run now, templatize after resolve" on the three-option
dialog (Phase 5) generate a `draft_templates` row at decision time. After
the session resolves, the TemplatizePrompt component lets them either:
- Accept → promotes the draft to a real `script_templates` row
- Reject → marks the draft rejected, no library entry created
The Script Library sidebar uses the list endpoint to surface a
"X drafts ready to review" badge for the account.
See FLOWPILOT-MIGRATION.md Section 5.3.
"""
import logging
import re
from datetime import datetime, timezone
from typing import Annotated
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import get_current_active_user, get_db, require_engineer_or_admin
from app.models.ai_session import AISession
from app.models.draft_template import DraftTemplate
from app.models.script_template import ScriptCategory, ScriptTemplate
from app.models.user import User
from app.schemas.draft_template import (
DraftTemplateAcceptRequest,
DraftTemplateAcceptResponse,
DraftTemplateListResponse,
DraftTemplateRejectResponse,
DraftTemplateResponse,
)
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/draft-templates", tags=["draft-templates"])
def _slugify(name: str) -> str:
"""Same slug rule as scripts.create_template — lowercase, kebab-case, ASCII."""
return re.sub(r"[^a-z0-9]+", "-", name.lower()).strip("-")
# ── List ─────────────────────────────────────────────────────────────────
@router.get("", response_model=DraftTemplateListResponse)
async def list_drafts(
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
pending_only: bool = True,
) -> DraftTemplateListResponse:
"""List drafts for the current user's account.
Defaults to pending-only — that's what the Script Library badge counts
and what the post-resolve TemplatizePrompt iterates over. Pass
`pending_only=false` to include accepted/rejected for an audit view.
"""
stmt = select(DraftTemplate).order_by(DraftTemplate.created_at.desc())
if pending_only:
stmt = stmt.where(DraftTemplate.status == "pending")
result = await db.execute(stmt)
drafts = list(result.scalars().all())
return DraftTemplateListResponse(
drafts=[DraftTemplateResponse.model_validate(d) for d in drafts]
)
# ── Get one ──────────────────────────────────────────────────────────────
@router.get("/{draft_id}", response_model=DraftTemplateResponse)
async def get_draft(
draft_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> DraftTemplateResponse:
draft = await _load_draft_or_404(db, draft_id)
return DraftTemplateResponse.model_validate(draft)
# ── Accept ───────────────────────────────────────────────────────────────
@router.post(
"/{draft_id}/accept",
response_model=DraftTemplateAcceptResponse,
status_code=201,
)
async def accept_draft(
draft_id: UUID,
body: DraftTemplateAcceptRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> DraftTemplateAcceptResponse:
"""Promote a draft to a real `script_templates` row.
Provenance fields (`source_session_id`, `source_user_id`,
`source_ticket_ref`) are copied so the Script Library can render the
"generated from CW #X · resolved by Y · used N times" chip.
On success: draft.status='accepted', draft.promoted_template_id set,
draft.resolved_at set. The new template is owned by the engineer's team
(matches scripts.create_template's behavior).
Returns 409 if the draft is already accepted/rejected.
"""
draft = await _load_draft_or_404(db, draft_id)
if draft.status != "pending":
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Draft is already {draft.status}",
)
# Validate the category exists and belongs to (or is global for) this account.
cat_result = await db.execute(
select(ScriptCategory).where(
ScriptCategory.id == body.category_id,
ScriptCategory.is_active == True, # noqa: E712
)
)
if cat_result.scalar_one_or_none() is None:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="category_id does not reference an active script category",
)
# Look up source-session ticket ref for the provenance chip. RLS makes
# cross-account ai_session lookup impossible — the draft must belong to
# the same account as the requesting user.
source_session = (
await db.execute(
select(AISession).where(AISession.id == draft.source_session_id)
)
).scalar_one_or_none()
source_ticket_ref = (
f"CW #{source_session.psa_ticket_id}"
if source_session and source_session.psa_ticket_id
else None
)
slug = _slugify(body.name)
template = ScriptTemplate(
category_id=body.category_id,
team_id=current_user.team_id,
account_id=current_user.account_id,
created_by=current_user.id,
name=body.name,
slug=slug,
description=body.description,
script_body=body.edited_body or draft.script_body,
parameters_schema=body.parameters_schema,
# FlowPilot provenance — drives the Script Library chip.
source_session_id=draft.source_session_id,
source_user_id=draft.source_user_id,
source_ticket_ref=source_ticket_ref,
)
db.add(template)
await db.flush() # populate template.id
draft.status = "accepted"
draft.promoted_template_id = template.id
draft.resolved_at = datetime.now(timezone.utc)
await db.commit()
await db.refresh(template)
return DraftTemplateAcceptResponse(
draft_id=draft.id,
promoted_template_id=template.id,
template_slug=template.slug,
)
# ── Reject ───────────────────────────────────────────────────────────────
@router.post("/{draft_id}/reject", response_model=DraftTemplateRejectResponse)
async def reject_draft(
draft_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> DraftTemplateRejectResponse:
"""Mark a draft rejected.
No template is created. The row stays for audit (so a team admin can see
the engineer reviewed and explicitly declined). Returns 409 on a draft
that's already accepted/rejected.
"""
draft = await _load_draft_or_404(db, draft_id)
if draft.status != "pending":
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Draft is already {draft.status}",
)
draft.status = "rejected"
draft.resolved_at = datetime.now(timezone.utc)
await db.commit()
return DraftTemplateRejectResponse(draft_id=draft.id, status="rejected")
# ── Helpers ─────────────────────────────────────────────────────────────
async def _load_draft_or_404(
db: AsyncSession, draft_id: UUID
) -> DraftTemplate:
"""RLS-scoped draft load. 404 covers missing + cross-tenant."""
result = await db.execute(
select(DraftTemplate).where(DraftTemplate.id == draft_id)
)
draft = result.scalar_one_or_none()
if draft is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Draft template not found",
)
return draft

View File

@@ -27,6 +27,7 @@ from app.schemas.psa_connection import (
PsaMemberMappingSaveRequest,
PsaMemberResponse,
AutoMatchResult,
PSABoardResponse,
)
from app.core.config import settings
from app.services.psa.encryption import (
@@ -345,26 +346,103 @@ async def update_flowpilot_settings(
# ── ticket / status / company endpoints ──────────────────────────
@router.get("/tickets/search", response_model=list[PSATicketSearchResult])
async def search_tickets(
@router.get("/boards", response_model=list[PSABoardResponse])
async def list_boards(
current_user: Annotated[User, Depends(require_engineer_or_admin)],
db: Annotated[AsyncSession, Depends(get_db)],
query: str = "",
board_id: int | None = None,
status_id: int | None = None,
include_closed: bool = False,
):
"""Search ConnectWise tickets."""
"""List PSA service boards."""
if not current_user.account_id:
raise HTTPException(status_code=400, detail="User has no account")
from app.services.psa.registry import get_provider_for_account
from app.services.psa.exceptions import PSAError
try:
provider = await get_provider_for_account(current_user.account_id, db)
boards = await provider.list_boards()
return [PSABoardResponse(id=b.id, name=b.name) for b in boards]
except PSAError:
# Boards are optional UI chrome — degrade gracefully rather than surfacing a toast
return []
@router.get("/tickets/search", response_model=list[PSATicketSearchResult])
async def search_tickets(
current_user: Annotated[User, Depends(require_engineer_or_admin)],
db: Annotated[AsyncSession, Depends(get_db)],
query: str = "",
board_id: int | None = None,
status_id: int | None = None,
include_closed: bool = False,
assigned_to_me: bool = False,
unassigned: bool = False,
board_ids: str = "",
page: int = 1,
page_size: int = 10,
):
"""Search ConnectWise tickets."""
if not current_user.account_id:
raise HTTPException(status_code=400, detail="User has no account")
from app.services.psa.registry import get_provider_for_account
from app.services.psa.exceptions import PSAError
# Resolve assigned_to_me → member_identifier (CW login name for resources contains filter)
member_identifier: str | None = None
if assigned_to_me:
conn_result = await db.execute(
select(PsaConnection).where(
PsaConnection.account_id == current_user.account_id,
PsaConnection.is_active.is_(True),
)
)
conn = conn_result.scalar_one_or_none()
if conn:
mapping_result = await db.execute(
select(PsaMemberMapping).where(
PsaMemberMapping.psa_connection_id == conn.id,
PsaMemberMapping.user_id == current_user.id,
)
)
mapping = mapping_result.scalar_one_or_none()
if not mapping:
# No mapping for this user — return empty list
return []
from app.services.psa.registry import get_provider_for_account as _get_provider
from app.services.psa.exceptions import PSAError as _PSAError
try:
_provider = await _get_provider(current_user.account_id, db)
cw_members = await _provider.list_members()
matched = next((m for m in cw_members if m.id == mapping.external_member_id), None)
if matched:
member_identifier = matched.identifier
else:
return []
except _PSAError:
return []
# Parse comma-separated board_ids
parsed_board_ids: list[int] = []
if board_ids:
try:
parsed_board_ids = [int(bid.strip()) for bid in board_ids.split(",") if bid.strip()]
except ValueError:
raise HTTPException(status_code=400, detail="board_ids must be comma-separated integers")
try:
provider = await get_provider_for_account(current_user.account_id, db)
tickets = await provider.search_tickets(
query, board_id=board_id, status_id=status_id, include_closed=include_closed
query,
board_id=board_id,
status_id=status_id,
include_closed=include_closed,
member_identifier=member_identifier,
unassigned=unassigned,
board_ids=parsed_board_ids,
page=page,
page_size=page_size,
)
return [
PSATicketSearchResult(
@@ -517,31 +595,37 @@ async def get_member_mappings(
current_user: Annotated[User, Depends(require_account_owner)],
db: Annotated[AsyncSession, Depends(get_db)],
):
"""Get all member mappings for the account."""
"""Get all account users with their PSA member mappings (unmapped users included)."""
conn = await _get_account_connection(current_user.account_id, db)
if not conn:
return []
result = await db.execute(
# Fetch all active account users
users_result = await db.execute(
select(User).where(User.account_id == current_user.account_id, User.is_active.is_(True))
)
users = users_result.scalars().all()
# Fetch all existing mappings keyed by user_id for O(1) lookup
mappings_result = await db.execute(
select(PsaMemberMapping).where(PsaMemberMapping.psa_connection_id == conn.id)
)
mappings = result.scalars().all()
mapping_by_user: dict[str, PsaMemberMapping] = {
str(m.user_id): m for m in mappings_result.scalars().all()
}
response = []
for m in mappings:
user_result = await db.execute(select(User).where(User.id == m.user_id))
user = user_result.scalar_one_or_none()
if user:
response.append(PsaMemberMappingResponse(
id=str(m.id),
user_id=str(m.user_id),
user_email=user.email,
user_name=user.name,
external_member_id=m.external_member_id,
external_member_name=m.external_member_name,
matched_by=m.matched_by,
))
return response
return [
PsaMemberMappingResponse(
id=str(m.id) if (m := mapping_by_user.get(str(user.id))) else None,
user_id=str(user.id),
user_email=user.email,
user_name=user.name,
external_member_id=m.external_member_id if m else None,
external_member_name=m.external_member_name if m else None,
matched_by=m.matched_by if m else None,
)
for user in users
]
@router.post("/member-mappings", response_model=list[PsaMemberMappingResponse])
@@ -564,6 +648,7 @@ async def save_member_mappings(
for m in mappings:
mapping = PsaMemberMapping(
psa_connection_id=conn.id,
account_id=current_user.account_id,
user_id=UUID(m.user_id),
external_member_id=m.external_member_id,
external_member_name=m.external_member_name,
@@ -624,6 +709,7 @@ async def auto_match_members(
if not existing.scalar_one_or_none():
mapping = PsaMemberMapping(
psa_connection_id=conn.id,
account_id=current_user.account_id,
user_id=user.id,
external_member_id=cw_member.id,
external_member_name=cw_member.name,

View File

@@ -29,8 +29,8 @@ def _compute_next_run(cron_expression: str, tz_name: str) -> datetime:
return cron.get_next(datetime).astimezone(timezone.utc)
async def _get_tree_or_403(tree_id: UUID, current_user: User, db: AsyncSession) -> "Tree":
"""Fetch tree and verify the current user's team owns it."""
async def _get_tree_or_404(tree_id: UUID, current_user: User, db: AsyncSession) -> "Tree":
"""Fetch tree and verify the current user's team owns it. Raises 404 if not found or access denied."""
result = await db.execute(select(Tree).where(Tree.id == tree_id))
tree = result.scalar_one_or_none()
if not tree:
@@ -38,7 +38,7 @@ async def _get_tree_or_403(tree_id: UUID, current_user: User, db: AsyncSession)
# Super admins can access any tree; regular users must be on the same team
if not getattr(current_user, 'is_super_admin', False):
if tree.team_id != current_user.team_id:
raise HTTPException(status_code=403, detail="Access denied")
raise HTTPException(status_code=404, detail="Tree not found")
return tree
@@ -51,7 +51,7 @@ async def create_schedule(
):
"""Create a cron schedule for a maintenance flow. One per flow."""
# Verify user's team owns the tree
tree = await _get_tree_or_403(data.tree_id, current_user, db)
tree = await _get_tree_or_404(data.tree_id, current_user, db)
if tree.tree_type != "maintenance":
raise HTTPException(status_code=400, detail="Schedules are only supported for maintenance flows")
@@ -69,6 +69,7 @@ async def create_schedule(
schedule = MaintenanceSchedule(
tree_id=data.tree_id,
account_id=current_user.account_id,
created_by=current_user.id,
cron_expression=data.cron_expression,
timezone=data.timezone,
@@ -94,7 +95,7 @@ async def get_schedule_for_tree(
):
"""Get the schedule for a specific maintenance flow."""
# Verify user's team owns the tree before returning schedule data
await _get_tree_or_403(tree_id, current_user, db)
await _get_tree_or_404(tree_id, current_user, db)
result = await db.execute(
select(MaintenanceSchedule).where(MaintenanceSchedule.tree_id == tree_id)
@@ -122,7 +123,7 @@ async def update_schedule(
raise HTTPException(status_code=404, detail="Schedule not found")
# Verify user's team owns the tree this schedule belongs to
await _get_tree_or_403(schedule.tree_id, current_user, db)
await _get_tree_or_404(schedule.tree_id, current_user, db)
update_fields = data.model_fields_set
was_active = schedule.is_active

View File

@@ -1,10 +1,12 @@
"""Network diagrams API endpoints."""
import base64
import logging
from datetime import datetime, timezone
from typing import Annotated
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, Query
from pydantic import BaseModel
from sqlalchemy import select, or_
from sqlalchemy.ext.asyncio import AsyncSession
@@ -13,6 +15,7 @@ from app.api.deps import get_current_active_user
from app.models.user import User
from app.models.device_type import DeviceType
from app.models.network_diagram import NetworkDiagram
from app.core.service_account import PLATFORM_ACCOUNT_ID
from app.schemas.network_diagram import (
NetworkDiagramCreate,
NetworkDiagramUpdate,
@@ -26,7 +29,7 @@ from app.schemas.network_diagram import (
DiagramNode,
DiagramEdge,
)
from app.services import network_diagram_ai_service
from app.services import network_diagram_ai_service, storage_service
# Maps system device-type slugs to their category — mirrors frontend deviceRegistry.ts
_SLUG_CATEGORY: dict[str, str] = {
@@ -49,11 +52,11 @@ router = APIRouter(prefix="/network-diagrams", tags=["network-diagrams"])
async def _get_diagram_or_404(
diagram_id: UUID,
team_id: UUID,
account_id: UUID,
db: AsyncSession,
) -> NetworkDiagram:
diagram = await db.get(NetworkDiagram, diagram_id)
if not diagram or diagram.team_id != team_id or diagram.is_archived:
if not diagram or diagram.account_id != account_id or diagram.is_archived:
raise HTTPException(status_code=404, detail="Diagram not found")
return diagram
@@ -82,15 +85,19 @@ def _diagram_to_list_item(
description=diagram.description,
node_count=len(nodes),
category_counts=category_counts,
thumbnail_url=diagram.thumbnail_url,
created_by=diagram.created_by,
created_at=diagram.created_at,
updated_at=diagram.updated_at,
)
async def _get_available_slugs(team_id: UUID, db: AsyncSession) -> set[str]:
async def _get_available_slugs(account_id: UUID, db: AsyncSession) -> set[str]:
stmt = select(DeviceType.slug).where(
or_(DeviceType.is_system.is_(True), DeviceType.team_id == team_id)
or_(
DeviceType.account_id == PLATFORM_ACCOUNT_ID,
DeviceType.account_id == account_id,
)
)
result = await db.execute(stmt)
return {row[0] for row in result.all()}
@@ -104,7 +111,7 @@ async def list_client_names(
stmt = (
select(NetworkDiagram.client_name)
.where(
NetworkDiagram.team_id == current_user.team_id,
NetworkDiagram.account_id == current_user.account_id,
NetworkDiagram.is_archived.is_(False),
NetworkDiagram.client_name.isnot(None),
NetworkDiagram.client_name != "",
@@ -126,7 +133,7 @@ async def list_diagrams(
stmt = (
select(NetworkDiagram)
.where(
NetworkDiagram.team_id == current_user.team_id,
NetworkDiagram.account_id == current_user.account_id,
NetworkDiagram.is_archived.is_(False),
)
.order_by(NetworkDiagram.updated_at.desc())
@@ -148,7 +155,7 @@ async def list_diagrams(
# Single query for custom device types so category_counts is accurate
dt_stmt = select(DeviceType.slug, DeviceType.category).where(
DeviceType.is_system.is_(False),
DeviceType.team_id == current_user.team_id,
DeviceType.account_id == current_user.account_id,
)
dt_result = await db.execute(dt_stmt)
custom_slug_category = {row[0]: row[1] for row in dt_result.all()}
@@ -164,13 +171,8 @@ async def create_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> NetworkDiagramResponse:
if current_user.team_id is None:
raise HTTPException(
status_code=422,
detail="Network Diagrams require a team account. Assign your account to a team first.",
)
diagram = NetworkDiagram(
team_id=current_user.team_id,
account_id=current_user.account_id,
name=data.name,
client_name=data.client_name,
asset_name=data.asset_name,
@@ -191,7 +193,7 @@ async def get_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> NetworkDiagramResponse:
diagram = await _get_diagram_or_404(diagram_id, current_user.team_id, db)
diagram = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
return _diagram_to_response(diagram)
@@ -202,7 +204,7 @@ async def update_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> NetworkDiagramResponse:
diagram = await _get_diagram_or_404(diagram_id, current_user.team_id, db)
diagram = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
update_data = data.model_dump(exclude_unset=True)
if "nodes" in update_data and update_data["nodes"] is not None:
@@ -225,7 +227,7 @@ async def archive_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> None:
diagram = await _get_diagram_or_404(diagram_id, current_user.team_id, db)
diagram = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
diagram.is_archived = True
diagram.updated_at = datetime.now(timezone.utc)
await db.commit()
@@ -237,9 +239,9 @@ async def duplicate_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> NetworkDiagramResponse:
source = await _get_diagram_or_404(diagram_id, current_user.team_id, db)
source = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
copy = NetworkDiagram(
team_id=current_user.team_id,
account_id=current_user.account_id,
name=f"Copy of {source.name}",
client_name=source.client_name,
asset_name=source.asset_name,
@@ -260,7 +262,7 @@ async def export_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> DiagramExportResponse:
diagram = await _get_diagram_or_404(diagram_id, current_user.team_id, db)
diagram = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
nodes = [DiagramNode(**n) for n in (diagram.nodes or [])]
edges = [DiagramEdge(**e) for e in (diagram.edges or [])]
return DiagramExportResponse(
@@ -280,7 +282,7 @@ async def import_diagram(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> DiagramImportResponse:
available_slugs = await _get_available_slugs(current_user.team_id, db)
available_slugs = await _get_available_slugs(current_user.account_id, db)
warnings: list[str] = []
for node in data.nodes:
@@ -288,7 +290,7 @@ async def import_diagram(
warnings.append(f"Unknown device type '{node.type}' — will render with default icon")
diagram = NetworkDiagram(
team_id=current_user.team_id,
account_id=current_user.account_id,
name=data.name,
client_name=data.client_name,
description=data.description,
@@ -306,13 +308,41 @@ async def import_diagram(
)
class ThumbnailUploadRequest(BaseModel):
data_url: str # base64 PNG data URL: "data:image/png;base64,..."
@router.post("/{diagram_id}/thumbnail", status_code=204)
async def upload_thumbnail(
diagram_id: UUID,
body: ThumbnailUploadRequest,
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> None:
diagram = await _get_diagram_or_404(diagram_id, current_user.account_id, db)
try:
header, encoded = body.data_url.split(",", 1)
except ValueError:
raise HTTPException(status_code=422, detail="Invalid data URL format")
image_bytes = base64.b64decode(encoded)
storage_key = await storage_service.upload_file(
file_data=image_bytes,
filename=f"thumbnail-{diagram_id}.png",
content_type="image/png",
account_id=str(current_user.account_id),
)
presigned_url = storage_service.get_presigned_url(storage_key)
diagram.thumbnail_url = presigned_url
await db.commit()
@router.post("/ai-generate", response_model=AIGenerateResponse)
async def ai_generate_diagram(
data: AIGenerateRequest,
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> AIGenerateResponse:
available_slugs_set = await _get_available_slugs(current_user.team_id, db)
available_slugs_set = await _get_available_slugs(current_user.account_id, db)
available_slugs = list(available_slugs_set)
existing_node_ids: list[str] | None = None

View File

@@ -8,6 +8,7 @@ from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import get_current_active_user
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.models.assistant_chat import AssistantChat
from app.models.psa_connection import PsaConnection
from app.models.session import Session
@@ -98,7 +99,7 @@ async def get_onboarding_status(
@router.post("/onboarding-status/dismiss", response_model=OnboardingStatus)
async def dismiss_onboarding(
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> OnboardingStatus:
"""Dismiss the onboarding checklist for the current user."""

View File

@@ -91,6 +91,7 @@ async def submit_step_feedback(
new_rating = StepRating(
step_id=step_id,
user_id=current_user.id,
account_id=current_user.account_id,
session_id=session_uuid,
was_helpful=data.was_helpful,
# rating is nullable now — thumbs-only mode

View File

@@ -3,12 +3,14 @@ from typing import Annotated
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, Request
from sqlalchemy import text
from sqlalchemy import select, text
from sqlalchemy.exc import IntegrityError
from sqlalchemy.ext.asyncio import AsyncSession
from app.core.database import get_db
from app.core.rate_limit import limiter
from app.api.deps import get_current_active_user
from app.models.ai_session import AISession
from app.models.user import User
from app.models.script_builder_session import ScriptBuilderSession
from app.schemas.script_builder import (
@@ -67,15 +69,85 @@ async def create_session(
db: Annotated[AsyncSession, Depends(get_db)],
current_user: Annotated[User, Depends(get_current_active_user)],
) -> ScriptBuilderSessionDetail:
"""Start a new Script Builder session."""
"""Start a new Script Builder session.
When origin='pilot_inline', behaves as get-or-create: the same row is
returned on repeated calls with the same (user, ai_session_id) pair.
Inline sessions are excluded from the session cap and the list endpoint.
"""
# Phase 9: inline origin validation + authorization
if data.origin == "pilot_inline":
if data.ai_session_id is None:
raise HTTPException(
status_code=400,
detail="ai_session_id is required when origin='pilot_inline'",
)
# Ownership check: the pilot session must belong to the current user.
ai_session = await db.scalar(
select(AISession).where(
AISession.id == data.ai_session_id,
AISession.user_id == current_user.id,
)
)
if ai_session is None:
raise HTTPException(
status_code=404,
detail="Session not found",
)
# Idempotent get-or-create: if a pilot_inline row already exists for
# this (user, ai_session_id) pair, return it without creating a duplicate.
existing = await db.scalar(
select(ScriptBuilderSession).where(
ScriptBuilderSession.user_id == current_user.id,
ScriptBuilderSession.ai_session_id == data.ai_session_id,
ScriptBuilderSession.origin == "pilot_inline",
)
)
if existing is not None:
# Re-fetch with message_records loaded
session = await script_builder_service.get_session(db, existing.id, current_user.id)
return _session_to_detail(session)
# Create the inline session — wrap in IntegrityError catch for races.
try:
session = await script_builder_service.create_session(
db=db,
user_id=current_user.id,
account_id=current_user.account_id,
team_id=current_user.team_id,
language=data.language,
origin=data.origin,
ai_session_id=data.ai_session_id,
)
await db.commit()
except IntegrityError:
await db.rollback()
# Race: another request won the unique index — re-read the winner row.
existing = await db.scalar(
select(ScriptBuilderSession).where(
ScriptBuilderSession.user_id == current_user.id,
ScriptBuilderSession.ai_session_id == data.ai_session_id,
ScriptBuilderSession.origin == "pilot_inline",
)
)
if existing is None:
raise
session = existing
# Re-fetch with message_records loaded
session = await script_builder_service.get_session(db, session.id, current_user.id)
return _session_to_detail(session)
# ── Standalone session ──────────────────────────────────────────────────
# Acquire per-user advisory lock so concurrent create requests are serialized.
# Without this, two simultaneous requests both read count < limit and both
# insert, exceeding MAX_SESSIONS_PER_USER.
user_lock_key = hash(str(current_user.id)) % (2**62)
await db.execute(text("SELECT pg_advisory_xact_lock(:key)"), {"key": user_lock_key})
# Enforce max concurrent sessions
count = await script_builder_service.count_user_sessions(db, current_user.id)
# Enforce max concurrent sessions (inline sessions excluded from cap)
count = await script_builder_service.count_user_sessions(db, current_user.id, include_inline=False)
if count >= MAX_SESSIONS_PER_USER:
raise HTTPException(
status_code=400,
@@ -85,8 +157,11 @@ async def create_session(
session = await script_builder_service.create_session(
db=db,
user_id=current_user.id,
account_id=current_user.account_id,
team_id=current_user.team_id,
language=data.language,
origin=data.origin,
ai_session_id=data.ai_session_id,
)
await db.commit()
# Re-fetch with message_records loaded

View File

@@ -5,7 +5,7 @@ import re
from fastapi import APIRouter, Depends, HTTPException, Query, status
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy import select, func, or_, literal
from sqlalchemy import select, func, or_, literal, update as sa_update
from app.core.database import get_db
from app.api.deps import get_current_active_user
@@ -197,6 +197,7 @@ async def create_template(
template = ScriptTemplate(
category_id=data.category_id,
team_id=current_user.team_id,
account_id=current_user.account_id,
created_by=current_user.id,
name=data.name,
slug=slug,
@@ -364,6 +365,7 @@ async def generate_script(
generation = ScriptGeneration(
template_id=template.id,
user_id=current_user.id,
account_id=current_user.account_id,
team_id=current_user.team_id,
session_id=data.session_id,
ai_session_id=data.ai_session_id,
@@ -372,6 +374,20 @@ async def generate_script(
)
db.add(generation)
template.usage_count += 1
# FlowPilot Phase 3: bump the linked AI session's state_version so the
# resolution-note preview cache invalidates. One-off scripts run outside
# any FlowPilot session — in that case the UPDATE matches zero rows.
if data.ai_session_id is not None:
# Local import: scripts endpoint stays independent of AI-session
# imports for non-AI generation paths.
from app.models.ai_session import AISession
await db.execute(
sa_update(AISession)
.where(AISession.id == data.ai_session_id)
.values(state_version=AISession.state_version + 1)
)
await db.commit()
await db.refresh(generation)

View File

@@ -0,0 +1,315 @@
"""Session fact endpoints — the "What we know" CRUD surface for a FlowPilot session.
All routes are sub-resources of `/ai-sessions/{session_id}`. Tenant isolation is
enforced by RLS on `session_facts.account_id`; a user from another account
literally cannot see or write facts for this session.
Editability rule (per FLOWPILOT-MIGRATION.md Section 7.3):
- `user_note` and `ai_synthesis` facts are editable at the card level.
- `question` and `diagnostic_check` facts are read-only at the card level —
edit the source question/check instead. PATCH returns 403 for those.
Fact promotion writes always bump `ai_sessions.state_version` so the
resolution-note preview cache invalidates (Section 5.5).
"""
import logging
from datetime import datetime, timezone
from typing import Annotated
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import get_current_active_user, get_db, require_engineer_or_admin
from app.models.ai_session import AISession
from app.models.session_fact import SessionFact
from app.models.user import User
from app.schemas.session_fact import (
SessionFactCreateRequest,
SessionFactListResponse,
SessionFactPromoteRequest,
SessionFactResponse,
SessionFactUpdateRequest,
)
from app.services.fact_synthesis_service import (
FactSynthesisService,
list_facts_for_session,
)
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/ai-sessions/{session_id}", tags=["session-facts"])
# Source types whose facts can be edited at the card level (Section 7.3).
_EDITABLE_SOURCE_TYPES = frozenset({"user_note", "ai_synthesis"})
def _to_response(fact: SessionFact) -> SessionFactResponse:
"""Wrap an ORM SessionFact in the response model with the editable flag."""
return SessionFactResponse(
id=fact.id,
session_id=fact.session_id,
text=fact.text,
source_type=fact.source_type, # type: ignore[arg-type]
source_ref=fact.source_ref,
source_summary=fact.source_summary,
created_by=fact.created_by,
created_at=fact.created_at,
updated_at=fact.updated_at,
editable=fact.source_type in _EDITABLE_SOURCE_TYPES,
)
async def _load_session_or_404(db: AsyncSession, session_id: UUID) -> AISession:
"""Load the session via RLS-scoped SELECT. Returns 404 if missing/cross-tenant.
Tenant isolation: RLS on `ai_sessions` filters by current account, so a
cross-tenant access returns no rows and we 404 (rather than 403, which
would leak the row's existence).
"""
result = await db.execute(select(AISession).where(AISession.id == session_id))
session = result.scalar_one_or_none()
if session is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Session not found")
return session
async def _load_fact_or_404(
db: AsyncSession, session_id: UUID, fact_id: UUID
) -> SessionFact:
"""Load a non-deleted fact for the session. 404 if missing or already deleted."""
result = await db.execute(
select(SessionFact).where(
SessionFact.id == fact_id,
SessionFact.session_id == session_id,
SessionFact.deleted_at.is_(None),
)
)
fact = result.scalar_one_or_none()
if fact is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Fact not found")
return fact
# ── List ──
@router.get("/facts", response_model=SessionFactListResponse)
async def list_facts(
session_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionFactListResponse:
"""List facts for a session, oldest first."""
await _load_session_or_404(db, session_id)
facts = await list_facts_for_session(db, session_id)
return SessionFactListResponse(facts=[_to_response(f) for f in facts])
# ── Create (manual user note) ──
@router.post("/facts", response_model=SessionFactResponse, status_code=201)
async def create_fact(
session_id: UUID,
body: SessionFactCreateRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionFactResponse:
"""Create a manual fact (the "+ Add a note" UI affordance).
Always recorded as `source_type=user_note`. Source-typed creation goes
through `/facts/promote` so the originating item ID is captured.
"""
session = await _load_session_or_404(db, session_id)
service = FactSynthesisService(db)
try:
fact = await service.create_fact(
session_id=session.id,
account_id=session.account_id,
user_id=current_user.id,
source_type="user_note",
text=body.text,
summary=body.summary,
source_ref=None,
)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
await db.commit()
await db.refresh(fact)
return _to_response(fact)
# ── Update ──
@router.patch("/facts/{fact_id}", response_model=SessionFactResponse)
async def update_fact(
session_id: UUID,
fact_id: UUID,
body: SessionFactUpdateRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionFactResponse:
"""Edit fact text or summary.
Returns 403 for `question` and `diagnostic_check`-sourced facts: the
source item is the canonical input, so editing the fact card would
desync the two. Engineers edit the source instead.
"""
fact = await _load_fact_or_404(db, session_id, fact_id)
if fact.source_type not in _EDITABLE_SOURCE_TYPES:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail=(
f"Facts sourced from {fact.source_type!r} are read-only at the "
"card level. Edit the originating question or diagnostic check instead."
),
)
if body.text is None and body.summary is None:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="At least one of `text` or `summary` must be provided",
)
service = FactSynthesisService(db)
try:
fact = await service.update_fact(fact, text=body.text, summary=body.summary)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
await db.commit()
await db.refresh(fact)
return _to_response(fact)
# ── Soft delete ──
@router.delete("/facts/{fact_id}", status_code=204)
async def delete_fact(
session_id: UUID,
fact_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> None:
"""Soft-delete a fact. All source types are deletable.
Soft delete (rather than hard) preserves provenance for audit and lets
accidental deletes be recovered if needed. The `editable` flag does NOT
control deletion — even read-only facts can be removed when the
underlying question/check turned out to be wrong.
"""
fact = await _load_fact_or_404(db, session_id, fact_id)
service = FactSynthesisService(db)
await service.soft_delete_fact(fact)
await db.commit()
# ── Promote (AI marker + engineer-driven) ──
@router.post("/facts/promote", response_model=SessionFactResponse, status_code=201)
async def promote_fact(
session_id: UUID,
body: SessionFactPromoteRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionFactResponse:
"""Convert a question answer / check result into a fact.
Two modes:
- `proposed_text` provided → persisted as-is.
- `raw_input` provided → server drafts text/summary via FactSynthesisService.
Exactly one of the two must be set. The engineer-facing UI typically uses
`proposed_text` after letting the engineer review/edit a draft.
"""
if (body.proposed_text is None) == (body.raw_input is None):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="Exactly one of `proposed_text` or `raw_input` must be provided",
)
if body.source_type == "ai_synthesis" and body.source_ref is not None:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="`source_ref` must be null for source_type=ai_synthesis",
)
session = await _load_session_or_404(db, session_id)
service = FactSynthesisService(db)
text = body.proposed_text
summary = body.proposed_summary
if text is None:
# Synthesize via LLM. Caller must hint which task-lane item the input
# came from so we can shape the prompt appropriately.
raw = body.raw_input or ""
if body.source_type == "question":
draft = await service.synthesize_from_question(
question_text=_lookup_task_lane_text(session, body.source_ref, "questions"),
raw_answer=raw,
)
elif body.source_type == "diagnostic_check":
draft = await service.synthesize_from_check(
check_label=_lookup_task_lane_text(session, body.source_ref, "actions"),
check_output=raw,
)
else:
# ai_synthesis with raw_input: the raw input IS the synthesis.
# Re-run through the question synthesizer with an empty question
# so the conservative prompt still applies.
draft = await service.synthesize_from_question(
question_text="(none — synthesizing from engineer summary)",
raw_answer=raw,
)
text = draft["text"]
summary = summary or draft["summary"]
if not text:
raise HTTPException(
status_code=status.HTTP_422_UNPROCESSABLE_ENTITY,
detail=(
"Synthesizer found no substantive fact in the input. "
"Edit the input or supply `proposed_text` directly."
),
)
try:
fact = await service.create_fact(
session_id=session.id,
account_id=session.account_id,
user_id=current_user.id,
source_type=body.source_type,
text=text,
summary=summary,
source_ref=body.source_ref,
)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
await db.commit()
await db.refresh(fact)
return _to_response(fact)
def _lookup_task_lane_text(
session: AISession, source_ref: UUID | None, list_key: str
) -> str:
"""Find the originating question text / action label from pending_task_lane.
Falls back to a generic placeholder if the source item is no longer in
the lane (e.g., the AI dropped it from a later turn). The synthesizer is
forgiving — an empty/generic question still produces a useful fact when
the engineer's answer is substantive on its own.
"""
if source_ref is None:
return ""
lane = session.pending_task_lane or {}
items = lane.get(list_key) or []
sref = str(source_ref)
for item in items:
if isinstance(item, dict) and str(item.get("id")) == sref:
return str(item.get("text") or item.get("label") or "")
return ""

View File

@@ -0,0 +1,759 @@
"""Suggested-fix + resolution-note / escalation-package preview-and-post endpoints.
Phase 3: active suggested fix lookup + decision recording, resolution-note
preview with state_version cache.
Phase 4: resolution-note POST (writeback to PSA + mark resolved), escalation
package preview + POST (writeback + mark escalated). Local-only path when
the session has no linked PSA ticket: markdown is stored on the session and
the status flipped, no external call.
Per FLOWPILOT-MIGRATION.md Sections 5.2 + 5.4.
"""
import logging
from datetime import datetime, timezone
from typing import Annotated
from uuid import UUID
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy import select, update
from sqlalchemy.ext.asyncio import AsyncSession
from app.api.deps import get_current_active_user, get_db, require_engineer_or_admin
from app.models.ai_session import AISession
from app.models.session_suggested_fix import SessionSuggestedFix
from app.models.user import User
from app.schemas.session_suggested_fix import (
EscalationPackagePostRequest,
ResolutionNotePostRequest,
ResolutionNotePreviewResponse,
ResolutionPostResponse,
SessionSuggestedFixDecisionRequest,
SessionSuggestedFixDecisionResponse,
SessionSuggestedFixOutcomeRequest,
SessionSuggestedFixResponse,
SessionSuggestedFixScriptRequest,
)
from app.models.draft_template import DraftTemplate
from app.models.session_fact import SessionFact
from app.services.escalation_package_generator import EscalationPackageGeneratorService
from app.services.preview_cache import preview_cache
from app.services.psa_writeback_service import (
PSAStatusVerificationError,
PSAWritebackService,
)
from app.services.resolution_note_generator import ResolutionNoteGeneratorService
from app.services.template_extraction_service import extract_parameters as _extract_template_parameters
logger = logging.getLogger(__name__)
router = APIRouter(prefix="/ai-sessions/{session_id}", tags=["session-suggested-fixes"])
async def _load_session_or_404(db: AsyncSession, session_id: UUID) -> AISession:
"""RLS-scoped session load. 404 covers both missing and cross-tenant."""
result = await db.execute(select(AISession).where(AISession.id == session_id))
session = result.scalar_one_or_none()
if session is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Session not found")
return session
# ── Suggested fix: active ──────────────────────────────────────────────────
@router.get(
"/suggested-fixes/active",
response_model=SessionSuggestedFixResponse,
)
async def get_active_suggested_fix(
session_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixResponse:
"""Return the current active suggested fix (`superseded_at IS NULL`) or 404.
A session has at most one active fix. Multiple historical rows persist
for audit, but only the most-recent un-superseded one is returned here.
"""
await _load_session_or_404(db, session_id)
result = await db.execute(
select(SessionSuggestedFix)
.where(
SessionSuggestedFix.session_id == session_id,
SessionSuggestedFix.superseded_at.is_(None),
)
.order_by(SessionSuggestedFix.created_at.desc())
)
fix = result.scalars().first()
if fix is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="No active suggested fix for this session",
)
return SessionSuggestedFixResponse.model_validate(fix)
# ── Suggested fix: decision ────────────────────────────────────────────────
@router.post(
"/suggested-fixes/{fix_id}/decision",
response_model=SessionSuggestedFixDecisionResponse,
)
async def record_decision(
session_id: UUID,
fix_id: UUID,
body: SessionSuggestedFixDecisionRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixDecisionResponse:
"""Record the engineer's path choice on a suggested fix.
Phase 3 recorded the choice and (for `dismissed`) superseded the fix.
Phase 5 adds side effects: one_off / draft_template return the rendered
script; draft_template also creates a `draft_templates` row via the
TemplateExtractionService; build_template returns a redirect to the
Script Builder.
"""
session_obj = await _load_session_or_404(db, session_id)
result = await db.execute(
select(SessionSuggestedFix).where(
SessionSuggestedFix.id == fix_id,
SessionSuggestedFix.session_id == session_id,
)
)
fix = result.scalar_one_or_none()
if fix is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="Suggested fix not found"
)
# Once a fix has been superseded we still record the engineer's
# decision (it's a historical signal — "engineer dismissed the
# interim hypothesis"), but `dismissed` on a superseded row would
# be redundant noise.
if fix.superseded_at is not None and body.decision == "dismissed":
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail="This fix is already superseded by a newer suggestion",
)
fix.user_decision = body.decision
if body.decision == "dismissed" and fix.superseded_at is None:
fix.superseded_at = datetime.now(timezone.utc)
# Engineer's choice changes the bundle the resolution-note preview sees,
# so bump state_version too.
await db.execute(
update(AISession)
.where(AISession.id == session_id)
.values(state_version=AISession.state_version + 1)
)
rendered_script: str | None = None
draft_template_id: UUID | None = None
redirect_path: str | None = None
# Phase 5 side effects. All three non-dismiss paths assume the fix has
# either a script_template_id (template match — use the dedicated
# /scripts/generate endpoint from the frontend, not this one) or an
# ai_drafted_script (custom script — this is the entry point).
if body.decision in ("one_off", "draft_template", "build_template"):
drafted = body.edited_script or fix.ai_drafted_script
if not drafted:
# Template-matched fixes take the regular /scripts/generate path.
# If a fix somehow reaches here without a drafted script AND
# without a template, that's a client-side wiring bug.
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail=(
"Suggested fix has no ai_drafted_script — use "
"/api/v1/scripts/generate for template-matched fixes."
),
)
rendered_script = drafted.strip()
if body.decision == "draft_template":
# TemplateExtractionService proposes the parameterization. Runs
# under the same transaction so a failure rolls back the decision.
session_ctx = await _summarize_session_for_extraction(db, session_id)
extraction = await _extract_template_parameters(
script_body=rendered_script or "",
session_context=session_ctx,
ticket_context=None, # ticket context wiring lands in Phase 5 polish
)
draft = DraftTemplate(
account_id=session_obj.account_id,
source_session_id=session_obj.id,
source_user_id=current_user.id,
script_body=extraction["templated_body"] or (rendered_script or ""),
proposed_parameters={"parameters": extraction["parameters"]},
proposed_name=fix.title[:200] if fix.title else None,
status="pending",
)
db.add(draft)
await db.flush()
draft_template_id = draft.id
if body.decision == "build_template":
# Frontend navigates to the Script Builder preloaded with the
# drafted body. The builder wires the full parameterization flow;
# we hand it a scratch-pad query string, not persistent state.
redirect_path = (
f"/scripts/builder?from_session={session_obj.id}&fix={fix.id}"
)
await db.commit()
await db.refresh(fix)
return SessionSuggestedFixDecisionResponse(
id=fix.id,
user_decision=fix.user_decision, # type: ignore[arg-type]
rendered_script=rendered_script,
draft_template_id=draft_template_id,
redirect_path=redirect_path,
)
# ── Suggested fix: apply (stamp applied_at) ──────────────────────────────
@router.post(
"/suggested-fixes/{fix_id}/apply",
response_model=SessionSuggestedFixResponse,
)
async def apply_suggested_fix(
session_id: UUID,
fix_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixResponse:
"""Stamp applied_at when the engineer clicks Apply in the ProposalBanner.
This does NOT change status (fix remains 'proposed'). Status only flips
when the engineer records an outcome via PATCH /outcome.
Rules:
- Fix must be in 'proposed' status; any other status → 409.
- Idempotent: if applied_at is already set, returns 200 with the unchanged row.
- Bumps ai_sessions.state_version so resolve/escalate preview generators
know the fix has entered the verifying phase.
"""
await _load_session_or_404(db, session_id)
result = await db.execute(
select(SessionSuggestedFix).where(
SessionSuggestedFix.id == fix_id,
SessionSuggestedFix.session_id == session_id,
)
)
fix = result.scalar_one_or_none()
if fix is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="Suggested fix not found"
)
if fix.status != "proposed":
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Apply is only valid from 'proposed'; fix is already '{fix.status}'",
)
# Idempotent: already stamped → return as-is without bumping state_version again.
if fix.applied_at is not None:
return SessionSuggestedFixResponse.model_validate(fix)
fix.applied_at = datetime.now(timezone.utc)
# Bump state_version so preview generators see the verifying-phase signal.
await db.execute(
update(AISession)
.where(AISession.id == session_id)
.values(state_version=AISession.state_version + 1)
)
await db.commit()
await db.refresh(fix)
return SessionSuggestedFixResponse.model_validate(fix)
# ── Suggested fix: outcome ────────────────────────────────────────────────
@router.patch(
"/suggested-fixes/{fix_id}/outcome",
response_model=SessionSuggestedFixResponse,
)
async def patch_suggested_fix_outcome(
session_id: UUID,
fix_id: UUID,
body: SessionSuggestedFixOutcomeRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixResponse:
"""Record the engineer's outcome for an applied fix.
See `SessionSuggestedFixOutcomeRequest` for transition rules.
"""
await _load_session_or_404(db, session_id)
now = datetime.now(timezone.utc)
result = await db.execute(
select(SessionSuggestedFix).where(
SessionSuggestedFix.id == fix_id,
SessionSuggestedFix.session_id == session_id,
)
)
fix = result.scalar_one_or_none()
if fix is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="Suggested fix not found"
)
if body.outcome == "applied_partial" and not (body.notes and body.notes.strip()):
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
detail="notes are required when outcome is applied_partial",
)
TERMINAL = {"applied_success", "applied_failed", "dismissed"}
if fix.status in TERMINAL:
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Fix is already in terminal status {fix.status!r}",
)
fix.status = body.outcome
if body.outcome == "applied_partial":
fix.partial_notes = (body.notes or "").strip() or None
elif body.outcome == "applied_failed":
fix.failure_reason = (body.notes or "").strip() or None
fix.verified_at = now
elif body.outcome == "applied_success":
fix.verified_at = now
# dismissed: no timestamp/notes stamping
if fix.applied_at is None and body.outcome != "dismissed":
fix.applied_at = now
# Clear any pending AI outcome proposal — engineer has taken a terminal action.
fix.ai_outcome_proposal = None
# Outcome changes the bundle that resolution-note/escalation-package
# previews see, so bump state_version inside the same transaction —
# mirrors the pattern in record_decision above.
await db.execute(
update(AISession)
.where(AISession.id == session_id)
.values(state_version=AISession.state_version + 1)
)
await db.commit()
await db.refresh(fix)
return SessionSuggestedFixResponse.model_validate(fix)
# ── Suggested fix: attach drafted script ─────────────────────────────────────
@router.patch(
"/suggested-fixes/{fix_id}/script",
response_model=SessionSuggestedFixResponse,
)
async def patch_suggested_fix_script(
session_id: UUID,
fix_id: UUID,
body: SessionSuggestedFixScriptRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixResponse:
"""Attach an engineer-drafted script to a suggested fix.
Called by the inline Script Builder tab on Submit. Does NOT stamp
applied_at — a draft is not an application. Bumps state_version so
the Resolve/Escalate preview bundles regenerate.
"""
await _load_session_or_404(db, session_id)
fix = await db.scalar(
select(SessionSuggestedFix).where(
SessionSuggestedFix.id == fix_id,
SessionSuggestedFix.session_id == session_id,
)
)
if fix is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Suggested fix not found")
TERMINAL = {"applied_success", "applied_failed", "dismissed"}
if fix.status in TERMINAL:
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Fix is already in terminal status {fix.status!r}",
)
fix.ai_drafted_script = body.ai_drafted_script
fix.ai_drafted_parameters = body.ai_drafted_parameters
# Bump state_version on the parent session — previews cached by
# (session_id, state_version) must regenerate to reflect the new draft.
await db.execute(
update(AISession)
.where(AISession.id == session_id)
.values(state_version=AISession.state_version + 1)
)
await db.commit()
await db.refresh(fix)
return SessionSuggestedFixResponse.model_validate(fix)
# ── Suggested fix: clear AI outcome proposal ("Not yet") ─────────────────────
@router.delete(
"/suggested-fixes/{fix_id}/ai-outcome-proposal",
response_model=SessionSuggestedFixResponse,
)
async def clear_ai_outcome_proposal(
session_id: UUID,
fix_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> SessionSuggestedFixResponse:
"""Explicitly dismiss the AI-proposed outcome banner ("Not yet").
Clears `ai_outcome_proposal` without touching status or state_version
(this is pure UI state, not outcome data). Idempotent: returns 200 even
when the field is already null. After this call the banner will not
re-surface on the next refreshSessionDerived unless the AI emits a new
proposal.
"""
await _load_session_or_404(db, session_id)
result = await db.execute(
select(SessionSuggestedFix).where(
SessionSuggestedFix.id == fix_id,
SessionSuggestedFix.session_id == session_id,
)
)
fix = result.scalar_one_or_none()
if fix is None:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND, detail="Suggested fix not found"
)
fix.ai_outcome_proposal = None
await db.commit()
await db.refresh(fix)
return SessionSuggestedFixResponse.model_validate(fix)
async def _summarize_session_for_extraction(
db: AsyncSession, session_id: UUID,
) -> str:
"""Compact fact list for TemplateExtractionService context.
We don't send the full chat transcript — the extractor only needs enough
signal to decide which values in the script are session-specific (and
therefore worth parameterizing).
"""
result = await db.execute(
select(SessionFact)
.where(
SessionFact.session_id == session_id,
SessionFact.deleted_at.is_(None),
)
.order_by(SessionFact.created_at.asc())
)
facts = list(result.scalars().all())
if not facts:
return ""
lines = [f"- {f.text}" for f in facts]
return "\n".join(lines)
# ── Resolution note preview ────────────────────────────────────────────────
@router.post(
"/resolution-note/preview",
response_model=ResolutionNotePreviewResponse,
)
async def resolution_note_preview(
session_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> ResolutionNotePreviewResponse:
"""Generate (or return cached) draft markdown for the Resolve note.
Cache key: `(resolution_note, session_id, state_version)`. State_version is
bumped by every fact / suggested-fix / script-generation write, so two
consecutive calls with no intervening writes return the same cached
payload (and won't pay for a Sonnet call).
Posted to PSA in Phase 4. Until then, this endpoint is read-only.
"""
await _load_session_or_404(db, session_id)
gen = ResolutionNoteGeneratorService(db)
try:
payload = await gen.generate_or_get_cached(session_id)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
except Exception as e:
logger.exception("Resolution note preview failed for session %s", session_id)
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"Resolution-note generator error ({type(e).__name__})",
)
return ResolutionNotePreviewResponse(**payload)
# ── Phase 4: escalation-package preview ────────────────────────────────────
@router.post(
"/escalation-package/preview",
response_model=ResolutionNotePreviewResponse,
)
async def escalation_package_preview(
session_id: UUID,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> ResolutionNotePreviewResponse:
"""Generate (or return cached) draft markdown for the Escalate handoff package.
Same caching story as the resolution-note preview: keyed on
`(session_id, state_version)`. Separate cache kind so a Resolve preview
and an Escalate preview for the same state can coexist.
"""
await _load_session_or_404(db, session_id)
gen = EscalationPackageGeneratorService(db)
try:
payload = await gen.generate_or_get_cached(session_id)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
except Exception as e:
logger.exception("Escalation package preview failed for session %s", session_id)
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"Escalation-package generator error ({type(e).__name__})",
)
return ResolutionNotePreviewResponse(**payload)
# ── Phase 4: Resolve & post ────────────────────────────────────────────────
@router.post(
"/resolution-note/post",
response_model=ResolutionPostResponse,
)
async def post_resolution_note(
session_id: UUID,
body: ResolutionNotePostRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> ResolutionPostResponse:
"""Commit the engineer-edited resolution note and close the session.
Three outcomes:
- **External post + status verified** — session.status='resolved',
markdown + external_id + posted_at persisted, CW status flipped to
the configured Resolved status ID and re-fetch-verified.
- **External post only** — markdown posted, but no cw_resolved_status_id
configured → session.status='resolved', `status_transition_skipped_reason`
explains the skip. Not an error — posting the note is meaningful.
- **Local-only** — session has no linked PSA ticket → markdown stored on
`resolution_note_markdown`, session.status='resolved', outcome =
'resolved_local'. No external call.
Status verification failure raises 502: the engineer intended to close
the ticket but we cannot confirm it actually closed. Surfacing silent
success would be a footgun.
"""
session_obj = await _load_session_or_404(db, session_id)
if session_obj.status not in ("active", "paused", "requesting_escalation", "escalated"):
# Already-resolved sessions shouldn't be re-posted; caller should
# query first. escalated→resolved is allowed (engineer revised course).
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Session is already {session_obj.status}",
)
service = PSAWritebackService(db)
summary = (body.resolution_summary or body.markdown.strip().splitlines()[0])[:500]
# Local-only path — no PSA ticket linked, nothing to post.
if not session_obj.psa_ticket_id or not session_obj.psa_connection_id:
session_obj.resolution_note_markdown = body.markdown.strip()
session_obj.status = "resolved"
session_obj.resolved_at = datetime.now(timezone.utc)
session_obj.resolution_summary = summary
await db.commit()
return ResolutionPostResponse(
outcome="resolved_local",
session_status=session_obj.status,
)
try:
posted = await service.post_resolution_note(session_obj, body.markdown)
except Exception as e:
logger.exception("post_resolution_note failed for session %s", session_id)
await db.rollback()
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"PSA post failed ({type(e).__name__})",
)
# Attempt the status transition if configured; failed verification is
# surfaced loudly (status_code 502) per the ConnectWise anti-silent-
# success principle. Not configured → skip with a reason, not an error.
target_status_id = await service.resolved_status_id_for_account(session_obj.account_id)
verified_status_id: int | None = None
verified_status_name: str | None = None
skipped_reason: str | None = None
if target_status_id is None:
skipped_reason = (
"No cw_resolved_status_id configured in account_settings.preferences — "
"note posted, status unchanged."
)
else:
try:
result = await service.transition_ticket_status(session_obj, target_status_id)
verified_status_id = result["verified_status_id"]
verified_status_name = result["verified_status_name"]
except PSAStatusVerificationError as e:
logger.error("Status verification failed for session %s: %s", session_id, e)
# Note was already posted — roll that partial side effect back in
# the session record (the CW note itself can't be un-posted).
await db.rollback()
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=str(e),
)
except Exception as e:
logger.exception("Status transition failed for session %s", session_id)
await db.rollback()
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"PSA status transition error ({type(e).__name__})",
)
session_obj.status = "resolved"
session_obj.resolved_at = datetime.now(timezone.utc)
session_obj.resolution_summary = summary
await db.commit()
return ResolutionPostResponse(
outcome="resolved",
session_status=session_obj.status,
external_id=posted["external_id"],
posted_at=posted["posted_at"],
verified_status_id=verified_status_id,
verified_status_name=verified_status_name,
status_transition_skipped_reason=skipped_reason,
)
# ── Phase 4: Escalate & post ──────────────────────────────────────────────
@router.post(
"/escalation-package/post",
response_model=ResolutionPostResponse,
)
async def post_escalation_package(
session_id: UUID,
body: EscalationPackagePostRequest,
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
) -> ResolutionPostResponse:
"""Commit the engineer-edited escalation package and mark the session escalated.
Structure mirrors post_resolution_note:
- Local-only when no PSA ticket: markdown stored, session.status='escalated'.
- PSA post: internal-analysis note (handoff is for the next engineer,
not the customer), optional status transition via cw_escalated_status_id,
re-fetch verified.
"""
session_obj = await _load_session_or_404(db, session_id)
if session_obj.status not in ("active", "paused", "resolved"):
# resolved→escalated is allowed (engineer realized they need help
# after closing); escalated→escalated would be a no-op, block it.
raise HTTPException(
status_code=status.HTTP_409_CONFLICT,
detail=f"Session is already {session_obj.status}",
)
service = PSAWritebackService(db)
reason = body.escalation_reason or body.markdown.strip().splitlines()[0][:500]
if not session_obj.psa_ticket_id or not session_obj.psa_connection_id:
session_obj.escalation_package_markdown = body.markdown.strip()
session_obj.status = "escalated"
session_obj.escalation_reason = reason
await db.commit()
return ResolutionPostResponse(
outcome="escalated_local",
session_status=session_obj.status,
)
try:
posted = await service.post_escalation_package(session_obj, body.markdown)
except Exception as e:
logger.exception("post_escalation_package failed for session %s", session_id)
await db.rollback()
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"PSA post failed ({type(e).__name__})",
)
target_status_id = await service.escalated_status_id_for_account(session_obj.account_id)
verified_status_id: int | None = None
verified_status_name: str | None = None
skipped_reason: str | None = None
if target_status_id is None:
skipped_reason = (
"No cw_escalated_status_id configured — package posted, status unchanged."
)
else:
try:
result = await service.transition_ticket_status(session_obj, target_status_id)
verified_status_id = result["verified_status_id"]
verified_status_name = result["verified_status_name"]
except PSAStatusVerificationError as e:
logger.error("Status verification failed for session %s: %s", session_id, e)
await db.rollback()
raise HTTPException(status_code=status.HTTP_502_BAD_GATEWAY, detail=str(e))
except Exception as e:
logger.exception("Status transition failed for session %s", session_id)
await db.rollback()
raise HTTPException(
status_code=status.HTTP_502_BAD_GATEWAY,
detail=f"PSA status transition error ({type(e).__name__})",
)
session_obj.status = "escalated"
session_obj.escalation_reason = reason
await db.commit()
return ResolutionPostResponse(
outcome="escalated",
session_status=session_obj.status,
external_id=posted["external_id"],
posted_at=posted["posted_at"],
verified_status_id=verified_status_id,
verified_status_name=verified_status_name,
status_transition_skipped_reason=skipped_reason,
)
# ── Helper used by tests ───────────────────────────────────────────────────
def _clear_preview_cache_for_tests() -> None:
"""Reset the singleton cache between tests."""
preview_cache._store.clear() # noqa: SLF001 — test-only access

View File

@@ -143,8 +143,8 @@ async def get_session(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
return session
@@ -196,6 +196,7 @@ async def start_session(
new_session = Session(
tree_id=tree.id,
user_id=current_user.id,
account_id=current_user.account_id,
tree_snapshot=tree_snapshot,
path_taken=[],
decisions=[],
@@ -234,8 +235,8 @@ async def update_session(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
if session.completed_at:
@@ -281,8 +282,8 @@ async def complete_session(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
if session.completed_at:
@@ -319,8 +320,8 @@ async def update_scratchpad(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
session.scratchpad = data.scratchpad
@@ -348,8 +349,8 @@ async def update_session_variables(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
if session.completed_at:
@@ -387,8 +388,8 @@ async def export_session(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
# PDF export — separate path with binary response
@@ -693,6 +694,7 @@ async def prepare_session(
new_session = Session(
tree_id=tree.id,
user_id=data.assigned_to_id or current_user.id,
account_id=current_user.account_id,
tree_snapshot=tree_snapshot,
path_taken=[],
decisions=[],
@@ -770,6 +772,7 @@ async def batch_launch_sessions(
session = Session(
tree_id=tree.id,
user_id=current_user.id,
account_id=current_user.account_id,
tree_snapshot=tree_snapshot,
path_taken=[],
decisions=[],
@@ -830,8 +833,8 @@ async def link_ticket(
if session.user_id != current_user.id and session.assigned_to_id != current_user.id:
if not current_user.is_super_admin:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this session",
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found",
)
# Unlink
@@ -1102,6 +1105,7 @@ async def psa_post_to_ticket(
# Log to audit trail
log_entry = PsaPostLog(
session_id=session.id,
account_id=session.account_id,
psa_connection_id=psa_connection.id if psa_connection else None,
ticket_id=session.psa_ticket_id,
note_type=data.note_type,

View File

@@ -9,6 +9,7 @@ from sqlalchemy.orm import joinedload
from sqlalchemy.exc import IntegrityError
from app.core.database import get_db
from app.core.admin_database import get_admin_db
from app.models.session import Session
from app.models.session_share import SessionShare, SessionShareView
from app.models.user import User
@@ -72,8 +73,8 @@ async def create_share(
if session.user_id != current_user.id and not current_user.is_super_admin:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Only the session owner can create share links"
status_code=status.HTTP_404_NOT_FOUND,
detail="Session not found"
)
# Require account_id for account-scoped shares
@@ -170,8 +171,8 @@ async def revoke_share(
if share.created_by != current_user.id and not current_user.is_super_admin:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="Only the share creator can revoke it"
status_code=status.HTTP_404_NOT_FOUND,
detail="Share not found"
)
share.is_active = False
@@ -210,7 +211,7 @@ async def _get_optional_user(request: Request, db: AsyncSession) -> Optional[Use
async def access_share(
share_token: str,
request: Request,
db: Annotated[AsyncSession, Depends(get_db)],
db: Annotated[AsyncSession, Depends(get_admin_db)],
):
"""Access a shared session via share token.

View File

@@ -16,6 +16,7 @@ from app.schemas.step_category import (
)
from app.api.deps import get_current_active_user
from app.core.permissions import can_manage_step_category, can_create_step_category
from app.core.service_account import PLATFORM_ACCOUNT_ID
router = APIRouter(prefix="/step-categories", tags=["step-categories"])
@@ -44,13 +45,13 @@ async def list_step_categories(
elif current_user.account_id:
query = query.where(
or_(
StepCategory.account_id.is_(None), # Global
StepCategory.account_id == PLATFORM_ACCOUNT_ID, # Global
StepCategory.account_id == current_user.account_id # User's account
)
)
else:
# User has no account, only show global categories
query = query.where(StepCategory.account_id.is_(None))
query = query.where(StepCategory.account_id == PLATFORM_ACCOUNT_ID)
query = query.order_by(StepCategory.display_order, StepCategory.name)
@@ -94,8 +95,8 @@ async def get_step_category(
# Check access: global categories visible to all, account categories only to account members
if category.account_id and category.account_id != current_user.account_id and not current_user.is_super_admin:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this step category"
status_code=status.HTTP_404_NOT_FOUND,
detail="Step category not found"
)
return StepCategoryResponse(
@@ -155,7 +156,7 @@ async def create_step_category(
name=category_data.name,
slug=slug,
description=category_data.description,
account_id=category_data.account_id,
account_id=category_data.account_id if category_data.account_id is not None else PLATFORM_ACCOUNT_ID,
display_order=max_order + 1,
created_by=current_user.id
)

View File

@@ -47,10 +47,10 @@ async def get_step_or_404(
raise HTTPException(status_code=404, detail="Step not found")
if check_view and not can_view_step(current_user, step):
raise HTTPException(status_code=403, detail="Not authorized to view this step")
raise HTTPException(status_code=404, detail="Step not found")
if check_edit and not can_edit_step(current_user, step):
raise HTTPException(status_code=403, detail="Not authorized to modify this step")
raise HTTPException(status_code=404, detail="Step not found")
return step
@@ -460,6 +460,7 @@ async def rate_step(
rating = StepRating(
step_id=step_id,
user_id=current_user.id,
account_id=current_user.account_id,
rating=rating_data.rating,
was_helpful=rating_data.was_helpful,
review_text=rating_data.review_text,

View File

@@ -103,6 +103,7 @@ async def create_supporting_data(
item = SessionSupportingData(
session_id=session_id,
account_id=session.account_id,
label=data.label,
data_type=data.data_type,
content=data.content,

View File

@@ -12,6 +12,7 @@ from app.models.user import User
from app.schemas.tag import TagCreate, TagResponse, TagListResponse, TagAssignment
from app.api.deps import get_current_active_user
from app.core.permissions import can_manage_tree_tags, can_create_tag
from app.core.service_account import PLATFORM_ACCOUNT_ID
router = APIRouter(prefix="/tags", tags=["tags"])
@@ -33,13 +34,13 @@ async def list_tags(
if include_account and current_user.account_id:
query = query.where(
or_(
TreeTag.account_id.is_(None), # Global
TreeTag.account_id == PLATFORM_ACCOUNT_ID, # Global
TreeTag.account_id == current_user.account_id # User's account
)
)
else:
# Only show global tags
query = query.where(TreeTag.account_id.is_(None))
query = query.where(TreeTag.account_id == PLATFORM_ACCOUNT_ID)
query = query.order_by(TreeTag.usage_count.desc(), TreeTag.name)
@@ -71,12 +72,12 @@ async def search_tags(
if include_account and current_user.account_id:
query = query.where(
or_(
TreeTag.account_id.is_(None),
TreeTag.account_id == PLATFORM_ACCOUNT_ID,
TreeTag.account_id == current_user.account_id
)
)
else:
query = query.where(TreeTag.account_id.is_(None))
query = query.where(TreeTag.account_id == PLATFORM_ACCOUNT_ID)
query = query.order_by(TreeTag.usage_count.desc(), TreeTag.name).limit(limit)
@@ -105,8 +106,8 @@ async def get_tag(
# Check access: global tags visible to all, account tags only to account members
if tag.account_id and tag.account_id != current_user.account_id and not current_user.is_super_admin:
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this tag"
status_code=status.HTTP_404_NOT_FOUND,
detail="Tag not found"
)
return TagResponse.model_validate(tag)
@@ -147,7 +148,7 @@ async def create_tag(
new_tag = TreeTag(
name=tag_data.name,
slug=slug,
account_id=tag_data.account_id,
account_id=tag_data.account_id if tag_data.account_id is not None else PLATFORM_ACCOUNT_ID,
created_by=current_user.id
)
db.add(new_tag)
@@ -206,7 +207,7 @@ async def add_tags_to_tree(
tag_query = select(TreeTag).where(
TreeTag.slug == slug,
or_(
TreeTag.account_id.is_(None), # Global tag
TreeTag.account_id == PLATFORM_ACCOUNT_ID, # Global tag
TreeTag.account_id == tag_account_id # Account tag
)
)
@@ -340,7 +341,7 @@ async def replace_tree_tags(
tag_query = select(TreeTag).where(
TreeTag.slug == slug,
or_(
TreeTag.account_id.is_(None),
TreeTag.account_id == PLATFORM_ACCOUNT_ID,
TreeTag.account_id == tag_account_id
)
)

View File

@@ -18,12 +18,10 @@ async def list_target_lists(
current_user: Annotated[User, Depends(get_current_active_user)],
db: Annotated[AsyncSession, Depends(get_db)],
):
"""List all target lists for the current user's team."""
if not current_user.team_id:
return []
"""List all target lists for the current user's account."""
result = await db.execute(
select(TargetList)
.where(TargetList.team_id == current_user.team_id)
.where(TargetList.account_id == current_user.account_id)
.order_by(TargetList.name)
)
return result.scalars().all()
@@ -36,11 +34,9 @@ async def create_target_list(
db: Annotated[AsyncSession, Depends(get_db)],
_: None = Depends(require_engineer_or_admin),
):
"""Create a new target list for the current team."""
if not current_user.team_id:
raise HTTPException(status_code=400, detail="User must belong to a team")
"""Create a new target list for the current account."""
target_list = TargetList(
team_id=current_user.team_id,
account_id=current_user.account_id,
created_by=current_user.id,
name=data.name,
description=data.description,
@@ -61,7 +57,7 @@ async def get_target_list(
result = await db.execute(
select(TargetList).where(
TargetList.id == list_id,
TargetList.team_id == current_user.team_id,
TargetList.account_id == current_user.account_id,
)
)
target_list = result.scalar_one_or_none()
@@ -81,7 +77,7 @@ async def update_target_list(
result = await db.execute(
select(TargetList).where(
TargetList.id == list_id,
TargetList.team_id == current_user.team_id,
TargetList.account_id == current_user.account_id,
)
)
target_list = result.scalar_one_or_none()
@@ -91,7 +87,7 @@ async def update_target_list(
if "name" in update_fields and data.name is not None:
target_list.name = data.name
if "description" in update_fields:
target_list.description = data.description # allow setting to None
target_list.description = data.description
if "targets" in update_fields and data.targets is not None:
target_list.targets = [t.model_dump() for t in data.targets]
await db.commit()
@@ -109,7 +105,7 @@ async def delete_target_list(
result = await db.execute(
select(TargetList).where(
TargetList.id == list_id,
TargetList.team_id == current_user.team_id,
TargetList.account_id == current_user.account_id,
)
)
target_list = result.scalar_one_or_none()

View File

@@ -29,6 +29,7 @@ from app.core.subscriptions import check_tree_limit, get_account_subscription, g
from app.core.audit import log_audit
from app.core.config import settings
from app.core.tree_validation import can_publish_tree
from app.core.service_account import PLATFORM_ACCOUNT_ID
from app.core.step_sync import sync_steps_from_tree, deactivate_synced_steps_for_tree
from app.services.rag_service import index_tree as rag_index_tree
@@ -391,9 +392,10 @@ async def get_tree(
)
if not tree.is_active or not can_access_tree(current_user, tree):
# Always 404, never 403. A 403 confirms the resource exists.
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You don't have access to this tree"
status_code=status.HTTP_404_NOT_FOUND,
detail="Tree not found"
)
return build_full_tree_response(tree)
@@ -470,7 +472,7 @@ async def create_tree(
tree_structure=tree_data.tree_structure,
intake_form=intake_form_data,
author_id=service_account_id if is_default else current_user.id,
account_id=None if is_default else current_user.account_id,
account_id=PLATFORM_ACCOUNT_ID if is_default else current_user.account_id,
is_public=True if is_default else tree_data.is_public, # Default trees are always public
is_default=is_default,
status=tree_data.status
@@ -610,9 +612,17 @@ async def update_tree(
)
if not can_edit_tree(current_user, tree):
# If the user can see this tree (same account, team visibility), give a 403 with
# a clear message — returning 404 here would be confusing since GET returns 200.
# For truly inaccessible trees (cross-account), return 404 to avoid confirming existence.
if can_access_tree(current_user, tree):
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You do not have permission to edit this flow"
)
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You can only edit your own trees"
status_code=status.HTTP_404_NOT_FOUND,
detail="Tree not found"
)
# Extract tags for separate handling
@@ -1038,6 +1048,7 @@ async def create_tree_share(
# Create share
tree_share = TreeShare(
tree_id=tree.id,
account_id=tree.account_id, # share belongs to the tree's tenant, not the actor
share_token=share_token,
created_by=current_user.id,
allow_forking=share_data.allow_forking,
@@ -1144,9 +1155,17 @@ async def update_tree_visibility(
)
if not can_edit_tree(current_user, tree):
# If the user can see this tree (same account, team visibility), give a 403 with
# a clear message — returning 404 here would be confusing since GET returns 200.
# For truly inaccessible trees (cross-account), return 404 to avoid confirming existence.
if can_access_tree(current_user, tree):
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You do not have permission to edit this flow"
)
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail="You can only edit your own trees"
status_code=status.HTTP_404_NOT_FOUND,
detail="Tree not found"
)
# Update visibility

View File

@@ -255,9 +255,9 @@ async def get_upload_url(
if upload is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found")
# Verify the upload belongs to the user's account
# Verify the upload belongs to the user's account — 404 to avoid revealing existence
if upload.account_id != current_user.account_id and not current_user.is_super_admin:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Access denied")
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found")
url = storage_service.get_presigned_url(upload.storage_key)
return {"url": url}
@@ -311,9 +311,9 @@ async def delete_upload(
if upload is None:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found")
# Verify ownership
# Verify ownership — 404 to avoid revealing existence
if upload.uploaded_by != current_user.id and not current_user.is_super_admin:
raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Access denied")
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Upload not found")
# Delete from S3
await storage_service.delete_file(upload.storage_key)

View File

@@ -1,53 +1,94 @@
from fastapi import APIRouter
from app.api.endpoints import auth, trees, sessions, sidebar, invite, categories, tags, folders, step_categories, steps, admin, accounts, webhooks, shares, shared, tree_markdown
from app.api.endpoints import admin_dashboard, admin_audit, admin_plan_limits, admin_feature_flags, admin_settings, admin_categories
from app.api.endpoints import ratings, analytics
from app.api.endpoints import target_lists
from app.api.endpoints import maintenance_schedules
from app.api.endpoints import feedback
from app.api.endpoints import ai_builder
from app.api.endpoints import ai_fix
from app.api.endpoints import ai_chat
from app.api.endpoints import copilot
from app.api.endpoints import assistant_chat
from app.api.endpoints import survey
from app.api.endpoints import admin_survey
from app.api.endpoints import tree_transfer
from app.api.endpoints import ai_suggestions
from app.api.endpoints import kb_accelerator
from app.api.endpoints import beta_signup
from app.api.endpoints import scripts
from app.api.endpoints import integrations
from app.api.endpoints import onboarding
from app.api.endpoints import branding
from app.api.endpoints import supporting_data
from app.api.endpoints import ai_sessions
from app.api.endpoints import flow_proposals
from app.api.endpoints import flowpilot_analytics
from app.api.endpoints import notifications
from app.api.endpoints import public_templates
from app.api.endpoints import admin_gallery
from app.api.endpoints import uploads
from app.api.endpoints import script_builder
from app.api.endpoints import beta_feedback
from app.api.endpoints import session_branches
from app.api.endpoints import session_handoffs
from app.api.endpoints import session_resolutions
from app.api.endpoints import device_types
from app.api.endpoints import network_diagrams
from fastapi import APIRouter, Depends
from app.api.deps import require_tenant_context
from app.api.endpoints import (
admin,
admin_audit,
admin_categories,
admin_dashboard,
admin_feature_flags,
admin_gallery,
admin_plan_limits,
admin_settings,
admin_survey,
ai_builder,
ai_chat,
ai_fix,
ai_sessions,
ai_suggestions,
analytics,
assistant_chat,
auth,
beta_feedback,
beta_signup,
branding,
categories,
copilot,
device_types,
draft_templates,
feedback,
flow_proposals,
flowpilot_analytics,
folders,
integrations,
invite,
kb_accelerator,
maintenance_schedules,
network_diagrams,
notifications,
onboarding,
public_templates,
ratings,
scripts,
script_builder,
session_branches,
session_facts,
session_handoffs,
session_resolutions,
session_suggested_fixes,
sessions,
shared,
shares,
sidebar,
step_categories,
steps,
supporting_data,
survey,
tags,
target_lists,
tree_markdown,
tree_transfer,
trees,
uploads,
webhooks,
accounts,
)
api_router = APIRouter()
# ---------------------------------------------------------------------------
# Public / unauthenticated endpoints — no tenant context
#
# Note: auth.router contains both public endpoints (register, login,
# forgot-password, reset-password, email/verify) and authenticated endpoints
# (GET/PATCH /me, logout, change-password, email/send-verification).
# The authenticated auth endpoints only query the `users` table, which is
# excluded from Phase 1 RLS. They work correctly without tenant context
# in Phase 1. This will need revisiting in Phase 2 when `users` gets RLS.
# ---------------------------------------------------------------------------
api_router.include_router(auth.router)
api_router.include_router(trees.router)
api_router.include_router(sidebar.router)
api_router.include_router(sessions.router)
api_router.include_router(invite.router)
api_router.include_router(categories.router)
api_router.include_router(tags.router)
api_router.include_router(folders.router)
api_router.include_router(step_categories.router)
api_router.include_router(steps.router)
api_router.include_router(shared.router) # Public share links (no auth)
api_router.include_router(beta_signup.router)
api_router.include_router(webhooks.router) # Stripe webhook receiver
api_router.include_router(public_templates.router) # Public gallery (no auth, rate-limited)
# ---------------------------------------------------------------------------
# Admin endpoints — super_admin only
# admin_categories, admin_gallery, admin_dashboard, admin query Phase 1 RLS
# tables and MUST use get_admin_db (migrated in Task 8). The remaining admin
# endpoints (admin_audit, admin_plan_limits, admin_feature_flags,
# admin_settings, admin_survey) are safe until Phase 2 extends RLS.
# ---------------------------------------------------------------------------
api_router.include_router(admin.router)
api_router.include_router(admin_dashboard.router)
api_router.include_router(admin_audit.router)
@@ -55,44 +96,60 @@ api_router.include_router(admin_plan_limits.router)
api_router.include_router(admin_feature_flags.router)
api_router.include_router(admin_settings.router)
api_router.include_router(admin_categories.router)
api_router.include_router(accounts.router)
api_router.include_router(webhooks.router)
api_router.include_router(shares.router)
api_router.include_router(shared.router) # Public endpoints (no auth)
api_router.include_router(tree_markdown.router)
api_router.include_router(ratings.router)
api_router.include_router(analytics.router)
api_router.include_router(target_lists.router)
api_router.include_router(maintenance_schedules.router)
api_router.include_router(feedback.router)
api_router.include_router(ai_builder.router)
api_router.include_router(ai_fix.router)
api_router.include_router(ai_chat.router)
api_router.include_router(copilot.router)
api_router.include_router(assistant_chat.router)
api_router.include_router(survey.router)
api_router.include_router(admin_survey.router)
api_router.include_router(tree_transfer.router)
api_router.include_router(ai_suggestions.router)
api_router.include_router(kb_accelerator.router)
api_router.include_router(beta_signup.router)
api_router.include_router(scripts.router)
api_router.include_router(integrations.router)
api_router.include_router(onboarding.router)
api_router.include_router(branding.router)
api_router.include_router(supporting_data.router)
api_router.include_router(network_diagrams.router) # Must be before ai_sessions to avoid /{diagram_id} conflict
api_router.include_router(session_handoffs.queue_router) # Must be before ai_sessions to avoid /{session_id} conflict
api_router.include_router(session_resolutions.router) # Must be before ai_sessions to avoid /{session_id} conflict
api_router.include_router(ai_sessions.router)
api_router.include_router(flow_proposals.router)
api_router.include_router(flowpilot_analytics.router)
api_router.include_router(notifications.router)
api_router.include_router(public_templates.router)
api_router.include_router(admin_gallery.router)
api_router.include_router(uploads.router)
api_router.include_router(script_builder.router)
api_router.include_router(beta_feedback.router)
api_router.include_router(session_branches.router)
api_router.include_router(session_handoffs.router)
api_router.include_router(device_types.router)
# ---------------------------------------------------------------------------
# User-facing endpoints — tenant context required
# ---------------------------------------------------------------------------
_tenant_deps = [Depends(require_tenant_context)]
api_router.include_router(trees.router, dependencies=_tenant_deps)
api_router.include_router(sidebar.router, dependencies=_tenant_deps)
api_router.include_router(sessions.router, dependencies=_tenant_deps)
api_router.include_router(invite.router, dependencies=_tenant_deps)
api_router.include_router(categories.router, dependencies=_tenant_deps)
api_router.include_router(tags.router, dependencies=_tenant_deps)
api_router.include_router(folders.router, dependencies=_tenant_deps)
api_router.include_router(step_categories.router, dependencies=_tenant_deps)
api_router.include_router(steps.router, dependencies=_tenant_deps)
api_router.include_router(accounts.router, dependencies=_tenant_deps)
api_router.include_router(shares.router, dependencies=_tenant_deps)
api_router.include_router(tree_markdown.router, dependencies=_tenant_deps)
api_router.include_router(ratings.router, dependencies=_tenant_deps)
api_router.include_router(analytics.router, dependencies=_tenant_deps)
api_router.include_router(target_lists.router, dependencies=_tenant_deps)
api_router.include_router(maintenance_schedules.router, dependencies=_tenant_deps)
api_router.include_router(feedback.router, dependencies=_tenant_deps)
api_router.include_router(ai_builder.router, dependencies=_tenant_deps)
api_router.include_router(ai_fix.router, dependencies=_tenant_deps)
api_router.include_router(ai_chat.router, dependencies=_tenant_deps)
api_router.include_router(copilot.router, dependencies=_tenant_deps)
api_router.include_router(assistant_chat.router, dependencies=_tenant_deps)
api_router.include_router(survey.router, dependencies=_tenant_deps)
api_router.include_router(tree_transfer.router, dependencies=_tenant_deps)
api_router.include_router(ai_suggestions.router, dependencies=_tenant_deps)
api_router.include_router(kb_accelerator.router, dependencies=_tenant_deps)
api_router.include_router(scripts.router, dependencies=_tenant_deps)
api_router.include_router(integrations.router, dependencies=_tenant_deps)
api_router.include_router(onboarding.router, dependencies=_tenant_deps)
api_router.include_router(branding.router, dependencies=_tenant_deps)
api_router.include_router(supporting_data.router, dependencies=_tenant_deps)
api_router.include_router(network_diagrams.router, dependencies=_tenant_deps)
# session_handoffs queue router must come before ai_sessions to avoid conflict
api_router.include_router(session_handoffs.queue_router, dependencies=_tenant_deps)
api_router.include_router(session_resolutions.router, dependencies=_tenant_deps)
# session_facts mounts under /ai-sessions/{id}/facts — register before ai_sessions
# so the {session_id}/facts subpaths take precedence over any future generic catchalls.
api_router.include_router(session_facts.router, dependencies=_tenant_deps)
api_router.include_router(session_suggested_fixes.router, dependencies=_tenant_deps)
api_router.include_router(draft_templates.router, dependencies=_tenant_deps)
api_router.include_router(ai_sessions.router, dependencies=_tenant_deps)
api_router.include_router(flow_proposals.router, dependencies=_tenant_deps)
api_router.include_router(flowpilot_analytics.router, dependencies=_tenant_deps)
api_router.include_router(notifications.router, dependencies=_tenant_deps)
api_router.include_router(uploads.router, dependencies=_tenant_deps)
api_router.include_router(script_builder.router, dependencies=_tenant_deps)
api_router.include_router(beta_feedback.router, dependencies=_tenant_deps)
api_router.include_router(session_branches.router, dependencies=_tenant_deps)
api_router.include_router(session_handoffs.router, dependencies=_tenant_deps)
api_router.include_router(device_types.router, dependencies=_tenant_deps)

View File

@@ -0,0 +1,38 @@
# backend/app/core/admin_database.py
"""
Admin database engine — connects as resolutionflow_admin (BYPASSRLS).
Use ONLY where explicit application-level access control makes database-layer
tenant filtering unnecessary: /admin/* endpoints, internal tooling, and public
endpoints that enforce their own authorization before returning data (e.g.
share access via opaque token + visibility check).
"""
from collections.abc import AsyncGenerator
from sqlalchemy.ext.asyncio import AsyncSession, async_sessionmaker, create_async_engine
from app.core.config import settings
admin_engine = create_async_engine(
settings.ADMIN_DATABASE_URL,
echo=settings.DEBUG,
future=True,
)
_admin_session_factory = async_sessionmaker(
admin_engine,
class_=AsyncSession,
expire_on_commit=False,
)
async def get_admin_db() -> AsyncGenerator[AsyncSession, None]:
"""Yield an admin DB session (BYPASSRLS). See module docstring for approved use cases."""
async with _admin_session_factory() as session:
try:
yield session
except Exception:
await session.rollback()
raise
finally:
await session.close()

View File

@@ -40,7 +40,7 @@ CRITICAL BEHAVIORS:
- Act as a senior engineer, not a chatbot. Use your domain knowledge to SUGGEST diagnostic steps, not just record what the user says.
- When the user describes a problem area, demonstrate understanding by naming specific sub-categories, common causes, and relevant tools.
- Challenge assumptions constructively: "Before we go down that path, have you considered checking X first? In my experience, that resolves 60% of these cases."
- Capture SPECIFIC commands with exact syntax. Not "check the service" but "Get-Service ADSync | Select-Object Status, StartType".
- Capture SPECIFIC commands with exact syntax (PowerShell/CLI invocations the engineer would actually paste into a shell), not vague directives like "check the service".
- Include expected outcomes for every action: what does success look like?
- Surface edge cases proactively: "What about multi-forest environments?" or "Does this change if they have conditional access policies?"
- Explain WHY the diagnostic order matters: "We check connectivity before auth because a network issue masquerades as an auth failure."
@@ -74,7 +74,7 @@ STRUCTURAL RULES:
- All IDs must be unique strings (use descriptive slugs like "check-service-status")
CROSS-REFERENCE / LOOP-BACK PATTERN:
When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID. Example: an action node "restart-ssh-service" can set next_node_id to "verify-ssh-connection" (an ancestor decision node) to create a re-verification loop.
When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID — including ancestor decision nodes for re-verification loops. The target ID must already exist somewhere in the tree.
"""
INTERVIEW_PROTOCOL = """
@@ -85,7 +85,7 @@ Ask broad questions to understand the problem domain and scope:
- What type of issue is this flow for?
- Who is the target audience? (Tier 1 help desk, Tier 2, Tier 3?)
- What environment assumptions? (On-prem, hybrid, specific vendors?)
Demonstrate domain expertise immediately. If the user says "Azure AD Sync failures," show understanding: "Are you primarily seeing password hash sync issues, object attribute sync failures, or full directory sync errors?"
Demonstrate domain expertise immediately. When the user names a technology, ask a follow-up that proves you know its common failure modes — a sub-categorization question that only someone fluent in that area would think to ask. Use vocabulary native to whatever the user actually mentioned, not stock examples from past conversations.
DO NOT emit [TREE_UPDATE] during scoping. You are still understanding the problem.
PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -130,7 +130,7 @@ Your response is natural conversational text. When the tree structure changes, i
3. Metadata capture (when you learn the flow's name, description, or tags):
[METADATA]
{"name": "...", "description": "...", "tags": ["..."]}
{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
[/METADATA]
IMPORTANT:
@@ -172,8 +172,8 @@ STRUCTURAL RULES:
- All IDs must be unique descriptive slugs (e.g., "check-dns-resolution", not UUIDs)
- The last step MUST be type "procedure_end"
- Use section_headers to organize steps into logical phases
- Commands are arrays of objects: [{"code": "Get-Service ADSync", "label": "Check sync service", "language": "powershell"}]
- Descriptions support [VAR:variable_name] interpolation for intake form variables (e.g., "Connect to [VAR:server_name] via RDP")
- Commands are arrays of objects: [{"code": "<exact command>", "label": "<short label>", "language": "powershell|bash|cmd"}]
- Descriptions support [VAR:variable_name] interpolation for intake form variables. Pick variable names that fit the procedure being built — do not reuse names from prior conversations.
VARIABLE INTERPOLATION:
When the procedure needs per-execution input (server name, IP address, client name, etc.), use [VAR:variable_name] syntax in descriptions and commands. These map to intake form fields that the engineer fills in before starting.
@@ -188,7 +188,7 @@ Understand the process being documented:
- Who will execute it? (Tier 1 help desk, Tier 2, senior engineers?)
- What environment context? (Specific vendor, on-prem vs cloud, tools available?)
- Will this need per-execution input? (server name, client info, IP addresses → intake form fields)
Demonstrate domain expertise: if the user says "Exchange Online mailbox migration," show understanding: "Are we covering full tenant-to-tenant migration, on-prem to Exchange Online cutover, or individual mailbox moves with hybrid?"
Demonstrate domain expertise: when the user names a process, ask a sub-categorization question that distinguishes which variant of that process they mean (the variants will differ by technology — use vocabulary specific to whatever the user mentioned, not examples from prior chats).
DO NOT emit [STEPS_UPDATE] during scoping. You are still understanding the process.
PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -238,12 +238,12 @@ Your response is natural conversational text. When the step structure changes, i
3. Metadata capture (when you learn the flow's name, description, or tags):
[METADATA]
{"name": "...", "description": "...", "tags": ["..."]}
{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
[/METADATA]
4. Intake form suggestion (when intake form fields are identified):
[INTAKE_FORM]
[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
[/INTAKE_FORM]
IMPORTANT:
@@ -659,12 +659,12 @@ Requirements:
Also provide metadata as a separate JSON object after the steps:
[METADATA]
{"name": "...", "description": "...", "tags": ["..."]}
{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
[/METADATA]
If we discussed intake form fields, also include:
[INTAKE_FORM]
[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
[/INTAKE_FORM]"""
else:
generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL TreeStructure JSON for this flow.
@@ -681,7 +681,7 @@ Requirements:
Also provide metadata as a separate JSON object after the tree:
[METADATA]
{"name": "...", "description": "...", "tags": ["..."]}
{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
[/METADATA]"""
provider_messages.append({"role": "user", "content": generation_instruction})

View File

@@ -199,7 +199,10 @@ async def generate_fixes(
try:
text, in_tok, out_tok = await provider.generate_json(
system_prompt=FIX_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": FIX_SYSTEM_PROMPT},
# cacheable: stable constant across all fix attempts
],
messages=messages,
max_tokens=2048,
)
@@ -232,7 +235,11 @@ async def generate_fixes(
try:
text2, in_tok2, out_tok2 = await provider.generate_json(
system_prompt=FIX_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": FIX_SYSTEM_PROMPT},
# cacheable: stable constant; retry reads the cached
# system block from the first attempt above
],
messages=messages,
max_tokens=2048,
)

View File

@@ -3,16 +3,169 @@ AI Provider abstraction layer.
Supports Gemini (google-genai) and Anthropic (anthropic) as interchangeable
backends for JSON generation used by the AI Flow Builder.
## Prompt caching (Anthropic only)
Callers may pass `system_prompt` as either:
- `str` — backward-compatible, uncached.
- `list[SystemBlock]` — Anthropic structured system blocks. Each block is a
dict of shape `{"type": "text", "text": str, "cache_control": {...}?}`.
Caching policy (policy α, per Phase 0.1 design):
- If any block in the list carries an explicit `cache_control` key, that
caller-authored configuration is honored verbatim.
- If no block carries `cache_control`, the provider applies
`cache_control: {"type": "ephemeral"}` to the first block only. First block
is the common "large static prefix" case (e.g. system prompt, reference data).
Gemini ignores cache_control and concatenates list blocks into one system
string — callers should not rely on Gemini for cache-hit behavior.
TODO(phase0-verify): When a dev environment is available, verify cache-hit
behavior by hitting any FlowPilot endpoint twice within the 5-minute
ephemeral TTL. First call should emit `anthropic.cache` with
`cache_creation_input_tokens > 0`; second call with `cache_read_input_tokens > 0`.
If the second call returns zero reads, inspect the prefix for silent
invalidators (timestamps, unsorted JSON keys, varying tool list ordering).
"""
import logging
from abc import ABC, abstractmethod
from collections.abc import AsyncIterator
from typing import Any
from app.core.config import settings
logger = logging.getLogger(__name__)
# Anthropic structured system block. See module docstring for caching policy.
SystemBlock = dict[str, Any]
def _normalize_system_for_anthropic(
system_prompt: str | list[SystemBlock],
) -> str | list[SystemBlock]:
"""Return the value to pass as the `system=` parameter to the Anthropic API.
- Plain strings pass through untouched (uncached path).
- Lists are returned as structured system blocks. If no block in the list
carries an explicit `cache_control`, `cache_control: {"type": "ephemeral"}`
is applied to the FIRST block only (policy α).
- Caller-authored `cache_control` is never overwritten.
"""
if isinstance(system_prompt, str):
return system_prompt
if not system_prompt:
# Empty list is not a meaningful system prompt — pass empty string so
# Anthropic treats this as "no system prompt" rather than erroring.
return ""
blocks = [dict(b) for b in system_prompt]
already_cached = any("cache_control" in b for b in blocks)
if not already_cached:
blocks[0]["cache_control"] = {"type": "ephemeral"}
return blocks
def _flatten_system_for_gemini(
system_prompt: str | list[SystemBlock],
) -> str:
"""Gemini has no structured system blocks; concatenate list entries."""
if isinstance(system_prompt, str):
return system_prompt
return "\n\n".join(b.get("text", "") for b in system_prompt)
def build_anthropic_chat_messages(
history: list[dict[str, Any]],
new_message: str,
images: list[dict[str, Any]] | None = None,
format_reminder: str | None = None,
) -> list[dict[str, Any]]:
"""Construct the Anthropic `messages` payload for a cached multi-turn chat.
Responsibilities:
- Copy the valid history messages in order.
- Apply `cache_control: ephemeral` to the LAST history message so the entire
conversation prefix is cached across turns. The new user message stays
uncached (it changes each turn).
- Append `format_reminder` to the new user message if provided. The reminder
is invisible to storage (caller's concern) but helps enforce structured
output compliance at generation time.
- If `images` are provided, render the new user message as a multimodal
content block list (images first, then text). Otherwise, render it as
a plain string.
This helper is Anthropic-specific: the cache-breakpoint pattern, ephemeral
cache_control, and multimodal block shape are all Anthropic conventions.
Do not call it from Gemini code paths.
"""
messages: list[dict[str, Any]] = []
for msg in history:
messages.append({"role": msg["role"], "content": msg["content"]})
# Cache breakpoint on the last existing history message so the entire
# conversation prefix is cached across turns. Safe only when there IS a
# history message; otherwise the new message is the only message.
if messages:
last = messages[-1]
messages[-1] = {
"role": last["role"],
"content": [
{
"type": "text",
"text": last["content"],
"cache_control": {"type": "ephemeral"},
}
],
}
effective_text = new_message + (format_reminder or "")
if images:
content_blocks: list[dict[str, Any]] = []
for img in images:
content_blocks.append(
{
"type": "image",
"source": {
"type": "base64",
"media_type": img["media_type"],
"data": img["data"],
},
}
)
content_blocks.append({"type": "text", "text": effective_text})
messages.append({"role": "user", "content": content_blocks})
else:
messages.append({"role": "user", "content": effective_text})
return messages
def _log_anthropic_cache_usage(usage: Any, model: str) -> None:
"""Emit a structured log line capturing cache_read / cache_creation tokens."""
cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0
input_tokens = getattr(usage, "input_tokens", 0) or 0
output_tokens = getattr(usage, "output_tokens", 0) or 0
if cache_read or cache_creation:
logger.info(
"anthropic.cache",
extra={
"event": "anthropic.cache",
"model": model,
"cache_read_input_tokens": cache_read,
"cache_creation_input_tokens": cache_creation,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
},
)
class AIProvider(ABC):
"""Abstract base class for AI providers."""
@@ -20,14 +173,16 @@ class AIProvider(ABC):
@abstractmethod
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
"""Generate a JSON response from the AI model.
Args:
system_prompt: System-level instruction for the model.
system_prompt: System-level instruction. Plain `str` is uncached
(Anthropic) or used as-is (Gemini). `list[SystemBlock]` enables
Anthropic prompt caching per module-docstring policy.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
@@ -39,37 +194,25 @@ class AIProvider(ABC):
@abstractmethod
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
"""Generate a text response from the AI model (no JSON constraint).
Args:
system_prompt: System-level instruction for the model.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
Returns:
Tuple of (response_text, input_tokens, output_tokens).
See `generate_json` for argument semantics.
"""
...
async def generate_text_stream(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> "AsyncIterator[str]":
"""Stream a text response token by token.
Args:
system_prompt: System-level instruction for the model.
messages: List of message dicts with "role" and "content" keys.
max_tokens: Maximum output tokens.
Yields:
Text chunks as they are generated.
See `generate_json` for argument semantics.
"""
raise NotImplementedError("Streaming not supported for this provider")
# Make this an async generator to satisfy type checker
@@ -85,14 +228,15 @@ class GeminiProvider(AIProvider):
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
from google import genai
from google.genai import types as genai_types
client = genai.Client(api_key=self._api_key)
system_text = _flatten_system_for_gemini(system_prompt)
# Convert messages to Gemini Content format
contents: list[genai_types.Content] = []
@@ -106,7 +250,7 @@ class GeminiProvider(AIProvider):
)
config = genai_types.GenerateContentConfig(
system_instruction=system_prompt,
system_instruction=system_text,
max_output_tokens=max_tokens,
response_mime_type="application/json",
)
@@ -137,14 +281,15 @@ class GeminiProvider(AIProvider):
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
from google import genai
from google.genai import types as genai_types
client = genai.Client(api_key=self._api_key)
system_text = _flatten_system_for_gemini(system_prompt)
contents: list[genai_types.Content] = []
for msg in messages:
@@ -157,7 +302,7 @@ class GeminiProvider(AIProvider):
)
config = genai_types.GenerateContentConfig(
system_instruction=system_prompt,
system_instruction=system_text,
max_output_tokens=max_tokens,
# No response_mime_type — allow free-form text
)
@@ -214,16 +359,17 @@ class AnthropicProvider(AIProvider):
async def generate_json(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
client = _get_anthropic_client(self._api_key, self._timeout)
normalized_system = _normalize_system_for_anthropic(system_prompt)
response = await client.messages.create(
model=self._model,
max_tokens=max_tokens,
system=system_prompt,
system=normalized_system,
messages=messages,
)
@@ -231,12 +377,14 @@ class AnthropicProvider(AIProvider):
input_tokens = response.usage.input_tokens
output_tokens = response.usage.output_tokens
_log_anthropic_cache_usage(response.usage, self._model)
return text, input_tokens, output_tokens
async def generate_text(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> tuple[str, int, int]:
# Anthropic doesn't differentiate between JSON and text mode
@@ -244,20 +392,28 @@ class AnthropicProvider(AIProvider):
async def generate_text_stream(
self,
system_prompt: str,
messages: list[dict[str, str]],
system_prompt: str | list[SystemBlock],
messages: list[dict[str, Any]],
max_tokens: int = 4096,
) -> AsyncIterator[str]:
client = _get_anthropic_client(self._api_key, self._timeout)
normalized_system = _normalize_system_for_anthropic(system_prompt)
async with client.messages.stream(
model=self._model,
max_tokens=max_tokens,
system=system_prompt,
system=normalized_system,
messages=messages,
) as stream:
async for text in stream.text_stream:
yield text
# Per Anthropic SDK, get_final_message() resolves the stream's
# final usage object (including cache_read/cache_creation tokens).
try:
final = await stream.get_final_message()
_log_anthropic_cache_usage(final.usage, self._model)
except Exception as exc: # best-effort telemetry, never fail the stream
logger.debug("anthropic.cache streaming usage unavailable: %s", exc)
def get_ai_provider(model: str | None = None) -> AIProvider:

View File

@@ -89,8 +89,10 @@ Additional rules:
5. Use unique node IDs prefixed with the branch context (e.g., "gpo-check-link")
6. Build the tree bottom-up in your head: create solution/leaf nodes first, then build parent nodes referencing their IDs
Few-shot example showing correct action node next_node_id usage:
{"id": "dns-root", "type": "decision", "question": "Can the client resolve any DNS names?", "help_text": "Run: nslookup google.com", "options": [{"id": "dns-opt-none", "label": "No — nslookup times out or returns 'server failed'", "next_node_id": "dns-check-service"}, {"id": "dns-opt-partial", "label": "Some names resolve but others fail", "next_node_id": "dns-check-specific"}], "children": [{"id": "dns-check-service", "type": "action", "title": "Check DNS Client Service", "description": "Verify the DNS Client service is running on the affected machine", "commands": ["Get-Service -Name Dnscache | Select-Object Status,StartType"], "expected_outcome": "Status should be Running", "next_node_id": "dns-service-solution"}, {"id": "dns-service-solution", "type": "solution", "title": "DNS Service Was Stopped", "description": "The DNS Client service was stopped, preventing all name resolution", "resolution_steps": ["Run: Start-Service Dnscache", "Set startup type: Set-Service Dnscache -StartupType Automatic", "Flush cache: ipconfig /flushdns", "Test: nslookup google.com"]}, {"id": "dns-check-specific", "type": "solution", "title": "Selective DNS Failure — Stale or Missing Records", "description": "Some records resolve correctly, indicating DNS is functional but specific records are stale or missing", "resolution_steps": ["Check DNS server for missing A/CNAME records", "Clear DNS cache on the DNS server: Clear-DnsServerCache", "Flush client cache: ipconfig /flushdns", "Verify with: nslookup <failing-hostname>"]}]}"""
SHAPE-ONLY schema example (do not copy this content verbatim — it shows
how IDs link, NOT what to ask or run; your real tree must reflect the
branch the user described):
{"id": "<root-slug>", "type": "decision", "question": "<diagnostic question for THIS branch>", "help_text": "<optional hint>", "options": [{"id": "<opt-1>", "label": "<observable answer 1>", "next_node_id": "<child-1>"}, {"id": "<opt-2>", "label": "<observable answer 2>", "next_node_id": "<child-2>"}], "children": [{"id": "<child-1>", "type": "action", "title": "<what to do>", "description": "<details>", "commands": ["<exact command for THIS branch>"], "expected_outcome": "<what success looks like>", "next_node_id": "<sibling-id>"}, {"id": "<sibling-id>", "type": "solution", "title": "<resolution title>", "description": "<resolution description>", "resolution_steps": ["<step 1>", "<step 2>"]}, {"id": "<child-2>", "type": "solution", "title": "<other resolution>", "description": "<...>", "resolution_steps": ["<step 1>"]}]}"""
CORRECTIVE_PROMPT_TEMPLATE = """Your previous JSON was invalid for ResolutionFlow's tree schema.
@@ -146,7 +148,10 @@ async def scaffold_branches(
user_message += f"Environment: {', '.join(tags)}\n"
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=SCAFFOLD_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": SCAFFOLD_SYSTEM_PROMPT},
# cacheable: stable constant across all scaffold calls
],
messages=[{"role": "user", "content": user_message}],
max_tokens=2048,
)
@@ -207,7 +212,13 @@ async def generate_branch_detail(
for attempt in range(3):
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=BRANCH_DETAIL_SYSTEM_PROMPT,
system_prompt=[
{"type": "text", "text": BRANCH_DETAIL_SYSTEM_PROMPT},
# cacheable: stable constant. Retries in this loop re-read the
# cached system block rather than paying full input cost each
# attempt — the ~2.5k-token prompt with few-shot example is
# the dominant cost here.
],
messages=messages,
max_tokens=8192,
)

View File

@@ -12,10 +12,19 @@ async def log_audit(
resource_type: str,
resource_id: Optional[UUID] = None,
details: Optional[dict] = None,
account_id: Optional[UUID] = None,
) -> None:
"""Record an audit log entry. Does not commit — piggybacks on the caller's commit."""
if account_id is None:
# Derive from the acting user's account as a fallback (one extra query).
from sqlalchemy import select
from app.models.user import User
result = await db.execute(select(User.account_id).where(User.id == user_id))
account_id = result.scalar_one()
entry = AuditLog(
user_id=user_id,
account_id=account_id,
action=action,
resource_type=resource_type,
resource_id=resource_id,

View File

@@ -23,10 +23,33 @@ class Settings(BaseSettings):
return v.replace("postgresql://", "postgresql+asyncpg://", 1)
return v
@property
def DATABASE_URL_SYNC(self) -> str:
"""Get sync URL by removing asyncpg prefix from DATABASE_URL."""
return self.DATABASE_URL.replace("postgresql+asyncpg://", "postgresql://", 1)
# Sync URL for Alembic migrations. Defaults to DATABASE_URL (sync-converted).
# Set explicitly in .env to use a different role for migrations (e.g. superuser)
# when DATABASE_URL has been switched to the app role.
DATABASE_URL_SYNC: str = ""
@field_validator("DATABASE_URL_SYNC", mode="before")
@classmethod
def default_database_url_sync(cls, v: str, info) -> str:
"""Fall back to sync-converted DATABASE_URL if not explicitly set."""
if not v:
base = info.data.get("DATABASE_URL", "")
return base.replace("postgresql+asyncpg://", "postgresql://", 1)
return v
# Admin database — resolutionflow_admin role, BYPASSRLS.
# Used by /admin/* endpoints. Defaults to DATABASE_URL for local dev.
ADMIN_DATABASE_URL: str = ""
@field_validator("ADMIN_DATABASE_URL", mode="before")
@classmethod
def default_admin_database_url(cls, v: str, info) -> str:
"""Fall back to DATABASE_URL if ADMIN_DATABASE_URL is not set."""
if not v:
return info.data.get("DATABASE_URL", "")
if v.startswith("postgresql://"):
return v.replace("postgresql://", "postgresql+asyncpg://", 1)
return v
# JWT Settings
SECRET_KEY: str = _DEFAULT_SECRET_KEY
@@ -106,6 +129,23 @@ class Settings(BaseSettings):
"kb_convert": "standard",
"script_build": "standard",
"network_diagram_generate": "standard",
# FlowPilot migration Phase 2 — short, latency-sensitive transformation
# of an engineer's answer/check output into a candidate fact.
# Doc Section 6.6 sets Haiku as the default; instrumentation tracks
# disputed_fact_rate so we can escalate to Sonnet if quality drops.
"fact_synthesis": "fast",
# FlowPilot migration Phase 3 — resolution-note preview that ships to
# the customer ticket. Sonnet because customer-facing artifact quality
# matters more than latency; the in-process state_version cache keeps
# cost manageable.
"resolution_note": "standard",
# FlowPilot migration Phase 4 — escalation handoff package. Parallel
# to resolution_note: Sonnet, same cache story, no MCP.
"escalation_package": "standard",
# FlowPilot migration Phase 5 — extract a parameter schema from a
# concrete rendered script so a draft_template can be proposed.
# Creates a persistent library artifact on accept, so Sonnet.
"template_extraction": "standard",
}
def get_model_for_action(self, action_type: str) -> str:

View File

@@ -1,6 +1,7 @@
from sqlalchemy.ext.asyncio import AsyncSession, create_async_engine, async_sessionmaker
from sqlalchemy.orm import DeclarativeBase
from .config import settings
from app.core.tenant_context import register_tenant_listener
# Create async engine
engine = create_async_engine(
@@ -16,6 +17,11 @@ async_session_maker = async_sessionmaker(
expire_on_commit=False
)
# Register the RLS tenant context listener on the app engine.
# Fires at the start of every transaction; issues set_config automatically.
# Must NOT be called on admin_engine — admin connections bypass RLS.
register_tenant_listener(engine)
class Base(DeclarativeBase):
"""Base class for all database models."""

View File

@@ -1,10 +1,12 @@
"""
Centralized query filters for ResolutionFlow.
Provides reusable SQLAlchemy filter builders for tree access control
and step visibility, used across multiple endpoint modules.
Provides reusable SQLAlchemy filter builders for tree access control,
step visibility, and the canonical tenant_filter used by all queries
on tenant-scoped tables.
"""
from __future__ import annotations
import uuid
from typing import TYPE_CHECKING
from sqlalchemy import or_, and_, true as sa_true
@@ -13,6 +15,18 @@ if TYPE_CHECKING:
from app.models.user import User
def tenant_filter(model, account_id: uuid.UUID):
"""Primary app-layer tenant filter.
MUST be used in every SELECT/UPDATE/DELETE on tenant tables.
RLS (Phase 2) is the safety net — this is the primary enforcement.
Usage:
stmt = select(Tree).where(tenant_filter(Tree, current_user.account_id), ...)
"""
return model.account_id == account_id
def build_tree_access_filter(current_user: User):
"""Build the access filter for trees based on user permissions.
@@ -36,10 +50,11 @@ def build_tree_access_filter(current_user: User):
Tree.author_id == current_user.id,
]
if current_user.account_id:
# Team-visible trees: use tenant_filter as the account match
conditions.append(
and_(
Tree.visibility == 'team',
Tree.account_id == current_user.account_id
tenant_filter(Tree, current_user.account_id),
)
)
return or_(*conditions)
@@ -58,11 +73,14 @@ def build_step_visibility_filter(current_user: User):
if current_user.account_id:
return or_(
StepLibrary.visibility == 'public',
and_(StepLibrary.visibility == 'team', StepLibrary.account_id == current_user.account_id),
StepLibrary.created_by == current_user.id # Own private steps
and_(
StepLibrary.visibility == 'team',
tenant_filter(StepLibrary, current_user.account_id),
),
StepLibrary.created_by == current_user.id,
)
else:
return or_(
StepLibrary.visibility == 'public',
StepLibrary.created_by == current_user.id
StepLibrary.created_by == current_user.id,
)

View File

@@ -153,48 +153,29 @@ Identify values that would change between executions (server names, IPs, usernam
## Output Format
Return a JSON object:
Return a JSON object with this SHAPE (DO NOT copy the placeholders below
verbatim — fill each field with content derived from the actual KB article
the engineer attached, NOT from this schema):
```json
{
"title": "Procedure title derived from the article",
"description": "Brief description of what this procedure accomplishes",
"title": "<procedure title derived from the article>",
"description": "<brief description of what this procedure accomplishes>",
"steps": [
{
"id": "unique-step-id",
"type": "step",
"content": "Open Server Manager and navigate to Add Roles on [VAR:server_name]",
"confidence": 0.95,
"source_excerpt": "Step 1: Open Server Manager on DC01..."
},
{
"id": "warning-dns",
"type": "warning",
"content": "WARNING: This will restart DNS and cause brief connectivity loss",
"confidence": 0.90,
"source_excerpt": "Note: Restarting DNS will cause a brief outage"
},
{
"id": "section-verification",
"type": "section_header",
"content": "Verification Steps",
"confidence": 1.0,
"source_excerpt": "Verification"
"id": "<unique-kebab-case-id>",
"type": "step|warning|section_header",
"content": "<step body — may include [VAR:<your_variable>] interpolation>",
"confidence": <float 0.0-1.0>,
"source_excerpt": "<the verbatim sentence/phrase from the article that this step came from>"
}
],
"intake_form": [
{
"variable_name": "server_name",
"label": "Server Name",
"field_type": "text",
"required": true,
"display_order": 1
},
{
"variable_name": "ip_address",
"label": "IP Address",
"field_type": "text",
"required": true,
"display_order": 2
"variable_name": "<snake_case_name fitting THIS procedure>",
"label": "<Human Label>",
"field_type": "text|password|select|textarea|number|boolean",
"required": true|false,
"display_order": <integer>
}
]
}
@@ -425,7 +406,12 @@ async def convert_document(
try:
raw_text, input_tokens, output_tokens = await provider.generate_json(
system_prompt=system_prompt,
system_prompt=[
{"type": "text", "text": system_prompt},
# cacheable: one of two stable constants (TROUBLESHOOTING_SYSTEM_PROMPT
# or PROCEDURAL_SYSTEM_PROMPT) selected by target_type. Each
# variant caches independently by text content.
],
messages=[{"role": "user", "content": user_message}],
max_tokens=16384,
)

View File

@@ -21,7 +21,7 @@ async def _fire_maintenance_schedule(schedule_id: str) -> None:
"""Create batch sessions for a scheduled maintenance run."""
# Import all models first to ensure SQLAlchemy mapper relationships resolve
import app.models # noqa: F401
from app.core.database import async_session_maker
from app.core.admin_database import _admin_session_factory as async_session_maker
from app.models.maintenance_schedule import MaintenanceSchedule
from app.models.session import Session
from app.models.target_list import TargetList
@@ -118,7 +118,7 @@ async def _fire_maintenance_schedule(schedule_id: str) -> None:
async def _cleanup_expired_ai_conversations() -> None:
"""Delete expired AI wizard conversations."""
import app.models # noqa: F401
from app.core.database import async_session_maker
from app.core.admin_database import _admin_session_factory as async_session_maker
from app.models.ai_conversation import AIConversation
async with async_session_maker() as db:

View File

@@ -14,10 +14,16 @@ import logging
from sqlalchemy import select
from sqlalchemy.ext.asyncio import AsyncSession
from app.core.admin_database import _admin_session_factory
logger = logging.getLogger(__name__)
SERVICE_ACCOUNT_EMAIL = "noreply@resolutionflow.com"
SERVICE_ACCOUNT_NAME = "ResolutionFlow"
# Well-known UUID for the platform account — owns all default/global content.
# Created by migration 3a40fe11b427_create_global_content_tables.
PLATFORM_ACCOUNT_ID = uuid.UUID("00000000-0000-0000-0000-000000000001")
SYSTEM_ACCOUNT_NAME = "ResolutionFlow System"
SYSTEM_ACCOUNT_DISPLAY_CODE = "RF-SYS-1"
@@ -48,40 +54,45 @@ async def _ensure_system_account(db: AsyncSession) -> uuid.UUID:
async def ensure_service_account(db: AsyncSession) -> uuid.UUID:
"""Ensure the ResolutionFlow service account exists and return its ID.
Idempotent — safe to call on every startup. Creates the account if it
does not exist. The account has no usable password and is_service_account=True
so it can never log in via normal auth flows.
Idempotent — safe to call on every startup. This lookup must bypass RLS
because startup runs before any request-scoped tenant context exists and
the users table is tenant-isolated in Phase 4. The service account is
normally created by Alembic migration 1490781700bc; the runtime create path
remains as a self-healing fallback for environments that predate that seed.
"""
_ = db # Retained for call-site compatibility in app lifespan startup.
from app.models.user import User
result = await db.execute(
select(User).where(User.email == SERVICE_ACCOUNT_EMAIL)
)
user = result.scalar_one_or_none()
async with _admin_session_factory() as admin_db:
result = await admin_db.execute(
select(User).where(User.email == SERVICE_ACCOUNT_EMAIL)
)
user = result.scalar_one_or_none()
if user is not None:
if not user.is_service_account:
user.is_service_account = True
await db.commit()
return user.id
if user is not None:
if not user.is_service_account:
user.is_service_account = True
await admin_db.commit()
return user.id
account_id = await _ensure_system_account(db)
account_id = await _ensure_system_account(admin_db)
new_user = User(
id=uuid.uuid4(),
email=SERVICE_ACCOUNT_EMAIL,
name=SERVICE_ACCOUNT_NAME,
password_hash="!service-account-no-login", # bcrypt can't produce this prefix
role="engineer",
is_super_admin=False,
is_team_admin=False,
is_active=True,
is_service_account=True,
must_change_password=False,
account_id=account_id,
account_role="engineer",
)
db.add(new_user)
await db.commit()
logger.info(f"[service_account] Created service account (id={new_user.id})")
return new_user.id
new_user = User(
id=uuid.uuid4(),
email=SERVICE_ACCOUNT_EMAIL,
name=SERVICE_ACCOUNT_NAME,
password_hash="!service-account-no-login", # bcrypt can't produce this prefix
role="engineer",
is_super_admin=False,
is_team_admin=False,
is_active=True,
is_service_account=True,
must_change_password=False,
account_id=account_id,
account_role="engineer",
)
admin_db.add(new_user)
await admin_db.commit()
logger.info(f"[service_account] Created service account (id={new_user.id})")
return new_user.id

View File

@@ -0,0 +1,92 @@
# backend/app/core/tenant_context.py
"""
Per-request tenant context for row-level security.
Flow:
1. require_tenant_context (FastAPI dep) calls set_current_account_id().
2. The SQLAlchemy transaction-begin listener fires on every new transaction
and calls set_config('app.current_account_id', <id>, true) automatically.
3. PostgreSQL RLS policies read current_setting('app.current_account_id', TRUE)
to filter rows.
The ContextVar is asyncio-task-scoped: each concurrent request has its own value.
set_config with is_local=true is transaction-scoped: it resets on COMMIT or
ROLLBACK, so the listener re-applies it at the start of every transaction.
"""
import contextvars
from typing import TYPE_CHECKING
from uuid import UUID
from sqlalchemy import event, or_, text
from sqlalchemy.ext.asyncio import AsyncEngine
if TYPE_CHECKING:
from app.models.user import User
# One slot per async task — each concurrent request gets its own value.
_current_account_id: contextvars.ContextVar[str | None] = contextvars.ContextVar(
"current_account_id", default=None
)
# Platform account — global content visible to all tenants.
PLATFORM_ACCOUNT_ID = UUID("00000000-0000-0000-0000-000000000001")
def set_current_account_id(account_id: UUID) -> contextvars.Token:
"""Set tenant context for the current request coroutine.
Returns a token so the caller can reset it after the request.
"""
return _current_account_id.set(str(account_id))
def clear_current_account_id(token: contextvars.Token) -> None:
"""Reset the ContextVar to its previous value (call in finally block)."""
_current_account_id.reset(token)
def get_current_account_id() -> str | None:
"""Return the account_id string for the current request, or None."""
return _current_account_id.get()
def register_tenant_listener(engine: AsyncEngine) -> None:
"""Register the transaction-begin listener on the given engine.
Must be called once at application startup, AFTER the engine is created.
The listener issues set_config() at the start of every transaction so that
the setting is re-applied automatically even when a request commits
mid-flight and starts a new transaction.
Do NOT call this on admin_engine — admin connections must never set tenant
context automatically.
"""
@event.listens_for(engine.sync_engine, "begin")
def _on_transaction_begin(conn) -> None: # noqa: ANN001
account_id = _current_account_id.get()
if account_id:
# set_config(name, value, is_local=true) ≡ SET LOCAL.
# Unlike SET LOCAL, set_config IS parameterisable.
conn.execute(
text("SELECT set_config('app.current_account_id', :id, true)"),
{"id": account_id},
)
# If no account_id is set, do nothing. The RLS policy falls back to a
# null-matching UUID and returns zero rows — fail-closed behaviour.
def tenant_filter(Model, current_user: "User"): # noqa: ANN001
"""SQLAlchemy filter clause for tables that contain platform-owned rows.
Use for: tree_tags, tree_categories, step_categories, step_library,
template_trees, platform_steps.
For tenant-only tables (trees, sessions, psa_connections, etc.) use:
Model.account_id == current_user.account_id
directly.
"""
return or_(
Model.account_id == current_user.account_id,
Model.account_id == PLATFORM_ACCOUNT_ID,
)

View File

@@ -25,7 +25,8 @@ if settings.SENTRY_DSN:
),
)
from app.core.database import init_db, async_session_maker
from app.core.database import init_db
from app.core.admin_database import _admin_session_factory as async_session_maker
from app.core.logging_config import setup_logging
from app.core.middleware import RequestLoggingMiddleware, ErrorLoggingMiddleware
from app.core.security_headers import SecurityHeadersMiddleware

View File

@@ -54,8 +54,14 @@ from .session_branch import SessionBranch
from .fork_point import ForkPoint
from .session_handoff import SessionHandoff
from .session_resolution_output import SessionResolutionOutput
from .template_tree import TemplateTree
from .platform_step import PlatformStep
from .device_type import DeviceType
from .network_diagram import NetworkDiagram
from .session_fact import SessionFact
from .session_suggested_fix import SessionSuggestedFix
from .draft_template import DraftTemplate
from .account_settings import AccountSettings
__all__ = [
"User",
@@ -124,6 +130,12 @@ __all__ = [
"ForkPoint",
"SessionHandoff",
"SessionResolutionOutput",
"TemplateTree",
"PlatformStep",
"DeviceType",
"NetworkDiagram",
"SessionFact",
"SessionSuggestedFix",
"DraftTemplate",
"AccountSettings",
]

View File

@@ -0,0 +1,99 @@
"""Per-account settings with a JSONB preferences grab-bag.
Rows are created lazily on first write. Reads of a missing row return the
caller-supplied default — no upfront row creation per account.
Settings live in `preferences` until they meet the promotion criteria in
Section 4.6 of FLOWPILOT-MIGRATION.md (hot path / validation / joins), at
which point a future migration adds a typed column and the helpers prefer it.
"""
from __future__ import annotations
import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import DateTime, ForeignKey, text
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB, insert as pg_insert
from sqlalchemy.ext.asyncio import AsyncSession
from sqlalchemy.sql import select
from app.core.database import Base
if TYPE_CHECKING:
from app.models.account import Account
class AccountSettings(Base):
"""One row per account. Created lazily on first `set_setting` call."""
__tablename__ = "account_settings"
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
primary_key=True,
)
preferences: Mapped[dict[str, Any]] = mapped_column(
JSONB, nullable=False, default=dict, server_default=text("'{}'::jsonb")
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
@classmethod
async def get_setting(
cls,
db: AsyncSession,
account_id: uuid.UUID,
key: str,
default: Any = None,
) -> Any:
"""Return preferences[key] for the account, or `default` if no row/no key.
Never creates a row — this is the pure-read path.
"""
result = await db.execute(
select(cls.preferences).where(cls.account_id == account_id)
)
prefs = result.scalar_one_or_none()
if prefs is None:
return default
return prefs.get(key, default)
@classmethod
async def set_setting(
cls,
db: AsyncSession,
account_id: uuid.UUID,
key: str,
value: Any,
) -> None:
"""Upsert preferences[key] = value for the account.
Creates the row on first write; on subsequent writes, merges the key
into the existing preferences JSON without clobbering other keys.
Uses PostgreSQL's `||` jsonb merge operator via ON CONFLICT DO UPDATE.
"""
stmt = pg_insert(cls).values(
account_id=account_id,
preferences={key: value},
)
stmt = stmt.on_conflict_do_update(
index_elements=[cls.account_id],
set_={
# Merge the new {key: value} into the existing preferences.
# The `||` operator on jsonb overwrites matching keys and keeps
# all other keys intact.
"preferences": cls.preferences.op("||")(stmt.excluded.preferences),
"updated_at": text("now()"),
},
)
await db.execute(stmt)

View File

@@ -137,28 +137,6 @@ class AISession(Base):
comment="Snapshot of PSA ticket data at session start",
)
# ── Triage / Cockpit Header ──
client_name: Mapped[Optional[str]] = mapped_column(
String(255), nullable=True,
comment="MSP client name for incident header (AI-inferred or manual)",
)
asset_name: Mapped[Optional[str]] = mapped_column(
String(255), nullable=True,
comment="Device, asset, or user being worked on",
)
issue_category: Mapped[Optional[str]] = mapped_column(
String(100), nullable=True,
comment="Human-readable category (e.g. DNS / Networking)",
)
triage_hypothesis: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
comment="Current working hypothesis — AI-updated + engineer-editable",
)
evidence_items: Mapped[Optional[list[dict[str, Any]]]] = mapped_column(
JSONB, nullable=True,
comment='What We Know list: [{"text": str, "status": "confirmed"|"ruled_out"|"pending"}]',
)
# ── Resolution / Escalation ──
resolution_summary: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
@@ -236,6 +214,38 @@ class AISession(Base):
comment="Current task lane state: {questions: [...], actions: [...]}",
)
# ── Resolution / Escalation artifacts (Phase 1 — FlowPilot migration) ──
# Markdown of the posted note + PSA external ID for round-trip traceability.
resolution_note_markdown: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
comment="Final Resolve note markdown, as posted to the PSA",
)
resolution_note_posted_at: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
)
resolution_note_external_id: Mapped[Optional[str]] = mapped_column(
String(128), nullable=True,
comment="PSA (e.g. CW) ticket-note ID returned at post time",
)
escalation_package_markdown: Mapped[Optional[str]] = mapped_column(
Text, nullable=True,
comment="Final Escalate handoff package markdown, as posted to the PSA",
)
escalation_package_posted_at: Mapped[Optional[datetime]] = mapped_column(
DateTime(timezone=True), nullable=True,
)
escalation_package_external_id: Mapped[Optional[str]] = mapped_column(
String(128), nullable=True,
comment="PSA ticket-note ID for the escalation package",
)
# Incremented atomically by any write that invalidates the resolution
# note preview cache (facts, suggested fixes, script generations).
# See FLOWPILOT-MIGRATION.md Section 5.5.
state_version: Mapped[int] = mapped_column(
Integer, nullable=False, default=0, server_default=sa.text("0"),
comment="Monotonic preview-cache version; bumped on state-changing writes",
)
# ── Branching ──
is_branching: Mapped[bool] = mapped_column(
default=False,

View File

@@ -50,6 +50,13 @@ class AISessionStep(Base):
nullable=False,
index=True,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
comment="Denormalized from ai_sessions.account_id for direct tenant filtering.",
)
step_order: Mapped[int] = mapped_column(
Integer, nullable=False,
comment="Sequential position in the session (0-indexed)",

View File

@@ -28,6 +28,12 @@ class AISuggestion(Base):
nullable=False,
index=True,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
session_id: Mapped[Optional[uuid.UUID]] = mapped_column(
UUID(as_uuid=True),
ForeignKey("ai_chat_sessions.id", ondelete="SET NULL"),

View File

@@ -20,6 +20,12 @@ class Attachment(Base):
ForeignKey("sessions.id"),
nullable=False
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
node_id: Mapped[Optional[str]] = mapped_column(String(100), nullable=True)
file_name: Mapped[str] = mapped_column(String(255), nullable=False)
file_type: Mapped[Optional[str]] = mapped_column(String(50), nullable=True)

View File

@@ -21,6 +21,12 @@ class AuditLog(Base):
nullable=False,
index=True
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True
)
action: Mapped[str] = mapped_column(String(50), nullable=False, index=True)
resource_type: Mapped[str] = mapped_column(String(50), nullable=False, index=True)
resource_id: Mapped[Optional[uuid.UUID]] = mapped_column(

View File

@@ -39,10 +39,10 @@ class TreeCategory(Base):
nullable=True,
index=True
)
account_id: Mapped[Optional[uuid.UUID]] = mapped_column(
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=True,
nullable=False,
index=True
)
display_order: Mapped[int] = mapped_column(Integer, nullable=False, default=0, index=True)

View File

@@ -10,7 +10,7 @@ from app.core.database import Base
class DeviceType(Base):
"""A device type for network diagram nodes (system or team-custom)."""
"""A device type for network diagram nodes (platform or account-custom)."""
__tablename__ = "device_types"
id: Mapped[uuid.UUID] = mapped_column(
@@ -32,11 +32,11 @@ class DeviceType(Base):
Boolean, nullable=False, default=False,
comment="True for built-in types that cannot be deleted",
)
team_id: Mapped[uuid.UUID | None] = mapped_column(
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("teams.id", ondelete="CASCADE"),
nullable=True,
comment="NULL for system types, set for team-custom types",
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
comment="Platform account for system types, tenant account for custom types",
)
sort_order: Mapped[int] = mapped_column(
Integer, nullable=False, default=0,

View File

@@ -0,0 +1,91 @@
"""Draft template model — scripts generated during a session, pending templatization.
Created when an engineer picks "Run now, templatize after resolve" in the
three-option dialog. Post-resolve, the TemplatizePrompt component reads pending
drafts and lets the engineer accept (promotes to `script_templates`) or reject.
"""
import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import (
Text, DateTime, ForeignKey, String, CheckConstraint,
)
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB
from app.core.database import Base
if TYPE_CHECKING:
from app.models.account import Account
from app.models.ai_session import AISession
from app.models.user import User
from app.models.script_template import ScriptCategory, ScriptTemplate
class DraftTemplate(Base):
"""A session-generated script pending conversion to a reusable template."""
__tablename__ = "draft_templates"
__table_args__ = (
CheckConstraint(
"status IN ('pending', 'accepted', 'rejected')",
name="ck_draft_templates_status",
),
)
id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id"),
nullable=False,
)
source_session_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("ai_sessions.id"),
nullable=False,
)
source_user_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("users.id"),
nullable=False,
)
script_body: Mapped[str] = mapped_column(Text, nullable=False)
proposed_parameters: Mapped[dict[str, Any]] = mapped_column(
JSONB, nullable=False
)
proposed_name: Mapped[str | None] = mapped_column(String(200), nullable=True)
proposed_category_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True),
ForeignKey("script_categories.id"),
nullable=True,
)
status: Mapped[str] = mapped_column(
String(32), nullable=False, default="pending"
)
resolved_at: Mapped[datetime | None] = mapped_column(
DateTime(timezone=True), nullable=True
)
# Set when status transitions to 'accepted' and the draft is promoted
# to a real script_templates row.
promoted_template_id: Mapped[uuid.UUID | None] = mapped_column(
UUID(as_uuid=True),
ForeignKey("script_templates.id"),
nullable=True,
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
account: Mapped["Account"] = relationship("Account", foreign_keys=[account_id])
source_session: Mapped["AISession"] = relationship(
"AISession", foreign_keys=[source_session_id]
)
source_user: Mapped["User"] = relationship("User", foreign_keys=[source_user_id])
proposed_category: Mapped["ScriptCategory | None"] = relationship(
"ScriptCategory", foreign_keys=[proposed_category_id]
)
promoted_template: Mapped["ScriptTemplate | None"] = relationship(
"ScriptTemplate", foreign_keys=[promoted_template_id]
)

View File

@@ -1,6 +1,5 @@
import uuid
from datetime import datetime, timezone
from typing import Optional
from sqlalchemy import String, Text, DateTime, ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy.dialects.postgresql import UUID
@@ -11,7 +10,7 @@ class Feedback(Base):
__tablename__ = "feedback"
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
account_id: Mapped[Optional[uuid.UUID]] = mapped_column(UUID(as_uuid=True), ForeignKey("accounts.id", ondelete="SET NULL"), nullable=True)
account_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("accounts.id", ondelete="CASCADE"), nullable=False, index=True)
user_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"), nullable=False)
email: Mapped[str] = mapped_column(String(255), nullable=False)
feedback_type: Mapped[str] = mapped_column(String(50), nullable=False)

View File

@@ -46,6 +46,12 @@ class UserFolder(Base):
nullable=False,
index=True
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
name: Mapped[str] = mapped_column(String(100), nullable=False)
color: Mapped[str] = mapped_column(String(7), nullable=False, default="#6366f1")
icon: Mapped[str] = mapped_column(String(50), nullable=False, default="folder")

View File

@@ -23,6 +23,12 @@ class ForkPoint(Base):
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
session_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("ai_sessions.id", ondelete="CASCADE"), nullable=False, index=True)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
parent_branch_id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), ForeignKey("session_branches.id", ondelete="CASCADE"), nullable=False)
trigger_step_id: Mapped[Optional[uuid.UUID]] = mapped_column(UUID(as_uuid=True), ForeignKey("ai_session_steps.id", ondelete="SET NULL"), nullable=True)
fork_reason: Mapped[str] = mapped_column(Text, nullable=False)

View File

@@ -23,6 +23,12 @@ class MaintenanceSchedule(Base):
created_by: Mapped[Optional[uuid.UUID]] = mapped_column(
UUID(as_uuid=True), ForeignKey("users.id", ondelete="SET NULL"), nullable=True
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
cron_expression: Mapped[str] = mapped_column(String(100), nullable=False)
timezone: Mapped[str] = mapped_column(String(100), nullable=False, default="UTC")
target_list_id: Mapped[Optional[uuid.UUID]] = mapped_column(

View File

@@ -3,7 +3,7 @@ import uuid
from datetime import datetime, timezone
from typing import Any, TYPE_CHECKING
from sqlalchemy import String, Text, Boolean, DateTime, ForeignKey
from sqlalchemy import String, Text, Boolean, DateTime, ForeignKey, text
from sqlalchemy.orm import Mapped, mapped_column, relationship
from sqlalchemy.dialects.postgresql import UUID, JSONB
@@ -14,15 +14,15 @@ if TYPE_CHECKING:
class NetworkDiagram(Base):
"""A network topology diagram, team-scoped."""
"""A network topology diagram scoped to one account."""
__tablename__ = "network_diagrams"
id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True), primary_key=True, default=uuid.uuid4
)
team_id: Mapped[uuid.UUID] = mapped_column(
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("teams.id", ondelete="CASCADE"),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
@@ -30,8 +30,8 @@ class NetworkDiagram(Base):
client_name: Mapped[str | None] = mapped_column(String(255), nullable=True)
asset_name: Mapped[str | None] = mapped_column(String(255), nullable=True)
description: Mapped[str | None] = mapped_column(Text, nullable=True)
nodes: Mapped[list[dict[str, Any]]] = mapped_column(JSONB, nullable=False, server_default="'[]'")
edges: Mapped[list[dict[str, Any]]] = mapped_column(JSONB, nullable=False, server_default="'[]'")
nodes: Mapped[list[dict[str, Any]]] = mapped_column(JSONB, nullable=False, server_default=text("'[]'::jsonb"))
edges: Mapped[list[dict[str, Any]]] = mapped_column(JSONB, nullable=False, server_default=text("'[]'::jsonb"))
thumbnail_url: Mapped[str | None] = mapped_column(Text, nullable=True)
is_archived: Mapped[bool] = mapped_column(
Boolean, nullable=False, default=False,

View File

@@ -31,6 +31,12 @@ class NotificationLog(Base):
nullable=False,
index=True,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
event: Mapped[str] = mapped_column(String(50), nullable=False)
payload: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
status: Mapped[str] = mapped_column(String(20), default="sent")

View File

@@ -0,0 +1,37 @@
"""Platform step model — platform-owned steps, readable by all users.
No account_id. No RLS. Readable by any authenticated user.
Populated by promoting visibility='public' steps from step_library.
"""
import uuid
from datetime import datetime, timezone
from typing import Optional, Any
from sqlalchemy import String, Boolean, DateTime, ForeignKey
from sqlalchemy.orm import Mapped, mapped_column
from sqlalchemy.dialects.postgresql import UUID, JSONB
from app.core.database import Base
class PlatformStep(Base):
__tablename__ = "platform_steps"
id: Mapped[uuid.UUID] = mapped_column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
title: Mapped[str] = mapped_column(String(255), nullable=False)
step_type: Mapped[str] = mapped_column(String(50), nullable=False, index=True)
content: Mapped[dict[str, Any]] = mapped_column(JSONB, nullable=False)
is_active: Mapped[bool] = mapped_column(Boolean, nullable=False, default=True)
source_step_id: Mapped[Optional[uuid.UUID]] = mapped_column(
UUID(as_uuid=True),
ForeignKey("step_library.id", ondelete="SET NULL"),
nullable=True,
)
created_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True), default=lambda: datetime.now(timezone.utc)
)
updated_at: Mapped[datetime] = mapped_column(
DateTime(timezone=True),
default=lambda: datetime.now(timezone.utc),
onupdate=lambda: datetime.now(timezone.utc),
)

View File

@@ -25,6 +25,12 @@ class PsaMemberMapping(Base):
nullable=False,
index=True,
)
account_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("accounts.id", ondelete="CASCADE"),
nullable=False,
index=True,
)
user_id: Mapped[uuid.UUID] = mapped_column(
UUID(as_uuid=True),
ForeignKey("users.id", ondelete="CASCADE"),

Some files were not shown because too many files have changed in this diff Show More