Commit Graph

389 Commits

Author SHA1 Message Date
d68131a865 feat(seed): Phase 9 QA fixture seeder
Adds backend/scripts/seed_phase9_qa_fixtures.py — creates 4 ai_sessions
plus matching session_suggested_fixes that pre-bake the four backend
states the AI orchestrator must produce to mount the five conditional
Phase 9 components:

  A. no template, no draft     → ChatTabStrip + ScriptBuilderTab
  B. ai_drafted_script set      → InlineNoTemplateDialog
  C. script_template_id set     → TemplateMatchPanel
  D. applied_at + status=proposed → EscalateInterceptDialog (verify state)

Background: a Phase 9 QA pass against a regular session left these
five components unreached because the AI didn't emit SUGGEST_FIX in
time/at all. Seeding directly bypasses the AI and lets QA exercise
each surface deterministically.

UUIDs are deterministic (uuid5 over a fixed namespace) so re-runs
upsert. Pass --reset to wipe and recreate. Each session gets two
synthetic conversation messages so the chat header's canAct gate
(messages.length >= 2) opens up Resolve/Escalate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 00:08:38 -04:00
49c6c8fd00 fix(seed): include cancel_at_period_end in test-user subscription INSERT
Discovered during Phase 9 QA: seed_test_users.py was missing the
cancel_at_period_end column in its subscriptions INSERT, but the
column is NOT NULL (added in 016_add_subscription_tables.py).
Result: seed crashed with NotNullViolationError before any users
were created, blocking auth in fresh dev environments.

Pre-existing on main; not introduced by the FlowPilot migration
branch. Default value: false.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 23:36:04 -04:00
b14a16a1ab chore(tests): gate RLS tests behind RUN_RLS_TESTS flag
Continues the test-isolation work from dab740d. RLS migration tests run
against a policy-installed database and fail in the default create_all
suite, so they need to be opt-in:

- pytest.ini: register `rls` marker.
- conftest.py: auto-deselect test_rls_isolation.py unless
  RUN_RLS_TESTS=1. Drops the deprecated session-scoped event_loop
  fixture (not needed since pytest-asyncio 0.23+).
- test_rls_isolation.py: tag module with `rls` marker. Replace
  hardcoded `patherly_test` DB reference with parsed DATABASE_TEST_URL
  (matches conftest.py default `resolutionflow_test`). Updated docstring
  command to show RUN_RLS_TESTS=1.
- requirements-dev.txt: bump pytest-asyncio 0.23.0 → 0.24.0 (loop-scope
  marker behavior required by the RLS module fixture).

Run the RLS suite with:
  RUN_RLS_TESTS=1 DB_APP_ROLE_PASSWORD=... pytest tests/test_rls_isolation.py

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 16:09:13 -04:00
dab740ddf7 fix(tests): isolate test DB from dev DB and plug admin-db override gap
All checks were successful
Mirror to GitHub / mirror (push) Successful in 3s
Root cause of the 06:32 AM outage: running 'pytest tests/' inside the
resolutionflow_backend container silently dropped the public schema on
the DEV database. Two layered bugs made this possible; both are fixed.

Bug 1 — env-var lookup in conftest.TEST_DATABASE_URL put DATABASE_URL
(which normally points at the dev/prod DB) ahead of DATABASE_TEST_URL.
When DATABASE_URL is set, pytest used the dev DB as the 'test' DB and
the test_db fixture's DROP SCHEMA public CASCADE wiped it. Fixed:
  - Honor only DATABASE_TEST_URL (or the localhost fallback).
  - Assert at module load that the DB name contains 'test' — refuses
    to run otherwise. Makes future misconfiguration impossible.

Bug 2 — conftest overrode app.dependency_overrides[get_db] but not
get_admin_db. Endpoints using get_admin_db (register, admin routes)
bypassed the test session and hit the real admin DB. Before Bug 1 was
fixed this was hidden because both engines pointed at the same dev DB.
With isolation in place, register started failing 'Email already
registered' because of stale users in the dev DB. Fixed:
  - Also override get_admin_db to yield the same test session. RLS is
    not enabled in the create_all-managed test schema, so sharing is
    safe.

Also adds DATABASE_TEST_URL=resolutionflow_test to docker-compose.dev.yml
so pytest in the container works out of the box.

Verified: 49/50 Phase 8 + 9 tests pass against resolutionflow_test; the
1 failure is the pre-existing Phase 8 Issue #4
(test_record_decision_persists_and_bumps_state_version).

Refs gitea #145 (will update that issue with this as the primary fix).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 13:14:08 -04:00
1c855563ee feat(pilot): PATCH /suggested-fixes/:id/script endpoint
Called by the inline Script Builder tab on Submit. Writes
ai_drafted_script + ai_drafted_parameters to the fix without stamping
applied_at (a draft is not an application — that's §5 of the Phase 9
spec). Bumps state_version so Resolve/Escalate preview bundles
regenerate.

409 on terminal fix status. 404 on wrong session. 422 on empty script.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:34:06 -04:00
d4fae87236 feat(pilot): inline Script Builder session — idempotent create + auth + filtered list
POST /script-builder/sessions now supports origin='pilot_inline':
- Requires ai_session_id; validates it against current user ownership.
- Get-or-create: returns existing row for (user, ai_session_id) pair.
- Partial unique index on the DB backs the invariant; races resolve to
  the single winner row.

list_sessions + count_user_sessions default-scope to origin='standalone'
so inline scratch sessions don't pollute the /script-builder dashboard
or count against the 5-session cap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 02:24:57 -04:00
f2fce27f0d feat(pilot): pydantic schemas for inline origin + script PATCH
- ScriptBuilderCreateRequest gains origin ('standalone' | 'pilot_inline')
  and optional ai_session_id. Handler-side validation (next task) enforces
  pilot_inline ⇒ ai_session_id required + owned by caller.
- SessionSuggestedFixScriptRequest added for the new PATCH /script
  endpoint (Phase 9 Task 6).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:53:28 -04:00
93c974466a feat(pilot): script_builder_sessions.origin on SQLAlchemy model
Mirrors the DB column added in the prior migration. App-level default
is 'standalone' so existing callers of ScriptBuilderSession(...) work
without code changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 01:48:22 -04:00
8012668975 feat(pilot): add origin + inline idempotency to script_builder_sessions
Phase 9 prep. Adds:
- origin VARCHAR(20) NOT NULL with CHECK ('standalone' | 'pilot_inline')
- invariant: pilot_inline rows must have ai_session_id
- partial unique index on (user_id, ai_session_id) WHERE origin='pilot_inline'
  — backs get-or-create idempotency for the inline Script Builder tab.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 00:22:53 -04:00
70c5da0c75 fix(pilot): persist AI-proposal rejection + clear on outcome write
Issue #3 from phase-8-review-issues.md. 'Not yet' on the AI-confirming
banner was a local-state hide; the proposal re-surfaced on the next
refreshSessionDerived call.

Two-part fix:
- PATCH /outcome now clears ai_outcome_proposal on any terminal action
  (engineer has taken a decision; stale AI proposal is moot).
- New DELETE /ai-sessions/:sid/suggested-fixes/:fid/ai-outcome-proposal
  endpoint for explicit 'Not yet' rejection. Does not touch status
  or state_version — pure UI state.

Frontend handleRejectAIProposal now calls the DELETE and setActiveFix
with the server response.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:15:48 -04:00
de2bef3175 fix(pilot): persist Apply — stamp applied_at on click
Issue #2 from phase-8-review-issues.md. Apply was client-side-only via
a bannerApplied flag. Refresh / chat reselect / multi-tab would drop
Verifying state back to Proposed.

- New POST /ai-sessions/{sid}/suggested-fixes/{fid}/apply stamps
  applied_at without changing status (still 'proposed'). Idempotent
  if already stamped; 409 if fix is past proposed (a terminal outcome
  was already recorded).
- Bumps state_version so resolve/escalate preview bundles reflect that
  the fix has entered verifying.
- Frontend handleApplyFix calls the endpoint and uses the returned
  applied_at directly. bannerApplied client flag is removed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:10:52 -04:00
362c7b1d79 fix(pilot): outcome-aware Resolve/Escalate previews
Issue #1 from phase-8-review-issues.md. Cache invalidation alone isn't
enough — previews were also omitting outcome fields from the LLM bundle,
so a fresh regenerate still couldn't distinguish proposed / failed /
partial / success.

- PATCH /outcome now bumps ai_sessions.state_version (matches
  record_decision's existing pattern).
- Resolution-note + escalation-package bundles now include status,
  applied_at, verified_at, partial_notes, failure_reason on the active fix.
- Generator prompts prescribe outcome-aware phrasing (closure language
  for success; what-we've-tried + next-steps for failed/partial).
- New end-to-end test asserts the regenerated preview reflects the
  recorded outcome, not just that the cache key changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 22:04:56 -04:00
2cde6673b0 feat(pilot): [FIX_OUTCOME] system prompt instructions
Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's
parser consumes. Placeholder-only per the anti-parrot pattern — no
literal UUIDs, outcomes, or reasons that could leak into unrelated
sessions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:17:21 -04:00
c0112f8bee feat(pilot): [FIX_OUTCOME] marker parser + AI outcome proposal
The AI emits [FIX_OUTCOME] when the engineer indicates in chat that a
prior suggested fix worked, didn't work, or was partially applied. The
marker writes to session_suggested_fixes.ai_outcome_proposal (JSONB),
which the frontend surfaces as a "confirm outcome?" banner. The status
column is only updated when the engineer clicks confirm (via PATCH
/outcome endpoint from Task 3).

Placeholder-only system prompt wiring comes in Task 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 15:08:43 -04:00
8988dbc885 feat(pilot): PATCH /suggested-fixes/:id/outcome endpoint + tests
Records engineer-reported outcome (applied_success|applied_failed|
applied_partial|dismissed). Enforces transition rules (partial → success/
failed allowed; terminal outcomes return 409) and notes requirements
(applied_partial requires notes).

Sets verified_at on success/failure, stamps applied_at if not already
set (handles the case where the AI [FIX_OUTCOME] marker fires before
the engineer clicks Apply).

Also fixes pre-existing test-infrastructure bug: network_diagram.py used
bare string server_default="'[]'" for JSONB columns, which asyncpg
rejects during test schema creation. Changed to text("'[]'::jsonb") to
match the pattern used by script_template.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:59:34 -04:00
4a8e3ae954 feat(pilot): pydantic schemas for fix outcome patch
Adds FixStatus literal (5 values matching the DB check constraint),
extends SessionSuggestedFixResponse with outcome fields, and introduces
SessionSuggestedFixOutcomeRequest for the PATCH /outcome endpoint coming
in Task 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:44:39 -04:00
cdd8bb05cc feat(pilot): add outcome tracking columns to session_suggested_fixes
Phase 8 prep for the fix outcome banner. Adds:
- status (proposed|applied_success|applied_failed|applied_partial|dismissed)
- applied_at, verified_at (timestamps)
- partial_notes, failure_reason (engineer-provided context)
- ai_outcome_proposal (JSONB for AI [FIX_OUTCOME] marker payloads)

Backfills status='dismissed' from user_decision='dismissed'. status is
orthogonal to user_decision — outcome (did the fix work?) vs script-path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:40:17 -04:00
4aaf57adb5 feat(pilot): Phase 6 — post-resolve templatize prompt + draft accept/reject
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Closes the loop on the Phase 5 "Run now, templatize after resolve" path.
After a session resolves, drafts queued by the three-option dialog surface
as a modal that lets the engineer review the AI-proposed parameterization
and either save as a reusable team template or skip. A "don't ask again"
toggle writes to account_settings.preferences so the next resolve won't
pop the modal.

Backend:
- /api/v1/draft-templates:
  * GET — list account drafts (pending_only default true; pass false for
    audit view including accepted/rejected)
  * GET /{id} — single draft
  * POST /{id}/accept — promotes to a new script_templates row with
    source_session_id / source_user_id / source_ticket_ref populated
    (drives the Script Library "generated from CW #X · resolved by Y"
    provenance chip). Draft flips to status=accepted,
    promoted_template_id set, resolved_at stamped. 409 on re-accept /
    already-rejected. 400 on unknown category_id.
  * POST /{id}/reject — flips to status=rejected. 409 on re-reject.
- /api/v1/accounts/me/preferences (GET/PATCH) — thin wrapper over
  AccountSettings.get_setting/set_setting. PATCH merges keys into the
  JSONB column, preserving existing keys the client didn't touch.
  Used by the "Don't ask again for this team" checkbox
  (templatize_prompt_enabled=false) and, forward-looking, by
  cw_resolved_status_id / cw_escalated_status_id from Phase 4.
- 13 tests: list filter, accept with/without edited_body, provenance
  copy-through, reject, 409 on re-accept / re-reject, 400 on unknown
  category, prefs round-trip with merge semantics.

Frontend:
- src/components/pilot/script/TemplatizePrompt.tsx — modal showing the
  drafted script with proposed parameters in the Phase 5
  ParameterizationPreview, editable name/category/description, an
  individual-parameter remove button, and the "don't ask again" opt-out.
  Accept posts to /draft-templates/{id}/accept + optionally PATCHes
  preferences. Skip posts /reject.
- src/api/draftTemplates.ts — typed client plus accountPreferencesApi.
- AssistantChatPage: after a successful Resolve (external OR local),
  fetches preferences + pending drafts for the session and queues the
  modal one draft at a time. Escalate does not trigger this flow.
- Sidebar: Scripts nav shows the pending-draft count as a badge. Fetched
  independently of the main sidebar stats so endpoint flakes don't
  break the rest of the sidebar.

Verified live 2026-04-22: seed two drafts → GET sees both pending →
accept draft A (template created, provenance CW #99123 populated) →
reject draft B → pending count drops → PATCH opt-out → GET confirms
persistence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:37:49 -04:00
d0ebdef9e8 fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The "AI parrots example content from system prompt" bug bit us twice in
one day across two different prompt sites. Patching individual prompts
is treating the symptom; this commit makes the rule structural.

Audit + sanitize:
- assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in
  prior commits, but the [FORK] schema still had literal "Brief reason"
  / "Short name" / "One sentence" placeholders. Replaced with
  <angle-bracket> placeholders. Anti-parrot rule itself rewritten to
  describe the failure mode abstractly instead of naming "jsmith" so
  the rule no longer trips the guardrail (and so the model doesn't
  see "jsmith" as a token at all).
- ai_chat_service.py — removed three concrete-example offenders:
  "Get-Service ADSync" command literal, the "DC01 server_name" intake
  form payload (in two places), and the inline interview demos using
  "Azure AD Sync failures" / "Exchange Online mailbox migration".
  Replaced with technology-neutral schema descriptions.
- ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the
  fully-fleshed DNS troubleshooting tree (with literal Dnscache /
  ipconfig / google.com / Start-Service) with a placeholder schema
  showing only ID-linkage shape.
- kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked
  Server Manager + DC01 example payload with a placeholder schema.

Guardrail (tests/test_prompt_anti_parrot.py):
- Imports every module under app/services/ and app/core/ and walks
  every uppercase string constant ending in _PROMPT, _SCHEMA,
  _PROTOCOL, _FORMAT, or _CONTEXT.
- test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache,
  google.com, "Outlook keeps", "Teams drops") must not appear in any
  prompt constant. Add to the list when a new leak shows up in prod —
  the list IS the audit trail.
- test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.)
  must contain placeholders only. Distinguishes JSON keys (followed
  by ':', allowed) from JSON values (followed by ',' / ']' / '}',
  must be <placeholder>); allows pipe-separated enum types
  (text|password|select) and a small set of fixed enum values
  (question, diagnostic_check, decision, action, ...). Verified by
  feeding the test a known-bad block — caught it correctly.

Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming
the test as the enforcement point so future contributors know how to
extend it (add to the known-leaked list when a new leak surfaces).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:09:30 -04:00
50215b9110 fix(pilot): strip literal example content from system prompt — model was parroting
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
The system prompt had a "Complete example of a correct first response"
section with a specific Outlook/WiFi/jsmith scenario plus literal JSON
payloads in [QUESTIONS], [ACTIONS], [SUGGEST_FIX], and [PROMOTE]
markers. The model was emitting those literal strings (the same
WiFi/laptop questions, the same "Clear cached credentials" suggested
fix, the same "OWA login confirmed for jsmith" promote) on EVERY
unrelated chat — making the task lane look like it was leaking previous-
session data when in fact the AI was just reciting the prompt examples.

Replaced literal example content with `<placeholder>` schemas. Added an
explicit ANTI-PARROT RULE in the FINAL REMINDER section calling out
that the angle-bracket placeholders show SHAPE, not CONTENT, with
concrete examples of the failure mode (printer ticket → don't ask
about Outlook; user not named jsmith → don't name jsmith).

Same scrub applied to the FORK section's "Outlook AND Teams dropping"
and the worked fork-flow example.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 01:36:29 -04:00
fa61376303 feat(pilot): Phase 5 — inline Script Generator integration
All checks were successful
Mirror to GitHub / mirror (push) Successful in 10s
Wires the SuggestedFix card to an inline panel that handles both cases:
template-matched fixes open the Script Library generator with parameters
pre-filled from session context; un-matched fixes open the three-option
dialog (one_off / draft_template / build_template). The decision endpoint
records the path choice with side effects: draft_template persists a
draft_templates row via a Sonnet-driven TemplateExtractionService;
build_template returns a redirect to the Script Builder; one_off just
records the choice.

Backend:
- TemplateExtractionService: drafts a parameter schema from a concrete
  rendered script. Conservative by default ("prefer fewer parameters").
  Round-trip-validates that templated_body only references declared
  parameters; missing-key mismatch falls back to the original script
  with no params. LLM/parse failures fall back identically — the
  engineer can still create a draft and refine in the post-resolve
  prompt (Phase 6).
- /suggested-fixes/{fix_id}/decision side effects:
  * one_off → returns rendered_script (engineer's edited version or the
    fix's ai_drafted_script verbatim)
  * draft_template → same + creates draft_templates row with extracted
    params, returns draft_template_id
  * build_template → returns redirect_path=/scripts/builder?from_session=
    &fix= so the frontend can navigate to the builder pre-loaded
- 400 when a non-template fix has no ai_drafted_script (template-matched
  fixes take the dedicated /scripts/generate path, not this endpoint).
- 12 tests: TemplateExtractionService parse + fallback paths, all four
  decision branches, edited_script override, missing-script 400.

Frontend:
- src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog,
  ParameterizationPreview}.tsx — inline panels rendered in the task
  lane's bottom slot when the engineer clicks a SuggestedFix card.
- TemplateMatchPanel: loads template via /scripts/templates/{id},
  pre-fills params from fix.ai_drafted_parameters with cyan "from
  session" tags, generates via existing /scripts/generate (already
  bumps state_version on ai_session_id from Phase 3). 404 falls back
  with a clear message instead of erroring.
- NoTemplateDialog: shows the AI-drafted script with proposed parameter
  values highlighted in amber via ParameterizationPreview; three option
  cards with the middle (draft_template) flagged Recommended; inline
  edit on the script body before deciding.
- SuggestedFix card now clickable: onActivate toggles the inline panel.
- AssistantChatPage: scriptPanelOpen state + handleScriptDecision that
  navigates on build_template and toasts on the other paths. Active fix
  changes auto-close the panel so engineers don't act on stale state.
- Cmd+K → "Open inline Script Generator" palette entry surfaces only on
  /pilot/:id routes; fires a window event the chat page subscribes to.
  No Resolve shortcut added per Section 14 decision (browser ⌘R conflict).

Verified 2026-04-22 against the dev stack:
- one_off / draft_template / build_template all return the right shape
  with real Sonnet TemplateExtractionService for the draft path.
- Conservative extraction confirmed: cmdkey + Restart-Process script
  yielded zero proposed parameters as intended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 00:15:29 -04:00
8fd2c1bac6 feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification
All checks were successful
Mirror to GitHub / mirror (push) Successful in 11s
Wires the preview popover's Confirm & post action to ConnectWise (and,
via the provider pattern, any future PSA). Adds the parallel Escalate
flow with the handoff-oriented five-section markdown. Sessions without a
linked PSA ticket resolve/escalate locally — markdown stored, status
flipped, nothing posted externally.

Backend:
- EscalationPackageGeneratorService: Sonnet, five sections (Problem /
  What we've confirmed / What we've tried / Current hypothesis /
  Suggested next steps). Shares the preview_cache with a separate KIND
  so Resolve and Escalate previews for the same state coexist.
- PSAWritebackService: post_resolution_note (RESOLUTION note type,
  customer-visible), post_escalation_package (INTERNAL_ANALYSIS,
  handoff for the next engineer only), transition_ticket_status with
  mandatory re-fetch verification. PSAStatusVerificationError surfaces
  loudly when CW silently rejects a status change — the
  ConnectWise anti-pattern CLAUDE.md flags.
- Endpoints:
  * POST /ai-sessions/{id}/escalation-package/preview
  * POST /ai-sessions/{id}/resolution-note/post
  * POST /ai-sessions/{id}/escalation-package/post
  Outcomes: "resolved" / "escalated" with external_id + verified status,
  "resolved_local" / "escalated_local" when no PSA linked.
- Target CW status IDs live in account_settings.preferences
  (cw_resolved_status_id, cw_escalated_status_id). When unset, the post
  proceeds without a status transition — response includes a
  status_transition_skipped_reason rather than silently erroring.
- 7 tests: local-only path, PSA happy path with verified transition,
  status verification failure → 502, skipped transition when
  unconfigured, 409 on already-resolved re-post, escalate parallel path,
  internal-analysis note type enforced.

Frontend:
- ResolutionNotePreview now kind-parameterized ('resolve' | 'escalate')
  with inline edit + Confirm & post. Preview loads from the matching
  backend endpoint; posting calls the matching endpoint; outcome toast
  surfaces the verified CW status or the local-only result.
- AssistantChatPage: previewKind state replaces previewOpen; two toggle
  buttons (Preview Resolve note / Escalate instead) in the lane's bottom
  slot. handleConfirmPost dispatches by kind.

Verified 2026-04-22:
- Local-only Resolve + Escalate round-trip against the dev stack.
- Live Sonnet escalation-package preview; cache hit on repeat call
  with no state change (separate cache kind from resolution-note).
- PSA post + status-verification paths covered by mocked-provider pytest
  cases. Live CW round-trip pending a test CW instance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 23:54:54 -04:00
66e592096c feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache
Adds the AI-proposed resolution path and the inline preview of the
markdown that will be posted to the customer ticket on Resolve. The
preview is keyed on (session_id, ai_sessions.state_version) so back-to-
back fetches against unchanged state hit an in-process cache instead
of paying for a Sonnet call.

Backend:
- preview_cache: in-process LRU keyed on (kind, session_id, state_version).
  No TTL — state_version is the source of truth. Soft-cap 5000 entries.
- unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON
  payload, confidence clamped 0-100), supersession persistence (sets
  superseded_at on prior active row), atomic state_version bump.
- ResolutionNoteGeneratorService: pulls session, facts, active fix, and
  redacted script_generations into a structured input bundle for Sonnet;
  produces the four-section markdown (Problem / What we confirmed /
  Root cause / Resolution). Sensitive script parameters redacted via
  ScriptTemplateEngine.redact_sensitive driven by the template's
  parameters_schema.
- /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active
  fix or 404.
- /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records
  one_off / draft_template / build_template / dismissed; dismiss
  supersedes; bumps state_version. 409 on dismissing an already-
  superseded fix.
- /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns
  cached markdown; from_cache flag in payload signals cache hit.
- scripts.py POST /generate now bumps state_version on the linked
  ai_session_id when present (third source of preview-cache invalidation
  per Section 5.5).
- ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit,
  format, supersession semantics).
- 12 tests covering the parser (well-formed, last-wins, malformed,
  confidence clamping), supersession + state_version invariant, all
  decision branches, preview cache hit-on-no-change + miss-after-write.

Frontend:
- src/components/pilot/sections/SuggestedFix.tsx — amber-accented card
  with confidence badge; dismiss action wired to the decision endpoint.
- src/components/pilot/ResolutionNotePreview.tsx — popover with refresh,
  loading state, cached/fresh indicator, ticket-ref display.
- src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes
  404 to null so callers don't have to special-case.
- TaskLane gains suggestedFixSlot + bottomSlot props (rendered after
  Diagnostic Checks; bottomSlot anchors the Resolve action).
- AssistantChatPage: refreshSessionDerived helper batches fact + fix
  refresh; fact mutations and chat sends both schedule a 500ms-debounced
  preview refresh per the Section 5.5 spec.

Verified end-to-end against the dev stack with a real Sonnet call:
- /active 404 → fact create → preview generates four-section markdown
  grounded only in provided facts → second preview call hits cache
  (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:45:52 -04:00
625dba7548 feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs
Adds the load-bearing structural feature of the FlowPilot migration: a
"What we know" panel that holds confirmed facts for a session, fed by AI
[PROMOTE] markers and engineer-added notes. Facts feed the resolution
note preview (Phase 3) and survive across turns via stable UUIDs assigned
to pending_task_lane items.

Backend:
- FactSynthesisService: create/update/soft-delete facts with atomic
  state_version bumps; LLM-backed synthesize_from_question/check on the
  fact_synthesis (Haiku) action tier per Section 6.6.
- /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or
  via synthesis). PATCH returns 403 for question/diagnostic_check facts
  (edit the source item instead, Section 7.3).
- unified_chat_service: [PROMOTE] marker parser (JSON-block per Section
  8.1 spec drift note), stable-UUID assignment for pending_task_lane
  questions/actions preserved by exact text/label match across turns.
- ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to
  emit, hallucination guardrails, source_ref handling.
- 17 tests covering parser, stable IDs, service validation, CRUD,
  editability rule, both promote modes, 422 null-synthesis path,
  state_version invariant.

Frontend:
- src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton}
  — green-gradient section above Questions, dashed-circle check, inline
  edit/delete gated by the server's editable flag.
- TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept
  per the doc's "rename is opportunistic" guidance).
- AssistantChatPage fetches facts on selectChat and refetches after each
  chat send (so [PROMOTE]-synthesized facts appear immediately); auto-
  opens the lane when facts exist.

Verification: end-to-end smoke against the local docker stack confirms
all five endpoints (list/create/patch/delete/promote) plus the 403
editability rule. pytest suite verifies the same with mocked LLM. Live
[PROMOTE] flow remains untested until used in the UI — the marker shape
is covered by parser tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 21:13:44 -04:00
b49772f1a1 feat(models): Phase 1 SQLAlchemy models — SessionFact, SessionSuggestedFix, DraftTemplate, AccountSettings
Backs the schema added in 210d310 with SQLAlchemy 2.0 models.

- SessionFact: "What we know" facts with polymorphic source_ref pointing
  at task-lane item UUIDs inside ai_sessions.pending_task_lane (not a FK
  per Section 4.2).
- SessionSuggestedFix: AI-proposed resolutions with supersession tracking
  and the full user_decision state machine.
- DraftTemplate: post-resolve templatization queue with promotion to
  script_templates.
- AccountSettings: per-account JSONB preferences grab-bag with async
  classmethod helpers — get_setting(db, account_id, key, default) reads
  without creating, set_setting(db, account_id, key, value) upserts via
  Postgres ON CONFLICT + jsonb `||` merge so existing keys are preserved.
  Lazy row creation matches the Phase 1 design.

Column additions on existing models to mirror the migration:
- AISession: resolution_note_* / escalation_package_* / state_version
  (the preview-cache-invalidation counter consumed by Phase 3).
- ScriptTemplate: source_session_id / source_user_id / source_ticket_ref
  (provenance for templates promoted from DraftTemplate).

All four new models registered in app.models.__init__ and __all__.
TYPE_CHECKING-guarded relationship imports throughout, matching the
repo's existing model style.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:35:00 +00:00
210d310fb2 feat(db): Phase 1 schema — session_facts, suggested_fixes, draft_templates, account_settings
Adds the backing store for the FlowPilot unified session surface, per
the FLOWPILOT-MIGRATION.md Phase 1 deliverable. Descends from production
head 074 (add_network_diagrams_table).

New tables (all tenant-scoped, all RLS-enabled + forced):
- session_facts — "What we know" facts. source_ref is a polymorphic
  pointer to a task-lane item inside ai_sessions.pending_task_lane
  (no DB-level FK; integrity enforced at service layer per Section 4.2
  of the design doc). Soft-delete via deleted_at; active-facts partial
  index excludes deleted rows.
- session_suggested_fixes — AI-proposed resolutions. One active per
  session at a time (supersession tracked via superseded_at; partial
  index on (session_id) WHERE superseded_at IS NULL powers the
  "find active fix" query).
- draft_templates — scripts pending post-resolve templatization.
  Partial index on (account_id) WHERE status='pending' supports the
  "N scripts ready to review" Script Library badge.
- account_settings — new per-account table with JSONB preferences
  grab-bag. Rows created lazily on first write; get_setting returns
  default when no row exists.

Column additions on ai_sessions:
- resolution_note_markdown / posted_at / external_id
- escalation_package_markdown / posted_at / external_id
- state_version (INTEGER NOT NULL DEFAULT 0) — incremented atomically
  by any write that invalidates the resolution note preview cache
  per Section 5.5. Phase 3 consumes this.

Column additions on script_templates:
- source_session_id, source_user_id, source_ticket_ref — powers the
  "generated from CW #X · resolved by Y · used N times" provenance
  chip in the Script Library.

RLS pattern matches the repo convention (074 / network_diagrams is the
nearest template): ENABLE + FORCE, USING + WITH CHECK on
`account_id = app.current_account_id`. Downgrade is reversible —
drops in the inverse order of creation so FK dependencies unwind.

No runtime verification from code-server; migration apply + downgrade
will be verified on the new dev environment per the standing deferral.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 18:14:26 +00:00
3f0a132058 refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4)
Renames the chat caller to a name that signals its actual purpose, and
factors the reusable cached-system-block + cached-history + cache-usage-log
primitives out to app.core.ai_provider so they can be shared with the
provider-generic path without pulling MCP/beta/images into the abstract
interface.

Helpers added to ai_provider.py:
- `build_anthropic_chat_messages(history, new_message, images, format_reminder)`
  — owns: copy history, apply cache_control to last history message,
  append format reminder to new message, render images as multimodal blocks.
  Anthropic-shaped by design; do not call from Gemini paths.

chat_call_cached keeps exactly the concerns that are unique to the one
MCP/beta/multimodal chat caller:
- Anthropic beta endpoint invocation
- Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN)
- Retry-without-MCP fallback
- Format-reminder content string (declared as module constant)
- Phase 0.5 telemetry (mcp.turn, mcp.fallback)

Documents in the module docstring AND at the function site that this is
the ONE MCP/beta chat caller and should not become the general provider
path. MCP/beta/images are features of exactly one optional Anthropic beta
endpoint; routing them through AnthropicProvider would leak a provider-
specific concern into the abstract interface that also serves Gemini.

Behavior change: chat_call_cached now reuses the singleton AnthropicProvider
HTTP client via `_get_anthropic_client(...)` instead of instantiating a new
`anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern
and avoids burning connections per-turn. No user-visible difference.

No runtime verification from code-server. TODO(phase0-verify) in
ai_provider.py tracks the cache-hit verification owed on the new dev env.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 17:03:09 +00:00
da93ae55c3 feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3)
Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).

Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
  (~2.5k tokens with few-shot example); retries inside the same function
  re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
  (each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
  substitution — same-language sessions share cache entries)

Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.

script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.

No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:29:45 +00:00
b3be66652e feat(ai): structured-system-block caching in AnthropicProvider (Phase 0.1)
Widens AIProvider.generate_json / generate_text / generate_text_stream
signatures to accept `system_prompt: str | list[SystemBlock]`:

- `str` (the existing call shape): passes through uncached, unchanged
  behavior. Every existing caller stays on the uncached path — no silent
  behavior change.
- `list[SystemBlock]`: enables Anthropic prompt caching via structured
  system blocks. Caller-authored `cache_control` is honored verbatim
  (policy α); if no block carries it, the provider applies
  `cache_control: {"type": "ephemeral"}` to the first block only.

Gemini ignores cache_control and concatenates list entries into one
system string — the widened signature is strictly additive on that path.

Adds `anthropic.cache` structured-log telemetry: on every Anthropic
response (streaming included, via `stream.get_final_message()`), logs
`cache_read_input_tokens` and `cache_creation_input_tokens`. Telemetry
failure in streaming is swallowed so the user-facing stream never breaks.

Verification deferred: cannot run from code-server (no Python, no DB,
no dev env). TODO(phase0-verify) left inline in the module docstring.
First verification task on the new dev environment is to hit any
FlowPilot endpoint twice within 5 minutes and confirm the second call
shows cache_read_input_tokens > 0 in the `anthropic.cache` log event.
If verification fails, that's a debug task on the new env — not a
blocker for continuing Phase 0.2/0.3/0.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:17:12 +00:00
0fbc1e0a57 feat(telemetry): add MCP per-turn structured-log telemetry (Phase 0.5)
Emits structured `mcp.turn` log events on every Anthropic-path chat turn,
capturing whether MCP was wired in (mcp_available), whether the model
actually invoked an MCP tool (mcp_invoked), which tool names fired,
and whether the silent retry-without-MCP fallback was triggered.
Adds a separate `mcp.fallback` event with error type/message for
fallback occurrences.

Establishes baseline data for deciding whether MCP investment is earning
its keep before Phase 2+ expands the product footprint. Scope: the one
MCP-using code path (`_call_anthropic_cached`) — not a general
instrumentation layer.

No new dependencies, no schema changes, no behavior change. Standard
library `logging` is the sink; PostHog is not wired on the backend.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:57:13 +00:00
0d9babb986 fix(rls): add account_id to AISessionStep creations, fix boards toast
Some checks failed
CI / backend (push) Failing after 16m37s
CI / frontend (push) Failing after 45s
CI / e2e (push) Has been skipped
Mirror to GitHub / mirror (push) Successful in 3s
- flowpilot_engine: pass account_id at all 5 AISessionStep instantiation
  sites (_create_step_from_parsed x3, briefing step, status update step).
  Phase 4 RLS blocked every INSERT with NULL account_id — this broke all
  new FlowPilot sessions since the Phase 4 migration was applied.
- integrations: list_boards returns [] on PSAError instead of 502, stopping
  the spurious 'Server error' toast on dashboard load (boards are optional).
- client.ts: 5xx global toast now shows backend detail when available.
- useFlowPilotSession: startSession extracts backend detail for error state;
  suppresses duplicate toast for 5xx (global interceptor already handles it).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 04:41:14 +00:00
567985402f fix(psa): use board/id in (...) for multi-board filter per CW docs
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:54:05 +00:00
08a4c6600d fix(psa): use resources contains identifier for my tickets filter
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 3s
CW resources field is a plain string of member identifiers (login names),
not a navigable object. resources/member/id was invalid syntax causing 403.

Now resolves the CW member identifier from the cached member list and
uses: resources contains '{identifier}' which is the correct condition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:53:26 +00:00
29fa48e71b fix(psa): revert to resources/member/id for my tickets filter
Some checks failed
CI / backend (push) Has started running
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
Mirror to GitHub / mirror (push) Has been cancelled
Requires CW API member security role to have All scope on Service Tickets.
owner/id was incorrect for workflows using resources-based assignment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:48:10 +00:00
908a867986 fix(psa): use owner/id instead of resources/member/id for my tickets filter
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Has been cancelled
resources/member/id requires All scope on Service Tickets security role.
owner/id (primary assignee) works with standard Mine scope.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:43:34 +00:00
346576a730 feat(psa): ticket queue dashboard with board selector and session auto-start
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
- Add PSABoard type + list_boards() to CW provider (cached 1h)
- Extend search_tickets with assigned_to_me, unassigned, board_ids, page, page_size
- New GET /integrations/psa/boards endpoint
- New TicketQueue dashboard component: My Tickets / Unassigned tabs,
  multi-select board filter, Load more pagination, Start Session per ticket
- Add TicketQueue to QuickStartPage after active sessions
- FlowPilotSessionPage auto-starts with ticket context when navigated
  from TicketQueue (psaTicketId + psaTicket in location.state)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 03:20:45 +00:00
b18072e24b fix(psa): set account_id on PsaMemberMapping in save and auto-match
Some checks failed
CI / frontend (push) Has been cancelled
CI / e2e (push) Has been cancelled
CI / backend (push) Has been cancelled
Mirror to GitHub / mirror (push) Successful in 2s
2026-04-15 02:59:49 +00:00
chihlasm
4037a5213e fix(admin): use EmailStr for owner_email validation in AdminAccountCreate
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 21:25:03 +00:00
chihlasm
0ed5977fee feat(admin): allow setting owner when creating an account
Adds optional owner_email field to the Create Account modal. Superadmin
can specify an existing user's email to assign as account owner at
creation time. Backend 404s with a clear message if the email is unknown.
Error detail now surfaces to the toast instead of a generic message.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 14:30:23 +00:00
chihlasm
c5b8229ef6 fix(admin): allow owner and admin account roles in user creation and role management
Four places were hardcoded to engineer|viewer only:
- AccountRoleUpdate schema (user.py) — blocked PUT /admin/users/{id}/account-role at the API level
- AdminUserCreate schema (admin.py) — blocked creating users with owner/admin role
- AccountDetailPage role dropdowns (create form + inline member role changer)
- AccountsPage create user role dropdown

Now all four accept the full set: owner, admin, engineer, viewer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 13:24:17 +00:00
chihlasm
8eb814283d fix(psa): fix time entry AttributeError and show all users in member mapping
- Fix create_time_entry() using self._client instead of self.client
- GET /member-mappings now returns all active account users, not just mapped
  ones — allows manual assignment when auto-match by email doesn't work
- PsaMemberMappingResponse mapping fields are now Optional (id, external_member_id,
  external_member_name, matched_by) to represent unmapped users
- Frontend MemberMappingTab skips null external_member_id when building
  localMappings, and derives user list from all returned entries
- Add docs/connectwise-psa-testing-checklist.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 06:09:01 +00:00
chihlasm
c8f571db39 feat(network): thumbnail generation on save, shown on list page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 01:22:51 +00:00
chihlasm
4a12c9b37d fix(network): persist group node type, size, and child parentId on save/load
Backend DiagramNode schema was missing nodeType, style, and parentId fields —
Pydantic stripped them on save, so group nodes lost their identity on reload
and re-appeared as small device icons.

- Backend: add nodeType, style (NodeStyle), parentId to DiagramNode schema
- Frontend: serialize parentId for device nodes inside groups
- Frontend: restore parentId + extent:'parent' on both deserializer paths (setNodes + history init)
- Frontend: add parentId to DiagramNode interface

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 23:49:26 +00:00
chihlasm
a71f082e25 feat: extract admin account management rework from PR 124 (#138)
* feat: reorganize admin panel around accounts

* feat: expand admin customer account controls

* feat: add admin account detail management

* fix: remove unused admin account icon import

* refactor: design critique fixes for account pages

- Admin accounts: replace dense card grid with compact DataTable
- Account settings: remove redundant hero card, stat grid, header pills
- Fix bg-accent (orange) misuse on decorative elements across 7 files
- Add ConfirmButton for destructive actions (deactivate, remove member)
- Replace single-field modals with inline editing (plan, trial)
- Add contextual help: display code tooltip, improved empty states
- Non-owner aside explanation for hidden owner-only sections
- Admin sidebar: group 11 items into 5 labeled sections
- Rename UsersPage.tsx → AccountsPage.tsx to match route
- Fix border radius consistency, hide zero-count badges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use get_admin_db for all new admin account endpoints

All admin endpoints query across tenants without a tenant context.
get_db (app-role, subject to RLS) was never imported and would crash
at runtime — replace all 6 occurrences with get_admin_db (BYPASSRLS).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 04:44:51 -04:00
chihlasm
abd79bc763 feat: extract network map builder from PR 124 (#137)
* feat: add device_types table with system seed data

Creates DeviceType SQLAlchemy model and migration 073 that provisions the
device_types table with 28 system-seeded device types across 7 categories
(network, compute, storage, cloud, endpoint, infrastructure, security).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network_diagrams table

Create NetworkDiagram SQLAlchemy model with JSONB nodes/edges, team-scoped with client/asset metadata, and Alembic migration 074.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Pydantic schemas for device types and network diagrams

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add device types CRUD router

Adds GET/POST/PUT/DELETE endpoints at /device-types with team-scoped access. System types are read-only; custom types are scoped to the creating team.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add AI generation service for network diagrams

Adds network_diagram_ai_service.py with generate_diagram() function that
calls the AI provider to convert plain-English network descriptions into
structured DiagramNode/DiagramEdge data. Registers the action in
ACTION_MODEL_MAP as a standard-tier route.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add network diagrams CRUD + AI generate + export/import router

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add TypeScript types for network diagrams

Adds all interfaces for network diagrams and device types including
DiagramNode, DiagramEdge, DeviceProperties, NetworkDiagramResponse,
AI generate request/response, import/export shapes, and list item types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add frontend API clients for device types and network diagrams

Adds deviceTypesApi (list, create, update, remove) and networkDiagramsApi
(list, get, create, update, archive, duplicate, exportJson, importJson,
aiGenerate, listClients) following the existing apiClient module pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: add device registry, DeviceNode, ConnectionEdge for React Flow

Creates the React Flow building blocks for the network diagram editor:
device type registry with icon/color mappings, DeviceNode component with
status indicators and connection handles, ConnectionEdge with per-type
styling, and nodeTypes/edgeTypes registration maps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add DeviceToolbar panel with search, categories, drag-drop, custom type creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add PropertiesPanel for node and edge property editing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add AIAssistPanel with replace and merge modes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add NetworkCanvas wrapper and DiagramHeader components

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add DiagramEditor page assembling all panels with auto-save and AI generation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Network Diagrams list page with search, client filter, import

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Network Maps to sidebar navigation and router

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve TypeScript errors in DeviceToolbar and DiagramEditor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve stale selection bug in network diagram PropertiesPanel

Selection state now stores IDs and derives objects from live arrays,
so edits in PropertiesPanel inputs reflect immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add React Flow UI foundation components for network diagrams

BaseNode (structured node shell with header/content/footer slots),
BaseHandle (styled connection handle), LabeledHandle (handle with
port label), NodeStatusIndicator (status border effect),
NodeTooltip (hover details via NodeToolbar).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add LabeledGroupNode and AnimatedSvgEdge components

GroupNode for subnet/VLAN/site grouping with positioned label badge.
AnimatedSvgEdge for traffic flow visualization with animated SVG
shape along edge path. Both registered in type maps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor: DeviceNode uses BaseNode, BaseHandle, StatusIndicator, Tooltip

Replaces hand-rolled node layout with composable React Flow UI
components. Status is now a border effect instead of a dot.
Hover tooltip shows hostname, IP, vendor, role, notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add grouping toolbar items and traffic flow toggle

DeviceToolbar gets Subnet/VLAN/Site/DMZ grouping section with
drag-drop. PropertiesPanel gets Show Traffic toggle that switches
edges between connection and animated types. DiagramEditor handles
both device and group node drops.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address code review findings for React Flow UI integration

- Use screenToFlowPosition() for drop coordinates (fixes zoom/pan bug)
- Remove duplicate selection border from DeviceNode (BaseNode handles it)
- Add w-full to GroupNode for proper container sizing
- Remove unused 'selected' destructuring from DeviceNode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add ISP icon to network diagram device registry

Globe icon with accent color, under cloud category.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: improve drag-and-drop feel in network diagram editor

Grip icons on draggable toolbar items, press effect on drag start,
dashed border overlay with 'Drop to add' text when dragging over canvas.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add ContextMenu component for network diagram editor

Charcoal-styled context menu with action factories for node
and canvas variants. Viewport-clamped positioning, auto-dismiss
on click outside, escape, or scroll.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add useCanvasShortcuts hook for copy/paste/duplicate

Keyboard shortcuts with preventDefault and input guard.
Clipboard stores nodes with relative positions and edge indices.
Paste computes canvas center via screenToFlowPosition.
Duplicate offsets +30px. Supports both device and group nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: wire context menu and keyboard shortcuts into diagram editor

Right-click context menus for nodes (copy/duplicate/delete) and
canvas (paste/select-all/fit-view). Right-click selects the node
per spec. serializeNodes now handles group nodes correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: context menu dismisses on pane click, ISP in toolbar

Context menu now closes when clicking anywhere on the canvas via
onPaneClick prop. ISP device added as built-in toolbar item under
Internet section so it's always available without a database entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: backend code review fixes for network diagrams

- Replace legacy Optional imports with modern str | None syntax
- Type JSONB columns as Mapped[list[dict[str, Any]]]
- Escape SQL LIKE wildcards (%, _) in diagram search
- Type DiagramNode.position as Position(x, y) Pydantic model
- Wrap AI response parsing in KeyError handler for clean 422 errors
- Remove unused Optional/TYPE_CHECKING imports from schemas/models
- Extract _get_available_slugs helper to DRY duplicate queries

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: network diagram editor UX — straight edges, snap-to-grid, ISP in Cloud, group resize

- Straight edges: replace SmoothStepEdge with BaseEdge + getStraightPath so
  connections draw direct diagonal lines instead of orthogonal bent paths
- Snap-to-grid: add snapToGrid/snapGrid=[20,20] to NetworkCanvas so nodes
  align consistently when dragged
- ISP in Cloud: remove standalone "Internet" sidebar section, inject ISP into
  the Cloud category loop with search support and correct item count
- Group node resize: add NodeResizer to GroupNode (subnet/VLAN/site/DMZ),
  handles visible when selected; dimensions saved/restored correctly on
  reload (also fixes group node load bug where type was always 'device')
- DiagramNode type: add nodeType and style optional fields

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: network diagram team_id guard + multi-style edge routing

Backend:
- Guard create_diagram with 422 if current_user.team_id is None (prevents
  NOT NULL constraint crash for accounts not yet assigned to a team)
- Add routing field to DiagramEdge schema (straight/curved/step)

Frontend:
- ConnectionEdge now supports straight (default), curved (bezier), and
  step (smooth-step) routing per-edge via routing field in edge data
- PropertiesPanel Connection section gets a Line Style toggle:
  Straight | Curved | Step buttons, active state highlights in accent
- handleEdgeUpdate and serializeEdges now propagate the routing field
- DiagramEdge type gets optional routing field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: network diagrams UX overhaul — icons, empty canvas, properties panel

- Colorize: semantic category colors for all device types (network=blue,
  security=orange, compute=emerald, endpoint=amber, storage=violet,
  cloud=cyan, infra=steel); better icons (Router, ShieldAlert, Boxes,
  Package, Gauge, PlugZap, Video, Radio); MiniMap uses category colors
- Onboard: centered AI generate prompt on empty canvas with 5 MSP-specific
  example chips, ⌘↵ shortcut, spinner; AIAssistPanel only shown with nodes
- Arrange: properties panel — status badge grid at top, fields grouped into
  Network (IP/Subnet/VLAN) and Hardware (Hostname/Vendor/Model/Role) sections
- Delight: segmented topology color bar on listing cards; backend returns
  category_counts via single extra query on list endpoint
- Harden: real PNG export via html-to-image + getNodesBounds/getViewportForBounds
- Polish: ChevronDown replaces unicode ▾, click-outside for client filter,
  consistent spinner in empty prompt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: drop changelog noise from network extraction

* fix: align network map builder with account isolation

* feat: add manual create option for network maps

* feat: make manual network map creation easier to discover

* fix(network-maps): address design critique — harden, normalize, clarify, polish

- Archive: two-step inline confirm in card dropdown menu
- Delete Device/Edge: two-step inline confirm in PropertiesPanel footer
- Context menu Delete: floating confirm bar instead of immediate deletion
- AI Generate New: two-step confirm when replacing existing diagram nodes
- DiagramHeader: show 'Unsaved changes' in amber when isDirty and not saving
- deviceRegistry: SECURITY_COLOR #f97316 → #f87171 (deprecated ember orange removed)
- CanvasEmptyPrompt: remove backdrop-blur (design system violation)
- CanvasEmptyPrompt: remove redundant 'Skip AI' bottom button (duplicate of Build manually card)
- CanvasEmptyPrompt: rounded-xl/rounded-2xl → rounded-lg, border-2 → border
- Topology bar: h-1 → h-2 + native tooltip with category breakdown
- AIAssistPanel: replace pulse-dot loading with spinner (consistent with rest of feature)
- ContextMenu: add shadow-lg (consistent with other dropdowns)
- DeviceNode tooltip: Position.Bottom → Position.Top (avoids canvas-edge clipping)
- CanvasEmptyPrompt: raise ⌘↵ hint from /50 opacity to full text-muted-foreground

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(network-maps): bring to front / send to back layering for nodes

Three entry points for z-index control:
- Right-click context menu: Bring to Front / Send to Back with ] / [ shortcuts, separated by dividers from copy/delete groups
- Properties panel: Layer row with Bring Front + Send Back buttons, tooltip shows keyboard shortcut
- Keyboard: ] brings selected node(s) to front, [ sends to back (skips when input focused)

Context menu also gains divider support (dividerBefore flag) for visual grouping.
Layering handlers use max/min zIndex across all nodes so repeated presses always stack correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: swap switch icon from Layers → Network (Lucide)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: icon size picker (S/M/L) on device nodes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat: drag-to-resize device nodes + BrickWallFire for firewall

- NodeResizer on DeviceNode (same pattern as group nodes); icon scales
  proportionally with node width, clamped 16–60px
- Removes S/M/L static picker — resize is now direct manipulation
- firewall: ShieldAlert → BrickWallFire

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: trigger Railway rebuild

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: add missing hero_001.jpg to git (was untracked, broke Railway deploy)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: ShieldAlert still referenced in CATEGORY_DEFAULTS after icon swap

Removed ShieldAlert from imports when swapping firewall icon to BrickWallFire
but left it in CATEGORY_DEFAULTS — runtime crash, device toolbar empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(network): proportional node resize with locked aspect ratio

Nodes grew into rectangles because NodeResizer had no aspect ratio
constraint, minWidth != minHeight, and icon/text only scaled from width.

- DeviceNode: add keepAspectRatio + equal minWidth/minHeight (80×80),
  maxWidth/maxHeight (280×280), scale icon and label/IP font sizes from
  Math.min(width, height) so all content grows uniformly
- DiagramEditor: set explicit 120×120 style on dropped device nodes so
  React Flow has a definite starting size for aspect ratio calculation
- DiagramEditor: persist device node style (width/height) in
  serializeNodes and restore it on load so size survives save/reload

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix(lint): suppress ESLint errors in network diagram components

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 02:38:01 -04:00
chihlasm
52553d62d2 fix(tests): update expectations for RLS-correct behavior
- test_rls_isolation: add pytestmark for module-scoped event loop to fix
  "Future attached to a different loop" with pytest-asyncio 0.23 + asyncpg
  module-scoped fixtures
- test_admin_categories_global: global categories use PLATFORM_ACCOUNT_ID
  not NULL; update stale assertion
- test_permissions_account: with RLS, cross-tenant tree access returns 404
  (invisible) not 403 (forbidden) — update to match actual behavior

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:48:30 +00:00
chihlasm
a48660700a fix: background jobs and lifespan must use BYPASSRLS sessions
All code that runs outside a request context (APScheduler jobs,
lifespan startup) has no app.current_account_id set, so the
app-role session returns 0 rows from every RLS-protected table.

Changed to _admin_session_factory (BYPASSRLS) in:
- knowledge_flywheel_scheduler.py — queries ai_sessions
- psa_retry_scheduler.py — queries psa_post_log
- retention_cleanup.py — queries assistant_chats
- scheduler.py (_fire_maintenance_schedule, _cleanup_expired_ai_conversations)
- main.py (archive_stale_ai_sessions, _process_notification_retries,
  load_all_schedules at startup)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:44:23 +00:00
chihlasm
3ff886363c fix: use BYPASSRLS session for all auth deps and user-mutation endpoints
Phase 4 enabled RLS on the users table. All code paths that touch users
(or other RLS-protected tables) before require_tenant_context sets
app.current_account_id must use get_admin_db (BYPASSRLS):

- deps.py: get_current_user and get_current_active_user → get_admin_db
- auth.py: all endpoints → get_admin_db (login, register, refresh, etc.
  run before tenant context exists; mutation endpoints also need session
  consistency since current_user is in the admin session)
- accounts.py: transfer_ownership, leave_account, delete_account
  → get_admin_db (these mutate current_user directly)
- onboarding.py: dismiss_onboarding → get_admin_db (same reason)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:25:18 +00:00
chihlasm
501442e5f0 fix: seed_test_users must use ADMIN_DATABASE_URL after Phase 4 RLS on users
RLS is now enabled on the users table. The seed script was using the
app-role connection (DATABASE_URL) which has no tenant context at seed
time — all SELECTs return 0 rows and INSERTs are blocked by FORCE RLS.

Falls back to DATABASE_URL if ADMIN_DATABASE_URL is not set (local dev
without roles configured).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 03:12:46 +00:00
chihlasm
ec322f7cdf fix: bootstrap service account with BYPASSRLS session 2026-04-12 02:44:36 +00:00