resolutionflow

Author	SHA1	Message	Date
Michael Chihlas	029680ab2d	feat(escalations): unify /escalate through HandoffManager All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m8s Details CI / backend (pull_request) Successful in 10m13s Details CI / e2e (pull_request) Successful in 10m47s Details Replaces the legacy flowpilot_engine.escalate_session orchestration with a single canonical path through HandoffManager. Every escalation now creates a SessionHandoff row, fans out via the SSE bus, persists AppNotification rows for the bell icon, dispatches to external channels (Slack/Teams) via notify(), and emails per-user — regardless of whether the call entered through /escalate (legacy URL) or /handoff (new URL). The senior-pickup magic-moment screen now works end-to-end from the EscalateModal bell-icon path the user just tested. Backend - HandoffCreateRequest gains optional target_user_id (the equivalent of the legacy escalated_to_id field). Self-targeting rejected. - HandoffManager.create_handoff handles intent='escalate' end-to-end: sets escalation_reason + escalated_to_id, builds the legacy enhanced AI escalation_package (Sonnet, lazy-imported from flowpilot_engine, graceful fallback on failure), and merges handoff metadata into it. Eager-loads session.steps and session.user via selectinload — required by both the enhanced-package builder and notify() to avoid MissingGreenlet on async lazy access. - HandoffManager.finalize_escalation generates SessionDocumentation, pushes documentation to PSA, and runs notify() — pre-commit so the AppNotification rows persist atomically with the handoff. - HandoffManager.dispatch_escalation_notifications keeps only the fire-and-forget IO (bus publish, per-user emails) — runs post-commit. Pulls engineer name via a separate User query rather than relying on session.user lazy access. - /handoff endpoint passes target_user_id through and calls finalize_escalation pre-commit. - /escalate endpoint is now a thin shim: owner-only session lookup, HandoffManager.create_handoff(intent='escalate'), finalize_escalation, commit, dispatch_escalation_notifications, return SessionCloseResponse built from documentation + psa_result. flowpilot_engine.escalate_session is no longer called by any endpoint. - pickup_session accepts both 'requesting_escalation' (legacy in-flight sessions) and 'escalated' (new canonical) so the migration is seamless for sessions already in the queue. - Escalation queue list and sidebar count now match either status. Frontend - useFlowPilotSession optimistic update flips status to 'escalated' instead of 'requesting_escalation' so the page state matches the unified backend response. Verified end-to-end live: a fresh /escalate call from the junior produces status='escalated', a SessionHandoff row, a SessionDocumentation, PSA push attempted (no_psa for this test session), AND a bell-icon AppNotification for the team admin with link /pilot/{session_id}?pickup=true. Backend test suite: 1103 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 22:27:26 -04:00
Michael Chihlas	641853a002	fix(escalations): bell-icon notification opens the pickup flow Some checks failed Mirror to GitHub / mirror (push) Successful in 4s Details CI / backend (pull_request) Failing after 1m17s Details CI / frontend (pull_request) Successful in 4m53s Details CI / e2e (pull_request) Successful in 9m18s Details Two backend changes that unbreak the senior-pickup path from the notification panel: 1. notification_service: session.escalated link template now ends with ?pickup=true so the senior lands in the handoff/pickup flow on click. Without it, navigation hit /pilot/:id directly, which then 404'd on the GET because the senior isn't yet escalated_to_id — the user perceives this as the bell-icon "just clearing the notification". 2. ai_sessions GET access: any account member can now read an escalated session's detail when status is requesting_escalation or escalated. The owner-only guard was overly restrictive for explicitly-shared in-transit states. Tenant boundary is enforced by RLS on the underlying query, so account-scope is the right ceiling here. After pickup, the existing handler/escalated_to_id checks still apply. Verified live: re-login as the senior engineer and GET the active escalated session — now returns 200 with full detail. Focused test subset plus tests/test_sessions.py and tests/test_session_sharing.py → 94 passed in 43.26s, no regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:29:47 -04:00
Michael Chihlas	9bdd9959a8	fix(handoff): bound escalation assessment latency Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 20:03:14 -04:00
Michael Chihlas	87bd0b7c56	WIP: SSE pub/sub for live escalation arrivals (paused for Codex review) First half of the WebSocket/SSE push slice. Paused mid-flight to hand the branch to Codex for outside-voice review before stacking more commits on top. See .ai/HANDOFF.md for the full pause context + what to look at. What's here: - backend/app/core/escalation_bus.py — module-level singleton in-memory pub/sub keyed by account_id. asyncio.Queue per subscriber with 64-event maxsize and drop-on-full semantics. Designed to be swappable for Redis pub/sub when Railway scales past single-replica. - backend/app/api/endpoints/session_handoffs.py — GET /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe bound to current_user.account_id. - backend/app/services/handoff_manager.py — dispatch_escalation_notifications now publishes a `handoff_created` event to the bus BEFORE the email fan-out, in a try/except so a bus failure can't block email delivery. - backend/tests/test_escalation_bus.py — 7 unit tests, all green standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers. - backend/tests/test_handoff_manager.py — +1 dispatcher integration test (publishes to bus, payload shape). - backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer blocked, ready event handshake). [gstack-context] Decisions: - SSE over WebSocket (one-way, browser EventSource semantics, fewer moving parts behind Railway proxy) - In-memory bus over Redis for v1 pilot (3 MSPs, single replica) - Drop-on-full subscriber queue rather than back-pressure publishers - Bus publish ahead of email send, both wrapped in try/except so neither can break handoff creation - Frontend will be a fetch-based ReadableStream reader matching the existing streamDocumentation pattern, not native EventSource (custom-header auth) Remaining (post-Codex): - Frontend SSE subscription in EscalationQueue.tsx (slide-in, reconnect, tab-title flash, prefers-reduced-motion) - Magic-moment handoff-context screen - Re-run the full backend test suite to verify the SSE + dispatcher integration tests (bus units already green standalone) Tried: - Running the full test suite repeatedly without xdist; the per-test DROP SCHEMA + recreate fixture made wall-clock prohibitive when multiple stale runs collided on the same Postgres test schema. Resolution: -n auto next time. [/gstack-context] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:07 -04:00
Michael Chihlas	07d0db9579	feat(handoff): email engineer-or-admin teammates on escalation First half of the Escalation Mode notification dual-path. WebSocket/SSE push is the second half (next commit) — email handles offline seniors, push handles online ones for the magic-moment demo. HandoffManager.dispatch_escalation_notifications: - Pulls active engineer/admin/owner-role users in the same account_id (excludes the escalator + viewers + soft-deleted) - Sends via existing EmailService.send_notification_email, concurrent via asyncio.gather; per-message failures don't block the rest - Wrapped in try/except: any exception is logged + swallowed. Handoff creation is authoritative; notification is advisory. This is the graceful-degradation regression both eng + codex reviews flagged as critical (handoff must succeed even if SMTP is down). Endpoint wiring (POST /ai-sessions/{id}/handoff): - Dispatch fires AFTER db.commit() — never email about a rolled-back handoff. Trust-erosion bug if we got that wrong. - Only fires for intent=escalate. Park is private to the escalator. Tests (4 new): - emails-engineer-recipients-in-account: viewer excluded, escalator excluded, only the engineer/admin teammates get the message - skipped-for-park-intent: park doesn't fan out - graceful-degradation-when-email-raises: RuntimeError from the email service does NOT bubble out of dispatch - endpoint-dispatches-on-escalate: end-to-end wiring through POST Per-channel delivery records (replacing the dead `notification_sent` boolean per Codex correction) is a v1.x story — for now application logs are the audit trail. See docs/plans/2026-04-27-escalation-mode-wedge-design.md. 20 tests green across handoff_manager + session_handoffs_api + flowpilot_analytics_escalations. No regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 15:58:05 -04:00
Michael Chihlas	49f88569da	wip(handoff): restore backend suite to green Some checks failed Mirror to GitHub / mirror (push) Successful in 12s Details CI / backend (pull_request) Failing after 27m35s Details CI / frontend (pull_request) Successful in 2m46s Details CI / e2e (pull_request) Failing after 4m9s Details Co-Authored-By: Codex <noreply@openai.com>	2026-04-25 06:13:23 -04:00
Michael Chihlas	1c904373f8	Merge main into feat/flowpilot-migration Some checks failed Mirror to GitHub / mirror (push) Successful in 11s Details CI / backend (pull_request) Failing after 36s Details CI / frontend (pull_request) Failing after 1m7s Details CI / e2e (pull_request) Has been skipped Details Brings in PR #141 (PSA ticket management) so FlowPilot can ship on top of a unified main. Two manual conflict resolutions: 1. CLAUDE.md — kept the FlowPilot ai-handoff rewrite (`.ai/`-driven protocol). The pre-rewrite reference content (CW integration notes, lessons archive, env vars table) lives in `docs/connectwise/`, `docs/LESSONS-ARCHIVE.md`, and DEV-ENV.md by design. 2. frontend/src/pages/AssistantChatPage.tsx — both conflict regions were purely additive. Concatenated FlowPilot's Phase 2-9 state hooks (facts, activeFix, preview*, scriptPanelOpen, templatizeQueue) with PSA's spin-off ticket state (linkedTicket, showNewTicket, spinOffHint). Both modal mounts (TemplatizePrompt, ShortcutsHelpOverlay, NewTicketModal) kept. All setters wired by either branch are intact. Verification: - `tsc -b` clean across the merged tree. - Browser smoke-test (Session B fixture): Phase 9 ProposalBanner ("Run AI-drafted PowerShell to recover SSL VPN") renders alongside PSA's new Tickets sidebar icon. Console clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 01:03:33 -04:00
Michael Chihlas	d4fae87236	feat(pilot): inline Script Builder session — idempotent create + auth + filtered list POST /script-builder/sessions now supports origin='pilot_inline': - Requires ai_session_id; validates it against current user ownership. - Get-or-create: returns existing row for (user, ai_session_id) pair. - Partial unique index on the DB backs the invariant; races resolve to the single winner row. list_sessions + count_user_sessions default-scope to origin='standalone' so inline scratch sessions don't pollute the /script-builder dashboard or count against the 5-session cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-24 02:24:57 -04:00
Michael Chihlas	362c7b1d79	fix(pilot): outcome-aware Resolve/Escalate previews Issue #1 from phase-8-review-issues.md. Cache invalidation alone isn't enough — previews were also omitting outcome fields from the LLM bundle, so a fresh regenerate still couldn't distinguish proposed / failed / partial / success. - PATCH /outcome now bumps ai_sessions.state_version (matches record_decision's existing pattern). - Resolution-note + escalation-package bundles now include status, applied_at, verified_at, partial_notes, failure_reason on the active fix. - Generator prompts prescribe outcome-aware phrasing (closure language for success; what-we've-tried + next-steps for failed/partial). - New end-to-end test asserts the regenerated preview reflects the recorded outcome, not just that the cache key changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 22:04:56 -04:00
Michael Chihlas	2cde6673b0	feat(pilot): [FIX_OUTCOME] system prompt instructions Tells the AI when + how to emit the [FIX_OUTCOME] marker that Task 4's parser consumes. Placeholder-only per the anti-parrot pattern — no literal UUIDs, outcomes, or reasons that could leak into unrelated sessions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:17:21 -04:00
Michael Chihlas	c0112f8bee	feat(pilot): [FIX_OUTCOME] marker parser + AI outcome proposal The AI emits [FIX_OUTCOME] when the engineer indicates in chat that a prior suggested fix worked, didn't work, or was partially applied. The marker writes to session_suggested_fixes.ai_outcome_proposal (JSONB), which the frontend surfaces as a "confirm outcome?" banner. The status column is only updated when the engineer clicks confirm (via PATCH /outcome endpoint from Task 3). Placeholder-only system prompt wiring comes in Task 5. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 15:08:43 -04:00
Michael Chihlas	d0ebdef9e8	fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details The "AI parrots example content from system prompt" bug bit us twice in one day across two different prompt sites. Patching individual prompts is treating the symptom; this commit makes the rule structural. Audit + sanitize: - assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in prior commits, but the [FORK] schema still had literal "Brief reason" / "Short name" / "One sentence" placeholders. Replaced with <angle-bracket> placeholders. Anti-parrot rule itself rewritten to describe the failure mode abstractly instead of naming "jsmith" so the rule no longer trips the guardrail (and so the model doesn't see "jsmith" as a token at all). - ai_chat_service.py — removed three concrete-example offenders: "Get-Service ADSync" command literal, the "DC01 server_name" intake form payload (in two places), and the inline interview demos using "Azure AD Sync failures" / "Exchange Online mailbox migration". Replaced with technology-neutral schema descriptions. - ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the fully-fleshed DNS troubleshooting tree (with literal Dnscache / ipconfig / google.com / Start-Service) with a placeholder schema showing only ID-linkage shape. - kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked Server Manager + DC01 example payload with a placeholder schema. Guardrail (tests/test_prompt_anti_parrot.py): - Imports every module under app/services/ and app/core/ and walks every uppercase string constant ending in _PROMPT, _SCHEMA, _PROTOCOL, _FORMAT, or _CONTEXT. - test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache, google.com, "Outlook keeps", "Teams drops") must not appear in any prompt constant. Add to the list when a new leak shows up in prod — the list IS the audit trail. - test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.) must contain placeholders only. Distinguishes JSON keys (followed by ':', allowed) from JSON values (followed by ',' / ']' / '}', must be <placeholder>); allows pipe-separated enum types (text\|password\|select) and a small set of fixed enum values (question, diagnostic_check, decision, action, ...). Verified by feeding the test a known-bad block — caught it correctly. Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming the test as the enforcement point so future contributors know how to extend it (add to the known-leaked list when a new leak surfaces). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:09:30 -04:00
Michael Chihlas	50215b9110	fix(pilot): strip literal example content from system prompt — model was parroting All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details The system prompt had a "Complete example of a correct first response" section with a specific Outlook/WiFi/jsmith scenario plus literal JSON payloads in [QUESTIONS], [ACTIONS], [SUGGEST_FIX], and [PROMOTE] markers. The model was emitting those literal strings (the same WiFi/laptop questions, the same "Clear cached credentials" suggested fix, the same "OWA login confirmed for jsmith" promote) on EVERY unrelated chat — making the task lane look like it was leaking previous- session data when in fact the AI was just reciting the prompt examples. Replaced literal example content with `<placeholder>` schemas. Added an explicit ANTI-PARROT RULE in the FINAL REMINDER section calling out that the angle-bracket placeholders show SHAPE, not CONTENT, with concrete examples of the failure mode (printer ticket → don't ask about Outlook; user not named jsmith → don't name jsmith). Same scrub applied to the FORK section's "Outlook AND Teams dropping" and the worked fork-flow example. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 01:36:29 -04:00
Michael Chihlas	fa61376303	feat(pilot): Phase 5 — inline Script Generator integration All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details Wires the SuggestedFix card to an inline panel that handles both cases: template-matched fixes open the Script Library generator with parameters pre-filled from session context; un-matched fixes open the three-option dialog (one_off / draft_template / build_template). The decision endpoint records the path choice with side effects: draft_template persists a draft_templates row via a Sonnet-driven TemplateExtractionService; build_template returns a redirect to the Script Builder; one_off just records the choice. Backend: - TemplateExtractionService: drafts a parameter schema from a concrete rendered script. Conservative by default ("prefer fewer parameters"). Round-trip-validates that templated_body only references declared parameters; missing-key mismatch falls back to the original script with no params. LLM/parse failures fall back identically — the engineer can still create a draft and refine in the post-resolve prompt (Phase 6). - /suggested-fixes/{fix_id}/decision side effects: * one_off → returns rendered_script (engineer's edited version or the fix's ai_drafted_script verbatim) * draft_template → same + creates draft_templates row with extracted params, returns draft_template_id * build_template → returns redirect_path=/scripts/builder?from_session= &fix= so the frontend can navigate to the builder pre-loaded - 400 when a non-template fix has no ai_drafted_script (template-matched fixes take the dedicated /scripts/generate path, not this endpoint). - 12 tests: TemplateExtractionService parse + fallback paths, all four decision branches, edited_script override, missing-script 400. Frontend: - src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog, ParameterizationPreview}.tsx — inline panels rendered in the task lane's bottom slot when the engineer clicks a SuggestedFix card. - TemplateMatchPanel: loads template via /scripts/templates/{id}, pre-fills params from fix.ai_drafted_parameters with cyan "from session" tags, generates via existing /scripts/generate (already bumps state_version on ai_session_id from Phase 3). 404 falls back with a clear message instead of erroring. - NoTemplateDialog: shows the AI-drafted script with proposed parameter values highlighted in amber via ParameterizationPreview; three option cards with the middle (draft_template) flagged Recommended; inline edit on the script body before deciding. - SuggestedFix card now clickable: onActivate toggles the inline panel. - AssistantChatPage: scriptPanelOpen state + handleScriptDecision that navigates on build_template and toasts on the other paths. Active fix changes auto-close the panel so engineers don't act on stale state. - Cmd+K → "Open inline Script Generator" palette entry surfaces only on /pilot/:id routes; fires a window event the chat page subscribes to. No Resolve shortcut added per Section 14 decision (browser ⌘R conflict). Verified 2026-04-22 against the dev stack: - one_off / draft_template / build_template all return the right shape with real Sonnet TemplateExtractionService for the draft path. - Conservative extraction confirmed: cmdkey + Restart-Process script yielded zero proposed parameters as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 00:15:29 -04:00
Michael Chihlas	8fd2c1bac6	feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details Wires the preview popover's Confirm & post action to ConnectWise (and, via the provider pattern, any future PSA). Adds the parallel Escalate flow with the handoff-oriented five-section markdown. Sessions without a linked PSA ticket resolve/escalate locally — markdown stored, status flipped, nothing posted externally. Backend: - EscalationPackageGeneratorService: Sonnet, five sections (Problem / What we've confirmed / What we've tried / Current hypothesis / Suggested next steps). Shares the preview_cache with a separate KIND so Resolve and Escalate previews for the same state coexist. - PSAWritebackService: post_resolution_note (RESOLUTION note type, customer-visible), post_escalation_package (INTERNAL_ANALYSIS, handoff for the next engineer only), transition_ticket_status with mandatory re-fetch verification. PSAStatusVerificationError surfaces loudly when CW silently rejects a status change — the ConnectWise anti-pattern CLAUDE.md flags. - Endpoints: * POST /ai-sessions/{id}/escalation-package/preview * POST /ai-sessions/{id}/resolution-note/post * POST /ai-sessions/{id}/escalation-package/post Outcomes: "resolved" / "escalated" with external_id + verified status, "resolved_local" / "escalated_local" when no PSA linked. - Target CW status IDs live in account_settings.preferences (cw_resolved_status_id, cw_escalated_status_id). When unset, the post proceeds without a status transition — response includes a status_transition_skipped_reason rather than silently erroring. - 7 tests: local-only path, PSA happy path with verified transition, status verification failure → 502, skipped transition when unconfigured, 409 on already-resolved re-post, escalate parallel path, internal-analysis note type enforced. Frontend: - ResolutionNotePreview now kind-parameterized ('resolve' \| 'escalate') with inline edit + Confirm & post. Preview loads from the matching backend endpoint; posting calls the matching endpoint; outcome toast surfaces the verified CW status or the local-only result. - AssistantChatPage: previewKind state replaces previewOpen; two toggle buttons (Preview Resolve note / Escalate instead) in the lane's bottom slot. handleConfirmPost dispatches by kind. Verified 2026-04-22: - Local-only Resolve + Escalate round-trip against the dev stack. - Live Sonnet escalation-package preview; cache hit on repeat call with no state change (separate cache kind from resolution-note). - PSA post + status-verification paths covered by mocked-provider pytest cases. Live CW round-trip pending a test CW instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 23:54:54 -04:00
Michael Chihlas	66e592096c	feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache Adds the AI-proposed resolution path and the inline preview of the markdown that will be posted to the customer ticket on Resolve. The preview is keyed on (session_id, ai_sessions.state_version) so back-to- back fetches against unchanged state hit an in-process cache instead of paying for a Sonnet call. Backend: - preview_cache: in-process LRU keyed on (kind, session_id, state_version). No TTL — state_version is the source of truth. Soft-cap 5000 entries. - unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON payload, confidence clamped 0-100), supersession persistence (sets superseded_at on prior active row), atomic state_version bump. - ResolutionNoteGeneratorService: pulls session, facts, active fix, and redacted script_generations into a structured input bundle for Sonnet; produces the four-section markdown (Problem / What we confirmed / Root cause / Resolution). Sensitive script parameters redacted via ScriptTemplateEngine.redact_sensitive driven by the template's parameters_schema. - /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active fix or 404. - /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records one_off / draft_template / build_template / dismissed; dismiss supersedes; bumps state_version. 409 on dismissing an already- superseded fix. - /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns cached markdown; from_cache flag in payload signals cache hit. - scripts.py POST /generate now bumps state_version on the linked ai_session_id when present (third source of preview-cache invalidation per Section 5.5). - ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit, format, supersession semantics). - 12 tests covering the parser (well-formed, last-wins, malformed, confidence clamping), supersession + state_version invariant, all decision branches, preview cache hit-on-no-change + miss-after-write. Frontend: - src/components/pilot/sections/SuggestedFix.tsx — amber-accented card with confidence badge; dismiss action wired to the decision endpoint. - src/components/pilot/ResolutionNotePreview.tsx — popover with refresh, loading state, cached/fresh indicator, ticket-ref display. - src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes 404 to null so callers don't have to special-case. - TaskLane gains suggestedFixSlot + bottomSlot props (rendered after Diagnostic Checks; bottomSlot anchors the Resolve action). - AssistantChatPage: refreshSessionDerived helper batches fact + fix refresh; fact mutations and chat sends both schedule a 500ms-debounced preview refresh per the Section 5.5 spec. Verified end-to-end against the dev stack with a real Sonnet call: - /active 404 → fact create → preview generates four-section markdown grounded only in provided facts → second preview call hits cache (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:45:52 -04:00
Michael Chihlas	625dba7548	feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs Adds the load-bearing structural feature of the FlowPilot migration: a "What we know" panel that holds confirmed facts for a session, fed by AI [PROMOTE] markers and engineer-added notes. Facts feed the resolution note preview (Phase 3) and survive across turns via stable UUIDs assigned to pending_task_lane items. Backend: - FactSynthesisService: create/update/soft-delete facts with atomic state_version bumps; LLM-backed synthesize_from_question/check on the fact_synthesis (Haiku) action tier per Section 6.6. - /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or via synthesis). PATCH returns 403 for question/diagnostic_check facts (edit the source item instead, Section 7.3). - unified_chat_service: [PROMOTE] marker parser (JSON-block per Section 8.1 spec drift note), stable-UUID assignment for pending_task_lane questions/actions preserved by exact text/label match across turns. - ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to emit, hallucination guardrails, source_ref handling. - 17 tests covering parser, stable IDs, service validation, CRUD, editability rule, both promote modes, 422 null-synthesis path, state_version invariant. Frontend: - src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton} — green-gradient section above Questions, dashed-circle check, inline edit/delete gated by the server's editable flag. - TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept per the doc's "rename is opportunistic" guidance). - AssistantChatPage fetches facts on selectChat and refetches after each chat send (so [PROMOTE]-synthesized facts appear immediately); auto- opens the lane when facts exist. Verification: end-to-end smoke against the local docker stack confirms all five endpoints (list/create/patch/delete/promote) plus the 403 editability rule. pytest suite verifies the same with mocked LLM. Live [PROMOTE] flow remains untested until used in the UI — the marker shape is covered by parser tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:13:44 -04:00
Michael Chihlas	3f0a132058	refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4) Renames the chat caller to a name that signals its actual purpose, and factors the reusable cached-system-block + cached-history + cache-usage-log primitives out to app.core.ai_provider so they can be shared with the provider-generic path without pulling MCP/beta/images into the abstract interface. Helpers added to ai_provider.py: - `build_anthropic_chat_messages(history, new_message, images, format_reminder)` — owns: copy history, apply cache_control to last history message, append format reminder to new message, render images as multimodal blocks. Anthropic-shaped by design; do not call from Gemini paths. chat_call_cached keeps exactly the concerns that are unique to the one MCP/beta/multimodal chat caller: - Anthropic beta endpoint invocation - Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN) - Retry-without-MCP fallback - Format-reminder content string (declared as module constant) - Phase 0.5 telemetry (mcp.turn, mcp.fallback) Documents in the module docstring AND at the function site that this is the ONE MCP/beta chat caller and should not become the general provider path. MCP/beta/images are features of exactly one optional Anthropic beta endpoint; routing them through AnthropicProvider would leak a provider- specific concern into the abstract interface that also serves Gemini. Behavior change: chat_call_cached now reuses the singleton AnthropicProvider HTTP client via `_get_anthropic_client(...)` instead of instantiating a new `anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern and avoids burning connections per-turn. No user-visible difference. No runtime verification from code-server. TODO(phase0-verify) in ai_provider.py tracks the cache-hit verification owed on the new dev env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 17:03:09 +00:00
Michael Chihlas	da93ae55c3	feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3) Wraps each static system prompt in a single-block list so Phase 0.1's AnthropicProvider applies cache_control: ephemeral automatically (policy α, first block gets marked when no caller-authored cache_control is present). Call sites: - ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens) - ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT (~2.5k tokens with few-shot example); retries inside the same function re-read the cached block instead of paying full input cost on each attempt - kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt (each caches independently by text content) - ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry - script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language substitution — same-language sessions share cache entries) Each edit includes an inline comment explaining why the block is cacheable (stable-constant, retry-reuse, per-language variant) so a future dev can see the intent at the cache_control marker site. script_builder history caching deliberately deferred — per Phase 0.1 decision (option i), AnthropicProvider does not automatically cache the message list. If script_builder's growing 20-message history turns out to be a visible cost driver via the anthropic.cache telemetry, route that caller through the 0.4 chat wrapper which handles history caching. No runtime verification from code-server; cache-hit behavior will be confirmed against the new dev environment when it's up, per the inline TODO(phase0-verify) in ai_provider.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:29:45 +00:00
Michael Chihlas	0fbc1e0a57	feat(telemetry): add MCP per-turn structured-log telemetry (Phase 0.5) Emits structured `mcp.turn` log events on every Anthropic-path chat turn, capturing whether MCP was wired in (mcp_available), whether the model actually invoked an MCP tool (mcp_invoked), which tool names fired, and whether the silent retry-without-MCP fallback was triggered. Adds a separate `mcp.fallback` event with error type/message for fallback occurrences. Establishes baseline data for deciding whether MCP investment is earning its keep before Phase 2+ expands the product footprint. Scope: the one MCP-using code path (`_call_anthropic_cached`) — not a general instrumentation layer. No new dependencies, no schema changes, no behavior change. Standard library `logging` is the sink; PostHog is not wired on the backend. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 15:57:13 +00:00
Michael Chihlas	995a0c1d2e	fix(psa): use schedule entries for ticket co-assignees (CW canonical pattern) Some checks failed Mirror to GitHub / mirror (push) Successful in 33s Details CI / backend (pull_request) Failing after 17m0s Details CI / frontend (pull_request) Failing after 51s Details CI / e2e (pull_request) Has been skipped Details The previous implementation PATCHed the `resources` string directly, which CW silently ignores because `resources` is a server-derived read-only field (it's populated from schedule entries of type/id=4, not freely writable). Per CW docs (openapi line 70949): "Please use the /schedule/entries?conditions=type/id=4 AND objectId={id} endpoint". Behavior per spec: - No owner + assign user → set owner (existing behavior kept) - Has owner + assign different user → POST /schedule/entries with type/id=4, member, objectId; owner untouched - User already assigned (owner or schedule entry) → idempotent no-op - Remove owner → clear owner (existing behavior kept) - Remove co-assignee → DELETE /schedule/entries/{entry_id} - list_resources now merges owner + schedule-entry members, deduped by id Required CW security role permission on the API member: - Service > Resource Scheduling > Add/Inquire/Delete Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 00:34:18 +00:00
Michael Chihlas	f6a24ea4e1	fix(psa): resource assignment targets CW `owner`, status PATCH verifies apply Some checks failed Mirror to GitHub / mirror (push) Successful in 2s Details CI / backend (pull_request) Failing after 15m32s Details CI / frontend (pull_request) Failing after 45s Details CI / e2e (pull_request) Has been skipped Details Previous `resources`-string PATCH was silently ignored by CW — the `resources` field is server-derived from the ticket's owner + schedule entries, not freely writable. Status PATCH could also silently no-op when a cross-board status id was sent. - add_resource: when the ticket is unassigned, set the `owner` MemberReference (the canonical writable primary-assignee field). If already owned by someone else, append the identifier to the `resources` co-assignee string best-effort. - remove_resource: clear `owner` (with remove→replace:null fallback) if the target is the current owner, otherwise strip from `resources`. - list_resources: merge owner + resources string, deduped by member id, so the UI reflects both single-owner and multi-resource assignments. - update_ticket_status: verify CW applied the status by comparing the response body's status.id — raises PSAError with a clear message when CW silently rejects the change (e.g., status invalid for ticket's board), instead of reporting spurious success. - Frontend: surface the backend error detail in the toast so users see the real reason instead of a generic "Failed to update" message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 21:39:21 +00:00
Michael Chihlas	04ff2ea301	fix(tickets): refresh status and resources in detail panel after update Some checks failed Mirror to GitHub / mirror (push) Successful in 3s Details CI / backend (pull_request) Failing after 17m32s Details CI / frontend (pull_request) Failing after 48s Details CI / e2e (pull_request) Has been skipped Details Status update was returning only new_status (string) and the parent list's onStatusUpdated only set status_name. The <select> was bound to status_id, which never changed — so it visually reverted to the old status even though the PATCH succeeded. - Backend: include new_status_id in the status-update response. - Panel: own currentStatusId/currentStatusName state so the select reflects the change immediately and survives stale parent snapshots. - Parent list: update status_id on both the row and selectedTicket so the list row stays in sync when the panel stays open. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 21:28:48 +00:00
Michael Chihlas	60851b400a	fix(tickets): status filter dropdown and CW resource assignment Some checks failed Mirror to GitHub / mirror (push) Successful in 4s Details CI / backend (pull_request) Failing after 17m51s Details CI / frontend (pull_request) Failing after 52s Details CI / e2e (pull_request) Has been skipped Details - Status filter: aggregate statuses across all boards (deduped by name) when no board is selected. Backend accepts status_name and filters by status/name so the same status matches across boards. - Resource assignment: CW has no /service/tickets/{id}/members endpoint — assignees live in the ticket's comma-separated `resources` string field. Rewrote list/add/remove to read/PATCH that field via member identifier. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 21:03:00 +00:00
Michael Chihlas	294b309faa	fix: pre-landing review fixes — company_id filter and CW condition injection - Apply company_id filter in CW search_tickets conditions (was silently ignored) - Sanitize query string to strip single quotes before CW condition interpolation - Add psaError state to TicketsPage for permissions error surfacing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 14:42:05 +00:00
Michael Chihlas	7fa81f69a6	feat(psa): add spin-off ticket system prompt rule, backend routing tests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 03:01:21 +00:00
Michael Chihlas	a5e9615666	feat(psa): add ticket_service.py with list/add/remove resource, update_status, create_ticket Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 02:52:32 +00:00
Michael Chihlas	e714088a2b	feat(psa): implement list/add/remove resources, create_ticket, paginated search in CW provider Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 02:49:20 +00:00
Michael Chihlas	ff0ec143e2	feat(psa): add PSAResource, TicketCreatePayload types and abstract provider methods Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 02:45:24 +00:00
Michael Chihlas	8d964e64e4	fix(psa): update autotask/halopsa stub search_tickets return type annotation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 02:44:08 +00:00
Michael Chihlas	44634b1145	feat(psa): add PaginatedTicketResult type, update provider search_tickets signature Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-16 02:41:48 +00:00
Michael Chihlas	0d9babb986	fix(rls): add account_id to AISessionStep creations, fix boards toast Some checks failed CI / backend (push) Failing after 16m37s Details CI / frontend (push) Failing after 45s Details CI / e2e (push) Has been skipped Details Mirror to GitHub / mirror (push) Successful in 3s Details - flowpilot_engine: pass account_id at all 5 AISessionStep instantiation sites (_create_step_from_parsed x3, briefing step, status update step). Phase 4 RLS blocked every INSERT with NULL account_id — this broke all new FlowPilot sessions since the Phase 4 migration was applied. - integrations: list_boards returns [] on PSAError instead of 502, stopping the spurious 'Server error' toast on dashboard load (boards are optional). - client.ts: 5xx global toast now shows backend detail when available. - useFlowPilotSession: startSession extracts backend detail for error state; suppresses duplicate toast for 5xx (global interceptor already handles it). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 04:41:14 +00:00
Michael Chihlas	567985402f	fix(psa): use board/id in (...) for multi-board filter per CW docs Some checks failed CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details CI / backend (push) Has been cancelled Details Mirror to GitHub / mirror (push) Successful in 2s Details Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 03:54:05 +00:00
Michael Chihlas	08a4c6600d	fix(psa): use resources contains identifier for my tickets filter Some checks failed CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details CI / backend (push) Has been cancelled Details Mirror to GitHub / mirror (push) Successful in 3s Details CW resources field is a plain string of member identifiers (login names), not a navigable object. resources/member/id was invalid syntax causing 403. Now resolves the CW member identifier from the cached member list and uses: resources contains '{identifier}' which is the correct condition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 03:53:26 +00:00
Michael Chihlas	29fa48e71b	fix(psa): revert to resources/member/id for my tickets filter Some checks failed CI / backend (push) Has started running Details CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details Mirror to GitHub / mirror (push) Has been cancelled Details Requires CW API member security role to have All scope on Service Tickets. owner/id was incorrect for workflows using resources-based assignment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 03:48:10 +00:00
Michael Chihlas	908a867986	fix(psa): use owner/id instead of resources/member/id for my tickets filter Some checks failed CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details CI / backend (push) Has been cancelled Details Mirror to GitHub / mirror (push) Has been cancelled Details resources/member/id requires All scope on Service Tickets security role. owner/id (primary assignee) works with standard Mine scope. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 03:43:34 +00:00
Michael Chihlas	346576a730	feat(psa): ticket queue dashboard with board selector and session auto-start Some checks failed CI / frontend (push) Has been cancelled Details CI / e2e (push) Has been cancelled Details CI / backend (push) Has been cancelled Details Mirror to GitHub / mirror (push) Successful in 2s Details - Add PSABoard type + list_boards() to CW provider (cached 1h) - Extend search_tickets with assigned_to_me, unassigned, board_ids, page, page_size - New GET /integrations/psa/boards endpoint - New TicketQueue dashboard component: My Tickets / Unassigned tabs, multi-select board filter, Load more pagination, Start Session per ticket - Add TicketQueue to QuickStartPage after active sessions - FlowPilotSessionPage auto-starts with ticket context when navigated from TicketQueue (psaTicketId + psaTicket in location.state) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 03:20:45 +00:00
chihlasm	8eb814283d	fix(psa): fix time entry AttributeError and show all users in member mapping - Fix create_time_entry() using self._client instead of self.client - GET /member-mappings now returns all active account users, not just mapped ones — allows manual assignment when auto-match by email doesn't work - PsaMemberMappingResponse mapping fields are now Optional (id, external_member_id, external_member_name, matched_by) to represent unmapped users - Frontend MemberMappingTab skips null external_member_id when building localMappings, and derives user list from all returned entries - Add docs/connectwise-psa-testing-checklist.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 06:09:01 +00:00
chihlasm	abd79bc763	feat: extract network map builder from PR 124 (#137 ) * feat: add device_types table with system seed data Creates DeviceType SQLAlchemy model and migration 073 that provisions the device_types table with 28 system-seeded device types across 7 categories (network, compute, storage, cloud, endpoint, infrastructure, security). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add network_diagrams table Create NetworkDiagram SQLAlchemy model with JSONB nodes/edges, team-scoped with client/asset metadata, and Alembic migration 074. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Pydantic schemas for device types and network diagrams Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add device types CRUD router Adds GET/POST/PUT/DELETE endpoints at /device-types with team-scoped access. System types are read-only; custom types are scoped to the creating team. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add AI generation service for network diagrams Adds network_diagram_ai_service.py with generate_diagram() function that calls the AI provider to convert plain-English network descriptions into structured DiagramNode/DiagramEdge data. Registers the action in ACTION_MODEL_MAP as a standard-tier route. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add network diagrams CRUD + AI generate + export/import router Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add TypeScript types for network diagrams Adds all interfaces for network diagrams and device types including DiagramNode, DiagramEdge, DeviceProperties, NetworkDiagramResponse, AI generate request/response, import/export shapes, and list item types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add frontend API clients for device types and network diagrams Adds deviceTypesApi (list, create, update, remove) and networkDiagramsApi (list, get, create, update, archive, duplicate, exportJson, importJson, aiGenerate, listClients) following the existing apiClient module pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add device registry, DeviceNode, ConnectionEdge for React Flow Creates the React Flow building blocks for the network diagram editor: device type registry with icon/color mappings, DeviceNode component with status indicators and connection handles, ConnectionEdge with per-type styling, and nodeTypes/edgeTypes registration maps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add DeviceToolbar panel with search, categories, drag-drop, custom type creation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add PropertiesPanel for node and edge property editing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add AIAssistPanel with replace and merge modes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add NetworkCanvas wrapper and DiagramHeader components Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add DiagramEditor page assembling all panels with auto-save and AI generation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Network Diagrams list page with search, client filter, import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Network Maps to sidebar navigation and router Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve TypeScript errors in DeviceToolbar and DiagramEditor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve stale selection bug in network diagram PropertiesPanel Selection state now stores IDs and derives objects from live arrays, so edits in PropertiesPanel inputs reflect immediately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add React Flow UI foundation components for network diagrams BaseNode (structured node shell with header/content/footer slots), BaseHandle (styled connection handle), LabeledHandle (handle with port label), NodeStatusIndicator (status border effect), NodeTooltip (hover details via NodeToolbar). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add LabeledGroupNode and AnimatedSvgEdge components GroupNode for subnet/VLAN/site grouping with positioned label badge. AnimatedSvgEdge for traffic flow visualization with animated SVG shape along edge path. Both registered in type maps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: DeviceNode uses BaseNode, BaseHandle, StatusIndicator, Tooltip Replaces hand-rolled node layout with composable React Flow UI components. Status is now a border effect instead of a dot. Hover tooltip shows hostname, IP, vendor, role, notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add grouping toolbar items and traffic flow toggle DeviceToolbar gets Subnet/VLAN/Site/DMZ grouping section with drag-drop. PropertiesPanel gets Show Traffic toggle that switches edges between connection and animated types. DiagramEditor handles both device and group node drops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review findings for React Flow UI integration - Use screenToFlowPosition() for drop coordinates (fixes zoom/pan bug) - Remove duplicate selection border from DeviceNode (BaseNode handles it) - Add w-full to GroupNode for proper container sizing - Remove unused 'selected' destructuring from DeviceNode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ISP icon to network diagram device registry Globe icon with accent color, under cloud category. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: improve drag-and-drop feel in network diagram editor Grip icons on draggable toolbar items, press effect on drag start, dashed border overlay with 'Drop to add' text when dragging over canvas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ContextMenu component for network diagram editor Charcoal-styled context menu with action factories for node and canvas variants. Viewport-clamped positioning, auto-dismiss on click outside, escape, or scroll. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add useCanvasShortcuts hook for copy/paste/duplicate Keyboard shortcuts with preventDefault and input guard. Clipboard stores nodes with relative positions and edge indices. Paste computes canvas center via screenToFlowPosition. Duplicate offsets +30px. Supports both device and group nodes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: wire context menu and keyboard shortcuts into diagram editor Right-click context menus for nodes (copy/duplicate/delete) and canvas (paste/select-all/fit-view). Right-click selects the node per spec. serializeNodes now handles group nodes correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: context menu dismisses on pane click, ISP in toolbar Context menu now closes when clicking anywhere on the canvas via onPaneClick prop. ISP device added as built-in toolbar item under Internet section so it's always available without a database entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: backend code review fixes for network diagrams - Replace legacy Optional imports with modern str \| None syntax - Type JSONB columns as Mapped[list[dict[str, Any]]] - Escape SQL LIKE wildcards (%, _) in diagram search - Type DiagramNode.position as Position(x, y) Pydantic model - Wrap AI response parsing in KeyError handler for clean 422 errors - Remove unused Optional/TYPE_CHECKING imports from schemas/models - Extract _get_available_slugs helper to DRY duplicate queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: network diagram editor UX — straight edges, snap-to-grid, ISP in Cloud, group resize - Straight edges: replace SmoothStepEdge with BaseEdge + getStraightPath so connections draw direct diagonal lines instead of orthogonal bent paths - Snap-to-grid: add snapToGrid/snapGrid=[20,20] to NetworkCanvas so nodes align consistently when dragged - ISP in Cloud: remove standalone "Internet" sidebar section, inject ISP into the Cloud category loop with search support and correct item count - Group node resize: add NodeResizer to GroupNode (subnet/VLAN/site/DMZ), handles visible when selected; dimensions saved/restored correctly on reload (also fixes group node load bug where type was always 'device') - DiagramNode type: add nodeType and style optional fields Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: network diagram team_id guard + multi-style edge routing Backend: - Guard create_diagram with 422 if current_user.team_id is None (prevents NOT NULL constraint crash for accounts not yet assigned to a team) - Add routing field to DiagramEdge schema (straight/curved/step) Frontend: - ConnectionEdge now supports straight (default), curved (bezier), and step (smooth-step) routing per-edge via routing field in edge data - PropertiesPanel Connection section gets a Line Style toggle: Straight \| Curved \| Step buttons, active state highlights in accent - handleEdgeUpdate and serializeEdges now propagate the routing field - DiagramEdge type gets optional routing field Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: network diagrams UX overhaul — icons, empty canvas, properties panel - Colorize: semantic category colors for all device types (network=blue, security=orange, compute=emerald, endpoint=amber, storage=violet, cloud=cyan, infra=steel); better icons (Router, ShieldAlert, Boxes, Package, Gauge, PlugZap, Video, Radio); MiniMap uses category colors - Onboard: centered AI generate prompt on empty canvas with 5 MSP-specific example chips, ⌘↵ shortcut, spinner; AIAssistPanel only shown with nodes - Arrange: properties panel — status badge grid at top, fields grouped into Network (IP/Subnet/VLAN) and Hardware (Hostname/Vendor/Model/Role) sections - Delight: segmented topology color bar on listing cards; backend returns category_counts via single extra query on list endpoint - Harden: real PNG export via html-to-image + getNodesBounds/getViewportForBounds - Polish: ChevronDown replaces unicode ▾, click-outside for client filter, consistent spinner in empty prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: drop changelog noise from network extraction * fix: align network map builder with account isolation * feat: add manual create option for network maps * feat: make manual network map creation easier to discover * fix(network-maps): address design critique — harden, normalize, clarify, polish - Archive: two-step inline confirm in card dropdown menu - Delete Device/Edge: two-step inline confirm in PropertiesPanel footer - Context menu Delete: floating confirm bar instead of immediate deletion - AI Generate New: two-step confirm when replacing existing diagram nodes - DiagramHeader: show 'Unsaved changes' in amber when isDirty and not saving - deviceRegistry: SECURITY_COLOR #f97316 → #f87171 (deprecated ember orange removed) - CanvasEmptyPrompt: remove backdrop-blur (design system violation) - CanvasEmptyPrompt: remove redundant 'Skip AI' bottom button (duplicate of Build manually card) - CanvasEmptyPrompt: rounded-xl/rounded-2xl → rounded-lg, border-2 → border - Topology bar: h-1 → h-2 + native tooltip with category breakdown - AIAssistPanel: replace pulse-dot loading with spinner (consistent with rest of feature) - ContextMenu: add shadow-lg (consistent with other dropdowns) - DeviceNode tooltip: Position.Bottom → Position.Top (avoids canvas-edge clipping) - CanvasEmptyPrompt: raise ⌘↵ hint from /50 opacity to full text-muted-foreground Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(network-maps): bring to front / send to back layering for nodes Three entry points for z-index control: - Right-click context menu: Bring to Front / Send to Back with ] / [ shortcuts, separated by dividers from copy/delete groups - Properties panel: Layer row with Bring Front + Send Back buttons, tooltip shows keyboard shortcut - Keyboard: ] brings selected node(s) to front, [ sends to back (skips when input focused) Context menu also gains divider support (dividerBefore flag) for visual grouping. Layering handlers use max/min zIndex across all nodes so repeated presses always stack correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: swap switch icon from Layers → Network (Lucide) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: icon size picker (S/M/L) on device nodes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: drag-to-resize device nodes + BrickWallFire for firewall - NodeResizer on DeviceNode (same pattern as group nodes); icon scales proportionally with node width, clamped 16–60px - Removes S/M/L static picker — resize is now direct manipulation - firewall: ShieldAlert → BrickWallFire Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: trigger Railway rebuild Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add missing hero_001.jpg to git (was untracked, broke Railway deploy) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: ShieldAlert still referenced in CATEGORY_DEFAULTS after icon swap Removed ShieldAlert from imports when swapping firewall icon to BrickWallFire but left it in CATEGORY_DEFAULTS — runtime crash, device toolbar empty. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(network): proportional node resize with locked aspect ratio Nodes grew into rectangles because NodeResizer had no aspect ratio constraint, minWidth != minHeight, and icon/text only scaled from width. - DeviceNode: add keepAspectRatio + equal minWidth/minHeight (80×80), maxWidth/maxHeight (280×280), scale icon and label/IP font sizes from Math.min(width, height) so all content grows uniformly - DiagramEditor: set explicit 120×120 style on dropped device nodes so React Flow has a definite starting size for aspect ratio calculation - DiagramEditor: persist device node style (width/height) in serializeNodes and restore it on load so size survives save/reload Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): suppress ESLint errors in network diagram components Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 02:38:01 -04:00
chihlasm	a48660700a	fix: background jobs and lifespan must use BYPASSRLS sessions All code that runs outside a request context (APScheduler jobs, lifespan startup) has no app.current_account_id set, so the app-role session returns 0 rows from every RLS-protected table. Changed to _admin_session_factory (BYPASSRLS) in: - knowledge_flywheel_scheduler.py — queries ai_sessions - psa_retry_scheduler.py — queries psa_post_log - retention_cleanup.py — queries assistant_chats - scheduler.py (_fire_maintenance_schedule, _cleanup_expired_ai_conversations) - main.py (archive_stale_ai_sessions, _process_notification_retries, load_all_schedules at startup) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 03:44:23 +00:00
chihlasm	64f004a62c	feat: tenant isolation Phase 4 — RLS on 31 remaining tables + script_builder fix Enable RLS on all remaining tenant-scoped tables (31 tables): Standard policy (tenant sees own rows): users, account_invites, account_limit_overrides, account_feature_overrides, subscriptions, ai_chat_sessions, ai_conversations, ai_session_steps, ai_session_embeddings, ai_suggestions, ai_usage, assistant_chats, attachments, copilot_conversations, feedback, file_uploads, fork_points, kb_imports, notifications, notification_configs, notification_logs, psa_activity_logs, psa_member_mappings, script_builder_sessions, script_categories, session_ratings, tree_embeddings, user_folders, user_pinned_trees Platform-visibility policy (own rows OR PLATFORM_ACCOUNT_ID): platform_steps, template_trees Intentionally skipped: accounts (IS the root table, no account_id column) plan_feature_defaults (platform config, no account_id column) Also fixes script_builder_service.create_session() which was missing account_id= on ScriptBuilderSession construction, causing 500s on all script builder endpoints (pre-existing CI failure). Adds Phase 4 RLS isolation tests covering: users, script_builder_sessions, ai_session_steps, notifications, platform_steps, template_trees. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 01:25:28 +00:00
chihlasm	758cd61621	fix: propagate account_id through all write paths missing NOT NULL coverage Service layer (production code): - branch_manager: set account_id on SessionBranch (root + fork) and ForkPoint from session.account_id; load session in create_fork for this purpose - handoff_manager: set account_id on SessionHandoff from session.account_id - ai_suggestions endpoint: set account_id on AISuggestion from current_user - steps endpoint (/feedback): set account_id on StepRating from current_user - ratings endpoint: set account_id on StepRating from current_user Test infrastructure: - conftest.py: seed PLATFORM_ACCOUNT_ID (00000000-...-0001) account after Base.metadata.create_all so global categories and gallery items have a valid FK - test_rls_isolation: add _ensure_rls_schema fixture that runs 'alembic upgrade head' before module tests — previous function-scoped test_db fixtures drop the schema, leaving the RLS tests with no tables - test_branding: create Account before User in helper functions - test_admin_gallery: set account_id=PLATFORM_ACCOUNT_ID on Tree/ScriptTemplate - test_public_templates: set account_id=PLATFORM_ACCOUNT_ID on Tree, ScriptTemplate, TreeCategory - test_resolution_outputs: set account_id=session.account_id on SessionResolutionOutput - test_analytics_phase5: set account_id on PsaPostLog - test_draft_trees: replace account_id=None with PLATFORM_ACCOUNT_ID in migration default test (NOT NULL now enforced) - test_maintenance_schedules: set account_id on other_tree - test_save_session_as_tree: set account_id on all 5 Session() constructors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-11 04:24:36 +00:00
chihlasm	b641ac6c55	fix: set account_id on session_supporting_data, session_resolution_outputs, maintenance_schedules, psa_post_log constructors Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 06:44:17 +00:00
chihlasm	29a9573d6e	fix: CRITICAL — scope copilot tree query to current account (#131 ) * docs: add tenant data isolation design spec Complete architecture plan for multi-tenant data isolation across all layers (PostgreSQL RLS, application-layer filtering, schema migration, testing strategy, and phased rollout checklist). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add background job isolation policy to tenant isolation spec Documents policy for all 5 existing background jobs: - Knowledge Flywheel and PSA Retry flagged for account_id threading - Chat Retention already follows correct pattern (model for others) - Maintenance Schedule Firing needs account_id in queries + Session creation - AI Conversation Expiry approved as cross-tenant with justification Adds approved cross-tenant query registry and Phase 2 checklist items. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add tenant isolation Phase 0 implementation plan 8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper, get_tenant_context dependency, analytics/category/AI session gap fixes, full UUID endpoint audit, TargetList dead code audit, teams orphan check, and CI grep check for missing tenant filters. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: CRITICAL — scope copilot tree query to current account A user who knew another account's tree UUID could start a copilot conversation, causing the tree's full node structure, names, and descriptions to be sent to the AI as part of the system prompt. Fix: add account_id (or is_default / visibility='public') filter to the tree SELECT in copilot_service.start_conversation(). Returns 404 for inaccessible trees. Test added in test_tenant_isolation_p0.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 00:41:30 -04:00
chihlasm	290f2be2fd	fix: resolve "sorry something went wrong" errors and show images in chat Three fixes from beta tester session feedback: 1. MCP error handling (backend/app/services/assistant_chat_service.py) - The MCP Microsoft Learn integration was catching only BadRequestError. Any other error type (APIStatusError, APIConnectionError, timeout) from the external MCP server propagated as a 502, causing the generic error. - Now catches all Exception types when MCP is active and retries without MCP using the stable client.messages.create endpoint. 2. Frontend error UX (frontend/src/pages/AssistantChatPage.tsx) - catch {} was silently swallowing all errors and inserting a generic assistant message. Now: differentiates 429 (rate limit) vs 502/503 (AI unavailable), removes the optimistic user message on failure, restores the failed message to the input so users can retry without retyping, and logs errors to console for debugging. 3. Image attachments visible in chat (frontend/src/components/assistant/ChatMessage.tsx) - Uploaded images were sent to the AI correctly but never shown in the chat thread. Now captures preview URLs before clearing pendingUploads and renders thumbnails above the user bubble, clickable to full size. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 13:09:16 +00:00
chihlasm	e8e12cc7e5	fix: move session lifecycle actions to header bar in AssistantChatPage - Add persistent session header with title, status badge, Resolve, Escalate, and Update Ticket/Share Update buttons — mirrors FlowPilotSessionPage pattern exactly - Update Ticket label when psa_ticket_id present, Share Update otherwise - Full mobile support via ⋯ overflow menu (Resolve, Escalate, Update, Pause) - Strip _(not yet completed)_ markers from stored conversation_messages in unified_chat_service to prevent stale task lane items from prior turns leaking into new sessions via the AI's re-include instruction - Add currentChatRef guard to handleResumeNew (was missing unlike handleSend) - Remove Update/Conclude from chatbar — toolbar is now input utilities only Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-07 06:31:24 +00:00
chihlasm	8bd395a0c7	fix: resolve task lane stale state, partial submit, and closure bugs - Import and call clearTaskState before updating questions/actions in handleSend and handleTaskSubmit so new AI tasks always replace stale sessionStorage cache instead of being overridden by it - Include pending (not yet completed) tasks in the AI message on partial submit so the AI knows which tasks were left unanswered - Fix stale closure in TaskLane saveTaskLane useEffect — use refs for questions/actions so the debounced backend save always uses current values - Add responses field to pending_task_lane TypeScript type, removing the unsafe double-cast in selectChat - Instruct the AI to re-surface incomplete tasks unless ≥75% confident the information is no longer needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-06 16:53:48 +00:00
chihlasm	f4143e52a1	feat: overhaul session documentation, PSA notes, and client communications - Reformat PSA resolution/escalation notes: clean single-line header, steps with engineer responses inline, remove duplicate timing blocks, remove AI confidence section, add follow-up recommendations - Standardize time display to decimal hours (e.g. 0.25 hrs) across all note formatters and status update context - Add follow_up_recommendations to SessionDocumentation schema and surface in SessionDocView; extracted from resolution suggestion steps - Add _build_what_we_know() helper: uses session.evidence_items when cockpit branch merges, falls back to deriving findings from steps - Fix option label lookup in generate_status_update (was passing raw machine values to AI instead of human-readable labels) - Add 'What We Know' section to status update ticket notes prompt - Improve _build_session_context in resolution_output_generator to include intake text and full step details instead of truncated chat - Add request_info audience type: client-facing information request that skips the length step and generates a numbered question list - Improve client_update and email_draft prompts with per-context guidance (status/resolution/escalation) and fix escalation subject line from 'Specialist Review' to 'Specialist Assistance' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 15:18:31 +00:00
chihlasm	cb33787c08	fix: close race conditions in script builder session and slug creation - script_builder endpoint: pg_advisory_xact_lock on user_id before session count check, preventing concurrent creates from both passing the MAX_SESSIONS_PER_USER guard - script_builder_service send_message: pg_advisory_xact_lock on session_id before message count check, preventing concurrent sends from both passing the MAX_MESSAGES_PER_SESSION guard - script_builder_service save_to_library: replace check-then-insert slug logic with IntegrityError retry loop (3 attempts with fresh UUID suffix); add unique constraint on script_templates.slug (migration 070) - ScriptBuilderPage: add creatingSessionRef to serialize concurrent handleSend calls that would otherwise both call createSession() while session is still null Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 05:09:42 +00:00
chihlasm	d6d1002172	fix: add status_update to step_type CHECK constraint The generate_status_update service inserted AISessionStep with step_type='status_update' which violated the DB CHECK constraint, causing a 500 error. Also fix incorrect field name confidence_score (should be confidence_at_step) and remove nonexistent confidence_tier. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 06:35:11 +00:00

1 2 3

113 Commits