resolutionflow

Author	SHA1	Message	Date
Michael Chihlas	694279f89e	feat(sales): add POST /sales-leads public endpoint Phase 2 Task 29 — public Talk-to-Sales submission endpoint. - New POST /api/v1/sales-leads (public, no auth, rate-limited 5/hour per IP). - Inserts a sales_leads row, fires best-effort notification email and PostHog server-side capture; failures are logged but never fail the request. - New EmailService.send_sales_lead_notification static method. - New SALES_LEAD_RECIPIENT_EMAIL setting (defaults to sales@resolutionflow.com). - Schemas: SalesLeadCreate / SalesLeadCreateResponse with literal source enum. - Tests: happy path (row + email), email-failure resilience, and rate-limit enforcement (re-enables the slowapi limiter for the rate-limit assertion since DEBUG=true disables it by default in tests). PostHog server-side instrumentation point is wired in but no-ops gracefully until app.core.analytics.posthog exists — turning it on is a one-line change when the backend SDK is configured. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-06 20:12:03 -04:00
Michael Chihlas	f4606f073a	feat(auth): add Google OAuth callback with oauth_identities linking Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-06 19:14:30 -04:00
Michael Chihlas	f683bb5720	feat(billing): add /billing/checkout-session via BillingService Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-06 19:14:30 -04:00
Michael Chihlas	0d1b305619	fix(escalations): live-test fixes from QA bash Bundles four fixes from the live debugging session: 1. AssistantChatPage: replace urlSessionId === activeChatId gate with a loadedChatIdsRef. After `8914391` made activeChatId initialize from urlSessionId, the gate short-circuited fresh mounts and selectChat never fired. Symptom: senior picks up an escalation, lands on a blank chat surface with no conversation history and no sidebar entry. Fix also adds loadChats() in handleStartHere so the picked-up session appears in the sidebar (its escalated_to_id is null pre-claim, so listSessions doesn't return it until claim_session sets it). 2. config: bump ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS 15s → 45s. Sonnet was hitting tail latency at 15s in the field, leaving the magic-moment placeholder permanent. Background-task architecture (`e8ba74e`) means this no longer blocks the user; it's just the budget before publishing has_assessment=false. NOTE: live test still shows assessment not populating — see HANDOFF for the consolidation plan that supersedes this. 3. Enter-to-submit: chat-input convention (Enter submits, Shift+Enter inserts newline) on the escalate-flow forms. RichTextInput gains an optional onSubmit prop; EscalateModal wires it to handleSubmit; ConcludeSessionModal gets the same handler on its plain textarea. 4. PendingEscalations: each row is now expandable. Click row body to reveal the engineer's escalation reason, step count on record, confidence tier, and PSA ticket number. Pick Up still clicks through directly. Single-expand-at-a-time keeps the dashboard compact. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 00:18:40 -04:00
Michael Chihlas	aca915b047	fix(escalations): bump assessment timeout, surface picked-up sessions in sidebar All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m6s Details CI / backend (pull_request) Successful in 9m45s Details CI / e2e (pull_request) Successful in 10m20s Details Two field-reported issues from live wedge testing. ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS bumped 5s → 15s. The 5s bound fired too aggressively against the Sonnet diagnostic assessment prompt; ~4-8s is typical but tail latency hits 12-14s. The fallback "Assessment unavailable — model didn't respond in time" placeholder was showing on the magic-moment screen for two consecutive escalations, which kills the demo. 15s keeps the click-path bounded but lets the typical case return real content. Real fix is async generation (kick off, persist when done, surface "still computing" with refresh) — captured as a follow-up; bumping the bound is the right call for the wedge demo. list_sessions now matches escalated_to_id == current_user.id alongside the existing user_id and escalation_package.picked_up_by clauses. The unified HandoffManager.claim_session sets escalated_to_id but doesn't write the legacy picked_up_by JSONB key, so picked-up sessions never showed in the senior's chat list — the senior would land on the session detail (active chat) but the sidebar showed only their other unrelated sessions. User reported this as "4 different versions of the session in the chat history section" — they were actually 4 unrelated empty sessions the senior owned, plus the picked-up session was just invisible. Backend tests still 94/94. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 00:04:08 -04:00
Michael Chihlas	9bdd9959a8	fix(handoff): bound escalation assessment latency Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 20:03:14 -04:00
Michael Chihlas	bc15952857	fix(tests): stabilize escalation SSE backend tests Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:47:43 -04:00
Michael Chihlas	87bd0b7c56	WIP: SSE pub/sub for live escalation arrivals (paused for Codex review) First half of the WebSocket/SSE push slice. Paused mid-flight to hand the branch to Codex for outside-voice review before stacking more commits on top. See .ai/HANDOFF.md for the full pause context + what to look at. What's here: - backend/app/core/escalation_bus.py — module-level singleton in-memory pub/sub keyed by account_id. asyncio.Queue per subscriber with 64-event maxsize and drop-on-full semantics. Designed to be swappable for Redis pub/sub when Railway scales past single-replica. - backend/app/api/endpoints/session_handoffs.py — GET /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe bound to current_user.account_id. - backend/app/services/handoff_manager.py — dispatch_escalation_notifications now publishes a `handoff_created` event to the bus BEFORE the email fan-out, in a try/except so a bus failure can't block email delivery. - backend/tests/test_escalation_bus.py — 7 unit tests, all green standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers. - backend/tests/test_handoff_manager.py — +1 dispatcher integration test (publishes to bus, payload shape). - backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer blocked, ready event handshake). [gstack-context] Decisions: - SSE over WebSocket (one-way, browser EventSource semantics, fewer moving parts behind Railway proxy) - In-memory bus over Redis for v1 pilot (3 MSPs, single replica) - Drop-on-full subscriber queue rather than back-pressure publishers - Bus publish ahead of email send, both wrapped in try/except so neither can break handoff creation - Frontend will be a fetch-based ReadableStream reader matching the existing streamDocumentation pattern, not native EventSource (custom-header auth) Remaining (post-Codex): - Frontend SSE subscription in EscalationQueue.tsx (slide-in, reconnect, tab-title flash, prefers-reduced-motion) - Magic-moment handoff-context screen - Re-run the full backend test suite to verify the SSE + dispatcher integration tests (bus units already green standalone) Tried: - Running the full test suite repeatedly without xdist; the per-test DROP SCHEMA + recreate fixture made wall-clock prohibitive when multiple stale runs collided on the same Postgres test schema. Resolution: -n auto next time. [/gstack-context] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:07 -04:00
Michael Chihlas	d0ebdef9e8	fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details The "AI parrots example content from system prompt" bug bit us twice in one day across two different prompt sites. Patching individual prompts is treating the symptom; this commit makes the rule structural. Audit + sanitize: - assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in prior commits, but the [FORK] schema still had literal "Brief reason" / "Short name" / "One sentence" placeholders. Replaced with <angle-bracket> placeholders. Anti-parrot rule itself rewritten to describe the failure mode abstractly instead of naming "jsmith" so the rule no longer trips the guardrail (and so the model doesn't see "jsmith" as a token at all). - ai_chat_service.py — removed three concrete-example offenders: "Get-Service ADSync" command literal, the "DC01 server_name" intake form payload (in two places), and the inline interview demos using "Azure AD Sync failures" / "Exchange Online mailbox migration". Replaced with technology-neutral schema descriptions. - ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the fully-fleshed DNS troubleshooting tree (with literal Dnscache / ipconfig / google.com / Start-Service) with a placeholder schema showing only ID-linkage shape. - kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked Server Manager + DC01 example payload with a placeholder schema. Guardrail (tests/test_prompt_anti_parrot.py): - Imports every module under app/services/ and app/core/ and walks every uppercase string constant ending in _PROMPT, _SCHEMA, _PROTOCOL, _FORMAT, or _CONTEXT. - test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache, google.com, "Outlook keeps", "Teams drops") must not appear in any prompt constant. Add to the list when a new leak shows up in prod — the list IS the audit trail. - test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.) must contain placeholders only. Distinguishes JSON keys (followed by ':', allowed) from JSON values (followed by ',' / ']' / '}', must be <placeholder>); allows pipe-separated enum types (text\|password\|select) and a small set of fixed enum values (question, diagnostic_check, decision, action, ...). Verified by feeding the test a known-bad block — caught it correctly. Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming the test as the enforcement point so future contributors know how to extend it (add to the known-leaked list when a new leak surfaces). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 02:09:30 -04:00
Michael Chihlas	fa61376303	feat(pilot): Phase 5 — inline Script Generator integration All checks were successful Mirror to GitHub / mirror (push) Successful in 10s Details Wires the SuggestedFix card to an inline panel that handles both cases: template-matched fixes open the Script Library generator with parameters pre-filled from session context; un-matched fixes open the three-option dialog (one_off / draft_template / build_template). The decision endpoint records the path choice with side effects: draft_template persists a draft_templates row via a Sonnet-driven TemplateExtractionService; build_template returns a redirect to the Script Builder; one_off just records the choice. Backend: - TemplateExtractionService: drafts a parameter schema from a concrete rendered script. Conservative by default ("prefer fewer parameters"). Round-trip-validates that templated_body only references declared parameters; missing-key mismatch falls back to the original script with no params. LLM/parse failures fall back identically — the engineer can still create a draft and refine in the post-resolve prompt (Phase 6). - /suggested-fixes/{fix_id}/decision side effects: * one_off → returns rendered_script (engineer's edited version or the fix's ai_drafted_script verbatim) * draft_template → same + creates draft_templates row with extracted params, returns draft_template_id * build_template → returns redirect_path=/scripts/builder?from_session= &fix= so the frontend can navigate to the builder pre-loaded - 400 when a non-template fix has no ai_drafted_script (template-matched fixes take the dedicated /scripts/generate path, not this endpoint). - 12 tests: TemplateExtractionService parse + fallback paths, all four decision branches, edited_script override, missing-script 400. Frontend: - src/components/pilot/script/{TemplateMatchPanel, NoTemplateDialog, ParameterizationPreview}.tsx — inline panels rendered in the task lane's bottom slot when the engineer clicks a SuggestedFix card. - TemplateMatchPanel: loads template via /scripts/templates/{id}, pre-fills params from fix.ai_drafted_parameters with cyan "from session" tags, generates via existing /scripts/generate (already bumps state_version on ai_session_id from Phase 3). 404 falls back with a clear message instead of erroring. - NoTemplateDialog: shows the AI-drafted script with proposed parameter values highlighted in amber via ParameterizationPreview; three option cards with the middle (draft_template) flagged Recommended; inline edit on the script body before deciding. - SuggestedFix card now clickable: onActivate toggles the inline panel. - AssistantChatPage: scriptPanelOpen state + handleScriptDecision that navigates on build_template and toasts on the other paths. Active fix changes auto-close the panel so engineers don't act on stale state. - Cmd+K → "Open inline Script Generator" palette entry surfaces only on /pilot/:id routes; fires a window event the chat page subscribes to. No Resolve shortcut added per Section 14 decision (browser ⌘R conflict). Verified 2026-04-22 against the dev stack: - one_off / draft_template / build_template all return the right shape with real Sonnet TemplateExtractionService for the draft path. - Conservative extraction confirmed: cmdkey + Restart-Process script yielded zero proposed parameters as intended. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 00:15:29 -04:00
Michael Chihlas	8fd2c1bac6	feat(pilot): Phase 4 — Resolve + Escalate PSA writebacks with status verification All checks were successful Mirror to GitHub / mirror (push) Successful in 11s Details Wires the preview popover's Confirm & post action to ConnectWise (and, via the provider pattern, any future PSA). Adds the parallel Escalate flow with the handoff-oriented five-section markdown. Sessions without a linked PSA ticket resolve/escalate locally — markdown stored, status flipped, nothing posted externally. Backend: - EscalationPackageGeneratorService: Sonnet, five sections (Problem / What we've confirmed / What we've tried / Current hypothesis / Suggested next steps). Shares the preview_cache with a separate KIND so Resolve and Escalate previews for the same state coexist. - PSAWritebackService: post_resolution_note (RESOLUTION note type, customer-visible), post_escalation_package (INTERNAL_ANALYSIS, handoff for the next engineer only), transition_ticket_status with mandatory re-fetch verification. PSAStatusVerificationError surfaces loudly when CW silently rejects a status change — the ConnectWise anti-pattern CLAUDE.md flags. - Endpoints: * POST /ai-sessions/{id}/escalation-package/preview * POST /ai-sessions/{id}/resolution-note/post * POST /ai-sessions/{id}/escalation-package/post Outcomes: "resolved" / "escalated" with external_id + verified status, "resolved_local" / "escalated_local" when no PSA linked. - Target CW status IDs live in account_settings.preferences (cw_resolved_status_id, cw_escalated_status_id). When unset, the post proceeds without a status transition — response includes a status_transition_skipped_reason rather than silently erroring. - 7 tests: local-only path, PSA happy path with verified transition, status verification failure → 502, skipped transition when unconfigured, 409 on already-resolved re-post, escalate parallel path, internal-analysis note type enforced. Frontend: - ResolutionNotePreview now kind-parameterized ('resolve' \| 'escalate') with inline edit + Confirm & post. Preview loads from the matching backend endpoint; posting calls the matching endpoint; outcome toast surfaces the verified CW status or the local-only result. - AssistantChatPage: previewKind state replaces previewOpen; two toggle buttons (Preview Resolve note / Escalate instead) in the lane's bottom slot. handleConfirmPost dispatches by kind. Verified 2026-04-22: - Local-only Resolve + Escalate round-trip against the dev stack. - Live Sonnet escalation-package preview; cache hit on repeat call with no state change (separate cache kind from resolution-note). - PSA post + status-verification paths covered by mocked-provider pytest cases. Live CW round-trip pending a test CW instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 23:54:54 -04:00
Michael Chihlas	66e592096c	feat(pilot): Phase 3 — Suggested fix tracking + Resolve preview with state_version cache Adds the AI-proposed resolution path and the inline preview of the markdown that will be posted to the customer ticket on Resolve. The preview is keyed on (session_id, ai_sessions.state_version) so back-to- back fetches against unchanged state hit an in-process cache instead of paying for a Sonnet call. Backend: - preview_cache: in-process LRU keyed on (kind, session_id, state_version). No TTL — state_version is the source of truth. Soft-cap 5000 entries. - unified_chat_service: [SUGGEST_FIX] parser (last-block-wins, JSON payload, confidence clamped 0-100), supersession persistence (sets superseded_at on prior active row), atomic state_version bump. - ResolutionNoteGeneratorService: pulls session, facts, active fix, and redacted script_generations into a structured input bundle for Sonnet; produces the four-section markdown (Problem / What we confirmed / Root cause / Resolution). Sensitive script parameters redacted via ScriptTemplateEngine.redact_sensitive driven by the template's parameters_schema. - /api/v1/ai-sessions/{id}/suggested-fixes/active — 200 with the active fix or 404. - /api/v1/ai-sessions/{id}/suggested-fixes/{fix_id}/decision — records one_off / draft_template / build_template / dismissed; dismiss supersedes; bumps state_version. 409 on dismissing an already- superseded fix. - /api/v1/ai-sessions/{id}/resolution-note/preview — generates or returns cached markdown; from_cache flag in payload signals cache hit. - scripts.py POST /generate now bumps state_version on the linked ai_session_id when present (third source of preview-cache invalidation per Section 5.5). - ASSISTANT_SYSTEM_PROMPT documents [SUGGEST_FIX] (when to/not to emit, format, supersession semantics). - 12 tests covering the parser (well-formed, last-wins, malformed, confidence clamping), supersession + state_version invariant, all decision branches, preview cache hit-on-no-change + miss-after-write. Frontend: - src/components/pilot/sections/SuggestedFix.tsx — amber-accented card with confidence badge; dismiss action wired to the decision endpoint. - src/components/pilot/ResolutionNotePreview.tsx — popover with refresh, loading state, cached/fresh indicator, ticket-ref display. - src/api/sessionSuggestedFixes.ts — typed client; getActive normalizes 404 to null so callers don't have to special-case. - TaskLane gains suggestedFixSlot + bottomSlot props (rendered after Diagnostic Checks; bottomSlot anchors the Resolve action). - AssistantChatPage: refreshSessionDerived helper batches fact + fix refresh; fact mutations and chat sends both schedule a 500ms-debounced preview refresh per the Section 5.5 spec. Verified end-to-end against the dev stack with a real Sonnet call: - /active 404 → fact create → preview generates four-section markdown grounded only in provided facts → second preview call hits cache (from_cache=true, no LLM call) → fact write 2 → cache miss, regenerates. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:45:52 -04:00
Michael Chihlas	625dba7548	feat(pilot): Phase 2 — What we know (facts) with stable task-lane IDs Adds the load-bearing structural feature of the FlowPilot migration: a "What we know" panel that holds confirmed facts for a session, fed by AI [PROMOTE] markers and engineer-added notes. Facts feed the resolution note preview (Phase 3) and survive across turns via stable UUIDs assigned to pending_task_lane items. Backend: - FactSynthesisService: create/update/soft-delete facts with atomic state_version bumps; LLM-backed synthesize_from_question/check on the fact_synthesis (Haiku) action tier per Section 6.6. - /api/v1/ai-sessions/{id}/facts CRUD + /facts/promote (proposed_text or via synthesis). PATCH returns 403 for question/diagnostic_check facts (edit the source item instead, Section 7.3). - unified_chat_service: [PROMOTE] marker parser (JSON-block per Section 8.1 spec drift note), stable-UUID assignment for pending_task_lane questions/actions preserved by exact text/label match across turns. - ASSISTANT_SYSTEM_PROMPT: documents [PROMOTE] format, when to/not to emit, hallucination guardrails, source_ref handling. - 17 tests covering parser, stable IDs, service validation, CRUD, editability rule, both promote modes, 422 null-synthesis path, state_version invariant. Frontend: - src/components/pilot/sections/{WhatWeKnow,WhatWeKnowItem,AddNoteButton} — green-gradient section above Questions, dashed-circle check, inline edit/delete gated by the server's editable flag. - TaskLane gains a whatWeKnowSlot prop (existing assistant/ folder kept per the doc's "rename is opportunistic" guidance). - AssistantChatPage fetches facts on selectChat and refetches after each chat send (so [PROMOTE]-synthesized facts appear immediately); auto- opens the lane when facts exist. Verification: end-to-end smoke against the local docker stack confirms all five endpoints (list/create/patch/delete/promote) plus the 403 editability rule. pytest suite verifies the same with mocked LLM. Live [PROMOTE] flow remains untested until used in the UI — the marker shape is covered by parser tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 21:13:44 -04:00
Michael Chihlas	3f0a132058	refactor(ai): rename _call_anthropic_cached → chat_call_cached; extract cache plumbing (Phase 0.4) Renames the chat caller to a name that signals its actual purpose, and factors the reusable cached-system-block + cached-history + cache-usage-log primitives out to app.core.ai_provider so they can be shared with the provider-generic path without pulling MCP/beta/images into the abstract interface. Helpers added to ai_provider.py: - `build_anthropic_chat_messages(history, new_message, images, format_reminder)` — owns: copy history, apply cache_control to last history message, append format reminder to new message, render images as multimodal blocks. Anthropic-shaped by design; do not call from Gemini paths. chat_call_cached keeps exactly the concerns that are unique to the one MCP/beta/multimodal chat caller: - Anthropic beta endpoint invocation - Microsoft Learn MCP server wiring (ENABLE_MCP_MICROSOFT_LEARN) - Retry-without-MCP fallback - Format-reminder content string (declared as module constant) - Phase 0.5 telemetry (mcp.turn, mcp.fallback) Documents in the module docstring AND at the function site that this is the ONE MCP/beta chat caller and should not become the general provider path. MCP/beta/images are features of exactly one optional Anthropic beta endpoint; routing them through AnthropicProvider would leak a provider- specific concern into the abstract interface that also serves Gemini. Behavior change: chat_call_cached now reuses the singleton AnthropicProvider HTTP client via `_get_anthropic_client(...)` instead of instantiating a new `anthropic.AsyncAnthropic(...)` per call. Matches the provider's own pattern and avoids burning connections per-turn. No user-visible difference. No runtime verification from code-server. TODO(phase0-verify) in ai_provider.py tracks the cache-hit verification owed on the new dev env. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 17:03:09 +00:00
Michael Chihlas	da93ae55c3	feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3) Wraps each static system prompt in a single-block list so Phase 0.1's AnthropicProvider applies cache_control: ephemeral automatically (policy α, first block gets marked when no caller-authored cache_control is present). Call sites: - ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens) - ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT (~2.5k tokens with few-shot example); retries inside the same function re-read the cached block instead of paying full input cost on each attempt - kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt (each caches independently by text content) - ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry - script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language substitution — same-language sessions share cache entries) Each edit includes an inline comment explaining why the block is cacheable (stable-constant, retry-reuse, per-language variant) so a future dev can see the intent at the cache_control marker site. script_builder history caching deliberately deferred — per Phase 0.1 decision (option i), AnthropicProvider does not automatically cache the message list. If script_builder's growing 20-message history turns out to be a visible cost driver via the anthropic.cache telemetry, route that caller through the 0.4 chat wrapper which handles history caching. No runtime verification from code-server; cache-hit behavior will be confirmed against the new dev environment when it's up, per the inline TODO(phase0-verify) in ai_provider.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:29:45 +00:00
Michael Chihlas	b3be66652e	feat(ai): structured-system-block caching in AnthropicProvider (Phase 0.1) Widens AIProvider.generate_json / generate_text / generate_text_stream signatures to accept `system_prompt: str \| list[SystemBlock]`: - `str` (the existing call shape): passes through uncached, unchanged behavior. Every existing caller stays on the uncached path — no silent behavior change. - `list[SystemBlock]`: enables Anthropic prompt caching via structured system blocks. Caller-authored `cache_control` is honored verbatim (policy α); if no block carries it, the provider applies `cache_control: {"type": "ephemeral"}` to the first block only. Gemini ignores cache_control and concatenates list entries into one system string — the widened signature is strictly additive on that path. Adds `anthropic.cache` structured-log telemetry: on every Anthropic response (streaming included, via `stream.get_final_message()`), logs `cache_read_input_tokens` and `cache_creation_input_tokens`. Telemetry failure in streaming is swallowed so the user-facing stream never breaks. Verification deferred: cannot run from code-server (no Python, no DB, no dev env). TODO(phase0-verify) left inline in the module docstring. First verification task on the new dev environment is to hit any FlowPilot endpoint twice within 5 minutes and confirm the second call shows cache_read_input_tokens > 0 in the `anthropic.cache` log event. If verification fails, that's a debug task on the new env — not a blocker for continuing Phase 0.2/0.3/0.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:17:12 +00:00
chihlasm	abd79bc763	feat: extract network map builder from PR 124 (#137 ) * feat: add device_types table with system seed data Creates DeviceType SQLAlchemy model and migration 073 that provisions the device_types table with 28 system-seeded device types across 7 categories (network, compute, storage, cloud, endpoint, infrastructure, security). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add network_diagrams table Create NetworkDiagram SQLAlchemy model with JSONB nodes/edges, team-scoped with client/asset metadata, and Alembic migration 074. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Pydantic schemas for device types and network diagrams Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add device types CRUD router Adds GET/POST/PUT/DELETE endpoints at /device-types with team-scoped access. System types are read-only; custom types are scoped to the creating team. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add AI generation service for network diagrams Adds network_diagram_ai_service.py with generate_diagram() function that calls the AI provider to convert plain-English network descriptions into structured DiagramNode/DiagramEdge data. Registers the action in ACTION_MODEL_MAP as a standard-tier route. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add network diagrams CRUD + AI generate + export/import router Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add TypeScript types for network diagrams Adds all interfaces for network diagrams and device types including DiagramNode, DiagramEdge, DeviceProperties, NetworkDiagramResponse, AI generate request/response, import/export shapes, and list item types. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add frontend API clients for device types and network diagrams Adds deviceTypesApi (list, create, update, remove) and networkDiagramsApi (list, get, create, update, archive, duplicate, exportJson, importJson, aiGenerate, listClients) following the existing apiClient module pattern. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add device registry, DeviceNode, ConnectionEdge for React Flow Creates the React Flow building blocks for the network diagram editor: device type registry with icon/color mappings, DeviceNode component with status indicators and connection handles, ConnectionEdge with per-type styling, and nodeTypes/edgeTypes registration maps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add DeviceToolbar panel with search, categories, drag-drop, custom type creation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add PropertiesPanel for node and edge property editing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add AIAssistPanel with replace and merge modes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add NetworkCanvas wrapper and DiagramHeader components Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add DiagramEditor page assembling all panels with auto-save and AI generation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Network Diagrams list page with search, client filter, import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Network Maps to sidebar navigation and router Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve TypeScript errors in DeviceToolbar and DiagramEditor Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve stale selection bug in network diagram PropertiesPanel Selection state now stores IDs and derives objects from live arrays, so edits in PropertiesPanel inputs reflect immediately. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add React Flow UI foundation components for network diagrams BaseNode (structured node shell with header/content/footer slots), BaseHandle (styled connection handle), LabeledHandle (handle with port label), NodeStatusIndicator (status border effect), NodeTooltip (hover details via NodeToolbar). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add LabeledGroupNode and AnimatedSvgEdge components GroupNode for subnet/VLAN/site grouping with positioned label badge. AnimatedSvgEdge for traffic flow visualization with animated SVG shape along edge path. Both registered in type maps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: DeviceNode uses BaseNode, BaseHandle, StatusIndicator, Tooltip Replaces hand-rolled node layout with composable React Flow UI components. Status is now a border effect instead of a dot. Hover tooltip shows hostname, IP, vendor, role, notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add grouping toolbar items and traffic flow toggle DeviceToolbar gets Subnet/VLAN/Site/DMZ grouping section with drag-drop. PropertiesPanel gets Show Traffic toggle that switches edges between connection and animated types. DiagramEditor handles both device and group node drops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address code review findings for React Flow UI integration - Use screenToFlowPosition() for drop coordinates (fixes zoom/pan bug) - Remove duplicate selection border from DeviceNode (BaseNode handles it) - Add w-full to GroupNode for proper container sizing - Remove unused 'selected' destructuring from DeviceNode Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ISP icon to network diagram device registry Globe icon with accent color, under cloud category. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: improve drag-and-drop feel in network diagram editor Grip icons on draggable toolbar items, press effect on drag start, dashed border overlay with 'Drop to add' text when dragging over canvas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ContextMenu component for network diagram editor Charcoal-styled context menu with action factories for node and canvas variants. Viewport-clamped positioning, auto-dismiss on click outside, escape, or scroll. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add useCanvasShortcuts hook for copy/paste/duplicate Keyboard shortcuts with preventDefault and input guard. Clipboard stores nodes with relative positions and edge indices. Paste computes canvas center via screenToFlowPosition. Duplicate offsets +30px. Supports both device and group nodes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: wire context menu and keyboard shortcuts into diagram editor Right-click context menus for nodes (copy/duplicate/delete) and canvas (paste/select-all/fit-view). Right-click selects the node per spec. serializeNodes now handles group nodes correctly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: context menu dismisses on pane click, ISP in toolbar Context menu now closes when clicking anywhere on the canvas via onPaneClick prop. ISP device added as built-in toolbar item under Internet section so it's always available without a database entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: backend code review fixes for network diagrams - Replace legacy Optional imports with modern str \| None syntax - Type JSONB columns as Mapped[list[dict[str, Any]]] - Escape SQL LIKE wildcards (%, _) in diagram search - Type DiagramNode.position as Position(x, y) Pydantic model - Wrap AI response parsing in KeyError handler for clean 422 errors - Remove unused Optional/TYPE_CHECKING imports from schemas/models - Extract _get_available_slugs helper to DRY duplicate queries Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: network diagram editor UX — straight edges, snap-to-grid, ISP in Cloud, group resize - Straight edges: replace SmoothStepEdge with BaseEdge + getStraightPath so connections draw direct diagonal lines instead of orthogonal bent paths - Snap-to-grid: add snapToGrid/snapGrid=[20,20] to NetworkCanvas so nodes align consistently when dragged - ISP in Cloud: remove standalone "Internet" sidebar section, inject ISP into the Cloud category loop with search support and correct item count - Group node resize: add NodeResizer to GroupNode (subnet/VLAN/site/DMZ), handles visible when selected; dimensions saved/restored correctly on reload (also fixes group node load bug where type was always 'device') - DiagramNode type: add nodeType and style optional fields Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: network diagram team_id guard + multi-style edge routing Backend: - Guard create_diagram with 422 if current_user.team_id is None (prevents NOT NULL constraint crash for accounts not yet assigned to a team) - Add routing field to DiagramEdge schema (straight/curved/step) Frontend: - ConnectionEdge now supports straight (default), curved (bezier), and step (smooth-step) routing per-edge via routing field in edge data - PropertiesPanel Connection section gets a Line Style toggle: Straight \| Curved \| Step buttons, active state highlights in accent - handleEdgeUpdate and serializeEdges now propagate the routing field - DiagramEdge type gets optional routing field Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: network diagrams UX overhaul — icons, empty canvas, properties panel - Colorize: semantic category colors for all device types (network=blue, security=orange, compute=emerald, endpoint=amber, storage=violet, cloud=cyan, infra=steel); better icons (Router, ShieldAlert, Boxes, Package, Gauge, PlugZap, Video, Radio); MiniMap uses category colors - Onboard: centered AI generate prompt on empty canvas with 5 MSP-specific example chips, ⌘↵ shortcut, spinner; AIAssistPanel only shown with nodes - Arrange: properties panel — status badge grid at top, fields grouped into Network (IP/Subnet/VLAN) and Hardware (Hostname/Vendor/Model/Role) sections - Delight: segmented topology color bar on listing cards; backend returns category_counts via single extra query on list endpoint - Harden: real PNG export via html-to-image + getNodesBounds/getViewportForBounds - Polish: ChevronDown replaces unicode ▾, click-outside for client filter, consistent spinner in empty prompt Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: drop changelog noise from network extraction * fix: align network map builder with account isolation * feat: add manual create option for network maps * feat: make manual network map creation easier to discover * fix(network-maps): address design critique — harden, normalize, clarify, polish - Archive: two-step inline confirm in card dropdown menu - Delete Device/Edge: two-step inline confirm in PropertiesPanel footer - Context menu Delete: floating confirm bar instead of immediate deletion - AI Generate New: two-step confirm when replacing existing diagram nodes - DiagramHeader: show 'Unsaved changes' in amber when isDirty and not saving - deviceRegistry: SECURITY_COLOR #f97316 → #f87171 (deprecated ember orange removed) - CanvasEmptyPrompt: remove backdrop-blur (design system violation) - CanvasEmptyPrompt: remove redundant 'Skip AI' bottom button (duplicate of Build manually card) - CanvasEmptyPrompt: rounded-xl/rounded-2xl → rounded-lg, border-2 → border - Topology bar: h-1 → h-2 + native tooltip with category breakdown - AIAssistPanel: replace pulse-dot loading with spinner (consistent with rest of feature) - ContextMenu: add shadow-lg (consistent with other dropdowns) - DeviceNode tooltip: Position.Bottom → Position.Top (avoids canvas-edge clipping) - CanvasEmptyPrompt: raise ⌘↵ hint from /50 opacity to full text-muted-foreground Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(network-maps): bring to front / send to back layering for nodes Three entry points for z-index control: - Right-click context menu: Bring to Front / Send to Back with ] / [ shortcuts, separated by dividers from copy/delete groups - Properties panel: Layer row with Bring Front + Send Back buttons, tooltip shows keyboard shortcut - Keyboard: ] brings selected node(s) to front, [ sends to back (skips when input focused) Context menu also gains divider support (dividerBefore flag) for visual grouping. Layering handlers use max/min zIndex across all nodes so repeated presses always stack correctly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: swap switch icon from Layers → Network (Lucide) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: icon size picker (S/M/L) on device nodes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: drag-to-resize device nodes + BrickWallFire for firewall - NodeResizer on DeviceNode (same pattern as group nodes); icon scales proportionally with node width, clamped 16–60px - Removes S/M/L static picker — resize is now direct manipulation - firewall: ShieldAlert → BrickWallFire Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: trigger Railway rebuild Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add missing hero_001.jpg to git (was untracked, broke Railway deploy) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: ShieldAlert still referenced in CATEGORY_DEFAULTS after icon swap Removed ShieldAlert from imports when swapping firewall icon to BrickWallFire but left it in CATEGORY_DEFAULTS — runtime crash, device toolbar empty. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(network): proportional node resize with locked aspect ratio Nodes grew into rectangles because NodeResizer had no aspect ratio constraint, minWidth != minHeight, and icon/text only scaled from width. - DeviceNode: add keepAspectRatio + equal minWidth/minHeight (80×80), maxWidth/maxHeight (280×280), scale icon and label/IP font sizes from Math.min(width, height) so all content grows uniformly - DiagramEditor: set explicit 120×120 style on dropped device nodes so React Flow has a definite starting size for aspect ratio calculation - DiagramEditor: persist device node style (width/height) in serializeNodes and restore it on load so size survives save/reload Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): suppress ESLint errors in network diagram components Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 02:38:01 -04:00
chihlasm	a48660700a	fix: background jobs and lifespan must use BYPASSRLS sessions All code that runs outside a request context (APScheduler jobs, lifespan startup) has no app.current_account_id set, so the app-role session returns 0 rows from every RLS-protected table. Changed to _admin_session_factory (BYPASSRLS) in: - knowledge_flywheel_scheduler.py — queries ai_sessions - psa_retry_scheduler.py — queries psa_post_log - retention_cleanup.py — queries assistant_chats - scheduler.py (_fire_maintenance_schedule, _cleanup_expired_ai_conversations) - main.py (archive_stale_ai_sessions, _process_notification_retries, load_all_schedules at startup) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 03:44:23 +00:00
chihlasm	ec322f7cdf	fix: bootstrap service account with BYPASSRLS session	2026-04-12 02:44:36 +00:00
chihlasm	e05472615b	feat: tenant isolation Phase 3 — audit_logs, tree_shares, remaining RLS P3-A: Add account_id to audit_logs model + migration (backfill via user_id → users.account_id). log_audit() gains optional account_id param with fallback SELECT to avoid churn across 40 call sites. P3-B: Add account_id to tree_shares model + migration (backfill via created_by → users.account_id). TreeShare constructor updated in trees.py. P3-C: Enable RLS on 6 remaining tables: step_ratings, step_usage_log, target_lists, session_shares, audit_logs, tree_shares. P3-D: Drop team_id from target_lists — endpoint, schema, and model now use account_id as the sole isolation key. P3-E: Append Phase 3 RLS isolation tests for all 6 tables. test_target_lists.py: fix cross-account test to use Account model (not Team) and set account_id on new User. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-11 07:02:35 +00:00
chihlasm	ce4cfc3240	fix: set account_id on PsaPostLog in psa_post_to_ticket (missed third write path); fix get_admin_db docstring Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 07:12:45 +00:00
chihlasm	4f4bc435da	docs: broaden admin_database docstring to cover non-admin BYPASSRLS use cases Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 06:51:53 +00:00
chihlasm	b9da0e7107	chore: resolve merge conflicts with main - deps.py: keep require_tenant_context + require_admin_db (RLS deps); drop unused get_tenant_context stub from Phase 0 - categories.py: keep both PLATFORM_ACCOUNT_ID and tenant_filter imports (body uses both) - tenant-isolation spec: keep main's resolved TargetList/teams audit answers Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 04:57:39 +00:00
chihlasm	a5c5eb6cc3	fix: convert DATABASE_URL_SYNC from property to overridable field for Alembic superuser URL Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 04:03:32 +00:00
chihlasm	b0e5f12897	feat: register RLS transaction-begin listener on app engine at startup Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 03:49:49 +00:00
chihlasm	b4f8694f6b	feat: add tenant_context module — ContextVar, transaction listener, tenant_filter Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 03:48:34 +00:00
chihlasm	6f1becf21f	feat: add admin_engine and get_admin_db for BYPASSRLS admin endpoints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 03:46:29 +00:00
chihlasm	acbfb3fb37	feat: add ADMIN_DATABASE_URL setting with fallback to DATABASE_URL Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-10 03:45:52 +00:00
chihlasm	a394a1d464	fix: replace account_id=None with PLATFORM_ACCOUNT_ID for global content After migration 174f442795b7 enforces NOT NULL on account_id, all platform/global content must use the sentinel platform account instead of NULL. Three categories of fixes: 1. trees.py: is_default trees now get PLATFORM_ACCOUNT_ID (not None) 2. admin_categories.py: global category CRUD now uses PLATFORM_ACCOUNT_ID 3. categories.py, tags.py, step_categories.py: creation endpoints coerce None → PLATFORM_ACCOUNT_ID; IS NULL filter queries updated to == PLATFORM_ACCOUNT_ID (IS NULL queries returned empty after migration backfilled all global rows to the platform account) Defines PLATFORM_ACCOUNT_ID constant in app/core/service_account.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 18:35:52 +00:00
chihlasm	b3dba57bc5	feat: tenant isolation Phase 0 — app-layer filters, UUID audit, CI gate (#132 ) * docs: add tenant data isolation design spec Complete architecture plan for multi-tenant data isolation across all layers (PostgreSQL RLS, application-layer filtering, schema migration, testing strategy, and phased rollout checklist). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add background job isolation policy to tenant isolation spec Documents policy for all 5 existing background jobs: - Knowledge Flywheel and PSA Retry flagged for account_id threading - Chat Retention already follows correct pattern (model for others) - Maintenance Schedule Firing needs account_id in queries + Session creation - AI Conversation Expiry approved as cross-tenant with justification Adds approved cross-tenant query registry and Phase 2 checklist items. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: add tenant isolation Phase 0 implementation plan 8 tasks covering: CRITICAL copilot hotfix, tenant_filter() helper, get_tenant_context dependency, analytics/category/AI session gap fixes, full UUID endpoint audit, TargetList dead code audit, teams orphan check, and CI grep check for missing tenant filters. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add tenant_filter() helper and get_tenant_context dependency tenant_filter(model, account_id) is the canonical app-layer tenant scoping expression. Every query on a tenant table must use it. build_tree_access_filter and build_step_visibility_filter updated to call tenant_filter() internally for the account_id match. get_tenant_context is a FastAPI dependency that returns account_id or raises 403 if the user has no account — prevents raw access to current_user.account_id and centralises the null check. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: scope analytics/flows/{tree_id} to requesting account Any authenticated user could read flow analytics (session counts, completion rates, CSAT) for any tree UUID. Now returns 404 if the tree doesn't belong to the requesting account. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: scope category tree_count to requesting account tree_count on GET /categories/{id} was including trees from all accounts, leaking cross-tenant row counts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: restrict AI session search to current user only Search endpoint used OR(user_id, account_id), exposing other users' problem_summary and problem_domain within the same account. Sessions are user-scoped only — cross-user access requires explicit escalation or sharing. List and search endpoints now behave consistently. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: add ownership check and 404 responses to ai-sessions endpoints Cross-tenant isolation audit found: - retry-psa-push had NO ownership check (CRITICAL) — any user could retry any session's PSA push - save_task_lane used db.get() without ownership filter, returned 403 revealing existence - get_session returned 403 instead of 404 for unauthorized access - stream_documentation returned 403 instead of 404 All now use query-level user_id filtering and return 404 to avoid revealing existence. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-tenant session access All session endpoints (get, update, complete, scratchpad, variables, export, ticket-link) now return 404 instead of 403 when a user tries to access another user's session. This prevents confirming existence of resources across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-tenant tree access get_tree and update_tree now return 404 when a user cannot access a tree (private tree from another account). Prevents confirming resource existence across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-tenant step access get_step_or_404 now returns 404 when can_view_step or can_edit_step fails, preventing confirmation of step existence across tenant boundaries. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-tenant upload access get_upload_url and delete_upload now return 404 when the upload belongs to a different account/user, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-tenant share access revoke_share and create_share now return 404 when the caller is not the owner, preventing resource existence confirmation across users. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-team tree access in maintenance schedules _get_tree_or_403 now returns 404 when the user's team does not match, preventing confirmation of tree existence across teams. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-account tag access get_tag now returns 404 for account-specific tags that belong to another account, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 instead of 403 for cross-account step category access get_step_category now returns 404 for account-specific categories that belong to another account, preventing resource existence confirmation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test: add cross-tenant isolation tests for Task 6 UUID audit Tests cover: - Tree GET/PUT returns 404 for cross-account access - Session GET returns 404 for cross-user access - AI session GET returns 404 for cross-user access - AI session retry-psa-push requires ownership - Upload URL returns 404 for cross-account access - Share revoke returns 404 for cross-user access Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: return 404 (not 403) for get_documentation cross-user access; add missing Task 6 tests get_documentation was revealing session existence via 403. Added pre-check query filtering by session_id AND user_id before calling the engine. Also add cross-tenant isolation tests for steps, tags, step_categories, and maintenance_schedules endpoints fixed in Task 6 (TDD was skipped). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: address Task 6 quality review — rename helper, restore 403 for intra-account, add docs test - Rename _get_tree_or_403 → _get_tree_or_404 in maintenance_schedules.py (function now raises 404, old name was misleading) - Restore HTTP 403 for intra-account permission failures in update_tree: same-account users who can see a tree but can't edit it got 404 (wrong); only cross-account lookups should return 404 to avoid confirming existence - Apply same 403/404 distinction to update_tree_visibility - Add test: get_documentation must return 404 for cross-user session access - Add comment documenting owner-only design for documentation endpoints Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: Task 7+8 — TargetList audit, CI tenant-filter grep check Task 7: TargetList dead code audit - Found active code references in 12+ files across backend and frontend (full CRUD API + frontend page + MaintenanceScheduleSection + BatchLaunchModal) - Decision: migrate to account_id in Phase 1 (cannot drop) - DB row count not available from code-server — must verify from VPS SSH before Phase 1 migration - Teams orphan check query documented; must run from VPS SSH before Phase 1 - Results documented in spec Section 9 Task 8: CI tenant-filter enforcement check (warn mode) - Create backend/scripts/check_tenant_filters.py Scans endpoint and service files for select() on tenant tables without tenant_filter/account_id/user_id in surrounding context. Currently reports 109 warnings (Phase 1 backlog). Exits 0 (warn mode). - Add Check tenant filter enforcement step to backend CI job Add --fail flag after Phase 1 backlog clears to make it blocking. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: record Phase 0 audit results — 0 orphaned teams, 0 target_list rows Both checks confirmed 2026-04-09 from production DB. Phase 1 migration is safe to proceed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-09 00:42:19 -04:00
chihlasm	2f3781bfc2	feat: add generate_text_stream to AnthropicProvider for SSE support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 23:02:35 +00:00
chihlasm	c6772c6607	perf: singleton AsyncAnthropic client to avoid per-call connection setup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 23:02:07 +00:00
Michael Chihlas	28f8200b36	feat: add Script Builder service and API endpoints - Script Builder service with language-specific system prompts (PowerShell, Bash, Python) - AI-powered script generation with code block extraction and filename detection - Context window management (last 20 messages) and session message limits - REST API: CRUD sessions, send messages, save to Script Library - Rate limiting on message endpoint (10/min), max 5 concurrent sessions per user - Registered script_build action in AI model tier routing (standard tier) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-21 16:58:26 -04:00
chihlasm	10cf5f45eb	refactor: consolidate LLM JSON parsing into shared llm_utils module Extracted duplicate _strip_markdown_fences / _parse_llm_json functions from 7 files into app/services/llm_utils.py. Two shared functions: - strip_markdown_fences(): fence stripping only - parse_llm_json(): fence stripping + JSON parse + error logging Files updated: flowpilot_engine, knowledge_flywheel, session_to_flow_service, ai_tree_generator_service, ai_fix_service, ai_chat_service, kb_conversion_service Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-21 03:25:25 +00:00
chihlasm	60334cde93	fix(db): add rollback on exception in get_db dependency Prevents InFailedSQLTransaction cascade — when a request fails mid- transaction, the connection is rolled back before being returned to the pool, so subsequent requests on the same connection don't inherit a poisoned transaction state. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-21 01:10:07 +00:00
chihlasm	7518fe643b	fix(cors): return proper responses from middleware instead of re-raising Root cause: Both RequestLoggingMiddleware and ErrorLoggingMiddleware used BaseHTTPMiddleware and re-raised exceptions. When an exception (like a 401 from auth) was re-raised, the response never flowed back through CORSMiddleware, so browsers received error responses without CORS headers. This made 401 errors appear as CORS errors, breaking session resume and other operations after token expiry. Fix: Both middlewares now catch exceptions and return JSONResponse objects (with correct status codes from HTTPException) instead of re-raising. This ensures responses always flow through CORSMiddleware and receive proper Access-Control-Allow-Origin headers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 06:05:20 +00:00
chihlasm	c7d602cfa5	feat(evidence): add S3 storage service and file_uploads model Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-20 03:15:37 +00:00
chihlasm	0f750e63e0	feat(notifications): add Phase 4 Slice 2 — multi-channel notification system Full notification infrastructure with in-app, email, Slack, and Teams channels: Backend: - NotificationConfig, NotificationLog, Notification models + migration - Notification service with event routing, channel delivery, retry logic - 9 API endpoints (config CRUD + in-app notifications) - APScheduler retry job with exponential backoff (30s, 2m, 10m) - Wired into escalation, proposal approval, and knowledge flywheel - Pydantic event key validation, cross-tenant protection on recipients Frontend: - TypeScript types + API client for all notification endpoints - NotificationsPanel: bell icon with unread badge, dropdown, mark-read - NotificationSettings: channel config, event toggles, test, delete confirm - Notifications tab on IntegrationsPage - ARIA attributes, Escape handler, settings link on panel Review fixes (13 issues resolved): - notify() no longer commits/rolls back caller's transaction (critical) - retry_failed_notifications returns count instead of None (critical) - NotificationSettings moved inside dedicated tab (critical) - target_user_ids scoped by account_id (security) - Email loop collects all failures before raising - Slack webhook validates response body - events_enabled rejects unknown event keys - link column widened to String(500) - Dead code removed from _auto_reinforce - Delete confirmation, ARIA, Escape key support Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-19 12:37:54 +00:00
chihlasm	2f18056fd1	feat: add security headers middleware with report-only CSP Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-18 02:38:42 +00:00
chihlasm	8534dbfb5f	feat: command palette, PSA ticket context, session-to-flow converter (#108 ) * feat: add paletteIntent utility for command palette query classification Detects query intent ('question' \| 'keyword' \| 'page' \| 'empty') to drive smart result ordering in the enhanced command palette. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add recentFlows localStorage utility for command palette empty state Tracks recently visited flows (capped at 10) with deduplication by id, surfaced in command palette when query is empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: rewrite CommandPalette with categorized results and smart ranking - Adds FlowPilot AI result (always present when query is non-empty) - Intent-aware ordering: question → FlowPilot prominent; page → pages first; keyword → FlowPilot at top with flows/sessions/tags below - Pages section with admin-gated items (uses useAuthStore) - Tags extracted from flow search results with ?tag= navigation - Quick Actions for create/import/scripts - Empty state shows recent flows + quick actions - Grouped rendering with section labels per design system - Keyboard nav flattened across groups Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add FlowPilot prefill handoff from command palette to AssistantChatPage When navigated to /assistant with location.state.prefill, automatically creates a new chat and sends the prefill message without user interaction. Clears location state after handling to prevent re-trigger on back navigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: track recently visited flows for command palette empty state Calls addRecentFlow after tree data loads in both TreeNavigationPage and ProceduralNavigationPage so the command palette can surface recent flows when the query is empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: use useMemo instead of useCallback for groups builder in CommandPalette Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add PSA ticket context Pydantic schemas (Task 6) Add TicketDetails, CompanyInfo, ContactInfo, ConfigItem, TicketNote, RelatedTicket, and TicketContext models in schemas/psa_context.py for structured ticket context enrichment used by AI prompt injection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add ticket context prompt formatter (Task 7) format_ticket_context_for_prompt() in services/psa/ticket_context.py serializes TicketContext into structured text for AI system prompts, with 10-note limit, 200-char text previews, and human-readable timestamps. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add get_ticket_context() to ConnectWise provider (Task 8) Fetches ticket details, company, contact, configurations, notes, and related open tickets in parallel via asyncio.gather with partial failure tolerance. Results are cached with a 5-minute TTL per ticket/connection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add GET /integrations/psa/tickets/{id}/context endpoint (Task 9) Returns rich TicketContext for a ticket ID. Handles PSA auth failures (returns structured error), ticket-not-found (404), and general PSA errors (502). Requires active PSA connection for the user's account. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: inject PSA ticket context into copilot system prompt (Task 10) When a copilot conversation has an associated session with a linked PSA ticket, fetch the ticket context and append it to the system prompt. Failure is non-critical — errors are logged and the copilot proceeds without context rather than failing the request. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add PSA context API client with TypeScript interfaces Defines TicketDetails, CompanyInfo, ContactInfo, ConfigItemInfo, TicketNote, RelatedTicket, and TicketContext interfaces matching backend psa_context.py schemas. Exports psaContextApi with getTicketContext(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add useTicketContext hook for PSA ticket context fetching Accepts psaTicketId and psaConnectionId, fetches context on mount when both IDs are present, and exposes refresh() for manual re-fetch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add TicketContextPanel component with accordion sections Glass-card panel showing ticket summary, status/priority/SLA, and accordion sections for Client, Contact, Devices, Notes, and Related tickets. Matches design system with font-label labels and ice-cyan accents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: mount TicketContextPanel in session runners when ticket is linked ProceduralNavigationPage renders panel in left sidebar below step checklist. TreeNavigationPage renders panel above breadcrumb trail. Both use useTicketContext hook and show panel only when psa_ticket_id is set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add fallback_steps to TypeScript types (Task 15) Add optional fallback_steps field to ProceduralStep interface. Add FallbackStepRecord interface and fallback_decisions field to Session. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add backend validation for fallback steps (Task 16) Validate fallback_steps in procedural flow validation: required fields, no nested fallback_steps, no duplicate IDs. Add FallbackStepRecord schema and fallback_decisions field to SessionResponse. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: create FallbackSteps UI component (Task 17) Collapsible component supporting edit and execute modes. Edit mode provides title/description inputs with add/remove controls. Execute mode shows "This worked" / "Didn't help" action buttons with emerald/ rose styling. Amber accent styling throughout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: integrate FallbackSteps into editor and session runner (Task 18) Wire FallbackSteps edit mode into StepEditor for procedure_step type with add/remove/update handlers using crypto.randomUUID(). Add execute mode rendering in ProceduralNavigationPage with fallbackDecisions state tracking per parent step. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add session-to-flow request/response schemas (Task 19) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add session-to-flow AI generation service (Task 20) Converts completed troubleshooting sessions into reusable procedural flows with fallback branches. Includes PSA ticket context integration and AI-generated step validation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add POST /ai/session-to-flow endpoint (Task 21) Converts a completed session into a reusable procedural flow using AI. Includes quota checking, usage recording, and proper error handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Create Flow from Session button to session detail page (Task 22) Adds sessionToFlow API client, exports from api/index.ts, and integrates a prominent "Create Flow from Session" button on SessionDetailPage for completed sessions. Generates a procedural flow via AI then navigates to the procedural editor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: cast tree_type to TreeType in session-to-flow creation Fixes build error where string was not assignable to TreeType. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update Playwright test selectors to match actual UI - Use Control+k instead of Meta+k (Linux/CI compatibility) - Use 'AI Assistant' group label instead of 'FlowPilot AI' - Match actual FlowPilot chat page elements (Start a Conversation, New Chat, textarea) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: update Playwright test selectors to match actual UI - Use specific command palette placeholder to avoid ambiguous matches - Fix 'Quick Actions' scoping (two elements with same text) - Fix 'Resolved' exact match on session detail page - Fix tree editor to use getByText instead of getByDisplayValue - Fix 'Add Step' strict mode by using .first() - Fix fallback description placeholder text - Update playwright.config.ts to use port 5433 and resolutionflow DB - Update FlowPilot chat selectors to match actual page layout 11/17 new tests now passing. Remaining 6 need procedural session navigation investigation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: resolve all Playwright test failures — 16/16 passing - Fix procedural session tests: sessions auto-start, no Start button - Fix strict mode violations: use getByRole('heading') for step titles - Fix FlowPilot chat: use button role selector for New Chat - Fix command palette page nav: scope Analytics click to palette modal - Fix fallback runner: remove non-existent Start button click - Update playwright.config to port 5433 for local Docker Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 13:39:17 -04:00
chihlasm	46865882c6	feat: ConnectWise PSA integration (#106 ) PSA abstraction layer with provider pattern, ConnectWise integration (connection management, ticket linking, note posting, status updates, member mapping), Integrations page UI, Fernet credential encryption, in-memory TTL cache, 6 DB migrations, ConnectWise API reference docs.	2026-03-15 01:45:35 -04:00
chihlasm	d4dbf44781	feat: Script Generator Phase 1+2 — backend, engine, API, frontend, template editor, parameter detector Complete Script Generator feature including: Backend: - ScriptCategory, ScriptTemplate, ScriptGeneration models - ScriptTemplateEngine with substitution, filters, sanitization - CRUD + share API endpoints with permission checks - Integration tests for permissions and sharing - Migration 057 with AD User Management seed templates Frontend — Script Library: - Browse templates with category tabs and search - Configure pane with parameter form and script generation - Script preview with live substitution and copy/download - scriptGeneratorStore Zustand store Frontend — Template Editor: - Full CRUD form with metadata, script body (Monaco Editor), parameters - ParameterSchemaBuilder with visual builder + JSON toggle - ScriptManagePage with routing and nav link Frontend — Parameter Detector: - Client-side PowerShell parameter detection engine - Detects script-level param() blocks and variable assignments - Type inference from PS type annotations and value patterns - ParameterDetectorStepper one-by-one review UI with accept/skip Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-14 20:18:59 -04:00
chihlasm	042a12b190	feat: add landing page with beta signup + raise KB node limit to 100 Landing page at /landing with full marketing content: hero, features, pricing, testimonials, and beta email signup form. Beta signups email beta@resolutionflow.com via new public endpoint. Unauthenticated users redirect to landing instead of login. Also raises KB Accelerator node limit from 50 to 100 to accommodate dense troubleshooting articles. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 00:23:29 -04:00
chihlasm	458c2d9cab	fix: prevent circular parent_node_id in KB troubleshooting import AI-generated trees can have circular next_node_id references (e.g., node A → B → A). The parent mapping now checks for cycles before assigning parent_node_id, preventing FK deadlocks during insert. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 23:46:49 -04:00
chihlasm	efafcff4b2	fix: topological insert for KB import nodes to satisfy parent FK Nodes with parent_node_id references were inserted in a single batch, causing FK violations when children were inserted before their parents. Now inserts roots first, flushes, then children in subsequent passes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 23:41:25 -04:00
chihlasm	03390ed59f	feat: enable Markdown (.md) file upload in KB Accelerator Moved md from Phase 2 extensions to allowed formats, added extraction handler (reuses txt handler), and updated plan_limits defaults to include md for all plans. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 23:29:51 -04:00
Michael Chihlas	8c73233dd0	fix: KB conversion — increase max_tokens, add JSON repair, improve error handling - Increase max_tokens from 8192 to 16384 to prevent truncation on long articles - Add _try_repair_json() that fixes trailing commas and attempts to close unclosed brackets/braces from truncated AI responses - Log full raw response (first 2000 chars) on parse failure for debugging - Set status to 'failed' with user-friendly error message instead of leaving imports stuck in 'processing' state Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 02:57:27 -04:00
Michael Chihlas	91d2bc6df3	fix: KB Accelerator tree builder, approve all, canvas delete button - Fix _build_troubleshooting_tree() to handle deep nesting, warning nodes, and DAG deduplication (placed set prevents duplicate IDs) - Fix step_sync VARCHAR(255) overflow on publish by truncating title - Add "Approve All" button to KB review screen - Add delete button (hover-reveal) to flow canvas nodes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-11 01:59:03 -04:00
Michael Chihlas	71ff4a8c35	feat: KB Accelerator — convert KB articles into interactive flows Full-stack implementation of the KB Accelerator feature that converts static MSP knowledge base articles into interactive troubleshooting and procedural flows using AI. Backend: - Migrations 054/055: kb_imports, kb_import_nodes tables + plan_limits KB columns - SQLAlchemy models with relationships and self-referential node hierarchy - Text extraction service (txt, paste, docx with structural metadata) - AI conversion service with MSP-specialist prompts for both flow types - 8 API endpoints: upload, get, list, convert, edit node, commit, delete, quota - Tier-gated access via plan_limits (free: 3 lifetime, pro/team: unlimited) - 8 integration tests covering upload, get/list, quota, commit, delete Frontend: - TypeScript types and API client for all KB Accelerator endpoints - Multi-step wizard page: upload → processing → review → success - Upload screen with paste/file tabs, drag-drop, target type selector - Two-panel review screen with source highlighting and node cards - Per-node actions: approve, edit, regenerate, insert, delete - Confidence color indicators (green/amber/red) - Sidebar navigation with Sparkles icon - Code-split lazy-loaded route at /kb-accelerator Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 20:56:28 -04:00
chihlasm	e3a1e6fb75	feat: Sentry error monitoring for React frontend (#98 ) * feat: add Sentry error monitoring, tracing, and session replay - Install @sentry/react and @sentry/vite-plugin - Create instrument.ts with error monitoring, browser tracing (20% prod), and session replay (10% sessions, 100% on errors) - Wire React 19 reactErrorHandler() on createRoot error hooks - Wrap router with wrapCreateBrowserRouterV7 for route-aware transactions - Configure sentryVitePlugin for source map uploads - Add VITE_SENTRY_DSN to .env.example Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat: add Sentry error monitoring and tracing to FastAPI backend - Install sentry-sdk[fastapi] with auto-enabled FastAPI + Anthropic integrations - Init before app = FastAPI() with env-aware sample rates (100% dev, 20% prod) - Filter /health endpoint from traces to reduce noise - Add SENTRY_DSN to config settings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-07 19:29:58 -05:00

1 2 3

106 Commits