resolutionflow

Author	SHA1	Message	Date
Michael Chihlas	0f90c0e199	refactor(sidebar): collapse rail/sections to single-IA, log docs - Sidebar: kill the drifting railGroups + sections dual definition. Single source of truth (workItems / libraryItems / footerItems) rendered in both pinned and rail modes; pin/unpin is a width and label affordance, not an IA switch. Hairline divider replaces section labels. Guides moves to the footer alongside Account. Renames: Home -> Dashboard, History -> Sessions, Insights -> Analytics. - CURRENT-STATE.md: log PR #158 (session impeccable pass + tasklane keyboard flow) under "Recently shipped". - PRODUCT.md: design-context source of truth (users, brand, aesthetic); sibling to DESIGN-SYSTEM.md. - skills-lock.json: lock /impeccable + /documentation-writer skill versions so other sessions reproduce the same tooling state. - Drop stale .impeccable.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-04 22:50:19 -04:00
chihlasm	93fa4eac5c	Merge pull request 'feat(guides): rewrite in-product User Guides as Diátaxis how-tos' (#159 ) from feat/guides-diataxis-rewrite into main All checks were successful CI / frontend (push) Successful in 4m57s Details Mirror to GitHub / mirror (push) Successful in 6s Details CI / backend (push) Successful in 10m38s Details CI / e2e (push) Successful in 12m31s Details	2026-05-02 02:19:53 +00:00
Michael Chihlas	dc71d5873b	docs(ai): mark guides rewrite as merged in handoff and current task All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 5m1s Details CI / backend (pull_request) Successful in 13m8s Details CI / e2e (pull_request) Successful in 18m32s Details Update HANDOFF.md, CURRENT_TASK.md, and SESSION_LOG.md to reflect that PR #159 is being merged into main, replacing the in-flight "uncommitted" language with the merged-state rollup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 21:25:44 -04:00
Michael Chihlas	307a6285e6	feat(guides): rewrite in-product User Guides as Diátaxis how-tos All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 4m57s Details CI / backend (pull_request) Successful in 10m21s Details CI / e2e (pull_request) Successful in 12m0s Details Replace 15 feature-dump guides with 43 problem-oriented how-tos grouped under 10 categories. Drop Maintenance Flows / AI Assistant / Flow Assist Sparkles — those surfaces no longer exist post-FlowPilot pivot. Rename Step Library → Solutions Library throughout. Correct every "click X in the sidebar" reference to match live labels (Home, History, Tickets, Flows, Scripts, Data, Acct). Schema: add `category: CategoryId` and optional `relatedSlugs` to Guide; new Category type and `categories` const drive hub ordering. GuidesHubPage renders category sections (auto-hides empty); GuideDetailPage renders a related-guides footer when set; GuideCard drops the misleading "N sections" subtitle. Fix step.tip markdown rendering — `bold` rendered literally because tip used plain text instead of the same regex replacement used on instruction. 14 net-new how-tos for FlowPilot-era surfaces with no prior coverage: tasklane keyboard flow, view-what-we-know, ask-AI mid-session, pause-and-leave, resolve, record-fix-outcome, escalate (Escalation Mode), post-docs-to-ticket, send-client-update, build-script-from-scratch, open-suggested-flow, pin-a-flow, invite-teammate. Browser-verified against engineer + owner test users (sidebar labels, account sub-pages, pilot-screen header buttons, Tasks panel, integration form). tsc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 21:16:51 -04:00
chihlasm	5e10005276	Merge pull request 'feat(session): impeccable pass + tasklane keyboard flow' (#158 ) from feat/session-distill-quieter into main All checks were successful CI / frontend (push) Successful in 5m8s Details Mirror to GitHub / mirror (push) Successful in 6s Details CI / backend (push) Successful in 10m20s Details CI / e2e (push) Successful in 10m43s Details Reviewed-on: #158 -Michael Chihlas	2026-05-01 21:53:13 +00:00
Michael Chihlas	d3a9031e23	chore(session): bump keyboard hint contrast + drop redundant font-sans All checks were successful Mirror to GitHub / mirror (push) Successful in 12s Details CI / frontend (pull_request) Successful in 5m33s Details CI / backend (pull_request) Successful in 10m57s Details CI / e2e (pull_request) Successful in 13m21s Details Two small ergonomic fixes after the impeccable pass: - TaskLane keyboard hints (⏎ submit · ⇧⏎ newline) under each open input were rendered at text-muted-foreground/70, just shy of legible at a glance. Drop the /70 opacity modifier so they read at full muted weight on first look without becoming visually loud. - 12 sites across the session screen had explicit font-sans utilities, but the body default is already IBM Plex Sans (via --font-sans in index.css and Tailwind v4's default-sans binding). None of the call sites sit inside a font-heading or font-mono cascade, so every font-sans there was a no-op. Drop them. ConcludeSessionModal also had three "text-xs font-sans text-xs" triplets — drop both the redundant font-sans and the doubled text-xs in one pass. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 16:50:09 -04:00
Michael Chihlas	708e8b977f	chore(ai): log followup TODOs surfaced during impeccable pass Two backlog entries surfaced while polishing the session screen: - ConcludeSessionModal paused/escalated step forces a single-artifact choice (Ticket Notes / Client Update / Email Draft). Real escalations often need at least two of the three. Recommended shape: multi-select with smart pre-checks per outcome, parallel generation, per-result Copy / Post / Send actions. Feature work, deferred. - bg-card-hover Tailwind class doesn't resolve in CommandPalette. The --color-bg-card-hover token generates bg-bg-card-hover (Tailwind v4 takes the full token name minus --color-). Other call sites use the explicit hover:bg-[var(--color-bg-card-hover)] form that works; the CommandPalette classes silently produce nothing. Fix is two lines — swap to the explicit form, or add a --color-card-hover semantic mapping in index.css. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 16:23:15 -04:00
Michael Chihlas	8b0358af3b	fix(parameterization): word-boundary check prevents over-eager value match ParameterizationPreview.tokenize() matched highlight values via raw seg.text.startsWith(value, cursor) with no word-boundary check and no minimum length. A param value like "D" (e.g. a drive letter) lit up every capital D in the script body — Get-ADUser, Add-Type, Disable- all rendered as proposed-parameter pills. Add a word-boundary guard: a candidate match is only accepted if either side of the match either falls at start/end of the segment, OR the adjacent character is non-alphanumeric. The guard is conditional on whether the value itself starts/ends with a word char, so values that begin or end in punctuation (e.g. "D:\\Folder") still match cleanly when they sit next to whitespace or punctuation. Surfaced 2026-05-01 while testing the suggested-fix flow with a real PowerShell script. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 16:23:05 -04:00
Michael Chihlas	0156aae684	feat(session): impeccable session-screen pass + tasklane keyboard flow Multi-step UX refactor of the assistant chat session screen, run via the $impeccable skill. Heuristic score moved 24/40 → 33/40 (+9), with the biggest gains on Aesthetic & Minimalist (1→3), Consistency & Standards (1→3), and Recognition Rather Than Recall (2→4). Distill — chat region: - Remove the "Suggested checks" chip strip + selected-chip detail card; the TaskLane is the single canonical home for "what to do next" - Add an inline Next steps · N pending cue above the latest action-bearing AI bubble (anchors attention without duplicating the lane's items) - Link banner ↔ script-panel lifecycle: collapsing or dismissing the ProposalBanner now also hides the InlineNoTemplateDialog / TemplateMatchPanel - Drop backdrop-blur on the handoff-context overlay (DESIGN-SYSTEM hard rule) Quieter — drop decoration overshoot: - Remove 3px side stripes on TaskLane done cards, all 6 ProposalBanner modes, WhatWeKnowItem fact rows - Drop bg-gradient surfaces on WhatWeKnow + every ProposalBanner mode - Drop 2px accent borderTop on the TaskLane header - Replace bordered avatar boxes in banners with inline state-colored icons - Each surface now uses a single decoration channel (top border + inline icon) Layout: - Header consolidates to Resolve + Escalate + ⋯ kebab; Context, New Ticket, Update Ticket, Pause now live behind the kebab on desktop, with feature parity in the existing mobile overflow menu - Messages column anchors to max-w-3xl mx-auto to match the composer - Chat bubbles drop from rounded-2xl to rounded-xl for vocabulary alignment Typeset: - Unify text sizing from 14 distinct sizes (with sub-pixel oddities and rem/px duplicates) to a 5-step scale: 10px / 11px / text-xs / 13px / text-sm WhatWeKnow collapsible: - Header is now a toggle; section body hides when collapsed - Auto-collapses on first render when facts ≥ 5 so Questions / Diagnostic Checks stay above the fold - Engineer's choice persists in sessionStorage per session and beats the auto-collapse heuristic on subsequent renders - key=activeChatId on both render sites resets state cleanly across sessions Polish: - Split MessageCircleQuestion into Pencil (question Answer CTA, write affordance) + HelpCircle (per-check Explain toggle, universal help icon) — same icon for two different jobs was a discoverability bug - Drop redundant text-xs from font-sans text-[0.625rem] / text-[0.6875rem] double-class definitions; the more-specific size always wins TaskLane keyboard flow: - Enter submits and auto-advances to the next pending task; Shift+Enter inserts a newline (consistent across question and action textareas — paste events don't fire keydown, so paste-then-Enter still works as expected) - Esc cancels (same as the Cancel button) - After the last pending task is submitted, focus moves to the Send Responses button so the engineer can fire the whole batch with one more keystroke - Subtle hint row under each open input teaches the shortcut Type-check, lint, and build all clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 16:22:50 -04:00
Michael Chihlas	4d8b107121	wip(handoff): start issue cleanup plan sections 1 and 2 Co-Authored-By: Codex <noreply@openai.com>	2026-05-01 02:04:19 -04:00
Michael Chihlas	a21fe93454	wip(handoff): clean stale TODOs and plan issue cleanup Co-Authored-By: Codex <noreply@openai.com>	2026-05-01 01:47:41 -04:00
Michael Chihlas	595844de0b	wip(handoff): audit TODO and Gitea issue validity Co-Authored-By: Codex <noreply@openai.com>	2026-05-01 01:41:37 -04:00
chihlasm	b74d3cf584	Merge pull request 'chore(ai): post-#156 handoff + log shipped features in CHANGELOG/CURRENT-STATE' (#157 ) from chore/post-156-handoff into main All checks were successful CI / backend (push) Successful in 10m46s Details Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (push) Successful in 5m47s Details CI / e2e (push) Successful in 10m33s Details Reviewed-on: #157 by Michael Chihlas	2026-05-01 04:38:22 +00:00
Michael Chihlas	50ddacdb66	docs: log #155 + #156 in CHANGELOG/CURRENT-STATE All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m4s Details CI / backend (pull_request) Successful in 10m25s Details CI / e2e (pull_request) Successful in 10m41s Details Adds Unreleased entries for the Escalation Mode wedge and the suggested-fix Awaiting verification outcome — both user-visible features merged this week. Refreshes CURRENT-STATE last-updated date to 2026-05-01 and adds a "Recently shipped (post-0.1.0.0)" quick-reference block at the top. VERSION untouched (still 0.1.0.0; pre-PMF, no release scheduled). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-01 00:32:01 -04:00
Michael Chihlas	a5e2dcf43f	chore(ai): post-#156 handoff — feature shipped, QA report attached All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details Updates the .ai/ handoff trio after PR #156 merge: - CURRENT_TASK.md: clear active task; record #156 in Recently shipped alongside #155 with one-line summary and QA-report pointer. - HANDOFF.md: rewrite resume point as "pick next from TODO/roadmap"; document carry-forward env quirks (CONTAINER=1 for Chromium, docker-01 hosts entry, multi-head alembic state). - SESSION_LOG.md: append session entry for QA + merge. Also includes the .gstack/qa-reports/ artifacts (report + 8 screenshots). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 23:45:10 -04:00
chihlasm	3ba4532675	Merge PR #156 : pending-verification — applied_pending non-terminal outcome All checks were successful CI / frontend (push) Successful in 5m6s Details Mirror to GitHub / mirror (push) Successful in 6s Details CI / backend (push) Successful in 10m6s Details CI / e2e (push) Successful in 10m33s Details Adds applied_pending non-terminal status, pending_reason column, PendingBanner UI, and review fixes for page-level Resolve/Escalate intercepts. QA: 5/7 scripted checks PASS with concrete evidence. 2 entry-path checks deferred — same handlers verified via tested transitions.	2026-05-01 03:42:10 +00:00
Michael Chihlas	15042af6e2	docs(ai): document docker-exec pattern for hosts without native toolchains All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 4m57s Details CI / e2e (pull_request) Successful in 10m10s Details CI / backend (pull_request) Successful in 10m42s Details The code-server LXC has bun and docker but no python/node/npm on PATH, which left Codex unable to reproduce build/test commands. Adds a 6-line block to PROJECT_CONTEXT.md showing the docker exec resolutionflow_{backend,frontend} form, and updates the AGENTS.md "Tooling you do NOT have" line to point Codex at it instead of suggesting toolchain installs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-30 23:02:53 -04:00
Michael Chihlas	5bee264d70	fix(suggested-fix-pending): apply PR #156 review fixes - Page-level Resolve patches applied_pending → applied_success before opening the resolution flow, so resolved sessions don't carry a provisional pending fix. - Page-level Escalate intercept now catches applied_pending in addition to verifying/partial; intercept copy generalized from "Verifying state" to "still needs an outcome." - PendingBanner gains a Dismiss action, matching the PR body and the backend's allowed pending → dismissed transition. - resolution_note_generator and escalation_package_generator system prompts no longer include real-looking pending examples (anti-parrot guardrail compliance). Verified via Docker: prompt anti-parrot 2/2, suggested-fix outcome suite 21/21, frontend tsc -b clean, npm run build clean. Co-Authored-By: Codex <noreply@openai.com>	2026-04-30 23:02:46 -04:00
Michael Chihlas	7cee7228dc	docs(ai): refresh handoff for PR #156 — pending-verification feature All checks were successful Mirror to GitHub / mirror (push) Successful in 3s Details CI / frontend (pull_request) Successful in 5m9s Details CI / backend (pull_request) Successful in 9m51s Details CI / e2e (pull_request) Successful in 9m22s Details Closes out Escalation Mode (PR #155 merged) and pivots active task to the new applied_pending suggested-fix outcome on PR #156. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 17:37:08 -04:00
Michael Chihlas	00663a4734	feat(suggested-fix): add applied_pending status for deferred verification Some checks failed Mirror to GitHub / mirror (push) Has been cancelled Details CI / backend (pull_request) Successful in 10m43s Details CI / frontend (pull_request) Successful in 5m42s Details CI / e2e (pull_request) Successful in 11m13s Details Engineer applies a fix but can't verify yet (waiting on client power-cycle, AD replication, async sync). Today the verifying banner forces a synchronous verdict (worked / didn't / partial) — anything else means leaving the banner stale or guessing wrong. This adds a fourth outcome that parks the fix in a non-terminal "Awaiting verification" state with a reason ("waiting on what?") and exposes it on the chat-anchored banner so the engineer doesn't lose track. Backend - New non-terminal status `applied_pending` parallel to `applied_partial`. - New `pending_reason` column (nullable Text) — the "what are you waiting on?" prose, mirrors `partial_notes`. Required when outcome=applied_pending. - Outcome endpoint allows pending in/out transitions; pending stamps applied_at but NOT verified_at (it's parked, not verified). - Resolution-note + escalation-package prompts handle the new status: resolution note frames the fix as provisional; escalation package surfaces pending verification as the leading hypothesis with reference to what's being waited on. - Migration: add column + extend status CHECK constraint. Frontend - New `BannerMode = 'pending'` + `PendingBanner` component (info-tone, parallel to PartialBanner) with worked / didn't / update-reason actions. - VerifyingBanner overflow menu adds "Waiting to verify…". - Nudge banner's "Still checking" button now actually records pending with a reason, instead of just silencing for the session. - AssistantChatPage banner-mode derivation maps applied_pending → 'pending'. Tests: 4 new integration tests covering pending notes requirement, reason storage + applied_at/verified_at semantics, pending→success transition, and pending_reason update on re-PATCH. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 17:32:37 -04:00
chihlasm	ac42f971fc	Merge PR #155 : Escalation Mode wedge — live arrival + magic-moment pickup All checks were successful CI / frontend (push) Successful in 5m7s Details Mirror to GitHub / mirror (push) Successful in 6s Details CI / e2e (push) Successful in 10m36s Details CI / backend (push) Successful in 11m9s Details Magic-moment handoff-context screen on senior pickup, live SSE escalation arrivals, time-to-first-action metric, role-gated claim with atomic conflict resolution, and chat ownership extension for claimed sessions.	2026-04-30 21:32:16 +00:00
Michael Chihlas	f10649abc2	fix(escalations): atomic claim + self-claim rejection + queue exclusion All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 4m59s Details CI / backend (pull_request) Successful in 10m22s Details CI / e2e (pull_request) Successful in 10m46s Details Codex review pass on the escalation wedge. Reworks claim_session from read-then-write to a conditional UPDATE so two seniors racing can't both win, blocks the original engineer from claiming their own handoff, and filters self-escalated sessions out of the dashboard escalation queue. Also preassigns the handoff UUID before flush so the compatibility escalation_package payload carries it. Removes legacy frontend pickup state (claiming, handleStartHere) that broke tsc --noEmit. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 16:21:20 -04:00
Michael Chihlas	ab5e0deaf7	docs(ai): session 3 handoff — QA complete, chat ownership decision logged Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 01:32:39 -04:00
Michael Chihlas	f601a0db58	docs(ai): QA complete — escalation mode wedge browser-verified All paths pass. One critical fix: chat endpoint now allows escalated_to_id as a valid sender so the senior can run AI analysis on claimed sessions. PR #155 ready for review. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 00:26:18 -04:00
Michael Chihlas	dc69c9ddfb	fix(escalations): allow claimed-by user to send chat messages to escalated session unified_chat_service.send_chat_message checked AISession.user_id == user_id, blocking the senior who claimed an escalation from sending the AI briefing. Now also allows AISession.escalated_to_id == user_id (the claimer). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 00:17:31 -04:00
Michael Chihlas	db717b0b3f	feat(escalations): magic-moment 3-option CTA + claim 500 fix - HandoffContextScreen: 3-option layout (Continue/AI analysis/Own thing) with hasTaskLane, activeOptionKey, spinner/disabled states - AssistantChatPage: wire up handleContinue, handleAIAnalysis, handleOwnThing handlers; chip detail expansion inline with copy-button fix; post-escalation redirect to dashboard on ConcludeSessionModal close - TaskLane: fix async copy button (await + execCommand fallback + copiedKey visual feedback); whitespace-pre-wrap on command blocks - Fix 500 on claim: Pydantic v2 model_validate() + model_copy(update={}) (was passing update= kwarg directly which v2 rejects) - HandoffResponse schema: handed_off_by_name field Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-30 00:05:02 -04:00
Michael Chihlas	fb2dc222fd	docs(ai): handoff for fresh session — AI consolidation plan locked All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 5m9s Details CI / backend (pull_request) Successful in 9m43s Details CI / e2e (pull_request) Successful in 10m13s Details - HANDOFF: rewritten resume point. AI summary blocker is the active task; consolidation plan is the path. 5-step implementation order with watch-outs and breadcrumbs. - CURRENT_TASK: updated commit table through `0d1b305`. Documents the live-test results (what works, the AI summary blocker), full consolidation design with proposed payload shape. - SESSION_LOG: chronological entry covering live QA bash, two pickup bugs found + fixed, the three Enter/dashboard/timeout fixes, and the architectural smell that surfaced. - DECISIONS: new entry "Consolidate the three per-escalation AI calls into one structured generation" — rejected alternatives (bump timeout further, copy status-update content the wrong way, switch to Haiku) and consequences (5s magic-moment, ~60% token reduction, instant Ticket Notes button, schema enforcement required, migration concerns documented). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 00:21:30 -04:00
Michael Chihlas	0d1b305619	fix(escalations): live-test fixes from QA bash Bundles four fixes from the live debugging session: 1. AssistantChatPage: replace urlSessionId === activeChatId gate with a loadedChatIdsRef. After `8914391` made activeChatId initialize from urlSessionId, the gate short-circuited fresh mounts and selectChat never fired. Symptom: senior picks up an escalation, lands on a blank chat surface with no conversation history and no sidebar entry. Fix also adds loadChats() in handleStartHere so the picked-up session appears in the sidebar (its escalated_to_id is null pre-claim, so listSessions doesn't return it until claim_session sets it). 2. config: bump ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS 15s → 45s. Sonnet was hitting tail latency at 15s in the field, leaving the magic-moment placeholder permanent. Background-task architecture (`e8ba74e`) means this no longer blocks the user; it's just the budget before publishing has_assessment=false. NOTE: live test still shows assessment not populating — see HANDOFF for the consolidation plan that supersedes this. 3. Enter-to-submit: chat-input convention (Enter submits, Shift+Enter inserts newline) on the escalate-flow forms. RichTextInput gains an optional onSubmit prop; EscalateModal wires it to handleSubmit; ConcludeSessionModal gets the same handler on its plain textarea. 4. PendingEscalations: each row is now expandable. Click row body to reveal the engineer's escalation reason, step count on record, confidence tier, and PSA ticket number. Pick Up still clicks through directly. Single-expand-at-a-time keeps the dashboard compact. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 00:18:40 -04:00
Michael Chihlas	b7d7ff06d2	docs(ai): refresh handoff for compute swap All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 5m8s Details CI / backend (pull_request) Successful in 9m46s Details CI / e2e (pull_request) Successful in 10m16s Details - HANDOFF: rewritten resume point. First action on resume is `git push` (commits `0f00ee5` and `665530f` are local-only). Visual QA + bug bash is the active work; 4 plan-locked items + the structural task-lane fix all need real-browser verification. - CURRENT_TASK: add `0f00ee5` and `665530f` to the commit table; reframe "Just shipped" as a per-commit summary; flag the task-lane fix as needing visual confirmation. - SESSION_LOG: chronological entry for this session with full detail (audit, four polish items, race-condition wiring, structural task-lane fix, test status, files touched). - DECISIONS: new entry "Tag the task-lane state with an owner chatId" documenting the structural pattern, what was rejected, and the forward implication that future task-lane state slices follow the same owner-tagging pattern. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 08:21:23 -04:00
Michael Chihlas	665530f812	fix(assistant-chat): tag task-lane state with owner chatId to kill stale flash The previous fix (`8914391`) only blocked the mount-time sessionStorage restore when the page entered with prefill or ?pickup=true. It didn't cover any path where the page was already mounted and activeChatId flipped without the in-memory task-lane state going through reset+ repopulate cleanly — in-place URL navigation, mid-flight pickup, HMR re-runs, the gap between setActiveChatId(B) and the AI response that finally populates B's questions/actions. Root cause: activeQuestions / activeActions / showTaskLane were never intrinsically tied to a chatId. They were treated as "the active chat's data" by convention, with no structural enforcement. Any window where they survived past their owning chat leaked previous-session data into the new view. The persistence effect made it worse: it stamped the sessionStorage chatId field with activeChatId at write time, so a mid-transition snapshot {chatId: B, questions: [A's]} would happily restore A's data for B on the next mount. Fix: introduce taskLaneOwnerChatId state that records the chatId those in-memory questions/actions/show values BELONG to. Set at every site that populates them (sendPrefill, selectChat, handleSend, handleTaskSubmit, handleResumeNew, refreshFacts, handleApplyFix). Cleared in resetSessionDerivedState. The persistence effect now writes ownerChatId as the chatId tag, not activeChatId — so the snapshot is always self-consistent. Render gate: taskLaneIsForActiveChat = ownerChatId === activeChatId. ANDed into all three render conditions (toolbar Tasks button, narrow- viewport floating drawer, main side panel). The lane is structurally unable to display data tagged with a different chat. The mount-time skipTaskLaneRestore guard stays — it kills the flash between component mount and the first sendPrefill effect run, which the owner-gate alone doesn't cover. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 02:42:31 -04:00
Michael Chihlas	0f00ee5e01	feat(escalations): close out plan-locked wedge polish Four items from the design-plan audit, all flagged as locked-design or Codex corrections, shipped together so the GTM demo path covers them end-to-end before bug bash. 1. Live AI assessment refresh on the magic-moment screen. Backend already publishes handoff_assessment_ready when enrich_escalation_async commits; wire the frontend listener so the senior sees the assessment populate without a manual reopen. New event type + onAssessmentReady handler on streamEscalations; AssistantChatPage opens a scoped SSE subscription whenever it tracks a handoff missing its assessment, refetches on match, and replaces magicHandoff / overlayHandoff in place. Closes the loop on the async-assessment commit `e8ba74e`. 2. Suggested-step chips below the chat input. Locked design from the plan (Codex correction). Chip strip renders above the composer post-claim when ai_assessment_data.suggested_steps[] is non-empty. Click prefills the input and focuses; first send or explicit X hides for the session. 3. Unread 6px dot on EscalationQueue cards. localStorage-persisted seen set (rf-escalation-seen, capped 200). Dot top-right when not seen. Cleared on open (card click) or claim (Pick Up) — NOT on hover, per Codex correction. Pick Up stops propagation so it doesn't double-fire. 4. Race-condition toast on claim conflict. The /claim endpoint previously silently overwrote claimed_by — both seniors thought they owned the session. New HandoffAlreadyClaimedError carries the winner's id/name/ timestamp; claim_session rejects different-user re-claims (same-user is idempotent for double-click safety); endpoint returns 409 with structured detail. AssistantChatPage.handleStartHere extracts and surfaces "Already claimed by {name} {time_ago}." via toast, drops ?pickup=true, dismisses magic-moment so the loser flows back to queue. Tests: 2 new unit tests in test_handoff_manager.py (conflict raises, same-user idempotent). Full handoff + escalation suite (34 tests) green. Frontend tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 01:59:28 -04:00
Michael Chihlas	8914391336	fix(assistant-chat): kill stale task-lane flash on new-session entry All checks were successful Mirror to GitHub / mirror (push) Successful in 5s Details CI / frontend (pull_request) Successful in 5m4s Details CI / backend (pull_request) Successful in 10m9s Details CI / e2e (pull_request) Successful in 10m8s Details Two compounding bugs caused the previous session's questions/actions to render briefly when entering a new chat — visible as "the new session instantly pops with old session task-lane data" the user reported. The race - AssistantChatPage's activeQuestions / activeActions / showTaskLane useState initializers synchronously read sessionStorage's rf-tasklane-meta. They restore the persisted task-lane state if its saved chatId matches the freshly-resolved activeChatId. - On dashboard prefill flow, the page mounts on /pilot with location.state.prefill set; activeChatId initializes from sessionStorage's rf-active-chat-id (the previous session). The previous session's task-lane meta matches that chatId — so the initializer restores it. First paint shows old questions/actions. sendPrefill's resetSessionDerivedState fires later from a useEffect, but only after the flash. - Same pattern hits the senior-pickup flow: ?pickup=true means we're about to render the magic-moment screen and discard whatever chat the senior was previously on, but the underlying chat surface still initializes with their old task-lane meta. The amplifier - resetSessionDerivedState wiped the in-memory state but never removed sessionStorage's rf-tasklane-meta. Any remount or reload before the next persistence-effect write could re-hydrate the cleared state from the still-stale sessionStorage entry. Fixes - Initializer guard: when location.state.prefill is set OR ?pickup=true is in the URL, skip the sessionStorage restore entirely. Kills the first-paint flash for both entry paths. - Eager wipe: resetSessionDerivedState now also calls sessionStorage.removeItem('rf-tasklane-meta'). The persistence effect re-saves on the next state change anyway, so the only window where sessionStorage is empty is the exact window where stale-tag leakage was happening. tsc -b clean. No backend changes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 01:26:29 -04:00
Michael Chihlas	e8ba74ed6d	feat(escalations): distinguishable notifications, async AI, richer sidebar All checks were successful Mirror to GitHub / mirror (push) Successful in 6m5s Details CI / frontend (pull_request) Successful in 11m59s Details CI / e2e (pull_request) Successful in 10m7s Details CI / backend (pull_request) Successful in 16m22s Details Three improvements driven by live wedge testing. 1) Notification title now includes a problem snippet and PSA ticket suffix when present: "Escalation from Jane · #12345: Outlook is failing to sync email…" Replaces the prior "Session escalated by Jane" copy that made every escalation from the same junior look identical in the bell panel. Snippet is trimmed to 70 chars with ellipsis. handoff_manager now passes psa_ticket_id through in the notify() payload so this works for both /escalate and /handoff entry points. 2) AI enrichment (assessment + enhanced escalation_package) moved to a FastAPI BackgroundTask. The escalating engineer no longer waits on 15-25s of Sonnet latency — handoff creation returns as soon as snapshot, status flip, dual-write, documentation, PSA push, and notify() are committed. enrich_escalation_async opens its own DB session, runs both AI calls, updates handoff.ai_assessment + session.escalation_package, commits, and publishes a new `handoff_assessment_ready` event on the escalation bus. Frontend doesn't yet listen for that event — the magic-moment screen still shows a placeholder ("AI assessment is still generating. Reopen this view in a few seconds…") which is honest about the state. Live polling / auto-refresh on the bus event is the natural next step. 3) ChatSidebar entries now surface the problem summary as a secondary line and tag PSA-linked sessions with a monospace #ticket badge plus an "Escalated" pill on in-transit sessions. ChatListItem grew problem_summary, psa_ticket_id, and status fields; loadChats populates them from listSessions. The user couldn't tell their own sessions apart in the sidebar because they all rendered as "New Chat" with no distinguishing detail — this fixes that for any session, escalated or not. Test plan - Backend full suite: 1103 passed in 255.85s with -n auto. - Frontend tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 00:34:32 -04:00
Michael Chihlas	aca915b047	fix(escalations): bump assessment timeout, surface picked-up sessions in sidebar All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m6s Details CI / backend (pull_request) Successful in 9m45s Details CI / e2e (pull_request) Successful in 10m20s Details Two field-reported issues from live wedge testing. ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS bumped 5s → 15s. The 5s bound fired too aggressively against the Sonnet diagnostic assessment prompt; ~4-8s is typical but tail latency hits 12-14s. The fallback "Assessment unavailable — model didn't respond in time" placeholder was showing on the magic-moment screen for two consecutive escalations, which kills the demo. 15s keeps the click-path bounded but lets the typical case return real content. Real fix is async generation (kick off, persist when done, surface "still computing" with refresh) — captured as a follow-up; bumping the bound is the right call for the wedge demo. list_sessions now matches escalated_to_id == current_user.id alongside the existing user_id and escalation_package.picked_up_by clauses. The unified HandoffManager.claim_session sets escalated_to_id but doesn't write the legacy picked_up_by JSONB key, so picked-up sessions never showed in the senior's chat list — the senior would land on the session detail (active chat) but the sidebar showed only their other unrelated sessions. User reported this as "4 different versions of the session in the chat history section" — they were actually 4 unrelated empty sessions the senior owned, plus the picked-up session was just invisible. Backend tests still 94/94. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 00:04:08 -04:00
Michael Chihlas	e910bcc67d	fix(escalations): wire magic-moment + claim into AssistantChatPage All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m0s Details CI / backend (pull_request) Successful in 10m2s Details CI / e2e (pull_request) Successful in 10m39s Details The /pilot/:id route renders AssistantChatPage, not FlowPilotSessionPage (the latter is dead code with no active route). The earlier magic-moment integration sat in the wrong file, so clicking Pick Up from the dashboard navigated to /pilot/:id?pickup=true and AssistantChatPage just loaded the chat surface with no claim — the senior never saw the magic-moment screen and the handoff stayed unclaimed (status escalated, permanently in the queue). Adds full pickup awareness to AssistantChatPage: - ?pickup=true on entry triggers a handoff fetch via handoffsApi.listHandoffs (account-scoped, no claim required). magicState transitions loading → visible (handoff found) or loading → dismissed (no handoff or fetch failed). The dismiss path also strips ?pickup=true from the URL so a refresh doesn't re-enter loading state. - The existing selectChat-from-URL effect is gated on magicState — it skips while we're loading or showing the magic-moment so the chat surface doesn't race the claim flow. After claim it re-fires and populates messages from conversation_messages because the senior is now escalated_to_id and GET succeeds. - Magic-moment renders as full-page take-over (sidebar hidden) until Start here. handleStartHere calls handoffsApi.claimHandoff, drops ?pickup=true, and dismisses — the regular chat then loads. - Toolbar Context button (visible when magicHandoff is in memory) re-opens the screen as a dismissible overlay. Lazy-fetches the handoff when needed. Verified tsc -b clean and Vite HMR picked the file up without errors. The wire-level integration was already verified in earlier commits: listHandoffs returns the unclaimed handoff for a senior pre-claim, claimHandoff flips status escalated → active and sets escalated_to_id. Note: the prior FlowPilotSessionPage magic-moment integration is now in dead code (file is unreferenced from router). Left in place for this commit; will come out in a follow-up cleanup once we're confident the AssistantChatPage path is solid in production. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 23:23:00 -04:00
Michael Chihlas	5085bb47c2	docs(ai): handoff state after /escalate unification through HandoffManager All checks were successful Mirror to GitHub / mirror (push) Successful in 6s Details CI / backend (pull_request) Successful in 10m3s Details CI / frontend (pull_request) Successful in 5m34s Details CI / e2e (pull_request) Successful in 9m26s Details Records `029680a` — every escalation now funnels through HandoffManager regardless of which URL it entered through, so /escalate from EscalateModal produces the full set of artifacts (handoff row, AppNotification, SSE event, Slack/Teams via notify, per-user emails, documentation, PSA push) and the bell-icon notification opens the magic-moment screen end-to-end. Notes the legacy SessionBriefing branch + flowpilot_engine.escalate_session as orphaned, scheduled for removal after pilots have run a couple of weeks on the unified path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 22:29:40 -04:00
Michael Chihlas	029680ab2d	feat(escalations): unify /escalate through HandoffManager All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m8s Details CI / backend (pull_request) Successful in 10m13s Details CI / e2e (pull_request) Successful in 10m47s Details Replaces the legacy flowpilot_engine.escalate_session orchestration with a single canonical path through HandoffManager. Every escalation now creates a SessionHandoff row, fans out via the SSE bus, persists AppNotification rows for the bell icon, dispatches to external channels (Slack/Teams) via notify(), and emails per-user — regardless of whether the call entered through /escalate (legacy URL) or /handoff (new URL). The senior-pickup magic-moment screen now works end-to-end from the EscalateModal bell-icon path the user just tested. Backend - HandoffCreateRequest gains optional target_user_id (the equivalent of the legacy escalated_to_id field). Self-targeting rejected. - HandoffManager.create_handoff handles intent='escalate' end-to-end: sets escalation_reason + escalated_to_id, builds the legacy enhanced AI escalation_package (Sonnet, lazy-imported from flowpilot_engine, graceful fallback on failure), and merges handoff metadata into it. Eager-loads session.steps and session.user via selectinload — required by both the enhanced-package builder and notify() to avoid MissingGreenlet on async lazy access. - HandoffManager.finalize_escalation generates SessionDocumentation, pushes documentation to PSA, and runs notify() — pre-commit so the AppNotification rows persist atomically with the handoff. - HandoffManager.dispatch_escalation_notifications keeps only the fire-and-forget IO (bus publish, per-user emails) — runs post-commit. Pulls engineer name via a separate User query rather than relying on session.user lazy access. - /handoff endpoint passes target_user_id through and calls finalize_escalation pre-commit. - /escalate endpoint is now a thin shim: owner-only session lookup, HandoffManager.create_handoff(intent='escalate'), finalize_escalation, commit, dispatch_escalation_notifications, return SessionCloseResponse built from documentation + psa_result. flowpilot_engine.escalate_session is no longer called by any endpoint. - pickup_session accepts both 'requesting_escalation' (legacy in-flight sessions) and 'escalated' (new canonical) so the migration is seamless for sessions already in the queue. - Escalation queue list and sidebar count now match either status. Frontend - useFlowPilotSession optimistic update flips status to 'escalated' instead of 'requesting_escalation' so the page state matches the unified backend response. Verified end-to-end live: a fresh /escalate call from the junior produces status='escalated', a SessionHandoff row, a SessionDocumentation, PSA push attempted (no_psa for this test session), AND a bell-icon AppNotification for the team admin with link /pilot/{session_id}?pickup=true. Backend test suite: 1103 passed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 22:27:26 -04:00
Michael Chihlas	2a2329ad19	docs(ai): handoff state after bell-icon fix; record draft PR #155 All checks were successful Mirror to GitHub / mirror (push) Successful in 4s Details CI / frontend (pull_request) Successful in 5m41s Details CI / backend (pull_request) Successful in 9m55s Details CI / e2e (pull_request) Successful in 9m13s Details Updates the handoff trio after the legacy notification flow fix and the branch push. PR #155 is open against main as draft. Resume point is now visual QA via /qa, then deferred follow-ups (chat-input suggested-step chips, snapshot expansion). Logs the open question about whether EscalateModal should switch to /handoff. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:33:44 -04:00
Michael Chihlas	641853a002	fix(escalations): bell-icon notification opens the pickup flow Some checks failed Mirror to GitHub / mirror (push) Successful in 4s Details CI / backend (pull_request) Failing after 1m17s Details CI / frontend (pull_request) Successful in 4m53s Details CI / e2e (pull_request) Successful in 9m18s Details Two backend changes that unbreak the senior-pickup path from the notification panel: 1. notification_service: session.escalated link template now ends with ?pickup=true so the senior lands in the handoff/pickup flow on click. Without it, navigation hit /pilot/:id directly, which then 404'd on the GET because the senior isn't yet escalated_to_id — the user perceives this as the bell-icon "just clearing the notification". 2. ai_sessions GET access: any account member can now read an escalated session's detail when status is requesting_escalation or escalated. The owner-only guard was overly restrictive for explicitly-shared in-transit states. Tenant boundary is enforced by RLS on the underlying query, so account-scope is the right ceiling here. After pickup, the existing handler/escalated_to_id checks still apply. Verified live: re-login as the senior engineer and GET the active escalated session — now returns 200 with full detail. Focused test subset plus tests/test_sessions.py and tests/test_session_sharing.py → 94 passed in 43.26s, no regressions. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:29:47 -04:00
Michael Chihlas	c194ba4a43	docs(ai): handoff state after magic-moment screen lands Marks the magic-moment handoff-context screen as shipped, points the next session at visual QA + push + draft PR, and captures the deferred follow-ups (suggested-step chips, snapshot expansion, toolbar button on revisits, owner analytics, Playwright e2e). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:08:07 -04:00
Michael Chihlas	8e9d22e0e0	feat(escalations): magic-moment handoff-context screen on pickup Adds the dedicated 4-section handoff-context view that renders BEFORE the FlowPilot session for senior techs picking up an escalated session, then dissolves on "Start here". This is the wedge's demonstrable magic moment — what the GTM Loom records. - HandoffContextScreen.tsx: pure presentational, takes a HandoffResponse plus onStartHere / onDismiss callbacks. Sections: header (problem summary, domain, step count, escalated-time, priority badge), "What's been tried" (engineer notes + step-count affordance), "AI assessment" (likely_cause / suggested_steps / confidence badge), Start here CTA. Confidence badge accepts both numeric (0..1) and string ("low"/"medium"/"high") shapes — backend currently emits the latter. Renders an explicit "assessment unavailable" branch when ai_assessment_data is null (the 5s timeout from `9bdd995` fired). Honors prefers-reduced-motion (animate-fade-in vs animate-slide-up). ARIA dialog + focus on the primary CTA. Esc dismisses when used as a re-openable overlay; pre-claim, Start here is the only exit. - FlowPilotSessionPage.tsx: on /pilot/:id?pickup=true, fetch the handoff list via handoffsApi.listHandoffs (account-scoped via RLS, no claim required) and find the latest unclaimed escalate handoff. If found, render the magic-moment screen and skip the regular loadSession (the senior isn't yet escalated_to_id, so GET would 404). Start here calls claimHandoff, drops the pickup query param, dismisses the screen — the existing loadSession effect then fires because the senior is now escalated_to_id. A "Context" toolbar button on active sessions re-opens the screen as a dismissible overlay (visible only when the senior arrived via the magic-moment flow this session — handoff lookup on demand). Verified end-to-end against the running dev stack: listHandoffs returns the unclaimed handoff with full payload; claim flips session status from escalated → active; subsequent GET succeeds. tsc -b clean. Defers (TODO followups): suggested-step chips below the chat input that prefill on click (requires threading through to FlowPilotMessageBar); snapshot expansion to include the recent diagnostic steps pre-claim; toolbar Context button on sessions where the senior didn't arrive via magic-moment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 21:06:14 -04:00
Michael Chihlas	f65b65790c	docs(ai): handoff state after frontend SSE slice lands Marks the SSE subscription as shipped, points the next-session resume target at the magic-moment handoff-context screen, and logs the live end-to-end verification. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:57:20 -04:00
Michael Chihlas	b8627f4180	feat(escalations): subscribe EscalationQueue to live SSE arrivals Adds the frontend live-arrival slice on top of the test-stabilized SSE backend. Senior techs now see a junior's escalation slide into the queue without refresh. - streamEscalations(handlers, signal) in aiSessions.ts: fetch-based ReadableStream parser (native EventSource cannot send auth headers). Handles SSE frames, partial frames across chunks, : keepalive heartbeats. Dispatches ready and handoff_created. - HandoffCreatedEvent + EscalationStreamHandlers types mirror the bus payload published by HandoffManager.dispatch_escalation_notifications. - EscalationQueue.tsx: AbortController-managed subscription with exponential-backoff reconnect (1s → 30s cap, attempt counter resets on ready). On handoff_created, refetch and diff against previous IDs via sessionsRef; new arrivals prepended (newest-first) above established cards (oldest-first preserved). Slide-in tag held for 800ms so the locked 200ms animation completes. Tab-title flash prefixes (N) while document.hidden, restores on focus / unmount. prefers-reduced-motion swaps slide-in for fade-in. ARIA region + aria-live=polite + aria-label on heading. Pick Up bumped to py-2.5 to clear the 44px touch floor. Verified end-to-end against the running dev stack: subscriber received the ready frame on connect; after posting a handoff via the API, the subscriber received the handoff_created frame with the expected payload — wire format matches the parser. Backend regression: focused subset still 32 passed in 18.91s. Frontend tsc -b clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:57:15 -04:00
Michael Chihlas	02d5c6c08c	docs(ai): refresh handoff state for next-session pickup under 200k context Default Claude Code model is being switched from Opus 4.7 1M-context to Opus 4.7 (200k). Tighten the per-session pickup docs so they're self-sufficient under the smaller window: - CURRENT_TASK now reflects the post-Codex state: 8 commits on the branch (5 feat + WIP SSE + 2 Codex test/latency fixes + 1 doc refresh), 32/32 backend tests with -n auto, frontend tsc -b clean. Remaining work re-scoped: the SSE backend half is feature-complete and tested, so what's left is the FRONTEND SSE subscription in EscalationQueue.tsx, then the magic-moment handoff-context screen, then push + draft PR. - Session log gets a Claude Code entry covering today's planning → build → pause-for-Codex arc, the design decisions locked into the doc and code, the two TODOs added (peer-tech escalation, mobile responsive), and the model-switch context for the next session. - HANDOFF.md needs no change — Codex's update in `9bdd995` already describes the resume point and watch-outs cleanly. No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 20:13:40 -04:00
Michael Chihlas	9bdd9959a8	fix(handoff): bound escalation assessment latency Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 20:03:14 -04:00
Michael Chihlas	fff8338bf2	docs(ai): track escalation assessment latency follow-up Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:55:31 -04:00
Michael Chihlas	bc15952857	fix(tests): stabilize escalation SSE backend tests Co-Authored-By: Codex <noreply@openai.com>	2026-04-27 19:47:43 -04:00
Michael Chihlas	ba46fc5644	docs(ai): pause Escalation Mode build mid-SSE for Codex review Update HANDOFF to reflect: - Build paused after the WIP SSE commit (`87bd0b7`) - What Codex should look at on the SSE bus + endpoint + dispatch wiring - Resume point post-review: re-run tests with -n auto, then frontend SSE subscription, then magic-moment screen - Test-suite watch-out: per-test DROP SCHEMA fixture means concurrent pytest runs on the same DB collide; always one-suite-at-a-time or -n auto with conftest's per-worker DB isolation No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:16 -04:00
Michael Chihlas	87bd0b7c56	WIP: SSE pub/sub for live escalation arrivals (paused for Codex review) First half of the WebSocket/SSE push slice. Paused mid-flight to hand the branch to Codex for outside-voice review before stacking more commits on top. See .ai/HANDOFF.md for the full pause context + what to look at. What's here: - backend/app/core/escalation_bus.py — module-level singleton in-memory pub/sub keyed by account_id. asyncio.Queue per subscriber with 64-event maxsize and drop-on-full semantics. Designed to be swappable for Redis pub/sub when Railway scales past single-replica. - backend/app/api/endpoints/session_handoffs.py — GET /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe bound to current_user.account_id. - backend/app/services/handoff_manager.py — dispatch_escalation_notifications now publishes a `handoff_created` event to the bus BEFORE the email fan-out, in a try/except so a bus failure can't block email delivery. - backend/tests/test_escalation_bus.py — 7 unit tests, all green standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers. - backend/tests/test_handoff_manager.py — +1 dispatcher integration test (publishes to bus, payload shape). - backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer blocked, ready event handshake). [gstack-context] Decisions: - SSE over WebSocket (one-way, browser EventSource semantics, fewer moving parts behind Railway proxy) - In-memory bus over Redis for v1 pilot (3 MSPs, single replica) - Drop-on-full subscriber queue rather than back-pressure publishers - Bus publish ahead of email send, both wrapped in try/except so neither can break handoff creation - Frontend will be a fetch-based ReadableStream reader matching the existing streamDocumentation pattern, not native EventSource (custom-header auth) Remaining (post-Codex): - Frontend SSE subscription in EscalationQueue.tsx (slide-in, reconnect, tab-title flash, prefers-reduced-motion) - Magic-moment handoff-context screen - Re-run the full backend test suite to verify the SSE + dispatcher integration tests (bus units already green standalone) Tried: - Running the full test suite repeatedly without xdist; the per-test DROP SCHEMA + recreate fixture made wall-clock prohibitive when multiple stale runs collided on the same Postgres test schema. Resolution: -n auto next time. [/gstack-context] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 19:29:07 -04:00
Michael Chihlas	a283d0d3fd	docs(ai): refresh handoff state mid-flight on Escalation Mode build Capture the in-flight state of the Escalation Mode wedge build so the next session (or Codex resume) picks up cleanly without re-deriving context: - CURRENT_TASK now describes the wedge, what's done across the 5 commits on this branch, what remains (WebSocket push, magic-moment screen, analytics page, e2e), and the two-metric framing readers MUST internalize before quoting numbers - HANDOFF resume point is the WebSocket/SSE push (live-arrival half of the notification dual-path); includes suggested first slice + watch-outs (no user_id on ai_session_step, denormalized account_id, peer-escalation still gated to session owner) - Both files reference the design doc and the kill-switch criterion No code change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-27 16:38:14 -04:00

1 2 3 4 5 ...

1085 Commits