Commit Graph

10 Commits

Author SHA1 Message Date
f10649abc2 fix(escalations): atomic claim + self-claim rejection + queue exclusion
All checks were successful
Mirror to GitHub / mirror (push) Successful in 5s
CI / frontend (pull_request) Successful in 4m59s
CI / backend (pull_request) Successful in 10m22s
CI / e2e (pull_request) Successful in 10m46s
Codex review pass on the escalation wedge. Reworks claim_session from
read-then-write to a conditional UPDATE so two seniors racing can't both
win, blocks the original engineer from claiming their own handoff, and
filters self-escalated sessions out of the dashboard escalation queue.
Also preassigns the handoff UUID before flush so the compatibility
escalation_package payload carries it. Removes legacy frontend pickup
state (claiming, handleStartHere) that broke tsc --noEmit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 16:21:20 -04:00
db717b0b3f feat(escalations): magic-moment 3-option CTA + claim 500 fix
- HandoffContextScreen: 3-option layout (Continue/AI analysis/Own thing)
  with hasTaskLane, activeOptionKey, spinner/disabled states
- AssistantChatPage: wire up handleContinue, handleAIAnalysis, handleOwnThing
  handlers; chip detail expansion inline with copy-button fix; post-escalation
  redirect to dashboard on ConcludeSessionModal close
- TaskLane: fix async copy button (await + execCommand fallback + copiedKey
  visual feedback); whitespace-pre-wrap on command blocks
- Fix 500 on claim: Pydantic v2 model_validate() + model_copy(update={})
  (was passing update= kwarg directly which v2 rejects)
- HandoffResponse schema: handed_off_by_name field

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-30 00:05:02 -04:00
0f00ee5e01 feat(escalations): close out plan-locked wedge polish
Four items from the design-plan audit, all flagged as locked-design or
Codex corrections, shipped together so the GTM demo path covers them
end-to-end before bug bash.

1. Live AI assessment refresh on the magic-moment screen. Backend already
   publishes handoff_assessment_ready when enrich_escalation_async commits;
   wire the frontend listener so the senior sees the assessment populate
   without a manual reopen. New event type + onAssessmentReady handler on
   streamEscalations; AssistantChatPage opens a scoped SSE subscription
   whenever it tracks a handoff missing its assessment, refetches on match,
   and replaces magicHandoff / overlayHandoff in place. Closes the loop on
   the async-assessment commit e8ba74e.

2. Suggested-step chips below the chat input. Locked design from the plan
   (Codex correction). Chip strip renders above the composer post-claim
   when ai_assessment_data.suggested_steps[] is non-empty. Click prefills
   the input and focuses; first send or explicit X hides for the session.

3. Unread 6px dot on EscalationQueue cards. localStorage-persisted seen
   set (rf-escalation-seen, capped 200). Dot top-right when not seen.
   Cleared on open (card click) or claim (Pick Up) — NOT on hover, per
   Codex correction. Pick Up stops propagation so it doesn't double-fire.

4. Race-condition toast on claim conflict. The /claim endpoint previously
   silently overwrote claimed_by — both seniors thought they owned the
   session. New HandoffAlreadyClaimedError carries the winner's id/name/
   timestamp; claim_session rejects different-user re-claims (same-user is
   idempotent for double-click safety); endpoint returns 409 with
   structured detail. AssistantChatPage.handleStartHere extracts and
   surfaces "Already claimed by {name} {time_ago}." via toast, drops
   ?pickup=true, dismisses magic-moment so the loser flows back to queue.

Tests: 2 new unit tests in test_handoff_manager.py (conflict raises,
same-user idempotent). Full handoff + escalation suite (34 tests) green.
Frontend tsc -b clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 01:59:28 -04:00
e8ba74ed6d feat(escalations): distinguishable notifications, async AI, richer sidebar
All checks were successful
Mirror to GitHub / mirror (push) Successful in 6m5s
CI / frontend (pull_request) Successful in 11m59s
CI / e2e (pull_request) Successful in 10m7s
CI / backend (pull_request) Successful in 16m22s
Three improvements driven by live wedge testing.

1) Notification title now includes a problem snippet and PSA ticket
   suffix when present:
     "Escalation from Jane · #12345: Outlook is failing to sync email…"
   Replaces the prior "Session escalated by Jane" copy that made every
   escalation from the same junior look identical in the bell panel.
   Snippet is trimmed to 70 chars with ellipsis. handoff_manager now
   passes psa_ticket_id through in the notify() payload so this works
   for both /escalate and /handoff entry points.

2) AI enrichment (assessment + enhanced escalation_package) moved to
   a FastAPI BackgroundTask. The escalating engineer no longer waits
   on 15-25s of Sonnet latency — handoff creation returns as soon as
   snapshot, status flip, dual-write, documentation, PSA push, and
   notify() are committed. enrich_escalation_async opens its own DB
   session, runs both AI calls, updates handoff.ai_assessment +
   session.escalation_package, commits, and publishes a new
   `handoff_assessment_ready` event on the escalation bus. Frontend
   doesn't yet listen for that event — the magic-moment screen still
   shows a placeholder ("AI assessment is still generating. Reopen
   this view in a few seconds…") which is honest about the state.
   Live polling / auto-refresh on the bus event is the natural next
   step.

3) ChatSidebar entries now surface the problem summary as a secondary
   line and tag PSA-linked sessions with a monospace #ticket badge plus
   an "Escalated" pill on in-transit sessions. ChatListItem grew
   problem_summary, psa_ticket_id, and status fields; loadChats
   populates them from listSessions. The user couldn't tell their own
   sessions apart in the sidebar because they all rendered as "New
   Chat" with no distinguishing detail — this fixes that for any
   session, escalated or not.

Test plan
- Backend full suite: 1103 passed in 255.85s with -n auto.
- Frontend tsc -b clean.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-28 00:34:32 -04:00
029680ab2d feat(escalations): unify /escalate through HandoffManager
All checks were successful
Mirror to GitHub / mirror (push) Successful in 4s
CI / frontend (pull_request) Successful in 5m8s
CI / backend (pull_request) Successful in 10m13s
CI / e2e (pull_request) Successful in 10m47s
Replaces the legacy flowpilot_engine.escalate_session orchestration with
a single canonical path through HandoffManager. Every escalation now
creates a SessionHandoff row, fans out via the SSE bus, persists
AppNotification rows for the bell icon, dispatches to external channels
(Slack/Teams) via notify(), and emails per-user — regardless of whether
the call entered through /escalate (legacy URL) or /handoff (new URL).
The senior-pickup magic-moment screen now works end-to-end from the
EscalateModal bell-icon path the user just tested.

Backend
- HandoffCreateRequest gains optional target_user_id (the equivalent of
  the legacy escalated_to_id field). Self-targeting rejected.
- HandoffManager.create_handoff handles intent='escalate' end-to-end:
  sets escalation_reason + escalated_to_id, builds the legacy enhanced
  AI escalation_package (Sonnet, lazy-imported from flowpilot_engine,
  graceful fallback on failure), and merges handoff metadata into it.
  Eager-loads session.steps and session.user via selectinload — required
  by both the enhanced-package builder and notify() to avoid
  MissingGreenlet on async lazy access.
- HandoffManager.finalize_escalation generates SessionDocumentation,
  pushes documentation to PSA, and runs notify() — pre-commit so the
  AppNotification rows persist atomically with the handoff.
- HandoffManager.dispatch_escalation_notifications keeps only the
  fire-and-forget IO (bus publish, per-user emails) — runs post-commit.
  Pulls engineer name via a separate User query rather than relying on
  session.user lazy access.
- /handoff endpoint passes target_user_id through and calls
  finalize_escalation pre-commit.
- /escalate endpoint is now a thin shim: owner-only session lookup,
  HandoffManager.create_handoff(intent='escalate'), finalize_escalation,
  commit, dispatch_escalation_notifications, return SessionCloseResponse
  built from documentation + psa_result. flowpilot_engine.escalate_session
  is no longer called by any endpoint.
- pickup_session accepts both 'requesting_escalation' (legacy in-flight
  sessions) and 'escalated' (new canonical) so the migration is seamless
  for sessions already in the queue.
- Escalation queue list and sidebar count now match either status.

Frontend
- useFlowPilotSession optimistic update flips status to 'escalated'
  instead of 'requesting_escalation' so the page state matches the
  unified backend response.

Verified end-to-end live: a fresh /escalate call from the junior produces
status='escalated', a SessionHandoff row, a SessionDocumentation, PSA
push attempted (no_psa for this test session), AND a bell-icon
AppNotification for the team admin with link
/pilot/{session_id}?pickup=true. Backend test suite: 1103 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 22:27:26 -04:00
9bdd9959a8 fix(handoff): bound escalation assessment latency
Co-Authored-By: Codex <noreply@openai.com>
2026-04-27 20:03:14 -04:00
87bd0b7c56 WIP: SSE pub/sub for live escalation arrivals (paused for Codex review)
First half of the WebSocket/SSE push slice. Paused mid-flight to hand
the branch to Codex for outside-voice review before stacking more
commits on top. See .ai/HANDOFF.md for the full pause context + what
to look at.

What's here:
- backend/app/core/escalation_bus.py — module-level singleton in-memory
  pub/sub keyed by account_id. asyncio.Queue per subscriber with
  64-event maxsize and drop-on-full semantics. Designed to be swappable
  for Redis pub/sub when Railway scales past single-replica.
- backend/app/api/endpoints/session_handoffs.py — GET
  /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via
  require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe
  bound to current_user.account_id.
- backend/app/services/handoff_manager.py — dispatch_escalation_notifications
  now publishes a `handoff_created` event to the bus BEFORE the email
  fan-out, in a try/except so a bus failure can't block email delivery.
- backend/tests/test_escalation_bus.py — 7 unit tests, all green
  standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers.
- backend/tests/test_handoff_manager.py — +1 dispatcher integration test
  (publishes to bus, payload shape).
- backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer
  blocked, ready event handshake).

[gstack-context]
Decisions:
  - SSE over WebSocket (one-way, browser EventSource semantics, fewer
    moving parts behind Railway proxy)
  - In-memory bus over Redis for v1 pilot (3 MSPs, single replica)
  - Drop-on-full subscriber queue rather than back-pressure publishers
  - Bus publish ahead of email send, both wrapped in try/except so
    neither can break handoff creation
  - Frontend will be a fetch-based ReadableStream reader matching the
    existing streamDocumentation pattern, not native EventSource
    (custom-header auth)
Remaining (post-Codex):
  - Frontend SSE subscription in EscalationQueue.tsx (slide-in,
    reconnect, tab-title flash, prefers-reduced-motion)
  - Magic-moment handoff-context screen
  - Re-run the full backend test suite to verify the SSE +
    dispatcher integration tests (bus units already green standalone)
Tried:
  - Running the full test suite repeatedly without xdist; the per-test
    DROP SCHEMA + recreate fixture made wall-clock prohibitive when
    multiple stale runs collided on the same Postgres test schema.
    Resolution: -n auto next time.
[/gstack-context]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 19:29:07 -04:00
07d0db9579 feat(handoff): email engineer-or-admin teammates on escalation
First half of the Escalation Mode notification dual-path. WebSocket/SSE
push is the second half (next commit) — email handles offline seniors,
push handles online ones for the magic-moment demo.

HandoffManager.dispatch_escalation_notifications:
- Pulls active engineer/admin/owner-role users in the same account_id
  (excludes the escalator + viewers + soft-deleted)
- Sends via existing EmailService.send_notification_email, concurrent
  via asyncio.gather; per-message failures don't block the rest
- Wrapped in try/except: any exception is logged + swallowed. Handoff
  creation is authoritative; notification is advisory. This is the
  graceful-degradation regression both eng + codex reviews flagged as
  critical (handoff must succeed even if SMTP is down).

Endpoint wiring (POST /ai-sessions/{id}/handoff):
- Dispatch fires AFTER db.commit() — never email about a rolled-back
  handoff. Trust-erosion bug if we got that wrong.
- Only fires for intent=escalate. Park is private to the escalator.

Tests (4 new):
- emails-engineer-recipients-in-account: viewer excluded, escalator
  excluded, only the engineer/admin teammates get the message
- skipped-for-park-intent: park doesn't fan out
- graceful-degradation-when-email-raises: RuntimeError from the email
  service does NOT bubble out of dispatch
- endpoint-dispatches-on-escalate: end-to-end wiring through POST

Per-channel delivery records (replacing the dead `notification_sent`
boolean per Codex correction) is a v1.x story — for now application
logs are the audit trail. See
docs/plans/2026-04-27-escalation-mode-wedge-design.md.

20 tests green across handoff_manager + session_handoffs_api +
flowpilot_analytics_escalations. No regressions.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 15:58:05 -04:00
chihlasm
758cd61621 fix: propagate account_id through all write paths missing NOT NULL coverage
Service layer (production code):
- branch_manager: set account_id on SessionBranch (root + fork) and ForkPoint
  from session.account_id; load session in create_fork for this purpose
- handoff_manager: set account_id on SessionHandoff from session.account_id
- ai_suggestions endpoint: set account_id on AISuggestion from current_user
- steps endpoint (/feedback): set account_id on StepRating from current_user
- ratings endpoint: set account_id on StepRating from current_user

Test infrastructure:
- conftest.py: seed PLATFORM_ACCOUNT_ID (00000000-...-0001) account after
  Base.metadata.create_all so global categories and gallery items have a valid FK
- test_rls_isolation: add _ensure_rls_schema fixture that runs
  'alembic upgrade head' before module tests — previous function-scoped
  test_db fixtures drop the schema, leaving the RLS tests with no tables
- test_branding: create Account before User in helper functions
- test_admin_gallery: set account_id=PLATFORM_ACCOUNT_ID on Tree/ScriptTemplate
- test_public_templates: set account_id=PLATFORM_ACCOUNT_ID on Tree,
  ScriptTemplate, TreeCategory
- test_resolution_outputs: set account_id=session.account_id on
  SessionResolutionOutput
- test_analytics_phase5: set account_id on PsaPostLog
- test_draft_trees: replace account_id=None with PLATFORM_ACCOUNT_ID in
  migration default test (NOT NULL now enforced)
- test_maintenance_schedules: set account_id on other_tree
- test_save_session_as_tree: set account_id on all 5 Session() constructors

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 04:24:36 +00:00
chihlasm
f84b868d13 feat: add HandoffManager service with dual-write and integration tests
Unified park/escalate handoff management with snapshot generation,
AI diagnostic assessment for escalations (via _call_ai), claim workflow
that reactivates sessions, PSA push via existing psa_documentation_service,
and team queue query. Dual-writes to ai_sessions.escalation_package for
backward compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 08:42:21 +00:00