Commit Graph

3 Commits

Author SHA1 Message Date
87bd0b7c56 WIP: SSE pub/sub for live escalation arrivals (paused for Codex review)
First half of the WebSocket/SSE push slice. Paused mid-flight to hand
the branch to Codex for outside-voice review before stacking more
commits on top. See .ai/HANDOFF.md for the full pause context + what
to look at.

What's here:
- backend/app/core/escalation_bus.py — module-level singleton in-memory
  pub/sub keyed by account_id. asyncio.Queue per subscriber with
  64-event maxsize and drop-on-full semantics. Designed to be swappable
  for Redis pub/sub when Railway scales past single-replica.
- backend/app/api/endpoints/session_handoffs.py — GET
  /api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via
  require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe
  bound to current_user.account_id.
- backend/app/services/handoff_manager.py — dispatch_escalation_notifications
  now publishes a `handoff_created` event to the bus BEFORE the email
  fan-out, in a try/except so a bus failure can't block email delivery.
- backend/tests/test_escalation_bus.py — 7 unit tests, all green
  standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers.
- backend/tests/test_handoff_manager.py — +1 dispatcher integration test
  (publishes to bus, payload shape).
- backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer
  blocked, ready event handshake).

[gstack-context]
Decisions:
  - SSE over WebSocket (one-way, browser EventSource semantics, fewer
    moving parts behind Railway proxy)
  - In-memory bus over Redis for v1 pilot (3 MSPs, single replica)
  - Drop-on-full subscriber queue rather than back-pressure publishers
  - Bus publish ahead of email send, both wrapped in try/except so
    neither can break handoff creation
  - Frontend will be a fetch-based ReadableStream reader matching the
    existing streamDocumentation pattern, not native EventSource
    (custom-header auth)
Remaining (post-Codex):
  - Frontend SSE subscription in EscalationQueue.tsx (slide-in,
    reconnect, tab-title flash, prefers-reduced-motion)
  - Magic-moment handoff-context screen
  - Re-run the full backend test suite to verify the SSE +
    dispatcher integration tests (bus units already green standalone)
Tried:
  - Running the full test suite repeatedly without xdist; the per-test
    DROP SCHEMA + recreate fixture made wall-clock prohibitive when
    multiple stale runs collided on the same Postgres test schema.
    Resolution: -n auto next time.
[/gstack-context]

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 19:29:07 -04:00
7a5b853b3b feat(api): role-gate handoff claim to engineer-or-admin
POST /ai-sessions/{id}/handoffs/{hid}/claim previously required only an
authenticated user, so a viewer-role account user could claim escalations.
Codex review flagged this as wedge-relevant: the Escalation Mode race-
condition story (two seniors clicking Pick Up simultaneously) depends on
auth gating for audit integrity. Originally captured as a deferred TODO
during /plan-eng-review, then moved in-scope by /codex review.

Swap the dep to require_engineer_or_admin. One-line change. Two new tests:
- viewer_role gets 403 with "Engineer or admin access required"
- engineer/owner role still succeeds and claimed_at + claimed_by populate

Existing handoff create + queue tests unaffected.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-27 15:46:59 -04:00
chihlasm
9d2a8332aa feat: add handoff API endpoints with queue and integration tests
Four endpoints: create handoff (park/escalate), list handoff history,
claim session, and team queue. Two routers: session-scoped router with
prefix /ai-sessions/{session_id} and queue_router with prefix /ai-sessions.
queue_router registered before ai_sessions.router to avoid /{session_id}
path conflict on GET /ai-sessions/queue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 08:43:32 +00:00