WIP: SSE pub/sub for live escalation arrivals (paused for Codex review)
First half of the WebSocket/SSE push slice. Paused mid-flight to hand
the branch to Codex for outside-voice review before stacking more
commits on top. See .ai/HANDOFF.md for the full pause context + what
to look at.
What's here:
- backend/app/core/escalation_bus.py — module-level singleton in-memory
pub/sub keyed by account_id. asyncio.Queue per subscriber with
64-event maxsize and drop-on-full semantics. Designed to be swappable
for Redis pub/sub when Railway scales past single-replica.
- backend/app/api/endpoints/session_handoffs.py — GET
/api/v1/ai-sessions/escalations/stream SSE endpoint. Auth via
require_engineer_or_admin. 25s heartbeat. Account-scoped subscribe
bound to current_user.account_id.
- backend/app/services/handoff_manager.py — dispatch_escalation_notifications
now publishes a `handoff_created` event to the bus BEFORE the email
fan-out, in a try/except so a bus failure can't block email delivery.
- backend/tests/test_escalation_bus.py — 7 unit tests, all green
standalone (0.14s). Cross-tenant isolation, drop-on-full, no-subscribers.
- backend/tests/test_handoff_manager.py — +1 dispatcher integration test
(publishes to bus, payload shape).
- backend/tests/test_session_handoffs_api.py — +2 endpoint tests (viewer
blocked, ready event handshake).
[gstack-context]
Decisions:
- SSE over WebSocket (one-way, browser EventSource semantics, fewer
moving parts behind Railway proxy)
- In-memory bus over Redis for v1 pilot (3 MSPs, single replica)
- Drop-on-full subscriber queue rather than back-pressure publishers
- Bus publish ahead of email send, both wrapped in try/except so
neither can break handoff creation
- Frontend will be a fetch-based ReadableStream reader matching the
existing streamDocumentation pattern, not native EventSource
(custom-header auth)
Remaining (post-Codex):
- Frontend SSE subscription in EscalationQueue.tsx (slide-in,
reconnect, tab-title flash, prefers-reduced-motion)
- Magic-moment handoff-context screen
- Re-run the full backend test suite to verify the SSE +
dispatcher integration tests (bus units already green standalone)
Tried:
- Running the full test suite repeatedly without xdist; the per-test
DROP SCHEMA + recreate fixture made wall-clock prohibitive when
multiple stale runs collided on the same Postgres test schema.
Resolution: -n auto next time.
[/gstack-context]
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -15,6 +15,7 @@ from sqlalchemy.ext.asyncio import AsyncSession
|
||||
|
||||
from app.core.config import settings
|
||||
from app.core.email import EmailService
|
||||
from app.core.escalation_bus import bus as escalation_bus
|
||||
from app.models.ai_session import AISession
|
||||
from app.models.session_branch import SessionBranch
|
||||
from app.models.session_handoff import SessionHandoff
|
||||
@@ -114,6 +115,29 @@ class HandoffManager:
|
||||
if handoff.intent != "escalate":
|
||||
return 0
|
||||
|
||||
# Publish to the in-memory bus first so connected senior-tech inboxes
|
||||
# see the new card slide in within ~1s of escalate. This path is
|
||||
# fire-and-forget (no IO, just memory) so it can sit ahead of the
|
||||
# email fan-out.
|
||||
try:
|
||||
await escalation_bus.publish(
|
||||
handoff.account_id,
|
||||
{
|
||||
"type": "handoff_created",
|
||||
"handoff_id": str(handoff.id),
|
||||
"session_id": str(handoff.session_id),
|
||||
"priority": handoff.priority,
|
||||
"engineer_notes": handoff.engineer_notes or "",
|
||||
"created_at": handoff.created_at.isoformat()
|
||||
if handoff.created_at
|
||||
else None,
|
||||
},
|
||||
)
|
||||
except Exception:
|
||||
logger.exception(
|
||||
"EscalationBus publish failed for handoff %s", handoff.id
|
||||
)
|
||||
|
||||
try:
|
||||
recipients = (
|
||||
await self.db.execute(
|
||||
|
||||
Reference in New Issue
Block a user