fix(escalations): live-test fixes from QA bash

Bundles four fixes from the live debugging session: 1. AssistantChatPage: replace urlSessionId === activeChatId gate with a loadedChatIdsRef. After 8914391 made activeChatId initialize from urlSessionId, the gate short-circuited fresh mounts and selectChat never fired. Symptom: senior picks up an escalation, lands on a blank chat surface with no conversation history and no sidebar entry. Fix also adds loadChats() in handleStartHere so the picked-up session appears in the sidebar (its escalated_to_id is null pre-claim, so listSessions doesn't return it until claim_session sets it). 2. config: bump ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS 15s → 45s. Sonnet was hitting tail latency at 15s in the field, leaving the magic-moment placeholder permanent. Background-task architecture (e8ba74e) means this no longer blocks the user; it's just the budget before publishing has_assessment=false. NOTE: live test still shows assessment not populating — see HANDOFF for the consolidation plan that supersedes this. 3. Enter-to-submit: chat-input convention (Enter submits, Shift+Enter inserts newline) on the escalate-flow forms. RichTextInput gains an optional onSubmit prop; EscalateModal wires it to handleSubmit; ConcludeSessionModal gets the same handler on its plain textarea. 4. PendingEscalations: each row is now expandable. Click row body to reveal the engineer's escalation reason, step count on record, confidence tier, and PSA ticket number. Pick Up still clicks through directly. Single-expand-at-a-time keeps the dashboard compact. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 00:18:40 -04:00
parent b7d7ff06d2
commit 0d1b305619
6 changed files with 162 additions and 38 deletions
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -111,14 +111,16 @@ class Settings(BaseSettings):
    GOOGLE_AI_API_KEY: Optional[str] = None
    AI_MODEL_GEMINI: str = "gemini-2.5-flash"
    AI_MODEL_ANTHROPIC: str = "claude-sonnet-4-6"
-    # 15s is generous for the click-path; Claude usually returns a 500-token
-    # diagnostic in 4-8s but tail latency on the assessment prompt has hit
-    # 12-14s in the field. Going below this leaves too many escalations with
-    # the "Assessment unavailable — model didn't respond in time" placeholder
-    # the senior sees on the magic-moment screen. Real fix is async generation
-    # (kick off, persist when done, surface "still computing" with refresh) —
-    # that's a follow-up; bumping the bound keeps the wedge demo coherent.
-    ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS: int = 15
+    # Bound for the diagnostic assessment Sonnet call. Generation runs in a
+    # FastAPI BackgroundTask (commit e8ba74e), so this no longer blocks the
+    # senior's click — only how long we wait before publishing
+    # `handoff_assessment_ready` with has_assessment=false. 15s was hitting
+    # tail latency on Sonnet (timeout 03:57:35 in field testing 2026-04-29),
+    # leaving the magic-moment placeholder permanent. 45s is the right
+    # ceiling: well above Sonnet p99 for a 500-token output, far enough
+    # below "the senior gives up watching" that we still surface SOMETHING
+    # on persistent slowness.
+    ESCALATION_AI_ASSESSMENT_TIMEOUT_SECONDS: int = 45

    # Model tier routing — maps action types to model tiers
    AI_MODEL_TIERS: dict[str, str] = {