fix(escalations): atomic claim + self-claim rejection + queue exclusion
All checks were successful
Mirror to GitHub / mirror (push) Successful in 5s
CI / frontend (pull_request) Successful in 4m59s
CI / backend (pull_request) Successful in 10m22s
CI / e2e (pull_request) Successful in 10m46s

Codex review pass on the escalation wedge. Reworks claim_session from
read-then-write to a conditional UPDATE so two seniors racing can't both
win, blocks the original engineer from claiming their own handoff, and
filters self-escalated sessions out of the dashboard escalation queue.
Also preassigns the handoff UUID before flush so the compatibility
escalation_package payload carries it. Removes legacy frontend pickup
state (claiming, handleStartHere) that broke tsc --noEmit.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-30 16:21:20 -04:00
parent ab5e0deaf7
commit f10649abc2
10 changed files with 248 additions and 134 deletions

View File

@@ -2,34 +2,33 @@
# HANDOFF.md
**Last updated:** 2026-04-30 (session 3 — QA pass)
**Last updated:** 2026-04-30 (Codex review-fix pass)
**Active task:** **Escalation Mode** wedge — BROWSER QA COMPLETE. Branch: `feat/escalation-metric-endpoint`. PR #155 ready to mark ready-for-review.
**Active task:** **Escalation Mode** wedge — BROWSER QA COMPLETE + review fixes applied. Branch: `feat/escalation-metric-endpoint`. PR #155 ready to mark ready-for-review after committing this fix pass.
## Where the previous session ended
## Where this session ended
Browser QA pass completed. One critical bug found and fixed during QA.
Code-review fixes were applied after browser QA:
**Bug found + fixed (commit dc69c9d):**
- `POST /ai-sessions/{id}/chat → 400` when senior clicks "Get AI analysis" — `send_chat_message` checked `session.user_id == user_id` but the senior is `escalated_to_id`, not `user_id`. Fixed by adding `OR escalated_to_id == user_id` in the WHERE clause.
- `claim_session` now uses atomic conditional `UPDATE ... WHERE claimed_by IS NULL` instead of read-then-write, so simultaneous senior pickup cannot silently overwrite `claimed_by`.
- Original escalators cannot claim their own handoff. The escalation queue also excludes the current user's own escalated sessions, preventing the post-escalation dashboard from showing the junior their own handoff.
- `session.escalation_package["handoff_id"]` is now populated from a preassigned UUID instead of `None` before flush.
- Frontend build blockers removed: deleted unused legacy `claiming` / `handleStartHere` path in `AssistantChatPage` and unused `onStartHere` destructuring in `HandoffContextScreen`.
**All QA checks passed (17/17 backend tests pass):**
**Validation:**
- Post-escalation redirect: junior gets "Session escalated. Heading back to your dashboard." toast + navigates to `/`
- Magic-moment screen: header, metadata, two-column AI assessment, 2-option CTA (no task lane) all render correctly
- "I'll take it from here": claim → dismiss → chat surface → composer focused
- "Get AI analysis": claim → briefing sent → AI responds → task lane populates ✅ (fixed)
- Task lane copy button: toast + checkmark visual feedback ✅
- Chip expansion: inline detail card + "Open in Tasks panel" scroll ✅
- Post-claim overlay: "Context" toolbar button → dismissible mode → only Close button ✅
- `git diff --check`
- `cd backend && pytest --override-ini='addopts=' tests/test_handoff_manager.py tests/test_session_handoffs_api.py tests/test_escalation_bus.py``28 passed in 42.23s`
- `cd frontend && /config/.bun/bin/bunx tsc -p tsconfig.app.json --noEmit --pretty false && /config/.bun/bin/bunx tsc -p tsconfig.node.json --noEmit --pretty false`
- Full frontend build could not complete because generated dirs are root-owned in this workspace: `frontend/node_modules/.tmp`, `frontend/node_modules/.vite-temp`, and likely `frontend/dist` produce EACCES. Type errors from review are fixed.
**Not testable in dev (known limitations):**
- "Continue where X left off": requires senior to have existing task lane for session (won't occur on first pickup)
- 409 race condition: requires two distinct senior accounts; backend logic reviewed and correct
- Browser-level 409 race toast still requires two distinct senior accounts. Backend claim write is now atomic and covered by service/API tests for conflict, self-claim, and idempotent same-user retry.
## Resume point — DO THIS NEXT
**Ship:** Mark PR #155 ready-for-review and demo to stakeholder. No engineering work remaining.
**Ship:** Commit this review-fix pass, then mark PR #155 ready-for-review and demo to stakeholder.
Optional before shipping:
- Record Loom demo walking through the escalation flow end-to-end
@@ -37,6 +36,8 @@ Optional before shipping:
## Key files changed this session
- `backend/app/services/handoff_manager.py``_generate_handoff_summary` replaces old assessment pair; `enrich_escalation_async` unified; `claim_session` eager-loads `handed_off_by_user`
- `backend/app/api/endpoints/ai_sessions.py` — escalation queue excludes the current user's own escalations
- `backend/app/api/endpoints/session_handoffs.py` — self-claim returns 403
- `backend/app/services/flowpilot_engine.py``generate_status_update` early-returns saved prose for `context='escalation'`
- `backend/app/schemas/session_handoff.py``handed_off_by_name: str | None = None` added
- `backend/app/api/endpoints/session_handoffs.py` — both create + claim endpoints pass `handed_off_by_name`
@@ -44,11 +45,12 @@ Optional before shipping:
- `frontend/src/components/flowpilot/HandoffContextScreen.tsx` — 3-option CTA; `hasTaskLane`, `activeOptionKey`, `onContinue/onAIAnalysis/onOwnThing` props
- `frontend/src/components/assistant/TaskLane.tsx``id="task-lane-card-{idx}"` on all card variants
- `frontend/src/pages/AssistantChatPage.tsx``handleContinue`, `handleAIAnalysis`, `handleOwnThing` handlers; chip → card navigation; `activeOptionKey` state
- `backend/tests/test_handoff_manager.py`, `backend/tests/test_session_handoffs_api.py` — regression coverage for atomic/idempotent claim, self-claim rejection, queue self-exclusion, and pre-flush handoff ID
## Watch-outs
- Dev stack: backend `:8000`, frontend `:5173`, postgres `:5433` (docker-compose). HMR works.
- Test users (Acme MSP, password `TestPass123!`): `engineer@resolutionflow.example.com` (junior), `teamadmin@resolutionflow.example.com` (senior).
- `handleAIAnalysis` pre-adds `urlSessionId` to `loadedChatIdsRef` before dismissing so the normal selectChat effect doesn't double-fire. It then calls `selectChat` manually before sending the briefing.
- `claiming` state is now only used by the legacy `handleStartHere` (which is no longer wired to any UI). `activeOptionKey !== null` is the new `isProcessing` signal.
- Legacy `claiming` / `handleStartHere` on `AssistantChatPage` was removed; `activeOptionKey !== null` is the active pre-claim processing signal.
- The bus is acceptable for v1 pilot scale only (Railway single-replica). Redis pub/sub is the swap when horizontal scaling appears.