resolutionflow/.ai/HANDOFF.md

<!-- Keep under ~2K tokens. Old handoffs live in SESSION_LOG.md. Do not let this file accumulate history. -->

# HANDOFF.md

**Last updated:** 2026-05-30

**Active task:** L1 AI Tree Builder **Phase 2A — COMPLETE**. All 19 plan tasks done on
branch `feat/l1-ai-tree-builder-phase-2a` (branched from `main` @ `87236b5`), pushed to
Gitea, **PR #193 open** (`main` ← `feat/l1-ai-tree-builder-phase-2a`, mergeable):
<https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/193>.

## Resume point — review & merge PR #193

Nothing left to build. Next session:
1. Check Gitea CI on PR #193 (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`
   — `gh` cannot read Gitea CI). If green, review + merge.
2. After merge: `alembic upgrade head` on prod (3 new migrations, head `1fd88a68b145`),
   update CURRENT-STATE.md + roadmap.
3. **Before wide enablement (spec §5.3):** run a live constrained-decoding smoke test for
   `ai_tree_builder.generate_next_node` and benchmark Sonnet vs Opus for the
   `l1_realtime_build` action key. All model calls are mocked in tests — AI *quality* is
   unverified against a live model.

## What shipped (all verified this session)

- **Backend (Tasks 1–12):** 3 migrations (`ai_build` kind; `accounts.enabled_l1_categories`;
  `FlowProposal.l1_session_id` + nullable source + exactly-one CHECK; head `1fd88a68b145`).
  Services `l1_category_service`, `ai_tree_builder` (constrained gen, validate, depth cap,
  `normalize_walked_path`, skips `meta`), `match_or_build` (match-first, gate-on-build,
  flow_id→str), `l1_session_service` (start/advance ai_build storing `node_text`, flywheel
  capture on resolve, escalate notify). `l1.session.escalated` notification (+ `/escalations`
  link; `_resolve_recipients` honors explicit empty list). API: intake dispatch (build seeds
  a hidden `{"node_type":"meta","category":...}` walked_path entry), `/next-node`,
  `/escalations`, `GET|PATCH /accounts/me/l1-categories`, `require_account_owner_or_admin`.
- **Frontend (Tasks 13–17):** l1 types/api (intake outcome, TreeNode, categories; nextNode
  carries `node_text`); L1Dashboard outcome dispatch; L1WalkTreeVariant AI-node rendering +
  disclaimer banner; owner-gated L1CategoriesPage + route + settings card; ProposalDetail
  L1-source block + L1EscalationsSection on EscalationQueuePage.
- **Tests (Task 18 + throughout):** ~114 Phase 2A backend tests incl. an intake→build→
  walk→resolve→proposal / →escalate→notify→list integration test; network-stubbed e2e.

**Verification (Task 19):** the 11 Phase 2A backend test files run together = **124
passed / 0 errors**; frontend `tsc -b` + `npm run lint` + `npm run build` clean;
migration `downgrade -3` → `upgrade head` roundtrips cleanly.
**⚠️ Do NOT trust a local serial `pytest tests/`:** a complete serial run is
`723 passed / 507 errors`, of which 502 are `asyncpg ... another operation is in
progress` across subsystems untouched by this branch — a serial-single-DB / shared
event-loop artifact, proven NON-regression (the erroring files pass in isolation:
test_branch_manager + test_feedback + test_fix_outcome_endpoint = 74/74). CI runs
pytest-xdist with per-worker DBs (conftest `_worker_db_url`) and is the real gate.
(Earlier handoff revisions wrongly claimed "1376 passed / 0 failed" — that number was
never from a complete run; corrected here.)

## Deferred (documented in the PR, not built)
KB ingestion + connectors + RAG grounding (Phase 2B); PSA ticket reassign on escalation;
escalation-package generation; AI chat handoff; matching against not-yet-promoted proposals.

## ⚠️ Session tooling note (in case it recurs)
The Bash output channel was intermittently unreliable this session (stale/cached output;
once fabricated a passing result; `Write` once reported success without persisting). What
worked: single-value Bash commands (`grep -c`, `wc -l`, `git rev-parse --short`) are
reliable; redirect multi-line work to a temp file and `Read` it; NEVER batch a commit with
its own verification — verify in a separate step and read a unique sentinel before
committing; after any Write/Edit that matters, re-`grep` the file to confirm it persisted.
Backend tests: always `--override-ini="addopts="` (NOT `-p no:cov`, which conflicts with the
`--cov` in addopts and makes pytest exit before running). Frontend `*-dim` color tokens
aren't `--color-*-dim`; use `/10` opacity modifiers.

## Carry-forward (Phase O — separate, user-side, gated on EIN)
Phase O self-serve cutover (Stripe live-mode, apex DNS, Railway prod env, flag flip) remains
the prior active task; all code blockers closed, blocked on user's EIN. Not touched this session.