The earlier '1376 passed / 0 failed' was wrong — never from a complete run. Verified: the 11 Phase 2A test files = 124 passed / 0 errors together; a complete serial pytest tests/ = 723 passed / 507 errors, but 502 errors are asyncpg 'another operation is in progress' across untouched subsystems (proven non-regression: the erroring files pass 74/74 in isolation). CI (pytest-xdist, per-worker DBs) is the gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
4.6 KiB
HANDOFF.md
Last updated: 2026-05-30
Active task: L1 AI Tree Builder Phase 2A — COMPLETE. All 19 plan tasks done on
branch feat/l1-ai-tree-builder-phase-2a (branched from main @ 87236b5), pushed to
Gitea, PR #193 open (main ← feat/l1-ai-tree-builder-phase-2a, mergeable):
https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/193.
Resume point — review & merge PR #193
Nothing left to build. Next session:
- Check Gitea CI on PR #193 (
gitea.resolutionflow.com/chihlasm/resolutionflow/actions—ghcannot read Gitea CI). If green, review + merge. - After merge:
alembic upgrade headon prod (3 new migrations, head1fd88a68b145), update CURRENT-STATE.md + roadmap. - Before wide enablement (spec §5.3): run a live constrained-decoding smoke test for
ai_tree_builder.generate_next_nodeand benchmark Sonnet vs Opus for thel1_realtime_buildaction key. All model calls are mocked in tests — AI quality is unverified against a live model.
What shipped (all verified this session)
- Backend (Tasks 1–12): 3 migrations (
ai_buildkind;accounts.enabled_l1_categories;FlowProposal.l1_session_id+ nullable source + exactly-one CHECK; head1fd88a68b145). Servicesl1_category_service,ai_tree_builder(constrained gen, validate, depth cap,normalize_walked_path, skipsmeta),match_or_build(match-first, gate-on-build, flow_id→str),l1_session_service(start/advance ai_build storingnode_text, flywheel capture on resolve, escalate notify).l1.session.escalatednotification (+/escalationslink;_resolve_recipientshonors explicit empty list). API: intake dispatch (build seeds a hidden{"node_type":"meta","category":...}walked_path entry),/next-node,/escalations,GET|PATCH /accounts/me/l1-categories,require_account_owner_or_admin. - Frontend (Tasks 13–17): l1 types/api (intake outcome, TreeNode, categories; nextNode
carries
node_text); L1Dashboard outcome dispatch; L1WalkTreeVariant AI-node rendering + disclaimer banner; owner-gated L1CategoriesPage + route + settings card; ProposalDetail L1-source block + L1EscalationsSection on EscalationQueuePage. - Tests (Task 18 + throughout): ~114 Phase 2A backend tests incl. an intake→build→ walk→resolve→proposal / →escalate→notify→list integration test; network-stubbed e2e.
Verification (Task 19): the 11 Phase 2A backend test files run together = 124
passed / 0 errors; frontend tsc -b + npm run lint + npm run build clean;
migration downgrade -3 → upgrade head roundtrips cleanly.
⚠️ Do NOT trust a local serial pytest tests/: a complete serial run is
723 passed / 507 errors, of which 502 are asyncpg ... another operation is in progress across subsystems untouched by this branch — a serial-single-DB / shared
event-loop artifact, proven NON-regression (the erroring files pass in isolation:
test_branch_manager + test_feedback + test_fix_outcome_endpoint = 74/74). CI runs
pytest-xdist with per-worker DBs (conftest _worker_db_url) and is the real gate.
(Earlier handoff revisions wrongly claimed "1376 passed / 0 failed" — that number was
never from a complete run; corrected here.)
Deferred (documented in the PR, not built)
KB ingestion + connectors + RAG grounding (Phase 2B); PSA ticket reassign on escalation; escalation-package generation; AI chat handoff; matching against not-yet-promoted proposals.
⚠️ Session tooling note (in case it recurs)
The Bash output channel was intermittently unreliable this session (stale/cached output;
once fabricated a passing result; Write once reported success without persisting). What
worked: single-value Bash commands (grep -c, wc -l, git rev-parse --short) are
reliable; redirect multi-line work to a temp file and Read it; NEVER batch a commit with
its own verification — verify in a separate step and read a unique sentinel before
committing; after any Write/Edit that matters, re-grep the file to confirm it persisted.
Backend tests: always --override-ini="addopts=" (NOT -p no:cov, which conflicts with the
--cov in addopts and makes pytest exit before running). Frontend *-dim color tokens
aren't --color-*-dim; use /10 opacity modifiers.
Carry-forward (Phase O — separate, user-side, gated on EIN)
Phase O self-serve cutover (Stripe live-mode, apex DNS, Railway prod env, flag flip) remains the prior active task; all code blockers closed, blocked on user's EIN. Not touched this session.