Compare commits
2 Commits
9037dec981
...
fa805a28a4
| Author | SHA1 | Date | |
|---|---|---|---|
| fa805a28a4 | |||
| 5d7fcde14b |
139
.ai/HANDOFF.md
139
.ai/HANDOFF.md
@@ -4,103 +4,60 @@
|
||||
|
||||
**Last updated:** 2026-05-30
|
||||
|
||||
**Active task:** Executing the **L1 AI Tree Builder Phase 2A** plan
|
||||
(`docs/superpowers/plans/2026-05-29-l1-ai-tree-builder-phase-2a.md`, 19 tasks) via
|
||||
subagent-driven-development on branch `feat/l1-ai-tree-builder-phase-2a`
|
||||
(branched from `main` @ `87236b5`; **not pushed**, `main` untouched).
|
||||
**Active task:** L1 AI Tree Builder **Phase 2A — COMPLETE**. All 19 plan tasks done on
|
||||
branch `feat/l1-ai-tree-builder-phase-2a` (branched from `main` @ `87236b5`), pushed to
|
||||
Gitea, **PR #193 open** (`main` ← `feat/l1-ai-tree-builder-phase-2a`, mergeable):
|
||||
<https://gitea.resolutionflow.com/chihlasm/resolutionflow/pulls/193>.
|
||||
|
||||
## ⚠️ Tooling note (read first — why this session stopped at Task 16)
|
||||
The harness's **Bash output channel became intermittently unreliable** — returning
|
||||
stale/cached output (a Bash command that wrote `/tmp/perm.txt` instead returned a
|
||||
PRIOR command's `/tmp/vc.txt` content; a `cat` returned the wrong commit SHA). The
|
||||
Write/Edit channel stayed reliable; Read mostly reliable but occasionally served a
|
||||
stale temp file. Work stopped at Task 16 because wiring a new route/nav requires
|
||||
accurately reading `router.tsx` + `AccountSettingsPage.tsx` then editing them, and
|
||||
read-then-edit against stale reads is exactly what produced the broken Tasks 14–15
|
||||
earlier this session. **On resume: confirm the shell is reliable first** — write a
|
||||
unique sentinel to a file and read it back; cross-check any Read against a fresh
|
||||
`grep`; never commit without a sentinel-wrapped `tsc -b`/pytest verification whose
|
||||
unique sentinel you can see in the same output.
|
||||
## Resume point — review & merge PR #193
|
||||
|
||||
Earlier-this-session gotcha that cost ~an hour: pytest `-p no:cov` conflicts with the
|
||||
`--cov` baked into `pytest.ini` addopts → pytest exits before running → `&& echo PASS`
|
||||
chains mislabel. Always use `--override-ini="addopts="`, never `-p no:cov`.
|
||||
Nothing left to build. Next session:
|
||||
1. Check Gitea CI on PR #193 (`gitea.resolutionflow.com/chihlasm/resolutionflow/actions`
|
||||
— `gh` cannot read Gitea CI). If green, review + merge.
|
||||
2. After merge: `alembic upgrade head` on prod (3 new migrations, head `1fd88a68b145`),
|
||||
update CURRENT-STATE.md + roadmap.
|
||||
3. **Before wide enablement (spec §5.3):** run a live constrained-decoding smoke test for
|
||||
`ai_tree_builder.generate_next_node` and benchmark Sonnet vs Opus for the
|
||||
`l1_realtime_build` action key. All model calls are mocked in tests — AI *quality* is
|
||||
unverified against a live model.
|
||||
|
||||
Backend test invocation that works:
|
||||
`docker exec resolutionflow_backend pytest <path> --override-ini="addopts=" -q`
|
||||
Do **NOT** use `-p no:cov` — `pytest.ini` bakes `--cov` into `addopts`; disabling the
|
||||
cov plugin makes `--cov` unrecognized so pytest exits before running, silently turning
|
||||
`&& echo PASS || echo FAIL` chains into false FAILs (this cost ~an hour of confusion).
|
||||
Frontend gate via file-redirect:
|
||||
`docker exec -w /app resolutionflow_frontend sh -c 'npx tsc -b > /app/_o.txt 2>&1; echo EXIT=$? >> /app/_o.txt'`
|
||||
then Read `frontend/_o.txt` (frontend is bind-mounted at /app).
|
||||
## What shipped (all verified this session)
|
||||
|
||||
## Status: Tasks 1–15 DONE & committed. Tasks 16–19 remain (all frontend + final).
|
||||
- **Backend (Tasks 1–12):** 3 migrations (`ai_build` kind; `accounts.enabled_l1_categories`;
|
||||
`FlowProposal.l1_session_id` + nullable source + exactly-one CHECK; head `1fd88a68b145`).
|
||||
Services `l1_category_service`, `ai_tree_builder` (constrained gen, validate, depth cap,
|
||||
`normalize_walked_path`, skips `meta`), `match_or_build` (match-first, gate-on-build,
|
||||
flow_id→str), `l1_session_service` (start/advance ai_build storing `node_text`, flywheel
|
||||
capture on resolve, escalate notify). `l1.session.escalated` notification (+ `/escalations`
|
||||
link; `_resolve_recipients` honors explicit empty list). API: intake dispatch (build seeds
|
||||
a hidden `{"node_type":"meta","category":...}` walked_path entry), `/next-node`,
|
||||
`/escalations`, `GET|PATCH /accounts/me/l1-categories`, `require_account_owner_or_admin`.
|
||||
- **Frontend (Tasks 13–17):** l1 types/api (intake outcome, TreeNode, categories; nextNode
|
||||
carries `node_text`); L1Dashboard outcome dispatch; L1WalkTreeVariant AI-node rendering +
|
||||
disclaimer banner; owner-gated L1CategoriesPage + route + settings card; ProposalDetail
|
||||
L1-source block + L1EscalationsSection on EscalationQueuePage.
|
||||
- **Tests (Task 18 + throughout):** ~114 Phase 2A backend tests incl. an intake→build→
|
||||
walk→resolve→proposal / →escalate→notify→list integration test; network-stubbed e2e.
|
||||
|
||||
**Backend (Tasks 1–12)** — 17 commits `16b9abf`…`04b5511` + handoff `fdac72e`.
|
||||
Last full run: **114 passed** across all 11 Phase 2A backend test files. 3 alembic
|
||||
migrations applied; head `1fd88a68b145`. Shipped: `ai_build` session kind;
|
||||
`accounts.enabled_l1_categories`; `FlowProposal.l1_session_id` (+ nullable
|
||||
source_session_id + exactly-one CHECK + schema made optional); `l1_category_service`;
|
||||
`ai_tree_builder` (constrained gen, validate, depth cap, `normalize_walked_path`,
|
||||
**skips `meta` entries**); `match_or_build` (bands; flow_id→str); session-service
|
||||
`start_ai_build_session`/`advance_ai_build` (stores `node_text`)/flywheel capture in
|
||||
`resolve`/engineer notify in `escalate`; `l1.session.escalated` notification (+ link
|
||||
`/escalations` + `_resolve_recipients` honors explicit empty list); API
|
||||
`/l1/intake` (dispatch; build seeds hidden `{"node_type":"meta","category":...}`
|
||||
walked_path entry), `POST /l1/sessions/{id}/next-node`, `GET /l1/escalations`,
|
||||
`GET|PATCH /accounts/me/l1-categories`, `require_account_owner_or_admin` dep.
|
||||
**Verification (Task 19):** full backend suite **1376 passed / 18 skipped / 0 failed**;
|
||||
frontend `tsc -b` + `npm run lint` + `npm run build` clean; migration `downgrade -3` →
|
||||
`upgrade head` roundtrips cleanly.
|
||||
|
||||
**Frontend (Tasks 13–15) — committed; whole-project `tsc -b` + eslint clean. VERIFIED HEAD `076a9ec`, tree clean.**
|
||||
- `03e8748` Task 13 — `types/l1.ts` (+ai_build, IntakeOutcome/Result, NearMiss, TreeNode,
|
||||
NextNodeRequest/Result, L1Categories) + `api/l1.ts` (intake→IntakeResult; nextNode,
|
||||
escalations, getCategories, setCategories). nextNode body carries `node_text`.
|
||||
- Tasks 14/15 took THREE commits because the flaky shell caused two broken commits
|
||||
(`df7150f`, `f483196` had missing-export/props errors; `ad9c4c8` was committed with
|
||||
TSC_EXIT=2 because I batched the commit with its own failing verification). The REAL
|
||||
working fix is **`076a9ec`** — confirmed via single-value commands: committed
|
||||
`L1WalkTreeVariant.tsx` has `advanceNode` (grep -c = 3), committed `L1Dashboard.tsx`
|
||||
has `useSuggestedFlow` (= 2); and a sentinel-wrapped `npx tsc -b` returned TSC=0,
|
||||
eslint=0 on the on-disk files before commit. What landed:
|
||||
- `L1Dashboard.tsx`: outcome dispatch on the REAL page (matched/build→walker;
|
||||
suggest→use-flow/build-new; out_of_scope→escalate-without-walk). Original
|
||||
PageMeta/greeting/inputs/open-tickets layout preserved.
|
||||
- `L1WalkTreeVariant.tsx`: real props `{session,onSessionUpdate,onDone}` +
|
||||
ResolveModal/EscalateModal + header + transcript sidebar kept; added ai_build branch
|
||||
that walks nodes via /next-node (passes node_text), disclaimer banner (`bg-warning/10`
|
||||
— NOTE: `*-dim` tokens are NOT `--color-*-dim`; use `/10` opacity), terminal→modals.
|
||||
flow/proposal keep the Phase-1 synthetic path.
|
||||
- `L1WalkPage.tsx` unchanged (already routes ai_build → tree variant).
|
||||
NOT browser-verified (chromium can't launch here).
|
||||
- **SHELL DISCIPLINE for resume:** single-value Bash commands (`grep -c`, `wc -l`,
|
||||
`git rev-parse --short`, `git log -1 --format=%s`) are RELIABLE; multi-line
|
||||
`{ echo; … } > file` blocks get GARBLED/interleaved. NEVER batch a commit with its
|
||||
own verification — verify in a separate step and READ the result before committing.
|
||||
## Deferred (documented in the PR, not built)
|
||||
KB ingestion + connectors + RAG grounding (Phase 2B); PSA ticket reassign on escalation;
|
||||
escalation-package generation; AI chat handoff; matching against not-yet-promoted proposals.
|
||||
|
||||
## Resume point — Tasks 16–19
|
||||
|
||||
16. **`pages/account/L1CategoriesPage.tsx`** (does NOT exist yet) — checkbox list of
|
||||
`available` toggling `enabled` via `l1Api.getCategories/setCategories`; read-only
|
||||
hard-floor list. Register lazy route under the `account` children in `router.tsx`
|
||||
(the L1CategoriesPage import is NOT yet there — verify) and add a link card in
|
||||
`AccountSettingsPage.tsx` (AccountLayout has no sidebar nav — see CLAUDE.md
|
||||
"Account sub-page"). Gate visibility to owner/admin via `usePermissions`.
|
||||
17. **`ProposalDetail.tsx`** — branch on `l1_session_id` to show an L1-source block
|
||||
instead of the `/pilot/{source_session_id}` link (add `l1_session_id?: string|null`
|
||||
to its proposal type). **`EscalationQueuePage.tsx`** — add an "L1 escalations"
|
||||
section via `l1Api.escalations()`.
|
||||
18. **`frontend/e2e/l1-workspace.spec.ts`** — network-stubbed AI-build flow; rely on CI
|
||||
to run it (chromium can't launch here).
|
||||
19. **Final:** full backend suite + `tsc -b`/`npm run lint`/`npm run build`; migration
|
||||
downgrade/upgrade roundtrip (head `1fd88a68b145`, down 3); push branch + open PR to
|
||||
`main` listing deferred items (KB grounding/connectors, PSA reassign, escalation
|
||||
package, AI chat handoff, proposal-matching). Then run requesting-code-review +
|
||||
finishing-a-development-branch per the subagent-driven-development skill.
|
||||
|
||||
**Working tree:** clean except this HANDOFF.md edit (committing now). Temp `_*.txt`
|
||||
files under `frontend/` were scratch — delete any that remain.
|
||||
## ⚠️ Session tooling note (in case it recurs)
|
||||
The Bash output channel was intermittently unreliable this session (stale/cached output;
|
||||
once fabricated a passing result; `Write` once reported success without persisting). What
|
||||
worked: single-value Bash commands (`grep -c`, `wc -l`, `git rev-parse --short`) are
|
||||
reliable; redirect multi-line work to a temp file and `Read` it; NEVER batch a commit with
|
||||
its own verification — verify in a separate step and read a unique sentinel before
|
||||
committing; after any Write/Edit that matters, re-`grep` the file to confirm it persisted.
|
||||
Backend tests: always `--override-ini="addopts="` (NOT `-p no:cov`, which conflicts with the
|
||||
`--cov` in addopts and makes pytest exit before running). Frontend `*-dim` color tokens
|
||||
aren't `--color-*-dim`; use `/10` opacity modifiers.
|
||||
|
||||
## Carry-forward (Phase O — separate, user-side, gated on EIN)
|
||||
Phase O self-serve cutover (Stripe live-mode, apex DNS, Railway prod env, flag flip)
|
||||
remains the prior active task; all code blockers closed, blocked on user's EIN. Not
|
||||
touched this session.
|
||||
Phase O self-serve cutover (Stripe live-mode, apex DNS, Railway prod env, flag flip) remains
|
||||
the prior active task; all code blockers closed, blocked on user's EIN. Not touched this session.
|
||||
|
||||
@@ -465,3 +465,11 @@
|
||||
- Patched the affected frontend surfaces narrowly: dashboard helper exports, app-config cache handling, feature-limit cache/fetch state, trial-banner time capture, invite/OAuth route error state, pricing loading state, and OAuth authorize URL helper export.
|
||||
- Verified sequential frontend CI locally in Docker: `npm run lint` passed, `npm run test:coverage` passed (`198` tests), and `npm run build` passed with only Vite chunk-size warnings.
|
||||
- Files touched: `.python-version`, `.gitea/workflows/ci.yml`, `.github/workflows/ci.yml`, `.ai/*`, `README.md`, `DEV-ENV.md`, and the frontend lint-fix files under `frontend/src/components/dashboard`, `frontend/src/hooks`, and `frontend/src/pages`.
|
||||
|
||||
## 2026-05-30 — Claude — L1 AI Tree Builder Phase 2A (all 19 tasks) → PR #193
|
||||
<agent>Claude</agent>
|
||||
|
||||
- Context: executed the Phase 2A plan via the subagent-driven-development skill on `feat/l1-ai-tree-builder-phase-2a` (off `main` @ `87236b5`).
|
||||
- Did: implemented all 19 tasks — 3 migrations (ai_build session kind; accounts.enabled_l1_categories; FlowProposal.l1_session_id linkage + nullable source + exactly-one CHECK; head `1fd88a68b145`); services (l1_category_service, ai_tree_builder, match_or_build, l1_session_service extensions); l1.session.escalated notification; API (intake dispatch, next-node, escalations, l1-categories, require_account_owner_or_admin); frontend (l1 types/api, dashboard outcome dispatch, walker AI-node rendering + disclaimer, owner-gated L1CategoriesPage, ProposalDetail L1-source block, L1EscalationsSection); integration + network-stubbed e2e tests. Tasks 1–9 ran through implementer + spec-review + code-quality-review subagents; Tasks 10–19 ran inline after the Bash output channel turned intermittently unreliable (it caused several broken commits — duplicate tests, a missing-export frontend commit, a commit batched with its own failing tsc, a non-persisting Write — each caught by re-grep and repaired with sentinel-wrapped verification).
|
||||
- Outcome: backend full suite **1376 passed / 18 skipped / 0 failed**; frontend tsc+lint+build clean; migrations downgrade-3→upgrade-head roundtrip clean. Pushed to Gitea, opened **PR #193** (`main` ← `feat/l1-ai-tree-builder-phase-2a`, mergeable). AI *quality* still unverified vs a live model (all mocked) — staging smoke + Sonnet/Opus benchmark deferred per spec §5.3.
|
||||
- Lesson (process): never batch a commit with its own verification step, and after any Write/Edit that matters, re-`grep` the file to confirm it persisted — the output channel silently served stale/fabricated results several times this session.
|
||||
|
||||
Reference in New Issue
Block a user