Architecture review · 2026-05-13

ResolutionFlow workflow analysis

Based on workflows.html · 28 user-facing flows · 297 traced steps · 120 unique files

Bottom line

You're not bloated, and most of the "circles" in the diagram are visualization artifact, not architecture problems. Each HTTP call shows up as two steps (request + response), so a normal round-trip looks like a circle even though it's one unit of work.

Three real items worth engineering attention: ai_sessions.py is becoming a god endpoint, the three chat services have a confusing boundary, and the auth token tables have no physical cleanup so they accrue rows forever. Everything else looks structurally healthy.

Headline numbers

Avg steps / flow
10.6
healthy range for multi-tenant SaaS
Avg files / flow
7.5
one file per layer, roughly
Revisit ratio
1.39
1.0 = flat; 2.0+ = chat-shaped
"Backward" edges
15%
mostly HTTP response, not real circles

Why the diagrams look circular

Each HTTP request and its response are encoded as two separate steps. So an API call architecturally goes one direction, but visually looks like a loop. Breakdown of the 44 backward-flowing edges:

KindCountReal circle?Example
http_post / http_get response 20 artifact Server returns 200 to client. Not a circle.
function_call return value 8 artifact oauth_providers returns an OAuthProfile to the endpoint that called it.
state_update (hook → component/page) 8 idiomatic Hook returns updated state, page re-renders. Pure React data flow.
redirect (OAuth provider → app) 4 real Google/Microsoft sends user back to /oauth/callback. Architecturally required.
webhook 1 real Stripe POSTs to /webhooks/stripe. External system re-enters us.
navigation / external_api / other 3 real Page-to-page nav, Anthropic returning a response.

After subtracting the request/response duality, the real backward edges are about 3% of steps, and every one of them is in a place where the architecture demands it (React state propagation, OAuth callbacks, webhooks).

What's healthy

Clean layer discipline good

The system mostly respects layer boundaries. endpoint → service (34x), service → external (37x), api_client → endpoint (30x) dominate the traffic. Things flow in the expected direction.

flowpilot_engine is the right kind of shared service good

Touched by 5 flows (start, respond, resolve, pause, abandon). That's a coordination kernel doing its job — high fan-in is correct for orchestration code.

PostgreSQL in 25/28 flows good

Star topology, not a tangle. That's what a database is supposed to look like.

Layer transition heatmap

How many times each layer-pair appears across all steps. Bright cells = well-traveled paths. Empty cells = layer boundaries that aren't crossed (mostly a good sign).

pagecomphookstoreapi_chttpendpservcoremodelext
page 135 61217·····2
comp 1 52·1·1····
hook 71 ··11······
store ···42·1···1
api_client·····530···1
endpoint 3·924·1348229
service 1···2·395437
core ··········4
model ··········1
external 4·····11···
http_client······5····

Read row → column. Diagonal = same-layer transitions. Above-diagonal = "backward" (e.g. endpoint → hook = HTTP response). The strong upper-right concentration (endpoint → service → external) is the right shape.

Top coupling hot-spots

Files appearing in the most flows. The first two (PostgreSQL, Anthropic) are expected; everything else is worth a glance.

FlowsFileLayerRead
25external:postgresexternalExpected. The DB is the hub.
10external:anthropic_apiexternalExpected for an AI product.
7backend/app/api/endpoints/ai_sessions.pyendpointGod endpoint candidate. See concern below.
6frontend/src/api/aiSessions.tsapi_clientMirrors the god endpoint. Splits naturally if backend splits.
5backend/app/services/flowpilot_engine.pyserviceHealthy coordination kernel.
5backend/app/api/endpoints/auth.pyendpoint5 auth flows, 5 endpoints. Reasonable.
5frontend/src/store/authStore.tsstoreCentralized auth state. Correct.
5frontend/src/pages/FlowPilotSessionPage.tsxpageWorth checking — see OAuth concern.
5frontend/src/hooks/useFlowPilotSession.tshookAlways co-travels with the page. Right pattern.

Things worth examining

1. ai_sessions.py is a god endpoint split candidate

Appears in 7 flows. Houses ~12 route handlers in one file: create, respond, chat, resolve, escalate, pause, abandon, pickup, list, get, plus the /chat + /respond overload. It's the highest-coupled non-DB node.

Suggested seam:

Frontend aiSessions.ts would split along the same line. Net change: clearer ownership, no functional impact.

2. Three chat services with a confusing boundary name vs reality

Three files exist with overlapping responsibilities:

The PROJECT_CONTEXT.md note says assistant_chat_service was "removed except for retention settings," but the trace shows unified_chat_service.send_chat_message still calls into it for _call_ai. So the file is load-bearing infrastructure, not retention scaffolding.

Two paths forward:

Either way the confusing seam goes away.

3. OAuth login is the most "circular" real flow overloaded callback

19 steps, 4 backward edges, 3 self-loops — by far the most complex auth flow. Some complexity is unavoidable (provider redirect = 2 boundary crossings). But 3 self-loops on OAuthCallbackPage suggest the page is doing too much local state shuffling: CSRF state validation, code exchange, invite-code stash retrieval, JWT storage, navigation, welcome-banner logic.

Worth a look: move OAuth state handling into either authStore (which would centralize all auth state in one place) or a useOAuthCallback hook. The page itself should be mostly declarative.

4. Three auth-token tables grow without bound add cleanup

Auth writes to refresh_tokens, password_reset_tokens, email_verification_tokens, and oauth_identities. Each table is individually justified (different lifecycles, different lookup patterns, JTI rotation for refresh) — this is not bloat in the code. But the cleanup story is missing.

Verified directly: retention_cleanup.py only sweeps AssistantChat. scheduler.py only has one other cleanup job, for AIConversation. The auth endpoint code in auth.py revokes tokens (UPDATE … SET revoked_at = now()) but never deletes them. So:

Suggested fix: add a daily APScheduler job in retention_cleanup.py (or a sibling) that hard-deletes rows where revoked_at < now() - INTERVAL '30 days' for refresh_tokens, and expires_at < now() - INTERVAL '7 days' for the two single-use token tables. Pattern matches the existing cleanup_expired_chats shape and the _cleanup_expired_ai_conversations job in scheduler.py.

Earlier draft of this concern pointed to retention_cleanup.py as the place to verify existing cleanup. That was wrong — no such cleanup exists. Corrected after direct check.

Things not to worry about

Hook ↔ page state loops in session flows

That's just React. useFlowPilotSession and FlowPilotSessionPage always travel together because the hook is that page's controller — they're maximally coupled by design, which is the right pattern.

Low "work percentage" on simple flows

"Pause & leave" comes out at 11% real work, 89% plumbing. That's correct — pause is structurally just PATCH status='paused'. There's no work to do beyond plumbing. The metric undersells simple flows.

The 25-flow PostgreSQL hub

Star topology, not a tangle. A database serving every flow is the architectural ideal.

Caveats on this analysis

Work vs plumbing heuristic undersells reality. It counts http_post as plumbing even when it carries the actual payload. Work percentages should be read as roughly 2x the displayed value.
Only user-facing flows are traced. Background work (knowledge flywheel scheduler, retention cleanup, PSA retry scheduler, MCP turn routing) isn't in here — and that's exactly where bloat tends to hide because nobody watches it. A follow-up trace of the background jobs would close the loop.
~6 of 297 steps marked unverified (mostly knowledge-flywheel-created proposals). They're included in the totals but the conclusions don't depend on them.
"Backward edge" includes HTTP responses. An HTTP round-trip looks like one forward step (request) plus one backward step (response). That alone accounts for the majority of the 15% backward share. The interesting backward edges are the ~3% that aren't request/response duality.