Files
resolutionflow/backend/tests/test_ai_endpoints.py
chihlasm 97cd297f46 feat: AI-assisted flow builder with 4-stage wizard (#87)
* feat: AI-assisted flow builder with 4-stage wizard

Implements the complete AI flow builder feature using a guided 4-stage
wizard (Foundation → Scaffold → Branch Detail → Review & Assemble).
AI assists at bounded points using Claude Haiku for cost-efficient
structured JSON generation (~$0.01-0.03/flow).

Backend: new models (ai_conversations, ai_usage), Alembic migration,
quota enforcement with billing anchor, Anthropic API integration with
prompt caching, tree validation, conversation CRUD with 24h TTL,
APScheduler cleanup job, 5 API endpoints, Pydantic schemas.

Frontend: TypeScript types, API client, Zustand store for wizard state,
7 components (modal, step indicator, foundation form, branch selector,
branch detail view, tree preview, quota display), MyTreesPage integration
with "Build with AI" button (hidden when AI not configured).

Tests: 14 validator unit tests + 11 endpoint integration tests with
mocked Anthropic (zero real API spend). All 25 tests passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: dashboard design doc and implementation plan

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Phase 1 — pinnedFlowsStore, pagination hook, cached quota hook, sidebar refactor

- Add pin() to pinnedFlowsApi
- Create pinnedFlowsStore (Zustand) — single source of truth for pin state
- Add dashboardMyFlowsView preference to userPreferencesStore
- Create usePaginationParams hook (URL-synced)
- Create useCachedQuota hook (5-min TTL)
- Sidebar uses pinnedFlowsStore instead of local state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Phase 2 — pin/favorite buttons on all library view components

- TreeGridView: star in top-right corner of cards
- TreeListView: star at end of each row
- TreeTableView: dedicated leftmost Favorite column
- All with proper a11y (aria-label), event isolation, loading states

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Phase 3 — Library page create dropdown + AI Builder + pin wiring

- Replace single Create link with dropdown menu (3 flow types + AI Builder)
- Wire pinnedFlowsStore to all view components
- AI Builder modal integration via useCachedQuota hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Phase 4 — Dashboard refactor with Favorites grid + paginated My Flows

- Favorites section: compact grid from pinnedFlowsStore, max 2 rows, expandable
- My Flows: author_id filter, URL-synced pagination (10/25/50/All)
- View toggle (grid/list/table) with independent preference
- Skeleton loaders, empty states with CTAs
- Create dropdown with AI Builder option
- 500-item ceiling for "Show All" mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat: Phase 5 — Sidebar pinned section dual collapse + show more/less

- Header collapse hides entire section, resets to 5 items on re-expand
- List truncation: show first 5, "Show more (N)" expands to all
- Clicking a flow auto-collapses back to 5
- Smooth max-height CSS transition (250ms ease-out)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: stabilize usePaginationParams to prevent infinite re-render loop

allowedPageSizes array was recreated every render as a useMemo dep,
causing infinite updates. Use useRef to stabilize the reference.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove Set-based Zustand selectors causing infinite re-render loop

Zustand selectors returning new Set() on every call fail Object.is
equality check, triggering continuous re-renders. Replaced with
useMemo-derived Sets in consuming components.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: pin route ordering and star icon overlap in grid view

Move GET /pinned and PATCH /pinned/reorder before GET /{tree_id} to
prevent FastAPI from matching "pinned" as a UUID path parameter (422).
Relocate star button from absolute positioning into the header row to
avoid overlapping privacy icons and category badges.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: code review fixes — date calc, input validation, rate limits, shared components

- Fix monthly_reset_at crash when billing anchor day exceeds next month's length
- Add environment_tags sanitization (max 20 tags, 100 chars each) to prevent prompt injection
- Add @limiter.limit("10/minute") rate limiting to all AI endpoints
- Use getTreeNavigatePath() routing helper instead of hardcoded paths
- Extract shared CreateFlowDropdown component from QuickStartPage and TreeLibraryPage
- Clear useCachedQuota on logout to prevent stale data across user sessions
- Add useRef guard to scaffold useEffect to prevent potential double-fire
- Use node.id as React key instead of array index in BranchDetailView
- Remove redundant dead logic in ai_tree_validator

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: correct Anthropic model ID to full dated version

claude-haiku-4-5 is not a valid model alias — Anthropic requires the
full dated model ID claude-haiku-4-5-20251001.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: strip markdown code fences from AI JSON responses

Haiku sometimes wraps its JSON in ```json ... ``` despite the prompt
instructing otherwise. Strip fences before parsing to avoid JSONDecodeError
at char 0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: increase branch_detail max_tokens to 8192 and add response logging

Truncated output at 4096 tokens produces invalid JSON mid-generation.
Also logs stop_reason and output_tokens per attempt to diagnose failures.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: pass explicit status='draft' when creating AI-generated flow

Tree model defaults to 'published' in the DB schema, but passing status=None
from the constructor overrides that default, causing a nullable=False violation
and a 500 on save.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: auto-advance branch detail and pin navigation bar

- Auto-advance to next undetailed branch after generation completes,
  using a useEffect that watches the count of detailed branches
- Cap tree preview at max-h-48 with internal scroll so the nav bar
  is never pushed off screen
- Make nav bar sticky bottom-0 with bg-card so it stays visible
  regardless of content height

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: increase branch retries to 3 and relax cross-reference validation on final attempt

next_node_id mismatches are a common model hallucination that the retry
prompt doesn't reliably fix. On the final (3rd) attempt, accept the branch
with strict=False so only truly fatal errors (missing fields, dead ends,
bad JSON) cause a hard failure. Cross-reference issues are minor and
fixable in the tree editor.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: strengthen prompt to prevent next_node_id mismatches, keep strict validation

Rather than lowering the validation bar, improve the system prompt:
- Rule 6 now explicitly states next_node_id must match a direct child's id
- Added rule 10: build tree bottom-up to avoid forward-reference errors
- Corrective prompt now calls out the ID mismatch constraint specifically

Reverts the strict=False fallback — flows must be correct before saving.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: persist branch viewing index in store to survive phase remounts

Local useState resets to 0 every time phase transitions from 'generating'
back to 'detailing', causing the view to snap back to branch 1.

Move viewingIndex to store's currentBranchIndex (already existed) and
advance it in generateBranchDetail after success. Component reads from
store so remounts no longer lose position.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: correct publish validation to check title instead of action/solution fields

The publish validator was checking for an 'action' field on action nodes
and a 'solution' field on solution nodes, but the actual node schema
(confirmed from seed data and frontend types) uses 'title'/'description'.
This caused all AI-generated trees to fail publish validation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: correct action node schema and improve AI flow quality

- Fix action nodes to use next_node_id (not children) for continuation,
  matching how TreeNavigationPage.tsx navigates action nodes
- Validator now requires next_node_id on all action nodes and flags
  missing ones as broken dead ends
- Update _check_branch_termination: action nodes are not dead ends since
  they continue via next_node_id (validated separately)
- Improve scaffold prompt: branch names must describe observable symptoms
  users can self-identify, not internal category names
- Update branch_detail prompt with clearer action node schema, corrected
  few-shot example showing proper next_node_id on action nodes
- Improve assemble_tree root question to be more user-facing

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* docs: add AI flow builder gotchas to CLAUDE.md (#23-25)

- Action nodes use next_node_id (not children) for navigation
- Anthropic model IDs require full dated version string
- Claude API may wrap JSON in markdown fences

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: resolve CI lint errors and httpx dependency conflict

- Fix httpx version conflict: requirements-dev.txt now uses >=0.27.0 to match requirements.txt
- Extract CSAT helper functions to csatUtils.ts to fix react-refresh/only-export-components
- Remove default export from admin/EmptyState.tsx shim (same rule)
- Fix empty catch block in Modal.tsx (no-empty)
- Add eslint-disable comments for intentional setState-in-effect patterns in
  FlowAnalyticsPanel, QuickLaunch, NodeEditorPanel, useCachedQuota,
  MyAnalyticsPage, TeamAnalyticsPage
- Add eslint-disable comments for intentional _children destructure in NodeEditorPanel
- Fix _parentId unused var in useTreeLayout.ts
- Rewrite usePaginationParams.ts to avoid reading refs during render

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update tests to match action node schema (next_node_id, not children)

- Update _make_valid_tree() in test_ai_tree_validator to use next_node_id
  on action nodes (solution is a sibling, not a child)
- Fix test_dead_end_action_node → test_dead_end_decision_node (action nodes
  don't have child-based dead ends; dead ends are decision nodes with no children)
- Add test_action_missing_next_node_id for the new validation rule
- Update BRANCH_DETAIL_JSON in test_ai_endpoints to use next_node_id pattern
- Update test_draft_trees.py to use "title" field for action/solution nodes
  (tree_validation.py was updated this branch to require "title" not "action"/"solution")

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: update remaining tests and session_to_tree for title field rename

- test_tree_validation.py: replace "action"/"solution" content fields with "title"
- test_procedural_flows.py: update solution node fixtures to use "title"
- test_save_session_as_tree.py: update fixtures and assertions for "title" field
- session_to_tree.py: generate "title" instead of "action"/"solution" on converted nodes;
  fall back to legacy field names when reading from old tree snapshots for compatibility

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-23 00:03:54 -05:00

359 lines
12 KiB
Python

"""Integration tests for AI Flow Builder endpoints.
All Anthropic API calls are mocked — zero real API spend.
"""
import json
from unittest.mock import AsyncMock, patch, MagicMock
import pytest
from app.core.config import settings
# ── Sample AI responses ──
SCAFFOLD_RESPONSE_JSON = json.dumps({
"branches": [
{"name": "Service Not Running", "description": "The target service is stopped or crashed."},
{"name": "Authentication Failures", "description": "Users cannot authenticate against the service."},
{"name": "Network Connectivity", "description": "Network-level issues preventing access."},
{"name": "Configuration Errors", "description": "Misconfiguration of the service or its dependencies."},
]
})
BRANCH_DETAIL_JSON = json.dumps({
"id": "svc-root",
"type": "decision",
"question": "Is the service running?",
"options": [
{"id": "opt-yes", "label": "Yes", "next_node_id": "svc-check-logs"},
{"id": "opt-no", "label": "No", "next_node_id": "svc-restart"},
],
"children": [
{
"id": "svc-check-logs",
"type": "action",
"title": "Check Event Logs",
"description": "Check Windows Event Viewer for errors.",
"commands": ["Get-EventLog -LogName Application -Newest 20"],
"next_node_id": "svc-logs-resolved",
},
{
"id": "svc-logs-resolved",
"type": "solution",
"title": "Issue Found in Logs",
"description": "Error identified and resolved.",
"resolution_steps": ["Fix the error", "Restart service"],
},
{
"id": "svc-restart",
"type": "action",
"title": "Restart Service",
"description": "Attempt to restart the service.",
"commands": ["Restart-Service -Name 'TestService'"],
"next_node_id": "svc-restart-ok",
},
{
"id": "svc-restart-ok",
"type": "solution",
"title": "Service Restored",
"description": "Service is running after restart.",
"resolution_steps": ["Verify connectivity", "Document in ticket"],
},
],
})
def _mock_anthropic_response(text: str, input_tokens: int = 100, output_tokens: int = 200):
"""Create a mock Anthropic API response."""
response = MagicMock()
response.content = [MagicMock(text=text)]
response.usage = MagicMock(input_tokens=input_tokens, output_tokens=output_tokens)
return response
@pytest.fixture
def enable_ai():
"""Temporarily enable AI by setting a fake API key."""
original = settings.ANTHROPIC_API_KEY
settings.ANTHROPIC_API_KEY = "test-key-fake"
yield
settings.ANTHROPIC_API_KEY = original
@pytest.fixture
def disable_ai():
"""Ensure AI is disabled."""
original = settings.ANTHROPIC_API_KEY
settings.ANTHROPIC_API_KEY = None
yield
settings.ANTHROPIC_API_KEY = original
# ── Quota endpoint ──
@pytest.mark.asyncio
async def test_quota_returns_disabled_when_no_key(client, auth_headers, disable_ai):
"""GET /ai/quota returns ai_enabled=false when no API key."""
response = await client.get("/api/v1/ai/quota", headers=auth_headers)
assert response.status_code == 200
data = response.json()
assert data["ai_enabled"] is False
assert data["allowed"] is False
@pytest.mark.asyncio
async def test_quota_returns_enabled_with_key(client, auth_headers, enable_ai):
"""GET /ai/quota returns ai_enabled=true with API key configured."""
response = await client.get("/api/v1/ai/quota", headers=auth_headers)
assert response.status_code == 200
data = response.json()
assert data["ai_enabled"] is True
assert data["allowed"] is True
# ── Start endpoint ──
@pytest.mark.asyncio
async def test_start_requires_auth(client, enable_ai):
"""POST /ai/start requires authentication."""
response = await client.post("/api/v1/ai/start", json={
"flow_type": "troubleshooting",
"name": "Test Flow",
"description": "Test",
})
assert response.status_code == 401
@pytest.mark.asyncio
async def test_start_returns_503_when_disabled(client, auth_headers, disable_ai):
"""POST /ai/start returns 503 when AI is not configured."""
response = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "Test Flow",
"description": "Test description",
},
headers=auth_headers,
)
assert response.status_code == 503
@pytest.mark.asyncio
async def test_start_creates_conversation(client, auth_headers, enable_ai):
"""POST /ai/start creates a conversation and returns conversation_id."""
response = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "DNS Issues",
"description": "Troubleshooting DNS resolution failures",
"environment_tags": ["Windows Server", "Active Directory"],
},
headers=auth_headers,
)
assert response.status_code == 201
data = response.json()
assert "conversation_id" in data
assert data["status"] == "foundation"
@pytest.mark.asyncio
async def test_start_validates_input(client, auth_headers, enable_ai):
"""POST /ai/start rejects invalid input."""
response = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "", # Empty name
"description": "Test",
},
headers=auth_headers,
)
assert response.status_code == 422
# ── Scaffold endpoint ──
@pytest.mark.asyncio
async def test_scaffold_success(client, auth_headers, enable_ai):
"""POST /ai/scaffold returns AI-generated branches."""
# Create conversation first
start_resp = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "DNS Issues",
"description": "DNS resolution failures",
},
headers=auth_headers,
)
conversation_id = start_resp.json()["conversation_id"]
# Mock Anthropic
mock_response = _mock_anthropic_response(SCAFFOLD_RESPONSE_JSON)
with patch("app.core.ai_tree_generator_service._get_client") as mock_client:
mock_client.return_value.messages.create = AsyncMock(return_value=mock_response)
response = await client.post(
"/api/v1/ai/scaffold",
json={"conversation_id": conversation_id},
headers=auth_headers,
)
assert response.status_code == 200
data = response.json()
assert data["status"] == "scaffolding"
assert len(data["branches"]) == 4
assert data["branches"][0]["name"] == "Service Not Running"
@pytest.mark.asyncio
async def test_scaffold_invalid_conversation(client, auth_headers, enable_ai):
"""POST /ai/scaffold returns 404 for nonexistent conversation."""
response = await client.post(
"/api/v1/ai/scaffold",
json={"conversation_id": "00000000-0000-0000-0000-000000000000"},
headers=auth_headers,
)
assert response.status_code == 404
# ── Branch detail endpoint ──
@pytest.mark.asyncio
async def test_branch_detail_success(client, auth_headers, enable_ai):
"""POST /ai/branch-detail returns AI-generated branch nodes."""
# Create and scaffold first
start_resp = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "Service Issues",
"description": "Service troubleshooting",
},
headers=auth_headers,
)
conversation_id = start_resp.json()["conversation_id"]
scaffold_mock = _mock_anthropic_response(SCAFFOLD_RESPONSE_JSON)
with patch("app.core.ai_tree_generator_service._get_client") as mock_client:
mock_client.return_value.messages.create = AsyncMock(return_value=scaffold_mock)
await client.post(
"/api/v1/ai/scaffold",
json={"conversation_id": conversation_id},
headers=auth_headers,
)
# Now generate branch detail
detail_mock = _mock_anthropic_response(BRANCH_DETAIL_JSON)
with patch("app.core.ai_tree_generator_service._get_client") as mock_client:
mock_client.return_value.messages.create = AsyncMock(return_value=detail_mock)
response = await client.post(
"/api/v1/ai/branch-detail",
json={
"conversation_id": conversation_id,
"branch_name": "Service Not Running",
},
headers=auth_headers,
)
assert response.status_code == 200
data = response.json()
assert data["branch_name"] == "Service Not Running"
assert data["steps"]["id"] == "svc-root"
assert data["steps"]["type"] == "decision"
# ── Assemble endpoint ──
@pytest.mark.asyncio
async def test_assemble_success(client, auth_headers, enable_ai):
"""POST /ai/assemble returns assembled tree from branches with detail."""
# Create conversation
start_resp = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "Service Issues",
"description": "Service troubleshooting",
},
headers=auth_headers,
)
conversation_id = start_resp.json()["conversation_id"]
# Scaffold
scaffold_mock = _mock_anthropic_response(SCAFFOLD_RESPONSE_JSON)
with patch("app.core.ai_tree_generator_service._get_client") as mock_client:
mock_client.return_value.messages.create = AsyncMock(return_value=scaffold_mock)
await client.post(
"/api/v1/ai/scaffold",
json={"conversation_id": conversation_id},
headers=auth_headers,
)
# Assemble with branch detail included
branch_tree = json.loads(BRANCH_DETAIL_JSON)
response = await client.post(
"/api/v1/ai/assemble",
json={
"conversation_id": conversation_id,
"selected_branches": [
{
"name": "Service Not Running",
"description": "The target service is stopped.",
"steps": branch_tree,
},
{
"name": "Authentication Failures",
"description": "Users cannot authenticate.",
"steps": branch_tree,
},
],
},
headers=auth_headers,
)
assert response.status_code == 200
data = response.json()
assert data["status"] == "completed"
assert data["suggested_name"] == "Service Issues"
assert "tree_structure" in data
assert data["tree_structure"]["type"] == "decision"
assert data["summary"]["node_count"] > 0
assert data["summary"]["solution_count"] >= 2
@pytest.mark.asyncio
async def test_assemble_requires_min_2_branches(client, auth_headers, enable_ai):
"""POST /ai/assemble rejects fewer than 2 branches."""
start_resp = await client.post(
"/api/v1/ai/start",
json={
"flow_type": "troubleshooting",
"name": "Test",
"description": "Test",
},
headers=auth_headers,
)
conversation_id = start_resp.json()["conversation_id"]
response = await client.post(
"/api/v1/ai/assemble",
json={
"conversation_id": conversation_id,
"selected_branches": [
{"name": "Only Branch", "description": "Just one"},
],
},
headers=auth_headers,
)
assert response.status_code == 422