feat: add procedural flow support to AI chat builder (Flow Assist)

- Add procedural-specific system prompts (schema, interview protocol, response format) - Dispatch prompts by flow_type: procedural/maintenance use flat steps schema, troubleshooting uses decision tree schema - Parse [STEPS_UPDATE] and [INTAKE_FORM] markers in AI responses - Add validate_generated_procedural_steps() validator - Handle intake form extraction in AI chat import endpoint - Add StaticStepsPreview component for procedural flow preview - Update store and page to render correct preview by flow type Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 02:20:14 -05:00
parent 07a723c687
commit f86e16661a
7 changed files with 1298 additions and 38 deletions
--- a/backend/app/api/endpoints/ai_chat.py
+++ b/backend/app/api/endpoints/ai_chat.py
@@ -390,11 +390,18 @@ async def import_tree(
    # Always create a new Tree record (no duplicate check — user may
    # want multiple copies or re-import after edits)
    metadata = session.tree_metadata or {}
+
+    # Extract intake form from metadata if present (procedural flows)
+    intake_form = None
+    if isinstance(metadata.get("intake_form"), list):
+        intake_form = metadata.pop("intake_form")
+
    tree = Tree(
        name=data.name or metadata.get("name", "AI-Generated Flow"),
        description=data.description or metadata.get("description", ""),
        tree_type=session.flow_type,
        tree_structure=session.working_tree,
+        intake_form=intake_form,
        author_id=current_user.id,
        account_id=current_user.account_id,
        category_id=data.category_id,
--- a/backend/app/core/ai_chat_service.py
+++ b/backend/app/core/ai_chat_service.py
@@ -15,7 +15,7 @@ from sqlalchemy import select
 from sqlalchemy.ext.asyncio import AsyncSession

 from app.core.ai_provider import get_ai_provider
-from app.core.ai_tree_validator import validate_generated_tree
+from app.core.ai_tree_validator import validate_generated_tree, validate_generated_procedural_steps
 from app.core.config import settings
 from app.models.ai_chat_session import AIChatSession

@@ -140,18 +140,139 @@ IMPORTANT:
 """


+PROCEDURAL_SCHEMA_CONTEXT = """
+PROCEDURAL STEP SCHEMA — This is what you are building:
+
+The flow is an ordered array of steps in a JSON object: {"steps": [...]}
+
+Each step has a "type" field:
+
+1. procedure_step — A concrete step the engineer performs
+   Required: id (string), type ("procedure_step"), title (string), description (string)
+   Optional:
+   - content_type ("action"|"informational"|"verification"|"warning") — default "action"
+   - estimated_minutes (number)
+   - commands (array of objects: {code: string, label?: string, language?: string}) — exact CLI/PowerShell syntax
+   - expected_outcome (string) — what success looks like
+   - verification_prompt (string) — question to confirm completion
+   - verification_type ("checkbox"|"text_input") — how the engineer confirms
+   - warning_text (string) — caution or prerequisite info
+   - notes_enabled (boolean) — allow engineer to capture notes on this step
+   - reference_url (string) — link to documentation
+
+2. section_header — Groups steps into logical phases
+   Required: id (string), type ("section_header"), title (string)
+   Section headers apply to all subsequent steps until the next section_header.
+
+3. procedure_end — Terminal marker (always the last step)
+   Required: id (string), type ("procedure_end"), title (string)
+
+STRUCTURAL RULES:
+- Steps are executed in array order (flat list, no branching)
+- All IDs must be unique descriptive slugs (e.g., "check-dns-resolution", not UUIDs)
+- The last step MUST be type "procedure_end"
+- Use section_headers to organize steps into logical phases
+- Commands are arrays of objects: [{"code": "Get-Service ADSync", "label": "Check sync service", "language": "powershell"}]
+- Descriptions support [VAR:variable_name] interpolation for intake form variables (e.g., "Connect to [VAR:server_name] via RDP")
+
+VARIABLE INTERPOLATION:
+When the procedure needs per-execution input (server name, IP address, client name, etc.), use [VAR:variable_name] syntax in descriptions and commands. These map to intake form fields that the engineer fills in before starting.
+"""
+
+PROCEDURAL_INTERVIEW_PROTOCOL = """
+INTERVIEW PHASES — Follow this progression:
+
+PHASE 1 - SCOPING (current_phase: scoping):
+Understand the process being documented:
+- What process or procedure is this flow for?
+- Who will execute it? (Tier 1 help desk, Tier 2, senior engineers?)
+- What environment context? (Specific vendor, on-prem vs cloud, tools available?)
+- Will this need per-execution input? (server name, client info, IP addresses → intake form fields)
+Demonstrate domain expertise: if the user says "Exchange Online mailbox migration," show understanding: "Are we covering full tenant-to-tenant migration, on-prem to Exchange Online cutover, or individual mailbox moves with hybrid?"
+DO NOT emit [STEPS_UPDATE] during scoping. You are still understanding the process.
+
+PHASE 2 - DISCOVERY (current_phase: discovery):
+Build the procedure step by step IN ORDER:
+- Start with prerequisites and initial verification
+- Walk through each step sequentially — ask what happens first, then next, then next
+- Suggest section headers to organize logical phases (e.g., "Pre-Flight Checks", "Migration", "Verification")
+- Capture specific commands, tools, and expected outcomes for each step
+- Identify where [VAR:variable_name] placeholders are needed
+EMIT [STEPS_UPDATE] when you and the user have agreed on concrete steps. Build progressively — emit partial step lists as you go.
+
+PHASE 3 - ENRICHMENT (current_phase: enrichment):
+Circle back to enrich existing steps:
+- Add exact PowerShell/CLI commands with full syntax
+- Add verification prompts for critical steps
+- Add warning_text for steps with risk (data loss, downtime, etc.)
+- Add estimated_minutes for time-critical procedures
+- Add expected_outcome for action steps
+- Suggest reference_url links to documentation
+- Identify missing edge cases or safety checks
+EMIT [STEPS_UPDATE] when enriching steps with additional detail.
+
+PHASE 4 - REVIEW (current_phase: review):
+Present a summary:
+- Total step count by content_type
+- Outline of sections and steps
+- List of intake form variables ([VAR:...]) used
+- Flag any steps missing commands or verification
+- Offer chance to reorder, add, or remove steps
+EMIT [STEPS_UPDATE] only if the user requests changes.
+
+TRANSITION between phases by emitting [PHASE:phase_name] when the conversation naturally moves to the next stage. You decide when enough information has been gathered for each phase.
+"""
+
+PROCEDURAL_RESPONSE_FORMAT = """
+RESPONSE FORMAT:
+
+Your response is natural conversational text. When the step structure changes, include structured markers that will be parsed by the system (the user will NOT see these markers):
+
+1. Steps update (only when structure changes — see phase rules above):
+[STEPS_UPDATE]
+{"steps": [...valid steps array...]}
+[/STEPS_UPDATE]
+
+2. Phase transition (when moving to next phase):
+[PHASE:discovery]
+
+3. Metadata capture (when you learn the flow's name, description, or tags):
+[METADATA]
+{"name": "...", "description": "...", "tags": ["..."]}
+[/METADATA]
+
+4. Intake form suggestion (when intake form fields are identified):
+[INTAKE_FORM]
+[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[/INTAKE_FORM]
+
+IMPORTANT:
+- Include [STEPS_UPDATE] sparingly. Only when concrete steps are established or modified.
+- The steps update should be the COMPLETE working step list, not a diff.
+- Always include conversational text OUTSIDE the markers — never respond with only markers.
+- The procedure_end step is always included as the last step.
+"""
+
+
 def _build_system_prompt(flow_type: str) -> str:
    """Assemble the full system prompt for the chat builder."""
-    flow_context = (
-        "The user wants to build a TROUBLESHOOTING flow — a diagnostic decision tree "
-        "that guides engineers through symptom identification, diagnostic checks, and "
-        "resolution steps."
-        if flow_type == "troubleshooting"
-        else "The user wants to build a PROCEDURAL flow — a step-by-step process guide "
-        "with phases, checklists, and verification steps."
-    )
-
-    return f"{ROLE_PERSONA}\n\n{flow_context}\n\n{SCHEMA_CONTEXT}\n\n{INTERVIEW_PROTOCOL}\n\n{RESPONSE_FORMAT}"
+    if flow_type in ("procedural", "maintenance"):
+        flow_context = (
+            "The user wants to build a PROCEDURAL flow — a step-by-step process guide "
+            "with ordered phases, verification checkpoints, and optional intake form variables. "
+            "This is NOT a branching decision tree — it is a flat, sequential procedure."
+        )
+        return (
+            f"{ROLE_PERSONA}\n\n{flow_context}\n\n"
+            f"{PROCEDURAL_SCHEMA_CONTEXT}\n\n{PROCEDURAL_INTERVIEW_PROTOCOL}\n\n{PROCEDURAL_RESPONSE_FORMAT}"
+        )
+    else:
+        flow_context = (
+            "The user wants to build a TROUBLESHOOTING flow — a diagnostic decision tree "
+            "that guides engineers through symptom identification, diagnostic checks, and "
+            "resolution steps."
+        )
+        return f"{ROLE_PERSONA}\n\n{flow_context}\n\n{SCHEMA_CONTEXT}\n\n{INTERVIEW_PROTOCOL}\n\n{RESPONSE_FORMAT}"


 def _strip_markdown_fences(text: str) -> str:
@@ -177,6 +298,7 @@ def _parse_ai_response(raw_response: str) -> dict[str, Any]:
        "tree_update": None,
        "phase": None,
        "metadata": None,
+        "intake_form": None,
    }

    # Extract [TREE_UPDATE]...[/TREE_UPDATE]
@@ -198,6 +320,40 @@ def _parse_ai_response(raw_response: str) -> dict[str, Any]:
            logger.warning("Truncated [TREE_UPDATE] block detected (no closing tag) — stripping from display")
            result["content"] = raw_response[: truncated_match.start()]

+    # Extract [STEPS_UPDATE]...[/STEPS_UPDATE] (procedural flows)
+    steps_match = re.search(
+        r"\[STEPS_UPDATE\]\s*([\s\S]*?)\s*\[/STEPS_UPDATE\]", result["content"]
+    )
+    if steps_match:
+        try:
+            raw_json = _strip_markdown_fences(steps_match.group(1))
+            result["tree_update"] = json.loads(raw_json)
+        except (json.JSONDecodeError, ValueError) as e:
+            logger.warning("Failed to parse steps update JSON: %s", e)
+        result["content"] = result["content"][: steps_match.start()] + result["content"][steps_match.end() :]
+    else:
+        truncated_steps = re.search(r"\[STEPS_UPDATE\][\s\S]*$", result["content"])
+        if truncated_steps:
+            logger.warning("Truncated [STEPS_UPDATE] block detected (no closing tag) — stripping from display")
+            result["content"] = result["content"][: truncated_steps.start()]
+
+    # Extract [INTAKE_FORM]...[/INTAKE_FORM] (procedural flows)
+    intake_match = re.search(
+        r"\[INTAKE_FORM\]\s*([\s\S]*?)\s*\[/INTAKE_FORM\]", result["content"]
+    )
+    if intake_match:
+        try:
+            raw_json = _strip_markdown_fences(intake_match.group(1))
+            result["intake_form"] = json.loads(raw_json)
+        except (json.JSONDecodeError, ValueError) as e:
+            logger.warning("Failed to parse intake form JSON: %s", e)
+        result["content"] = result["content"][: intake_match.start()] + result["content"][intake_match.end() :]
+    else:
+        truncated_intake = re.search(r"\[INTAKE_FORM\][\s\S]*$", result["content"])
+        if truncated_intake:
+            logger.warning("Truncated [INTAKE_FORM] block detected — stripping from display")
+            result["content"] = result["content"][: truncated_intake.start()]
+
    # Extract [PHASE:name]
    phase_match = re.search(r"\[PHASE:(\w+)\]", result["content"])
    if phase_match:
@@ -318,12 +474,19 @@ async def send_message(
    # only require valid root structure, not min node counts)
    tree_update = parsed["tree_update"]
    if tree_update:
-        if not isinstance(tree_update, dict) or tree_update.get("type") != "decision":
-            logger.warning("AI tree update rejected: root must be a decision node")
-            tree_update = None
-        elif not tree_update.get("id"):
-            logger.warning("AI tree update rejected: root node missing id")
-            tree_update = None
+        if session.flow_type in ("procedural", "maintenance"):
+            # Procedural: must be a dict with a "steps" list
+            if not isinstance(tree_update, dict) or not isinstance(tree_update.get("steps"), list):
+                logger.warning("AI steps update rejected: must be a dict with a 'steps' list")
+                tree_update = None
+        else:
+            # Troubleshooting: root must be a decision node
+            if not isinstance(tree_update, dict) or tree_update.get("type") != "decision":
+                logger.warning("AI tree update rejected: root must be a decision node")
+                tree_update = None
+            elif not tree_update.get("id"):
+                logger.warning("AI tree update rejected: root node missing id")
+                tree_update = None

    # Update session state
    history.append({"role": "assistant", "content": parsed["content"], "timestamp": now_iso})
@@ -345,6 +508,11 @@ async def send_message(
        merged.update(parsed["metadata"])
        session.tree_metadata = merged

+    if parsed.get("intake_form"):
+        merged = dict(session.tree_metadata)
+        merged["intake_form"] = parsed["intake_form"]
+        session.tree_metadata = merged
+
    session.updated_at = datetime.now(timezone.utc)

    return parsed["content"], tree_update, parsed["phase"], parsed["metadata"]
@@ -367,7 +535,33 @@ async def generate_final_tree(
        for msg in session.conversation_history
    ]

-    generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL TreeStructure JSON for this flow.
+    if session.flow_type in ("procedural", "maintenance"):
+        generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL procedural steps JSON for this flow.
+
+Requirements:
+- Output format: {"steps": [...]} — a JSON object with a "steps" array
+- Include ALL steps, section headers, and details we discussed
+- Use descriptive step IDs (slugs, not UUIDs)
+- Steps are in execution order (flat list, no branching)
+- Use section_header steps to organize into logical phases
+- Every procedure_step should have commands with exact syntax where discussed
+- Every procedure_step should have expected_outcome and verification_prompt where discussed
+- Include content_type, estimated_minutes, warning_text, and reference_url where discussed
+- Use [VAR:variable_name] syntax in descriptions/commands for intake form variables
+- The LAST step MUST be type "procedure_end"
+- Respond with ONLY the JSON — no conversational text, no markdown fences
+
+Also provide metadata as a separate JSON object after the steps:
+[METADATA]
+{"name": "...", "description": "...", "tags": ["..."]}
+[/METADATA]
+
+If we discussed intake form fields, also include:
+[INTAKE_FORM]
+[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[/INTAKE_FORM]"""
+    else:
+        generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL TreeStructure JSON for this flow.

 Requirements:
 - Include ALL branches, steps, and solutions we discussed
@@ -421,21 +615,30 @@ Also provide metadata as a separate JSON object after the tree:
                continue
            raise ValueError("AI failed to produce valid JSON after retry")

-        errors = validate_generated_tree(tree)
-        if errors:
+        if session.flow_type in ("procedural", "maintenance"):
+            val_errors = validate_generated_procedural_steps(tree)
+        else:
+            val_errors = validate_generated_tree(tree)
+
+        if val_errors:
            if attempt == 0:
                provider_messages.append({"role": "assistant", "content": response_text})
                correction = (
-                    f"The tree has validation errors: {'; '.join(errors)}. "
+                    f"The generated structure has validation errors: {'; '.join(val_errors)}. "
                    "Please fix these issues and respond with the corrected JSON only."
                )
                provider_messages.append({"role": "user", "content": correction})
                continue
-            raise ValueError(f"Generated tree failed validation: {'; '.join(errors)}")
+            raise ValueError(f"Generated structure failed validation: {'; '.join(val_errors)}")

        # Success
        session.working_tree = tree
        session.tree_metadata = metadata
+        if parsed.get("intake_form"):
+            merged = dict(session.tree_metadata)
+            merged["intake_form"] = parsed["intake_form"]
+            session.tree_metadata = merged
+            metadata = session.tree_metadata
        session.current_phase = "generation"
        session.updated_at = datetime.now(timezone.utc)

--- a/backend/app/core/ai_tree_validator.py
+++ b/backend/app/core/ai_tree_validator.py
@@ -230,3 +230,96 @@ def count_tree_stats(tree: dict[str, Any]) -> dict[str, int]:

    _count(tree, 1)
    return stats
+
+
+# --- Procedural flow validation ---
+
+VALID_PROCEDURAL_STEP_TYPES = {"procedure_step", "procedure_end", "section_header"}
+VALID_CONTENT_TYPES = {"action", "informational", "verification", "warning"}
+
+
+def validate_generated_procedural_steps(tree: dict[str, Any]) -> list[str]:
+    """Validate an AI-generated procedural step array.
+
+    Expects a dict with a 'steps' key containing a list of step objects.
+    Returns a list of error strings. Empty list means valid.
+    """
+    errors: list[str] = []
+
+    if not isinstance(tree, dict):
+        return ["Procedural flow must be a JSON object"]
+
+    steps = tree.get("steps")
+    if not isinstance(steps, list) or len(steps) == 0:
+        return ["Procedural flow must have a non-empty 'steps' array"]
+
+    if len(steps) > 100:
+        errors.append(
+            f"Procedural flow has {len(steps)} steps. Maximum 100 allowed."
+        )
+
+    all_ids: set[str] = set()
+    procedure_step_count = 0
+    procedure_end_count = 0
+
+    for i, step in enumerate(steps):
+        if not isinstance(step, dict):
+            errors.append(f"Step at index {i} is not an object")
+            continue
+
+        # Check required fields
+        step_id = step.get("id")
+        step_type = step.get("type")
+        step_title = step.get("title")
+
+        if not step_id or not isinstance(step_id, str):
+            errors.append(f"Step at index {i} missing or invalid 'id' (must be a string)")
+        elif step_id in all_ids:
+            errors.append(f"Duplicate step ID: '{step_id}'")
+        else:
+            all_ids.add(step_id)
+
+        if not step_type or step_type not in VALID_PROCEDURAL_STEP_TYPES:
+            errors.append(
+                f"Step '{step_id or f'index {i}'}' has invalid type '{step_type}'. "
+                f"Must be one of: {', '.join(sorted(VALID_PROCEDURAL_STEP_TYPES))}"
+            )
+        else:
+            if step_type == "procedure_step":
+                procedure_step_count += 1
+            elif step_type == "procedure_end":
+                procedure_end_count += 1
+
+        if not step_title or not isinstance(step_title, str):
+            errors.append(f"Step '{step_id or f'index {i}'}' missing or invalid 'title' (must be a string)")
+
+        # Validate content_type if present
+        content_type = step.get("content_type")
+        if content_type is not None and content_type not in VALID_CONTENT_TYPES:
+            errors.append(
+                f"Step '{step_id or f'index {i}'}' has invalid content_type '{content_type}'. "
+                f"Must be one of: {', '.join(sorted(VALID_CONTENT_TYPES))}"
+            )
+
+    # Must have exactly one procedure_end as the last step
+    if procedure_end_count == 0:
+        errors.append("Procedural flow must have exactly one 'procedure_end' step")
+    elif procedure_end_count > 1:
+        errors.append(
+            f"Procedural flow has {procedure_end_count} 'procedure_end' steps. "
+            "Must have exactly one."
+        )
+    else:
+        # Exactly one — check it's the last step
+        last_step = steps[-1]
+        if isinstance(last_step, dict) and last_step.get("type") != "procedure_end":
+            errors.append("The 'procedure_end' step must be the last step in the array")
+
+    # Need at least 2 procedure_step items
+    if procedure_step_count < 2:
+        errors.append(
+            f"Procedural flow has only {procedure_step_count} 'procedure_step' items. "
+            "Need at least 2 for a useful procedure."
+        )
+
+    return errors