fix(ai): full-sweep audit — placeholders only in system prompts + CI guardrail

The "AI parrots example content from system prompt" bug bit us twice in one day across two different prompt sites. Patching individual prompts is treating the symptom; this commit makes the rule structural. Audit + sanitize: - assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in prior commits, but the [FORK] schema still had literal "Brief reason" / "Short name" / "One sentence" placeholders. Replaced with <angle-bracket> placeholders. Anti-parrot rule itself rewritten to describe the failure mode abstractly instead of naming "jsmith" so the rule no longer trips the guardrail (and so the model doesn't see "jsmith" as a token at all). - ai_chat_service.py — removed three concrete-example offenders: "Get-Service ADSync" command literal, the "DC01 server_name" intake form payload (in two places), and the inline interview demos using "Azure AD Sync failures" / "Exchange Online mailbox migration". Replaced with technology-neutral schema descriptions. - ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the fully-fleshed DNS troubleshooting tree (with literal Dnscache / ipconfig / google.com / Start-Service) with a placeholder schema showing only ID-linkage shape. - kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked Server Manager + DC01 example payload with a placeholder schema. Guardrail (tests/test_prompt_anti_parrot.py): - Imports every module under app/services/ and app/core/ and walks every uppercase string constant ending in _PROMPT, _SCHEMA, _PROTOCOL, _FORMAT, or _CONTEXT. - test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache, google.com, "Outlook keeps", "Teams drops") must not appear in any prompt constant. Add to the list when a new leak shows up in prod — the list IS the audit trail. - test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.) must contain placeholders only. Distinguishes JSON keys (followed by ':', allowed) from JSON values (followed by ',' / ']' / '}', must be <placeholder>); allows pipe-separated enum types (text|password|select) and a small set of fixed enum values (question, diagnostic_check, decision, action, ...). Verified by feeding the test a known-bad block — caught it correctly. Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming the test as the enforcement point so future contributors know how to extend it (add to the known-leaked list when a new leak surfaces). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 02:09:30 -04:00
parent 50215b9110
commit d0ebdef9e8
6 changed files with 223 additions and 54 deletions
--- a/backend/app/core/ai_chat_service.py
+++ b/backend/app/core/ai_chat_service.py
@@ -40,7 +40,7 @@ CRITICAL BEHAVIORS:
 - Act as a senior engineer, not a chatbot. Use your domain knowledge to SUGGEST diagnostic steps, not just record what the user says.
 - When the user describes a problem area, demonstrate understanding by naming specific sub-categories, common causes, and relevant tools.
 - Challenge assumptions constructively: "Before we go down that path, have you considered checking X first? In my experience, that resolves 60% of these cases."
- Capture SPECIFIC commands with exact syntax. Not "check the service" but "Get-Service ADSync | Select-Object Status, StartType".
+- Capture SPECIFIC commands with exact syntax (PowerShell/CLI invocations the engineer would actually paste into a shell), not vague directives like "check the service".
 - Include expected outcomes for every action: what does success look like?
 - Surface edge cases proactively: "What about multi-forest environments?" or "Does this change if they have conditional access policies?"
 - Explain WHY the diagnostic order matters: "We check connectivity before auth because a network issue masquerades as an auth failure."
@@ -74,7 +74,7 @@ STRUCTURAL RULES:
 - All IDs must be unique strings (use descriptive slugs like "check-service-status")

 CROSS-REFERENCE / LOOP-BACK PATTERN:
-When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID. Example: an action node "restart-ssh-service" can set next_node_id to "verify-ssh-connection" (an ancestor decision node) to create a re-verification loop.
+When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID — including ancestor decision nodes for re-verification loops. The target ID must already exist somewhere in the tree.
 """

 INTERVIEW_PROTOCOL = """
@@ -85,7 +85,7 @@ Ask broad questions to understand the problem domain and scope:
 - What type of issue is this flow for?
 - Who is the target audience? (Tier 1 help desk, Tier 2, Tier 3?)
 - What environment assumptions? (On-prem, hybrid, specific vendors?)
-Demonstrate domain expertise immediately. If the user says "Azure AD Sync failures," show understanding: "Are you primarily seeing password hash sync issues, object attribute sync failures, or full directory sync errors?"
+Demonstrate domain expertise immediately. When the user names a technology, ask a follow-up that proves you know its common failure modes — a sub-categorization question that only someone fluent in that area would think to ask. Use vocabulary native to whatever the user actually mentioned, not stock examples from past conversations.
 DO NOT emit [TREE_UPDATE] during scoping. You are still understanding the problem.

 PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -130,7 +130,7 @@ Your response is natural conversational text. When the tree structure changes, i

 3. Metadata capture (when you learn the flow's name, description, or tags):
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 IMPORTANT:
@@ -172,8 +172,8 @@ STRUCTURAL RULES:
 - All IDs must be unique descriptive slugs (e.g., "check-dns-resolution", not UUIDs)
 - The last step MUST be type "procedure_end"
 - Use section_headers to organize steps into logical phases
- Commands are arrays of objects: [{"code": "Get-Service ADSync", "label": "Check sync service", "language": "powershell"}]
- Descriptions support [VAR:variable_name] interpolation for intake form variables (e.g., "Connect to [VAR:server_name] via RDP")
+- Commands are arrays of objects: [{"code": "<exact command>", "label": "<short label>", "language": "powershell|bash|cmd"}]
+- Descriptions support [VAR:variable_name] interpolation for intake form variables. Pick variable names that fit the procedure being built — do not reuse names from prior conversations.

 VARIABLE INTERPOLATION:
 When the procedure needs per-execution input (server name, IP address, client name, etc.), use [VAR:variable_name] syntax in descriptions and commands. These map to intake form fields that the engineer fills in before starting.
@@ -188,7 +188,7 @@ Understand the process being documented:
 - Who will execute it? (Tier 1 help desk, Tier 2, senior engineers?)
 - What environment context? (Specific vendor, on-prem vs cloud, tools available?)
 - Will this need per-execution input? (server name, client info, IP addresses → intake form fields)
-Demonstrate domain expertise: if the user says "Exchange Online mailbox migration," show understanding: "Are we covering full tenant-to-tenant migration, on-prem to Exchange Online cutover, or individual mailbox moves with hybrid?"
+Demonstrate domain expertise: when the user names a process, ask a sub-categorization question that distinguishes which variant of that process they mean (the variants will differ by technology — use vocabulary specific to whatever the user mentioned, not examples from prior chats).
 DO NOT emit [STEPS_UPDATE] during scoping. You are still understanding the process.

 PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -238,12 +238,12 @@ Your response is natural conversational text. When the step structure changes, i

 3. Metadata capture (when you learn the flow's name, description, or tags):
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 4. Intake form suggestion (when intake form fields are identified):
 [INTAKE_FORM]
-[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
 [/INTAKE_FORM]

 IMPORTANT:
@@ -659,12 +659,12 @@ Requirements:

 Also provide metadata as a separate JSON object after the steps:
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 If we discussed intake form fields, also include:
 [INTAKE_FORM]
-[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
 [/INTAKE_FORM]"""
    else:
        generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL TreeStructure JSON for this flow.
@@ -681,7 +681,7 @@ Requirements:

 Also provide metadata as a separate JSON object after the tree:
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]"""

    provider_messages.append({"role": "user", "content": generation_instruction})
--- a/backend/app/core/ai_tree_generator_service.py
+++ b/backend/app/core/ai_tree_generator_service.py
@@ -89,8 +89,10 @@ Additional rules:
 5. Use unique node IDs prefixed with the branch context (e.g., "gpo-check-link")
 6. Build the tree bottom-up in your head: create solution/leaf nodes first, then build parent nodes referencing their IDs

-Few-shot example showing correct action node next_node_id usage:
-{"id": "dns-root", "type": "decision", "question": "Can the client resolve any DNS names?", "help_text": "Run: nslookup google.com", "options": [{"id": "dns-opt-none", "label": "No — nslookup times out or returns 'server failed'", "next_node_id": "dns-check-service"}, {"id": "dns-opt-partial", "label": "Some names resolve but others fail", "next_node_id": "dns-check-specific"}], "children": [{"id": "dns-check-service", "type": "action", "title": "Check DNS Client Service", "description": "Verify the DNS Client service is running on the affected machine", "commands": ["Get-Service -Name Dnscache | Select-Object Status,StartType"], "expected_outcome": "Status should be Running", "next_node_id": "dns-service-solution"}, {"id": "dns-service-solution", "type": "solution", "title": "DNS Service Was Stopped", "description": "The DNS Client service was stopped, preventing all name resolution", "resolution_steps": ["Run: Start-Service Dnscache", "Set startup type: Set-Service Dnscache -StartupType Automatic", "Flush cache: ipconfig /flushdns", "Test: nslookup google.com"]}, {"id": "dns-check-specific", "type": "solution", "title": "Selective DNS Failure — Stale or Missing Records", "description": "Some records resolve correctly, indicating DNS is functional but specific records are stale or missing", "resolution_steps": ["Check DNS server for missing A/CNAME records", "Clear DNS cache on the DNS server: Clear-DnsServerCache", "Flush client cache: ipconfig /flushdns", "Verify with: nslookup <failing-hostname>"]}]}"""
+SHAPE-ONLY schema example (do not copy this content verbatim — it shows
+how IDs link, NOT what to ask or run; your real tree must reflect the
+branch the user described):
+{"id": "<root-slug>", "type": "decision", "question": "<diagnostic question for THIS branch>", "help_text": "<optional hint>", "options": [{"id": "<opt-1>", "label": "<observable answer 1>", "next_node_id": "<child-1>"}, {"id": "<opt-2>", "label": "<observable answer 2>", "next_node_id": "<child-2>"}], "children": [{"id": "<child-1>", "type": "action", "title": "<what to do>", "description": "<details>", "commands": ["<exact command for THIS branch>"], "expected_outcome": "<what success looks like>", "next_node_id": "<sibling-id>"}, {"id": "<sibling-id>", "type": "solution", "title": "<resolution title>", "description": "<resolution description>", "resolution_steps": ["<step 1>", "<step 2>"]}, {"id": "<child-2>", "type": "solution", "title": "<other resolution>", "description": "<...>", "resolution_steps": ["<step 1>"]}]}"""


 CORRECTIVE_PROMPT_TEMPLATE = """Your previous JSON was invalid for ResolutionFlow's tree schema.
--- a/backend/app/core/kb_conversion_service.py
+++ b/backend/app/core/kb_conversion_service.py
@@ -153,48 +153,29 @@ Identify values that would change between executions (server names, IPs, usernam

 ## Output Format

-Return a JSON object:
+Return a JSON object with this SHAPE (DO NOT copy the placeholders below
+verbatim — fill each field with content derived from the actual KB article
+the engineer attached, NOT from this schema):
 ```json
 {
-  "title": "Procedure title derived from the article",
-  "description": "Brief description of what this procedure accomplishes",
+  "title": "<procedure title derived from the article>",
+  "description": "<brief description of what this procedure accomplishes>",
  "steps": [
    {
-      "id": "unique-step-id",
-      "type": "step",
-      "content": "Open Server Manager and navigate to Add Roles on [VAR:server_name]",
-      "confidence": 0.95,
-      "source_excerpt": "Step 1: Open Server Manager on DC01..."
-    },
-    {
-      "id": "warning-dns",
-      "type": "warning",
-      "content": "WARNING: This will restart DNS and cause brief connectivity loss",
-      "confidence": 0.90,
-      "source_excerpt": "Note: Restarting DNS will cause a brief outage"
-    },
-    {
-      "id": "section-verification",
-      "type": "section_header",
-      "content": "Verification Steps",
-      "confidence": 1.0,
-      "source_excerpt": "Verification"
+      "id": "<unique-kebab-case-id>",
+      "type": "step|warning|section_header",
+      "content": "<step body — may include [VAR:<your_variable>] interpolation>",
+      "confidence": <float 0.0-1.0>,
+      "source_excerpt": "<the verbatim sentence/phrase from the article that this step came from>"
    }
  ],
  "intake_form": [
    {
-      "variable_name": "server_name",
-      "label": "Server Name",
-      "field_type": "text",
-      "required": true,
-      "display_order": 1
-    },
-    {
-      "variable_name": "ip_address",
-      "label": "IP Address",
-      "field_type": "text",
-      "required": true,
-      "display_order": 2
+      "variable_name": "<snake_case_name fitting THIS procedure>",
+      "label": "<Human Label>",
+      "field_type": "text|password|select|textarea|number|boolean",
+      "required": true|false,
+      "display_order": <integer>
    }
  ]
 }