2026-04-25 06:02:14 +00:00
6 changed files with 223 additions and 54 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -179,6 +179,7 @@ python -m scripts.seed_trees                                        # seed (from
 - **Model tier routing:** `settings.get_model_for_action(action_type)`. Always alias form (`claude-sonnet-4-6`).
 - **FlowPilot must ask GUI-vs-script before suggesting either** when both are viable — see `FLOWPILOT_SYSTEM_PROMPT` in `flowpilot_engine.py`.
 - **Telemetry events to grep:** `anthropic.cache` (prompt-cache hit/create), `mcp.turn` (per-turn MCP availability), `mcp.fallback` (MCP silent-retry fired).
+- **Don't put literal payloads in system prompts.** Bit us twice in one day: a worked `[QUESTIONS]` example with literal "Outlook + jsmith" content, and a full DNS troubleshooting tree, both caused Claude to recite that content on unrelated tickets — the symptom looked like task-lane state leaking across chats. The fix is structural: every output example in a system prompt uses `<placeholder>` syntax (`{"text": "<one short, specific question>"}`), never literal field values. Real-looking format examples live in few-shot messages (separate file, separate code path), not system prompts. Guardrail: `tests/test_prompt_anti_parrot.py` scans every `*_PROMPT`/`*_SCHEMA`/`*_PROTOCOL`/`*_FORMAT` constant in `app/services/` and `app/core/`; CI fails when a marker block contains a literal JSON value or when a known leaked token (jsmith, DC01, ADSync, Dnscache, etc.) appears anywhere in a prompt.

 ### Frontend / UI

--- a/backend/app/core/ai_chat_service.py
+++ b/backend/app/core/ai_chat_service.py
@@ -40,7 +40,7 @@ CRITICAL BEHAVIORS:
 - Act as a senior engineer, not a chatbot. Use your domain knowledge to SUGGEST diagnostic steps, not just record what the user says.
 - When the user describes a problem area, demonstrate understanding by naming specific sub-categories, common causes, and relevant tools.
 - Challenge assumptions constructively: "Before we go down that path, have you considered checking X first? In my experience, that resolves 60% of these cases."
- Capture SPECIFIC commands with exact syntax. Not "check the service" but "Get-Service ADSync | Select-Object Status, StartType".
+- Capture SPECIFIC commands with exact syntax (PowerShell/CLI invocations the engineer would actually paste into a shell), not vague directives like "check the service".
 - Include expected outcomes for every action: what does success look like?
 - Surface edge cases proactively: "What about multi-forest environments?" or "Does this change if they have conditional access policies?"
 - Explain WHY the diagnostic order matters: "We check connectivity before auth because a network issue masquerades as an auth failure."
@@ -74,7 +74,7 @@ STRUCTURAL RULES:
 - All IDs must be unique strings (use descriptive slugs like "check-service-status")

 CROSS-REFERENCE / LOOP-BACK PATTERN:
-When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID. Example: an action node "restart-ssh-service" can set next_node_id to "verify-ssh-connection" (an ancestor decision node) to create a re-verification loop.
+When a troubleshooting path needs to loop back (e.g., after remediation, re-verify from an earlier checkpoint), set next_node_id to the target node's ID — including ancestor decision nodes for re-verification loops. The target ID must already exist somewhere in the tree.
 """

 INTERVIEW_PROTOCOL = """
@@ -85,7 +85,7 @@ Ask broad questions to understand the problem domain and scope:
 - What type of issue is this flow for?
 - Who is the target audience? (Tier 1 help desk, Tier 2, Tier 3?)
 - What environment assumptions? (On-prem, hybrid, specific vendors?)
-Demonstrate domain expertise immediately. If the user says "Azure AD Sync failures," show understanding: "Are you primarily seeing password hash sync issues, object attribute sync failures, or full directory sync errors?"
+Demonstrate domain expertise immediately. When the user names a technology, ask a follow-up that proves you know its common failure modes — a sub-categorization question that only someone fluent in that area would think to ask. Use vocabulary native to whatever the user actually mentioned, not stock examples from past conversations.
 DO NOT emit [TREE_UPDATE] during scoping. You are still understanding the problem.

 PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -130,7 +130,7 @@ Your response is natural conversational text. When the tree structure changes, i

 3. Metadata capture (when you learn the flow's name, description, or tags):
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 IMPORTANT:
@@ -172,8 +172,8 @@ STRUCTURAL RULES:
 - All IDs must be unique descriptive slugs (e.g., "check-dns-resolution", not UUIDs)
 - The last step MUST be type "procedure_end"
 - Use section_headers to organize steps into logical phases
- Commands are arrays of objects: [{"code": "Get-Service ADSync", "label": "Check sync service", "language": "powershell"}]
- Descriptions support [VAR:variable_name] interpolation for intake form variables (e.g., "Connect to [VAR:server_name] via RDP")
+- Commands are arrays of objects: [{"code": "<exact command>", "label": "<short label>", "language": "powershell|bash|cmd"}]
+- Descriptions support [VAR:variable_name] interpolation for intake form variables. Pick variable names that fit the procedure being built — do not reuse names from prior conversations.

 VARIABLE INTERPOLATION:
 When the procedure needs per-execution input (server name, IP address, client name, etc.), use [VAR:variable_name] syntax in descriptions and commands. These map to intake form fields that the engineer fills in before starting.
@@ -188,7 +188,7 @@ Understand the process being documented:
 - Who will execute it? (Tier 1 help desk, Tier 2, senior engineers?)
 - What environment context? (Specific vendor, on-prem vs cloud, tools available?)
 - Will this need per-execution input? (server name, client info, IP addresses → intake form fields)
-Demonstrate domain expertise: if the user says "Exchange Online mailbox migration," show understanding: "Are we covering full tenant-to-tenant migration, on-prem to Exchange Online cutover, or individual mailbox moves with hybrid?"
+Demonstrate domain expertise: when the user names a process, ask a sub-categorization question that distinguishes which variant of that process they mean (the variants will differ by technology — use vocabulary specific to whatever the user mentioned, not examples from prior chats).
 DO NOT emit [STEPS_UPDATE] during scoping. You are still understanding the process.

 PHASE 2 - DISCOVERY (current_phase: discovery):
@@ -238,12 +238,12 @@ Your response is natural conversational text. When the step structure changes, i

 3. Metadata capture (when you learn the flow's name, description, or tags):
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 4. Intake form suggestion (when intake form fields are identified):
 [INTAKE_FORM]
-[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
 [/INTAKE_FORM]

 IMPORTANT:
@@ -659,12 +659,12 @@ Requirements:

 Also provide metadata as a separate JSON object after the steps:
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]

 If we discussed intake form fields, also include:
 [INTAKE_FORM]
-[{"variable_name": "server_name", "label": "Server Name", "field_type": "text", "required": true, "placeholder": "e.g., DC01", "group_name": "Server Details", "display_order": 1}]
+[{"variable_name": "<snake_case_name>", "label": "<Human Label>", "field_type": "text|password|select|textarea|number|boolean", "required": true|false, "placeholder": "<short hint, optional>", "group_name": "<section heading, optional>", "display_order": <integer>}]
 [/INTAKE_FORM]"""
    else:
        generation_instruction = """Based on our entire conversation, generate the COMPLETE and FINAL TreeStructure JSON for this flow.
@@ -681,7 +681,7 @@ Requirements:

 Also provide metadata as a separate JSON object after the tree:
 [METADATA]
-{"name": "...", "description": "...", "tags": ["..."]}
+{"name": "<flow name>", "description": "<one-sentence summary>", "tags": ["<tag1>", "<tag2>"]}
 [/METADATA]"""

    provider_messages.append({"role": "user", "content": generation_instruction})
--- a/backend/app/core/ai_tree_generator_service.py
+++ b/backend/app/core/ai_tree_generator_service.py
@@ -89,8 +89,10 @@ Additional rules:
 5. Use unique node IDs prefixed with the branch context (e.g., "gpo-check-link")
 6. Build the tree bottom-up in your head: create solution/leaf nodes first, then build parent nodes referencing their IDs

-Few-shot example showing correct action node next_node_id usage:
-{"id": "dns-root", "type": "decision", "question": "Can the client resolve any DNS names?", "help_text": "Run: nslookup google.com", "options": [{"id": "dns-opt-none", "label": "No — nslookup times out or returns 'server failed'", "next_node_id": "dns-check-service"}, {"id": "dns-opt-partial", "label": "Some names resolve but others fail", "next_node_id": "dns-check-specific"}], "children": [{"id": "dns-check-service", "type": "action", "title": "Check DNS Client Service", "description": "Verify the DNS Client service is running on the affected machine", "commands": ["Get-Service -Name Dnscache | Select-Object Status,StartType"], "expected_outcome": "Status should be Running", "next_node_id": "dns-service-solution"}, {"id": "dns-service-solution", "type": "solution", "title": "DNS Service Was Stopped", "description": "The DNS Client service was stopped, preventing all name resolution", "resolution_steps": ["Run: Start-Service Dnscache", "Set startup type: Set-Service Dnscache -StartupType Automatic", "Flush cache: ipconfig /flushdns", "Test: nslookup google.com"]}, {"id": "dns-check-specific", "type": "solution", "title": "Selective DNS Failure — Stale or Missing Records", "description": "Some records resolve correctly, indicating DNS is functional but specific records are stale or missing", "resolution_steps": ["Check DNS server for missing A/CNAME records", "Clear DNS cache on the DNS server: Clear-DnsServerCache", "Flush client cache: ipconfig /flushdns", "Verify with: nslookup <failing-hostname>"]}]}"""
+SHAPE-ONLY schema example (do not copy this content verbatim — it shows
+how IDs link, NOT what to ask or run; your real tree must reflect the
+branch the user described):
+{"id": "<root-slug>", "type": "decision", "question": "<diagnostic question for THIS branch>", "help_text": "<optional hint>", "options": [{"id": "<opt-1>", "label": "<observable answer 1>", "next_node_id": "<child-1>"}, {"id": "<opt-2>", "label": "<observable answer 2>", "next_node_id": "<child-2>"}], "children": [{"id": "<child-1>", "type": "action", "title": "<what to do>", "description": "<details>", "commands": ["<exact command for THIS branch>"], "expected_outcome": "<what success looks like>", "next_node_id": "<sibling-id>"}, {"id": "<sibling-id>", "type": "solution", "title": "<resolution title>", "description": "<resolution description>", "resolution_steps": ["<step 1>", "<step 2>"]}, {"id": "<child-2>", "type": "solution", "title": "<other resolution>", "description": "<...>", "resolution_steps": ["<step 1>"]}]}"""


 CORRECTIVE_PROMPT_TEMPLATE = """Your previous JSON was invalid for ResolutionFlow's tree schema.
--- a/backend/app/core/kb_conversion_service.py
+++ b/backend/app/core/kb_conversion_service.py
@@ -153,48 +153,29 @@ Identify values that would change between executions (server names, IPs, usernam

 ## Output Format

-Return a JSON object:
+Return a JSON object with this SHAPE (DO NOT copy the placeholders below
+verbatim — fill each field with content derived from the actual KB article
+the engineer attached, NOT from this schema):
 ```json
 {
-  "title": "Procedure title derived from the article",
-  "description": "Brief description of what this procedure accomplishes",
+  "title": "<procedure title derived from the article>",
+  "description": "<brief description of what this procedure accomplishes>",
  "steps": [
    {
-      "id": "unique-step-id",
-      "type": "step",
-      "content": "Open Server Manager and navigate to Add Roles on [VAR:server_name]",
-      "confidence": 0.95,
-      "source_excerpt": "Step 1: Open Server Manager on DC01..."
-    },
-    {
-      "id": "warning-dns",
-      "type": "warning",
-      "content": "WARNING: This will restart DNS and cause brief connectivity loss",
-      "confidence": 0.90,
-      "source_excerpt": "Note: Restarting DNS will cause a brief outage"
-    },
-    {
-      "id": "section-verification",
-      "type": "section_header",
-      "content": "Verification Steps",
-      "confidence": 1.0,
-      "source_excerpt": "Verification"
+      "id": "<unique-kebab-case-id>",
+      "type": "step|warning|section_header",
+      "content": "<step body — may include [VAR:<your_variable>] interpolation>",
+      "confidence": <float 0.0-1.0>,
+      "source_excerpt": "<the verbatim sentence/phrase from the article that this step came from>"
    }
  ],
  "intake_form": [
    {
-      "variable_name": "server_name",
-      "label": "Server Name",
-      "field_type": "text",
-      "required": true,
-      "display_order": 1
-    },
-    {
-      "variable_name": "ip_address",
-      "label": "IP Address",
-      "field_type": "text",
-      "required": true,
-      "display_order": 2
+      "variable_name": "<snake_case_name fitting THIS procedure>",
+      "label": "<Human Label>",
+      "field_type": "text|password|select|textarea|number|boolean",
+      "required": true|false,
+      "display_order": <integer>
    }
  ]
 }
--- a/backend/app/services/assistant_chat_service.py
+++ b/backend/app/services/assistant_chat_service.py
@@ -247,7 +247,7 @@ the most relevant branch to investigate first.
 To create a fork, append this marker AFTER your [QUESTIONS]/[ACTIONS] markers:

 [FORK]
-{"fork_reason": "Brief reason", "options": [{"label": "Short name", "description": "One sentence"}, {"label": "Another", "description": "One sentence"}]}
+{"fork_reason": "<one short sentence: why these branches need independent investigation>", "options": [{"label": "<short hypothesis name for branch 1>", "description": "<one sentence: what this branch will check>"}, {"label": "<branch 2 name>", "description": "<...>"}]}
 [/FORK]

 2-4 options. Never mention "fork", "branch", or "path" in your visible text.
@@ -274,11 +274,12 @@ ANTI-PARROT RULE: The schemas above use placeholders in `<angle brackets>` to sh
 the SHAPE of valid output. Your real questions, actions, facts, and suggested fixes \
 must be derived from the engineer's CURRENT message — never copy placeholder text, \
 never reuse content from a prior unrelated session, never invent ticket-specific \
-details (usernames, hostnames, error codes, application names) that the engineer \
-has not stated. If the engineer asks about printers, do not produce questions about \
-Outlook. If the engineer says nothing about a user named jsmith, never name jsmith \
-in your output. Tickets unrelated to the example domains in the schema must produce \
-questions and actions from THAT ticket's domain.
+details (usernames, hostnames, IPs, error codes, application names, ticket numbers) \
+that the engineer has not stated. The technology, vocabulary, and named entities in \
+your output must match the technology, vocabulary, and named entities in the \
+engineer's most recent message. If the engineer's ticket is about a different \
+domain than the last ticket you saw, your output must reflect the new domain — \
+do not let the previous ticket's specifics bleed into the new one.
 """


--- a/backend/tests/test_prompt_anti_parrot.py
+++ b/backend/tests/test_prompt_anti_parrot.py
@@ -0,0 +1,184 @@
+"""Guardrail: literal output payloads must not live in any LLM system prompt.
+
+This test exists because the same anti-pattern bit us twice in the same
+day: a worked example with literal content (Outlook + jsmith + literal
+JSON; full DNS troubleshooting tree) sitting inside a `*_PROMPT` constant
+caused Claude to recite that content on unrelated tickets, making the
+task lane look like it was leaking previous-session data.
+
+The fix is structural: every output example in a system prompt must use
+`<placeholder>` or `<...>` syntax, never literal field values, command
+names, hostnames, or usernames that the model could parrot. Format
+examples that need real-looking content live in few-shot messages
+(separate file, separate code path, model treats them as past behavior),
+not in system prompts.
+
+Failure messages here name the constant + line; fix by replacing the
+literal payload with a placeholder schema, or by moving the example
+out of the system prompt entirely.
+
+See CLAUDE.md → Critical Lessons → "Don't put literal payloads in
+system prompts" for the longer rationale.
+"""
+from __future__ import annotations
+
+import importlib
+import inspect
+import pkgutil
+import re
+from typing import Iterator
+
+import pytest
+
+# Modules to scan. We deliberately import the modules (not just walk source
+# files) so we get the actual string values of `*_PROMPT` constants — which
+# may be assembled from concat / .format() / f-strings.
+_MODULE_PACKAGES = ("app.services", "app.core")
+
+
+def _iter_prompt_constants() -> Iterator[tuple[str, str, str]]:
+    """Yield (module_name, constant_name, value) for every uppercase string
+    constant whose name ends in `_PROMPT` (or `_SCHEMA`/`_PROTOCOL`/`_FORMAT`
+    — same anti-pattern risk).
+
+    Skips modules that fail to import to keep the test resilient when an
+    individual module has unrelated breakage.
+    """
+    suffixes = ("_PROMPT", "_SCHEMA", "_PROTOCOL", "_FORMAT", "_CONTEXT")
+    for pkg_name in _MODULE_PACKAGES:
+        pkg = importlib.import_module(pkg_name)
+        for mod_info in pkgutil.iter_modules(pkg.__path__, prefix=f"{pkg_name}."):
+            try:
+                mod = importlib.import_module(mod_info.name)
+            except Exception:
+                continue
+            for name, value in inspect.getmembers(mod):
+                if not name.isupper() or not name.endswith(suffixes):
+                    continue
+                if not isinstance(value, str):
+                    continue
+                yield mod_info.name, name, value
+
+
+# ── The forbidden patterns ──────────────────────────────────────────────────
+
+# A literal username pattern that Claude has historically parroted across
+# unrelated tickets. The list isn't exhaustive — it's the exact strings
+# we've seen leak. Add to it if a new one shows up in production.
+_FORBIDDEN_LITERAL_TOKENS: tuple[str, ...] = (
+    "jsmith",       # leaked from an Outlook/AD example
+    "DC01",         # leaked from an intake-form example
+    "ADSync",       # leaked from a commands-array example
+    "Dnscache",     # leaked from a DNS troubleshooting tree example
+    "google.com",   # leaked from a DNS troubleshooting tree example
+    "Outlook keeps", "Teams drops",  # specific phrasings from a worked Outlook/WiFi example
+)
+
+# Marker-with-payload patterns. A `[QUESTIONS]\n[{...JSON with real field values...}]`
+# block in a prompt is the highest-risk shape — the model treats it as a
+# canonical response template. We allow placeholder content (anything inside
+# angle brackets `<...>` is treated as a placeholder, not a literal).
+#
+# Restrictions on the regex (to avoid false positives where the marker name
+# appears in prose like "include [QUESTIONS] markers"):
+# - opening tag must be at start of string OR preceded by newline/whitespace
+#   AND followed by newline+JSON-ish content
+# - block content must START with `[` or `{` after optional whitespace,
+#   so prose blocks (like the closing-tag-distance regex match across
+#   markdown headings) are excluded
+_MARKER_BLOCK_RE = re.compile(
+    r"(?:^|\n)\[(QUESTIONS|ACTIONS|SUGGEST_FIX|PROMOTE|FORK|TREE_UPDATE|STEPS_UPDATE|INTAKE_FORM|METADATA|DELTA)\]"
+    r"\s*\n"                          # forced newline before content
+    r"(\s*[\[{][\s\S]*?)"             # content must start with [ or {
+    r"\s*\n\[/\1\]"
+)
+
+# Heuristic: only flag JSON VALUES, not JSON KEYS. Keys are followed by `:`,
+# values come after `: ` (object value) or are bare strings inside an array.
+# The shape we're defending against is `{"text": "Is this user on a laptop?"}` —
+# the value `"Is this user on a laptop?"` is a literal payload the model will
+# recite. Keys like `"text"` are part of the schema and must stay literal.
+#
+# Matches a quoted string that has at least 3 chars, no angle brackets, and
+# is followed by a JSON value-terminator (`,` `]` `}`) — i.e. NOT followed
+# by `:` (which would mark it as a key).
+_QUOTED_VALUE_RE = re.compile(
+    r'"([^"<>][^"<>]{2,}?)"\s*(?=[,\]\}])'
+)
+# Substrings that, if PRESENT in the candidate value, indicate it's a
+# placeholder marker rather than literal output. Be strict — broad markers
+# like "?" alone would whitelist any sentence ending in a question mark,
+# defeating the test's purpose.
+_PLACEHOLDER_HINTS = ("...", "snake_case", "kebab-case", "<", "TODO")
+# Schema enum-like values that are part of the format spec, not parrotable text.
+_ALLOWED_ENUM_VALUES = frozenset({
+    "text", "password", "select", "boolean", "number", "textarea", "multi_text",
+    "powershell", "bash", "cmd", "python",
+    "question", "diagnostic_check", "user_note", "ai_synthesis",
+    "decision", "action", "solution", "procedure_step", "section_header", "procedure_end",
+    "step", "warning",
+})
+
+
+def _block_has_literal_payload(block_body: str) -> tuple[bool, str | None]:
+    """Return (True, offending_string) if the marker block looks like literal output."""
+    for m in _QUOTED_VALUE_RE.finditer(block_body):
+        s = m.group(1).strip()
+        if not s:
+            continue
+        # Pure placeholder hints — accept.
+        if any(h in s for h in _PLACEHOLDER_HINTS):
+            continue
+        # Pipe-separated enum like `text|password|select` — schema spec.
+        if "|" in s:
+            continue
+        # Single-word enum value we explicitly allow.
+        if s in _ALLOWED_ENUM_VALUES:
+            continue
+        # JSON ellipsis-style placeholders, ".." etc.
+        if all(c in "._" for c in s):
+            continue
+        return True, s
+    return False, None
+
+
+# ── Tests ──────────────────────────────────────────────────────────────────
+
+def test_no_known_leaked_literal_tokens_in_prompts() -> None:
+    """Constants must not contain strings the model has historically parroted.
+
+    Adding a new entry to _FORBIDDEN_LITERAL_TOKENS after a production leak is
+    the right way to extend coverage — keep this list as the audit trail.
+    """
+    failures: list[str] = []
+    for module_name, const_name, value in _iter_prompt_constants():
+        for token in _FORBIDDEN_LITERAL_TOKENS:
+            if token in value:
+                failures.append(
+                    f"{module_name}.{const_name} contains forbidden literal token "
+                    f"{token!r} — replace with a <placeholder>. See CLAUDE.md → "
+                    f"'Don't put literal payloads in system prompts'."
+                )
+    assert not failures, "\n".join(failures)
+
+
+def test_marker_blocks_in_prompts_use_placeholders_not_literal_payloads() -> None:
+    """Every marker block in a system prompt must contain placeholders only.
+
+    A block like `[QUESTIONS]\\n[{"text": "Is this user on a laptop or desktop?"}]\\n[/QUESTIONS]`
+    will be recited verbatim by Claude on unrelated tickets. Use angle-bracket
+    placeholders instead: `[{"text": "<one short, specific question>"}]`.
+    """
+    failures: list[str] = []
+    for module_name, const_name, value in _iter_prompt_constants():
+        for m in _MARKER_BLOCK_RE.finditer(value):
+            marker = m.group(1)
+            body = m.group(2)
+            has_literal, offender = _block_has_literal_payload(body)
+            if has_literal:
+                failures.append(
+                    f"{module_name}.{const_name}: [{marker}] block contains literal "
+                    f"payload string {offender!r}. Replace with a <placeholder>. "
+                    f"See CLAUDE.md → 'Don't put literal payloads in system prompts'."
+                )
+    assert not failures, "\n".join(failures)