Harden the Anthropic provider and lay the groundwork for schema-constrained
JSON, optimizing the existing claude-sonnet-4-6 / claude-haiku-4-5 usage
(no model changes).
ai_provider.py:
- _extract_text_from_response replaces fragile response.content[0].text:
skips non-text leading blocks (e.g. thinking), returns the first text
block, logs an anthropic.stop_reason warning on max_tokens/refusal
(truncation now observable), and raises ValueError on a no-text response.
- generate_json gains an optional `schema` param. Anthropic wires it to
output_config.format (structured outputs); schema=None preserves the exact
prior call for every existing caller. Gemini accepts-and-ignores it.
kb_conversion_service.py:
- TROUBLESHOOTING_SCHEMA / PROCEDURAL_SCHEMA + _schema_for_target_type(),
modelled as a strict superset of every field the prompts emit.
- convert_document passes the schema only when the new
AI_KB_CONVERT_STRUCTURED_OUTPUT setting is True (default False). The
_try_repair_json fallback stays as belt-and-suspenders.
Tests: 14 provider + 7 schema, TDD (red-green). Live constrained-decoding
smoke-test still required before enabling the flag in production.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The "AI parrots example content from system prompt" bug bit us twice in
one day across two different prompt sites. Patching individual prompts
is treating the symptom; this commit makes the rule structural.
Audit + sanitize:
- assistant_chat_service.ASSISTANT_SYSTEM_PROMPT — already cleaned in
prior commits, but the [FORK] schema still had literal "Brief reason"
/ "Short name" / "One sentence" placeholders. Replaced with
<angle-bracket> placeholders. Anti-parrot rule itself rewritten to
describe the failure mode abstractly instead of naming "jsmith" so
the rule no longer trips the guardrail (and so the model doesn't
see "jsmith" as a token at all).
- ai_chat_service.py — removed three concrete-example offenders:
"Get-Service ADSync" command literal, the "DC01 server_name" intake
form payload (in two places), and the inline interview demos using
"Azure AD Sync failures" / "Exchange Online mailbox migration".
Replaced with technology-neutral schema descriptions.
- ai_tree_generator_service.BRANCH_DETAIL_SYSTEM_PROMPT — replaced the
fully-fleshed DNS troubleshooting tree (with literal Dnscache /
ipconfig / google.com / Start-Service) with a placeholder schema
showing only ID-linkage shape.
- kb_conversion_service.PROCEDURAL_SYSTEM_PROMPT — replaced the worked
Server Manager + DC01 example payload with a placeholder schema.
Guardrail (tests/test_prompt_anti_parrot.py):
- Imports every module under app/services/ and app/core/ and walks
every uppercase string constant ending in _PROMPT, _SCHEMA,
_PROTOCOL, _FORMAT, or _CONTEXT.
- test 1: known-leaked-token list (jsmith, DC01, ADSync, Dnscache,
google.com, "Outlook keeps", "Teams drops") must not appear in any
prompt constant. Add to the list when a new leak shows up in prod —
the list IS the audit trail.
- test 2: marker blocks ([QUESTIONS], [ACTIONS], [SUGGEST_FIX], etc.)
must contain placeholders only. Distinguishes JSON keys (followed
by ':', allowed) from JSON values (followed by ',' / ']' / '}',
must be <placeholder>); allows pipe-separated enum types
(text|password|select) and a small set of fixed enum values
(question, diagnostic_check, decision, action, ...). Verified by
feeding the test a known-bad block — caught it correctly.
Documented the rule in CLAUDE.md → AI / FlowPilot lessons, naming
the test as the enforcement point so future contributors know how to
extend it (add to the known-leaked list when a new leak surfaces).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).
Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
(~2.5k tokens with few-shot example); retries inside the same function
re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
(each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
substitution — same-language sessions share cache entries)
Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.
script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.
No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Landing page at /landing with full marketing content: hero, features,
pricing, testimonials, and beta email signup form. Beta signups email
beta@resolutionflow.com via new public endpoint. Unauthenticated users
redirect to landing instead of login. Also raises KB Accelerator node
limit from 50 to 100 to accommodate dense troubleshooting articles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AI-generated trees can have circular next_node_id references (e.g.,
node A → B → A). The parent mapping now checks for cycles before
assigning parent_node_id, preventing FK deadlocks during insert.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Nodes with parent_node_id references were inserted in a single batch,
causing FK violations when children were inserted before their parents.
Now inserts roots first, flushes, then children in subsequent passes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Increase max_tokens from 8192 to 16384 to prevent truncation on long articles
- Add _try_repair_json() that fixes trailing commas and attempts to close
unclosed brackets/braces from truncated AI responses
- Log full raw response (first 2000 chars) on parse failure for debugging
- Set status to 'failed' with user-friendly error message instead of leaving
imports stuck in 'processing' state
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>