Commit Graph

3 Commits

Author SHA1 Message Date
da93ae55c3 feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3)
Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).

Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
  (~2.5k tokens with few-shot example); retries inside the same function
  re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
  (each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
  substitution — same-language sessions share cache entries)

Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.

script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.

No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 16:29:45 +00:00
10cf5f45eb refactor: consolidate LLM JSON parsing into shared llm_utils module
Extracted duplicate _strip_markdown_fences / _parse_llm_json functions
from 7 files into app/services/llm_utils.py. Two shared functions:
- strip_markdown_fences(): fence stripping only
- parse_llm_json(): fence stripping + JSON parse + error logging

Files updated: flowpilot_engine, knowledge_flywheel, session_to_flow_service,
ai_tree_generator_service, ai_fix_service, ai_chat_service, kb_conversion_service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 03:25:25 +00:00
chihlasm
373736c594 feat: add AI fix service with prompt building and validation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:25:34 -05:00