resolutionflow

chihlasm/resolutionflow

Fork 0

Commit Graph

Author	SHA1	Message	Date
Michael Chihlas	da93ae55c3	feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3) Wraps each static system prompt in a single-block list so Phase 0.1's AnthropicProvider applies cache_control: ephemeral automatically (policy α, first block gets marked when no caller-authored cache_control is present). Call sites: - ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens) - ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT (~2.5k tokens with few-shot example); retries inside the same function re-read the cached block instead of paying full input cost on each attempt - kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt (each caches independently by text content) - ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry - script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language substitution — same-language sessions share cache entries) Each edit includes an inline comment explaining why the block is cacheable (stable-constant, retry-reuse, per-language variant) so a future dev can see the intent at the cache_control marker site. script_builder history caching deliberately deferred — per Phase 0.1 decision (option i), AnthropicProvider does not automatically cache the message list. If script_builder's growing 20-message history turns out to be a visible cost driver via the anthropic.cache telemetry, route that caller through the 0.4 chat wrapper which handles history caching. No runtime verification from code-server; cache-hit behavior will be confirmed against the new dev environment when it's up, per the inline TODO(phase0-verify) in ai_provider.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-17 16:29:45 +00:00
chihlasm	10cf5f45eb	refactor: consolidate LLM JSON parsing into shared llm_utils module Extracted duplicate _strip_markdown_fences / _parse_llm_json functions from 7 files into app/services/llm_utils.py. Two shared functions: - strip_markdown_fences(): fence stripping only - parse_llm_json(): fence stripping + JSON parse + error logging Files updated: flowpilot_engine, knowledge_flywheel, session_to_flow_service, ai_tree_generator_service, ai_fix_service, ai_chat_service, kb_conversion_service Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-21 03:25:25 +00:00
chihlasm	373736c594	feat: add AI fix service with prompt building and validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-26 17:25:34 -05:00

Author

SHA1

Message

Date

Michael Chihlas

da93ae55c3

feat(ai): opt-in structured-system-block caching for one-shot generators (Phase 0.3)

Wraps each static system prompt in a single-block list so Phase 0.1's
AnthropicProvider applies cache_control: ephemeral automatically (policy α,
first block gets marked when no caller-authored cache_control is present).

Call sites:
- ai_tree_generator.scaffold_branches: SCAFFOLD_SYSTEM_PROMPT (~1k tokens)
- ai_tree_generator.generate_branch_detail: BRANCH_DETAIL_SYSTEM_PROMPT
  (~2.5k tokens with few-shot example); retries inside the same function
  re-read the cached block instead of paying full input cost on each attempt
- kb_conversion.convert_document: TROUBLESHOOTING or PROCEDURAL prompt
  (each caches independently by text content)
- ai_fix.generate_fixes: FIX_SYSTEM_PROMPT on first attempt + corrective retry
- script_builder.send_message: SYSTEM_PROMPT_TEMPLATE (per-session language
  substitution — same-language sessions share cache entries)

Each edit includes an inline comment explaining why the block is cacheable
(stable-constant, retry-reuse, per-language variant) so a future dev can
see the intent at the cache_control marker site.

script_builder history caching deliberately deferred — per Phase 0.1
decision (option i), AnthropicProvider does not automatically cache the
message list. If script_builder's growing 20-message history turns out
to be a visible cost driver via the anthropic.cache telemetry, route
that caller through the 0.4 chat wrapper which handles history caching.

No runtime verification from code-server; cache-hit behavior will be
confirmed against the new dev environment when it's up, per the inline
TODO(phase0-verify) in ai_provider.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-17 16:29:45 +00:00

chihlasm

10cf5f45eb

refactor: consolidate LLM JSON parsing into shared llm_utils module

Extracted duplicate _strip_markdown_fences / _parse_llm_json functions
from 7 files into app/services/llm_utils.py. Two shared functions:
- strip_markdown_fences(): fence stripping only
- parse_llm_json(): fence stripping + JSON parse + error logging

Files updated: flowpilot_engine, knowledge_flywheel, session_to_flow_service,
ai_tree_generator_service, ai_fix_service, ai_chat_service, kb_conversion_service

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-21 03:25:25 +00:00

chihlasm

373736c594

feat: add AI fix service with prompt building and validation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-26 17:25:34 -05:00

3 Commits