feat: FlowPilot migration — Phase 1-9 + Phase 9 bug fixes + QA fixture harness #147
@@ -557,6 +557,21 @@ A codebase audit revealed that prompt caching is only implemented in `assistant_
|
||||
|
||||
- **0.1** Promote `AnthropicProvider.generate_json()` and `generate_text_stream()` in `ai_provider.py` to the cached pattern currently implemented in `assistant_chat_service.py:_call_anthropic_cached()`. Convert the `system` string parameter to a structured system block list with `cache_control: {"type": "ephemeral"}` on the static portion. Add a second breakpoint on the last history message. For the streaming variant, capture the final usage object via `get_final_message()`. Log `cache_read_input_tokens` and `cache_creation_input_tokens` on every response.
|
||||
- **0.2** Update `integrations.py:557` (`/tickets/ai-parse`) to move the members list and team-stable boards data into a cached system block.
|
||||
|
||||
> **Phase 0.2 — pending target endpoint.** The `/tickets/ai-parse` endpoint described in the original migration doc does not exist in the codebase as of this commit. When this endpoint is built, apply the cached-system-block pattern:
|
||||
>
|
||||
> ```python
|
||||
> system_blocks = [
|
||||
> {"type": "text", "text": members_json, "cache_control": {"type": "ephemeral"}},
|
||||
> # cacheable: team-stable
|
||||
> {"type": "text", "text": boards_json, "cache_control": {"type": "ephemeral"}},
|
||||
> # cacheable: team-stable
|
||||
> {"type": "text", "text": engineer_description},
|
||||
> # uncached: per-request
|
||||
> ]
|
||||
> ```
|
||||
>
|
||||
> Remove this note when the endpoint is implemented and the pattern applied.
|
||||
- **0.3** Add `cache_control` to one-shot generators: `ai_tree_generator`, `kb_conversion`, `ai_fix`, `script_builder`. Same pattern as 0.1.
|
||||
- **0.4** Extract the caching logic from `assistant_chat_service.py:_call_anthropic_cached()` into `AnthropicProvider` and delete `_call_anthropic_cached`. `assistant_chat_service` should call the provider like every other service. This prevents two canonical implementations of the same pattern.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user