docs(decisions): scope structured outputs to flat-array JSON (close 3c)

Record the 3c finding: Anthropic structured outputs apply only to flat-array generate_json outputs (kb_conversion). ai_fix and knowledge_flywheel flow-gen emit recursive/nested decision trees that the "no recursive schemas" limit excludes; their fence-strippers stay. Documents the deferred kb-only _try_repair_json removal pending staging validation of the AI_KB_CONVERT_STRUCTURED_OUTPUT flag. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-28 15:09:16 -04:00
parent 3fde3369c8
commit 02db15f118
1 changed files with 26 additions and 0 deletions
--- a/.ai/DECISIONS.md
+++ b/.ai/DECISIONS.md
@@ -13,6 +13,32 @@

 ---

+## 2026-05-28 — Scope Anthropic structured outputs to flat-array JSON only
+
+**Context:** Optimizing the existing Claude API usage (no model change). The Anthropic path in `generate_json` (`ai_provider.py`) had no equivalent to the Gemini path's `response_mime_type="application/json"` — it prompted for JSON and relied on downstream defenses: `_strip_markdown_fences` (ai_fix), `parse_llm_json` (knowledge_flywheel), and `_try_repair_json` (kb_conversion, which balances unclosed braces on truncated output). Anthropic structured outputs (`output_config.format` with a JSON schema) guarantee valid, parseable JSON and would eliminate those band-aids. The question was which of the four `generate_json` call sites can adopt it.
+
+Structured outputs has hard schema limits: **no recursive schemas**, and **every object must set `additionalProperties: false`** (so the schema must enumerate exactly the fields the model emits — a superset is impossible, an omission makes a field unproducible). Tracing the call sites against those limits:
+
+- **kb_conversion** → output is `{title, description, nodes: [...]}` / `{...steps[], intake_form[]}` — **flat arrays**, references by `next_node_id`/id, no nesting. Expressible.
+- **ai_fix** → returns a fixed *node that is itself a subtree*; `_find_node_by_id` recurses `node["children"]` and the prompt requires decision nodes to have ≥2 children. **Recursive, arbitrary depth.**
+- **knowledge_flywheel flow-gen** → emits `tree_structure`, a decision-tree root with nested `children`/`options`, persisted as an opaque blob.
+- **knowledge_flywheel enhancement** → flat `new_nodes[] + modified_options[]`; expressible but low-frequency and only fence-stripped.
+
+**Decision:** Apply structured outputs to **flat-array outputs only** — i.e. `kb_conversion`. Wired via an optional `schema=` param on `AIProvider.generate_json` (`None` = legacy prompt-only behavior; Anthropic maps it to `output_config.format`, Gemini ignores it), with the two KB schemas + `_schema_for_target_type()` in `kb_conversion_service.py`, gated behind `settings.AI_KB_CONVERT_STRUCTURED_OUTPUT` (default **False**) pending a live constrained-decoding smoke-test in staging. The robustness fixes that motivated the work — `_extract_text_from_response` (skip non-text blocks, log `max_tokens`/`refusal`, raise on no-text) — live in the shared provider, so **all four** callers already benefit regardless of schema adoption.
+
+**Rejected:**
+- **Forcing schemas on ai_fix / flow-gen.** Their outputs are recursive/nested decision trees; a bounded-depth schema would reject valid deeper trees and break generation. Wrong architecture for marginal/zero benefit (flow-gen's tree is stored as a blob, never schema-validated downstream).
+- **Wiring the flywheel enhancement site.** Flat and technically expressible, but low call frequency and only fence-stripping today — marginal benefit against the risk of a blind (un-live-tested) `additionalProperties: false` schema.
+- **Deleting the fence-strip / repair helpers now.** `_strip_markdown_fences` / `parse_llm_json` must stay — they protect the recursive paths that can't use schemas. Only `_try_repair_json` (kb-only) becomes removable, and only *after* the flag is validated in staging.
+
+**Consequences:**
+- Structured outputs is the tool for flat JSON; recursive decision-tree outputs are excluded by design. New flat-JSON `generate_json` callers can opt in via `schema=`; recursive ones should not.
+- `AI_KB_CONVERT_STRUCTURED_OUTPUT` must be smoke-tested against the live model (both target types) before production enablement. Open risk: whether Anthropic accepts optional (non-`required`) fields — if not, the schemas need every field in `required` with nullable types. The flag makes this fully reversible.
+- Deferred cleanup: once the flag is validated, remove only `_try_repair_json` from the kb_conversion Anthropic path; leave the fence-strippers.
+- Work lives on branch `feat/ai-structured-outputs` (commits `84a02a5`, `1388357`), based on `design/l1-workspace`.
+
+---
+
 ## 2026-05-13 — Session expiration policy: 3d idle / 14d absolute defaults + per-account override

 **Context:** User report: "I login to ResolutionFlow and never have to log back in." Investigation found refresh tokens at `REFRESH_TOKEN_EXPIRE_DAYS=7` with JTI rotation (`security.py:36`) — every `/auth/refresh` minted a fresh 7-day window. Net effect: a sliding 7-day session with no absolute cap. Visit once a week, logged in forever. Acceptable for pilot but not for MSP buyers whose SOC2 / cyber-insurance auditors require enforced session timeouts. Required for the same Phase O launch readiness as the other gates already in flight.