# Solutions Library + Smart RAG — Design Spec

> **Status:** SPEC ONLY — not implementing yet. Build after colleague pilot (Week 3-4).
> **Date:** 2026-03-23
> **Context:** [GTM validation plan](resolutionflow-gtm-design.md) — copilot-first, team knowledge flywheel

---

## Problem

Engineers solve the same problems repeatedly across an MSP. Today that knowledge lives in engineers' heads, scattered PSA ticket notes, or nowhere. When an engineer resolves a tricky issue through the FlowPilot copilot, that knowledge dies with the session. The next engineer who hits the same issue starts from scratch.

## Solution

**Solutions Library** — a team knowledge base that builds itself from resolved copilot sessions and feeds back into future sessions via RAG.

Two halves:
1. **Capture & Dedup** — save resolutions from copilot sessions, prevent duplicates
2. **Smart RAG** — FlowPilot pulls from the Solutions Library during live sessions and surfaces relevant prior resolutions

## How It Works

### 1. Resolution Capture (post-session)

When an engineer resolves a copilot session, FlowPilot auto-generates a structured resolution:

```
{
  "title": "Exchange Online mailbox not receiving email",
  "problem": "User reports not receiving emails in Outlook. OWA also shows no new mail.",
  "root_cause": "Mail flow rule blocking external senders due to tenant-wide transport rule misconfiguration",
  "resolution_steps": [
    "Checked MX records — correct",
    "Ran message trace in Exchange Admin Center — messages queued",
    "Found transport rule 'Block External' was enabled tenant-wide instead of per-group",
    "Disabled rule, emails delivered within 5 minutes"
  ],
  "environment_tags": ["exchange-online", "mail-flow", "transport-rules"],
  "auto_detected_category": "Microsoft 365"
}
```

Engineer gets prompted: **"Save this as a reusable solution?"**
- One-click save with auto-generated content
- Can edit title, tags, steps before saving
- Can skip (not every session produces reusable knowledge)

### 2. Dedup Check (on save)

Before saving, system does a similarity search (embedding cosine similarity) against existing solutions.

**If similarity > 0.85 (strong match):**
- Show existing solution side-by-side with new one
- Three options:
  - **Merge** — update existing solution with new context/steps (keeps the better version, increments usage count)
  - **Keep Both** — they look similar but are actually different problems
  - **Discard** — it's the same thing, don't save

**If similarity 0.6-0.85 (partial match):**
- Show as "Related solutions" but save as new by default
- Engineer can choose to merge if they recognize it's the same

**If similarity < 0.6:**
- Save directly, no prompt

### 3. RAG During Live Sessions

When an engineer starts or progresses through a copilot conversation:

1. After the first 2-3 message exchanges (enough context to understand the problem), FlowPilot searches the Solutions Library
2. Uses the conversation context as the query (not just the initial message)
3. If a solution scores above threshold, FlowPilot surfaces it naturally:

> *"I found a similar issue. Sarah resolved an Exchange mail flow problem 3 days ago — she found a transport rule was blocking external senders. Want me to walk you through her resolution?"*

**If engineer says yes:**
- FlowPilot presents the resolution steps one at a time
- Engineer confirms each step worked or skips
- At the end, the solution's usage count increments

**If engineer says no:**
- FlowPilot continues open-ended troubleshooting
- The suggestion is noted (helps tune future relevance)

**Retrieval rules:**
- Only surface solutions from the same team
- Max 1 suggestion per session (don't nag)
- Don't suggest solutions the same engineer saved (they already know)
- Prefer recent solutions over old ones (tie-breaker)

### 4. Confidence Scoring

Each solution gets a confidence score (0-100):

| Event | Score change |
|-------|-------------|
| Saved from resolved session | +50 (base) |
| Another engineer uses it successfully | +15 |
| Engineer accepts RAG suggestion | +10 |
| Engineer rejects RAG suggestion | -5 |
| Multiple engineers save similar (merged) | +20 |
| Not suggested in 90 days | -10 (decay) |

High-confidence solutions are suggested more aggressively. Low-confidence solutions still appear in search but aren't proactively surfaced.

### 5. Solutions Library UI

Replaces the current Step Library page. Card-based grid with:

**Each solution card shows:**
- Title (e.g., "Exchange Online mailbox not receiving email")
- Problem summary (2 lines, truncated)
- Root cause (1 line)
- Tags (environment, category)
- Saved by [engineer name] · [date]
- Used [N] times · Confidence [high/medium/low]

**Page features:**
- Search (full-text + semantic)
- Filter by tag, engineer, confidence, recency
- Sort by most used, most recent, highest confidence
- Manual "Add Solution" button (not just from sessions)
- Edit/delete for solutions you created (team admins can edit any)

---

## Data Model

### `solutions` table

| Column | Type | Notes |
|--------|------|-------|
| id | UUID | PK |
| team_id | UUID | FK to teams |
| created_by | UUID | FK to users |
| title | VARCHAR(255) | |
| problem_description | TEXT | What the user reported |
| root_cause | TEXT | What was actually wrong |
| resolution_steps | JSONB | Array of step strings |
| environment_tags | JSONB | Array of tag strings |
| category | VARCHAR(100) | Auto-detected or manual |
| source_session_id | UUID | FK to ai_sessions (nullable — manual entries have no source) |
| embedding | VECTOR(1536) | For similarity search (pgvector) |
| confidence_score | INTEGER | 0-100, default 50 |
| use_count | INTEGER | Times used via RAG suggestion |
| last_used_at | TIMESTAMPTZ | |
| created_at | TIMESTAMPTZ | |
| updated_at | TIMESTAMPTZ | |

### `solution_events` table (for confidence scoring)

| Column | Type | Notes |
|--------|------|-------|
| id | UUID | PK |
| solution_id | UUID | FK to solutions |
| event_type | VARCHAR(30) | 'used', 'accepted', 'rejected', 'merged', 'decayed' |
| user_id | UUID | FK to users (nullable for decay events) |
| session_id | UUID | FK to ai_sessions (nullable) |
| created_at | TIMESTAMPTZ | |

---

## Existing Infrastructure to Reuse

| What exists | Where | How it maps |
|-------------|-------|-------------|
| Knowledge Flywheel | `services/knowledge_flywheel.py` | Session analysis → can generate solution drafts |
| Knowledge Gap Service | `services/knowledge_gap_service.py` | Detects weak options → can flag sessions worth saving |
| RAG in assistant chat | `services/ai_chat_service.py` | Already does retrieval — extend to Solutions Library |
| Step Library UI | `components/step-library/` | Restyle as Solutions Library |
| pgvector | Already in Docker image (`pgvector/pgvector:pg16`) | Embedding storage + similarity search |
| FlowPilot session conclusion | `components/flowpilot/` | Add "Save as Solution" prompt |

---

## Implementation Phases (future)

### Phase 1: Capture & Library
- Solutions table + migrations
- Post-session "Save as Solution" prompt in FlowPilot
- Auto-generate resolution summary from session transcript
- Solutions Library page (replaces Step Library)
- Manual add/edit/delete

### Phase 2: Dedup
- Embedding generation on save (Anthropic or OpenAI embeddings)
- Similarity search on save
- Merge/Keep Both/Discard UI

### Phase 3: Smart RAG
- Mid-session similarity search
- Natural language suggestion in FlowPilot conversation
- Accept/reject tracking
- Confidence scoring + decay job

### Phase 4: Team Intelligence
- "Trending solutions" on dashboard
- "Your team resolved 12 Exchange issues this week" insights
- Solution suggestions in the copilot intake ("Common issues today: VPN, Exchange, AD lockouts")

---

## Open Questions (to answer during pilot)

1. Do engineers actually want to save resolutions, or is it friction?
2. How similar do problems need to be before a suggestion is helpful vs. annoying?
3. Should solutions be editable by the whole team, or only the creator + admins?
4. What's the right moment to prompt "Save as Solution" — right after resolution, or in a follow-up?
5. Do engineers trust AI-generated resolution summaries, or do they want to write their own?

These questions should be answerable after 2-4 weeks of pilot usage.