Adds a new "procedural" tree type for linear step-by-step project workflows (domain controller setup, M365 onboarding, VPN config, etc). Includes intake form builder, two-panel step navigation, variable resolution, procedural exports, 3 seed templates, and UI rename from "Trees" to "Flows". Also archives 19 implemented plan docs and creates deferred features backlog. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
312 lines
12 KiB
Markdown
312 lines
12 KiB
Markdown
# Phase C: Sensitive Data Redaction — Consolidated Implementation Plan
|
|
|
|
> **Status:** Approved — ready for implementation
|
|
> **Spec:** `docs/plans/2026-02-13-EXPORT-IMPROVEMENTS-SPEC.md` section C1
|
|
> **UI Decision:** Simple toggle (Option 1)
|
|
> **Redaction Posture:** Conservative (false positives > false negatives)
|
|
> **Branch:** `feat/export-phase-c`
|
|
> **No DB migration required**
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Server-side regex redaction with a simple checkbox toggle in the export preview modal. Redaction runs **after** export generation and variable resolution to ensure no sensitive data slips through via late substitution. No rich editor — keeps the existing textarea. User sees a summary of what was masked and can manually edit the result.
|
|
|
|
Redaction is **non-persistent** and **request-scoped** — database records are never mutated.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
**In scope:**
|
|
- Redaction for exported content in SessionDetailPage preview/download/copy flows
|
|
- Backend redaction summary returned to frontend for user visibility
|
|
- Conservative pattern set (IPv4, IPv6, email, bearer/API/JWT-like tokens, UNC paths)
|
|
|
|
**Out of scope:**
|
|
- Rich editor / highlight / per-item unmask controls
|
|
- Redaction changes to non-export APIs or persisted session data
|
|
- Hostname masking (MSP tickets legitimately reference hostnames)
|
|
|
|
---
|
|
|
|
## Design Decisions
|
|
|
|
| Decision | Rationale |
|
|
|----------|-----------|
|
|
| Redaction runs post-generation, post-variable-substitution | Prevents misses from late substitutions; redacts the final rendered text |
|
|
| Fail-closed on error | If `redaction_mode="mask"` and redaction processing fails, return 500 — never leak unredacted content |
|
|
| Conservative detection | Prefer false positives over false negatives; users can manually edit |
|
|
| Idempotent output | Running redaction twice on already-redacted content produces the same result |
|
|
| Deterministic replacement order | Patterns applied in fixed order to prevent overlapping-match inconsistencies |
|
|
| Non-persistent | DB records are never mutated; redaction is request-scoped |
|
|
| Hostname exclusion | MSP tickets legitimately reference hostnames |
|
|
|
|
---
|
|
|
|
## Backend
|
|
|
|
### 1. New File: `backend/app/services/redaction_service.py`
|
|
|
|
**`RedactionSummary` dataclass:**
|
|
```python
|
|
@dataclass
|
|
class RedactionSummary:
|
|
ips: int = 0
|
|
emails: int = 0
|
|
tokens: int = 0
|
|
unc_paths: int = 0
|
|
|
|
@property
|
|
def total(self) -> int:
|
|
return self.ips + self.emails + self.tokens + self.unc_paths
|
|
```
|
|
|
|
**Compiled regex pattern registry (deterministic order):**
|
|
|
|
| Priority | Pattern | Regex | Replacement |
|
|
|----------|---------|-------|-------------|
|
|
| 1 | Bearer tokens | `Bearer\s+[A-Za-z0-9._-]+` | `[TOKEN REDACTED]` |
|
|
| 2 | API key patterns | Long hex/base64 strings (32+ chars) | `[TOKEN REDACTED]` |
|
|
| 3 | UNC paths | `\\\\[\w.-]+\\[\w$.-]+` | `[UNC PATH REDACTED]` |
|
|
| 4 | Email | `\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z\|a-z]{2,}\b` | `[EMAIL REDACTED]` |
|
|
| 5 | IPv6 | `\b(?:[0-9a-fA-F]{1,4}:){2,7}[0-9a-fA-F]{1,4}\b` | `[IP REDACTED]` |
|
|
| 6 | IPv4 | `\b(?:\d{1,3}\.){3}\d{1,3}\b` | `[IP REDACTED]` |
|
|
|
|
> **Priority rationale:** More specific/longer patterns match first to prevent partial matches. Bearer tokens before general tokens, IPv6 before IPv4, etc.
|
|
|
|
**Core function:**
|
|
```python
|
|
def apply_redaction_to_text(content: str) -> tuple[str, RedactionSummary]:
|
|
"""
|
|
Apply all redaction patterns to text content.
|
|
Uses re.subn for replacement + counting in one pass per pattern.
|
|
Returns (redacted_content, summary).
|
|
"""
|
|
```
|
|
|
|
- Compile all patterns at module load time (not per-request)
|
|
- Use `re.subn()` for simultaneous replacement and counting
|
|
- Ensure idempotent output — already-redacted placeholders like `[IP REDACTED]` must not be re-matched
|
|
- Raise exception on unexpected errors (fail-closed behavior enforced by caller)
|
|
|
|
### 2. Schema Change: `backend/app/schemas/session.py`
|
|
|
|
Add to `SessionExport`:
|
|
```python
|
|
redaction_mode: Literal["none", "mask"] = "none"
|
|
```
|
|
|
|
### 3. Endpoint Integration: `backend/app/api/endpoints/sessions.py`
|
|
|
|
Update export flow with this execution order:
|
|
```
|
|
1. Fetch session
|
|
2. Generate export by format (markdown/text/html)
|
|
3. Resolve variables
|
|
4. IF redaction_mode == "mask":
|
|
Call redaction service on final rendered content
|
|
If redaction raises → return 500 (fail-closed)
|
|
5. Set response headers
|
|
6. Return content
|
|
```
|
|
|
|
**Critical: Redaction happens AFTER steps 2-3**, not before format branching.
|
|
|
|
**Response headers (always set):**
|
|
- `X-Redaction-Mode: none|mask` — always present on export responses
|
|
- `X-Redaction-Summary: {"ips": 3, "emails": 2, "tokens": 1, "unc_paths": 0, "total": 6}` — present only when mode is `mask`
|
|
|
|
**Redaction footer appended to export content when matches exist:**
|
|
```
|
|
--- Redacted: 3 IPs, 2 emails, 1 token ---
|
|
```
|
|
|
|
Keep existing media types and exported-flag behavior unchanged.
|
|
|
|
### 4. CORS Header Exposure: `backend/main.py`
|
|
|
|
Update **both** CORS middleware branches to expose redaction headers:
|
|
```python
|
|
expose_headers=[
|
|
"X-Redaction-Mode",
|
|
"X-Redaction-Summary",
|
|
"X-Correlation-ID",
|
|
"X-Process-Time"
|
|
]
|
|
```
|
|
|
|
> **Without this, the frontend cannot read custom headers from the response.** This is a browser security restriction (CORS).
|
|
|
|
---
|
|
|
|
## Frontend
|
|
|
|
### 5. Types: `frontend/src/types/session.ts`
|
|
|
|
Add to `SessionExport` type:
|
|
```typescript
|
|
redaction_mode?: 'none' | 'mask';
|
|
```
|
|
|
|
Add new interface:
|
|
```typescript
|
|
interface RedactionSummary {
|
|
ips: number;
|
|
emails: number;
|
|
tokens: number;
|
|
unc_paths: number;
|
|
total: number;
|
|
}
|
|
```
|
|
|
|
### 6. API Layer: `frontend/src/api/sessions.ts`
|
|
|
|
**Keep existing `export()` function unchanged** for backward compatibility.
|
|
|
|
**Add new function:**
|
|
```typescript
|
|
async function exportWithMeta(
|
|
id: string,
|
|
options: SessionExport
|
|
): Promise<{
|
|
content: string;
|
|
redactionMode: 'none' | 'mask';
|
|
redactionSummary: RedactionSummary | null;
|
|
}> {
|
|
// Makes same API call but parses response headers
|
|
// Safely parse X-Redaction-Summary with try/catch
|
|
// Returns structured metadata alongside content
|
|
}
|
|
```
|
|
|
|
> **Why a separate function?** Existing callers of `export()` don't break. Preview flows that need metadata use the new function. Clean separation.
|
|
|
|
### 7. Session Detail Page: `frontend/src/pages/SessionDetailPage.tsx`
|
|
|
|
- Add state: `redactionMode: 'none' | 'mask'` (default: `'none'`)
|
|
- Add state: `redactionSummary: RedactionSummary | null`
|
|
- Use `exportWithMeta()` for preview and toggle-refresh flows
|
|
- Pass toggle callback and summary to `ExportPreviewModal`
|
|
- Keep "Copy for Ticket" and non-preview copy behavior unchanged unless explicitly toggled
|
|
- Follow same pattern as existing `includeSummary` state
|
|
|
|
### 8. Export Preview Modal: `frontend/src/components/session/ExportPreviewModal.tsx`
|
|
|
|
**New props:**
|
|
```typescript
|
|
redactionEnabled?: boolean;
|
|
onToggleRedaction?: (enabled: boolean) => void;
|
|
redactionSummary?: RedactionSummary | null;
|
|
```
|
|
|
|
**Checkbox — match existing "Include Summary" visual pattern:**
|
|
```tsx
|
|
<label className="flex items-center gap-2 text-sm text-white/60 cursor-pointer">
|
|
<input type="checkbox" checked={redactionEnabled} onChange={...} />
|
|
Mask Sensitive Data
|
|
</label>
|
|
```
|
|
|
|
**Summary display:**
|
|
- When matches exist: `"Masked: 3 IPs, 2 emails, 1 token"` in `text-blue-400`
|
|
- When mask is on but no matches: `"No sensitive data detected"` in `text-white/40`
|
|
- Helper text below toggle: `"Toggling reloads content and replaces any manual edits"` in `text-white/30 text-xs`
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
### Backend Unit Tests: `backend/tests/test_redaction_service.py`
|
|
|
|
| Test Case | Description |
|
|
|-----------|-------------|
|
|
| Individual patterns | Each pattern type independently (IPv4, IPv6, email, bearer token, API key, UNC path) |
|
|
| Mixed content | Multiple pattern types in single text block, verify aggregate counts |
|
|
| No matches | Input with no sensitive data returns unchanged text and zero counts |
|
|
| Idempotency | Already-redacted placeholders (`[IP REDACTED]`) are not re-matched or double-counted |
|
|
| Token boundaries | Conservative token detection minimum-length boundaries (32+ chars) |
|
|
| Edge cases | Empty strings, None handling, very long strings |
|
|
| Total calculation | `summary.total` matches sum of individual counts |
|
|
|
|
### Backend Integration Tests: `backend/tests/test_sessions.py` (extend)
|
|
|
|
| Test Case | Description |
|
|
|-----------|-------------|
|
|
| `redaction_mode=none` | Returns unmasked export and `X-Redaction-Mode: none` header |
|
|
| `redaction_mode=mask` | Masks content and sets parseable `X-Redaction-Summary` header |
|
|
| Variable substitution | Content from variable resolution is also masked when matching patterns |
|
|
| Media types unchanged | Export content types remain the same regardless of redaction |
|
|
| Exported flag unchanged | Existing exported-flag semantics for completed/in-progress sessions unchanged |
|
|
| Error behavior | Redaction failure returns 500, not unredacted content |
|
|
|
|
### Frontend Validation
|
|
|
|
- `npm run build` validates types
|
|
- `npm run test` for any existing test suites
|
|
- Verify `exportWithMeta` header parsing behavior
|
|
- Verify `ExportPreviewModal` toggle and summary rendering states
|
|
|
|
### Manual QA Checklist
|
|
|
|
- [ ] Preview with redaction OFF shows original content
|
|
- [ ] Preview with redaction ON masks sensitive data and shows accurate summary
|
|
- [ ] Toggle redaction repeatedly — verify stable counts and content
|
|
- [ ] Download from preview uses the currently shown (edited/masked) content
|
|
- [ ] Copy for Ticket respects current redaction choice
|
|
- [ ] Content with variables resolves correctly, then redacts
|
|
- [ ] Redaction footer appears in exported content when matches exist
|
|
- [ ] Summary line disappears when redaction is toggled off
|
|
|
|
---
|
|
|
|
## Acceptance Criteria
|
|
|
|
1. User can enable/disable masking via preview toggle without page reload
|
|
2. Masked output contains no raw matches for any covered pattern
|
|
3. Summary counts are visible in UI and match backend-calculated values
|
|
4. No persisted session fields are changed by export redaction
|
|
5. Existing export formats and Phase B features continue to pass current tests
|
|
6. Redaction failure results in 500 error, never unredacted content delivery
|
|
|
|
---
|
|
|
|
## Files to Create/Modify
|
|
|
|
| Action | File | Notes |
|
|
|--------|------|-------|
|
|
| **Create** | `backend/app/services/redaction_service.py` | Core redaction engine |
|
|
| **Create** | `backend/tests/test_redaction_service.py` | Unit tests for redaction |
|
|
| **Modify** | `backend/app/schemas/session.py` | Add `redaction_mode` to `SessionExport` |
|
|
| **Modify** | `backend/app/api/endpoints/sessions.py` | Integration point (post-generation) |
|
|
| **Modify** | `backend/main.py` | CORS `expose_headers` for both branches |
|
|
| **Modify** | `frontend/src/types/session.ts` | Add `RedactionSummary` interface + `redaction_mode` |
|
|
| **Modify** | `frontend/src/api/sessions.ts` | Add `exportWithMeta()` function |
|
|
| **Modify** | `frontend/src/components/session/ExportPreviewModal.tsx` | Checkbox + summary UI |
|
|
| **Modify** | `frontend/src/pages/SessionDetailPage.tsx` | State management + wiring |
|
|
| **Extend** | `backend/tests/test_sessions.py` | Integration tests for export + redaction |
|
|
|
|
---
|
|
|
|
## Implementation Order
|
|
|
|
1. `redaction_service.py` + unit tests (standalone, no dependencies)
|
|
2. Schema change in `session.py`
|
|
3. Endpoint integration in `sessions.py` + CORS update in `main.py`
|
|
4. Backend integration tests
|
|
5. Frontend types + API layer (`session.ts`, `sessions.ts`)
|
|
6. Frontend UI (`ExportPreviewModal.tsx`, `SessionDetailPage.tsx`)
|
|
7. Manual QA against checklist
|
|
|
|
---
|
|
|
|
## Assumptions & Defaults
|
|
|
|
- Default redaction mode is `none`
|
|
- Redaction scope is export content only, not stored session data
|
|
- Hostnames are intentionally not masked
|
|
- Conservative detection is accepted, including possible false positives
|
|
- No DB migration is required
|
|
- Existing `export()` API function remains unchanged for backward compatibility
|