Files

chihlasm 350c977eda feat: add procedural flows with intake forms, navigation, and seed templates

Adds a new "procedural" tree type for linear step-by-step project workflows
(domain controller setup, M365 onboarding, VPN config, etc). Includes intake
form builder, two-panel step navigation, variable resolution, procedural
exports, 3 seed templates, and UI rename from "Trees" to "Flows".

Also archives 19 implemented plan docs and creates deferred features backlog.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-14 04:13:52 -05:00

12 KiB

Raw Permalink Blame History

Phase C: Sensitive Data Redaction — Consolidated Implementation Plan

Status: Approved — ready for implementation
Spec: docs/plans/2026-02-13-EXPORT-IMPROVEMENTS-SPEC.md section C1
UI Decision: Simple toggle (Option 1)
Redaction Posture: Conservative (false positives > false negatives)
Branch: feat/export-phase-c
No DB migration required

Overview

Server-side regex redaction with a simple checkbox toggle in the export preview modal. Redaction runs after export generation and variable resolution to ensure no sensitive data slips through via late substitution. No rich editor — keeps the existing textarea. User sees a summary of what was masked and can manually edit the result.

Redaction is non-persistent and request-scoped — database records are never mutated.

Scope

In scope:

Redaction for exported content in SessionDetailPage preview/download/copy flows
Backend redaction summary returned to frontend for user visibility
Conservative pattern set (IPv4, IPv6, email, bearer/API/JWT-like tokens, UNC paths)

Out of scope:

Rich editor / highlight / per-item unmask controls
Redaction changes to non-export APIs or persisted session data
Hostname masking (MSP tickets legitimately reference hostnames)

Design Decisions

Decision	Rationale
Redaction runs post-generation, post-variable-substitution	Prevents misses from late substitutions; redacts the final rendered text
Fail-closed on error	If `redaction_mode="mask"` and redaction processing fails, return 500 — never leak unredacted content
Conservative detection	Prefer false positives over false negatives; users can manually edit
Idempotent output	Running redaction twice on already-redacted content produces the same result
Deterministic replacement order	Patterns applied in fixed order to prevent overlapping-match inconsistencies
Non-persistent	DB records are never mutated; redaction is request-scoped
Hostname exclusion	MSP tickets legitimately reference hostnames

Backend

1. New File: `backend/app/services/redaction_service.py`

RedactionSummary dataclass:

@dataclass
class RedactionSummary:
    ips: int = 0
    emails: int = 0
    tokens: int = 0
    unc_paths: int = 0

    @property
    def total(self) -> int:
        return self.ips + self.emails + self.tokens + self.unc_paths

Compiled regex pattern registry (deterministic order):

Priority	Pattern	Regex	Replacement
1	Bearer tokens	`Bearer\s+[A-Za-z0-9._-]+`	`[TOKEN REDACTED]`
2	API key patterns	Long hex/base64 strings (32+ chars)	`[TOKEN REDACTED]`
3	UNC paths	`\\\\[\w.-]+\\[\w$.-]+`	`[UNC PATH REDACTED]`
4	Email	`\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z\|a-z]{2,}\b`	`[EMAIL REDACTED]`
5	IPv6	`\b(?:[0-9a-fA-F]{1,4}:){2,7}[0-9a-fA-F]{1,4}\b`	`[IP REDACTED]`
6	IPv4	`\b(?:\d{1,3}\.){3}\d{1,3}\b`	`[IP REDACTED]`

Priority rationale: More specific/longer patterns match first to prevent partial matches. Bearer tokens before general tokens, IPv6 before IPv4, etc.

Core function:

def apply_redaction_to_text(content: str) -> tuple[str, RedactionSummary]:
    """
    Apply all redaction patterns to text content.
    Uses re.subn for replacement + counting in one pass per pattern.
    Returns (redacted_content, summary).
    """

Compile all patterns at module load time (not per-request)
Use re.subn() for simultaneous replacement and counting
Ensure idempotent output — already-redacted placeholders like [IP REDACTED] must not be re-matched
Raise exception on unexpected errors (fail-closed behavior enforced by caller)

2. Schema Change: `backend/app/schemas/session.py`

Add to SessionExport:

redaction_mode: Literal["none", "mask"] = "none"

3. Endpoint Integration: `backend/app/api/endpoints/sessions.py`

Update export flow with this execution order:

1. Fetch session
2. Generate export by format (markdown/text/html)
3. Resolve variables
4. IF redaction_mode == "mask":
     Call redaction service on final rendered content
     If redaction raises → return 500 (fail-closed)
5. Set response headers
6. Return content

Critical: Redaction happens AFTER steps 2-3, not before format branching.

Response headers (always set):

X-Redaction-Mode: none|mask — always present on export responses
X-Redaction-Summary: {"ips": 3, "emails": 2, "tokens": 1, "unc_paths": 0, "total": 6} — present only when mode is mask

Redaction footer appended to export content when matches exist:

--- Redacted: 3 IPs, 2 emails, 1 token ---

Keep existing media types and exported-flag behavior unchanged.

4. CORS Header Exposure: `backend/main.py`

Update both CORS middleware branches to expose redaction headers:

expose_headers=[
    "X-Redaction-Mode",
    "X-Redaction-Summary",
    "X-Correlation-ID",
    "X-Process-Time"
]

Without this, the frontend cannot read custom headers from the response. This is a browser security restriction (CORS).

Frontend

5. Types: `frontend/src/types/session.ts`

Add to SessionExport type:

redaction_mode?: 'none' | 'mask';

Add new interface:

interface RedactionSummary {
  ips: number;
  emails: number;
  tokens: number;
  unc_paths: number;
  total: number;
}

6. API Layer: `frontend/src/api/sessions.ts`

Keep existing export() function unchanged for backward compatibility.

Add new function:

async function exportWithMeta(
  id: string,
  options: SessionExport
): Promise<{
  content: string;
  redactionMode: 'none' | 'mask';
  redactionSummary: RedactionSummary | null;
}> {
  // Makes same API call but parses response headers
  // Safely parse X-Redaction-Summary with try/catch
  // Returns structured metadata alongside content
}

Why a separate function? Existing callers of export() don't break. Preview flows that need metadata use the new function. Clean separation.

7. Session Detail Page: `frontend/src/pages/SessionDetailPage.tsx`

Add state: redactionMode: 'none' | 'mask' (default: 'none')
Add state: redactionSummary: RedactionSummary | null
Use exportWithMeta() for preview and toggle-refresh flows
Pass toggle callback and summary to ExportPreviewModal
Keep "Copy for Ticket" and non-preview copy behavior unchanged unless explicitly toggled
Follow same pattern as existing includeSummary state

8. Export Preview Modal: `frontend/src/components/session/ExportPreviewModal.tsx`

New props:

redactionEnabled?: boolean;
onToggleRedaction?: (enabled: boolean) => void;
redactionSummary?: RedactionSummary | null;

Checkbox — match existing "Include Summary" visual pattern:

<label className="flex items-center gap-2 text-sm text-white/60 cursor-pointer">
  <input type="checkbox" checked={redactionEnabled} onChange={...} />
  Mask Sensitive Data
</label>

Summary display:

When matches exist: "Masked: 3 IPs, 2 emails, 1 token" in text-blue-400
When mask is on but no matches: "No sensitive data detected" in text-white/40
Helper text below toggle: "Toggling reloads content and replaces any manual edits" in text-white/30 text-xs

Testing

Backend Unit Tests: `backend/tests/test_redaction_service.py`

Test Case	Description
Individual patterns	Each pattern type independently (IPv4, IPv6, email, bearer token, API key, UNC path)
Mixed content	Multiple pattern types in single text block, verify aggregate counts
No matches	Input with no sensitive data returns unchanged text and zero counts
Idempotency	Already-redacted placeholders (`[IP REDACTED]`) are not re-matched or double-counted
Token boundaries	Conservative token detection minimum-length boundaries (32+ chars)
Edge cases	Empty strings, None handling, very long strings
Total calculation	`summary.total` matches sum of individual counts

Backend Integration Tests: `backend/tests/test_sessions.py` (extend)

Test Case	Description
`redaction_mode=none`	Returns unmasked export and `X-Redaction-Mode: none` header
`redaction_mode=mask`	Masks content and sets parseable `X-Redaction-Summary` header
Variable substitution	Content from variable resolution is also masked when matching patterns
Media types unchanged	Export content types remain the same regardless of redaction
Exported flag unchanged	Existing exported-flag semantics for completed/in-progress sessions unchanged
Error behavior	Redaction failure returns 500, not unredacted content

Frontend Validation

npm run build validates types
npm run test for any existing test suites
Verify exportWithMeta header parsing behavior
Verify ExportPreviewModal toggle and summary rendering states

Manual QA Checklist

Preview with redaction OFF shows original content
Preview with redaction ON masks sensitive data and shows accurate summary
Toggle redaction repeatedly — verify stable counts and content
Download from preview uses the currently shown (edited/masked) content
Copy for Ticket respects current redaction choice
Content with variables resolves correctly, then redacts
Redaction footer appears in exported content when matches exist
Summary line disappears when redaction is toggled off

Acceptance Criteria

User can enable/disable masking via preview toggle without page reload
Masked output contains no raw matches for any covered pattern
Summary counts are visible in UI and match backend-calculated values
No persisted session fields are changed by export redaction
Existing export formats and Phase B features continue to pass current tests
Redaction failure results in 500 error, never unredacted content delivery

Files to Create/Modify

Action	File	Notes
Create	`backend/app/services/redaction_service.py`	Core redaction engine
Create	`backend/tests/test_redaction_service.py`	Unit tests for redaction
Modify	`backend/app/schemas/session.py`	Add `redaction_mode` to `SessionExport`
Modify	`backend/app/api/endpoints/sessions.py`	Integration point (post-generation)
Modify	`backend/main.py`	CORS `expose_headers` for both branches
Modify	`frontend/src/types/session.ts`	Add `RedactionSummary` interface + `redaction_mode`
Modify	`frontend/src/api/sessions.ts`	Add `exportWithMeta()` function
Modify	`frontend/src/components/session/ExportPreviewModal.tsx`	Checkbox + summary UI
Modify	`frontend/src/pages/SessionDetailPage.tsx`	State management + wiring
Extend	`backend/tests/test_sessions.py`	Integration tests for export + redaction

Implementation Order

redaction_service.py + unit tests (standalone, no dependencies)
Schema change in session.py
Endpoint integration in sessions.py + CORS update in main.py
Backend integration tests
Frontend types + API layer (session.ts, sessions.ts)
Frontend UI (ExportPreviewModal.tsx, SessionDetailPage.tsx)
Manual QA against checklist

Assumptions & Defaults

Default redaction mode is none
Redaction scope is export content only, not stored session data
Hostnames are intentionally not masked
Conservative detection is accepted, including possible false positives
No DB migration is required
Existing export() API function remains unchanged for backward compatibility

12 KiB Raw Permalink Blame History