Covers tree_type expansion, target_lists + maintenance_schedules data model, APScheduler-based auto-session creation, batch launch modal, and phased rollout plan. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
7.1 KiB
Maintenance Flows — Design Document
Date: 2026-02-17 Status: Approved Phase: Design (pre-implementation)
Overview
Add maintenance as a first-class flow type in ResolutionFlow, alongside troubleshooting and procedural. Maintenance flows are designed for MSP scheduled/repeatable infrastructure tasks (e.g., patching Citrix servers, updating FSLogix, updating RDS software). They share the procedural execution engine but add scheduling, multi-target batch launching, and saved target lists.
Goals
- Visual separation of maintenance flows from troubleshooting and project flows
- Batch launch: one flow run against N servers/targets simultaneously, each tracked as an independent session
- Saved target lists per team, with ad-hoc entry and future PSA/RMM import
- Scheduled auto-session creation with in-app notifications
- Re-use target lists from previous batch runs
Data Model
tree_type expansion
Migration: Drop and recreate the ck_trees_tree_type check constraint to allow 'troubleshooting' | 'procedural' | 'maintenance'.
Maintenance flows reuse tree_structure (step-by-step like procedural) and intake_form (for capturing target-specific context at session start, e.g., patch version).
target_lists table (new)
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
team_id UUID NOT NULL REFERENCES teams(id) ON DELETE CASCADE
created_by UUID REFERENCES users(id) ON DELETE SET NULL
name VARCHAR(255) NOT NULL
description TEXT
targets JSONB NOT NULL -- [{ "label": "RDS-01", "notes": "..." }, ...]
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
- Scoped to team; any engineer can create/edit/delete their team's lists
- Each target entry:
label(required, display name / hostname) +notes(optional, IP, role, etc.)
maintenance_schedules table (new)
id UUID PRIMARY KEY DEFAULT gen_random_uuid()
tree_id UUID NOT NULL REFERENCES trees(id) ON DELETE CASCADE
created_by UUID REFERENCES users(id) ON DELETE SET NULL
cron_expression VARCHAR(100) NOT NULL -- e.g. "0 9 15 * *"
timezone VARCHAR(100) NOT NULL DEFAULT 'UTC'
target_list_id UUID REFERENCES target_lists(id) ON DELETE SET NULL
is_active BOOLEAN NOT NULL DEFAULT true
next_run_at TIMESTAMPTZ NOT NULL
last_run_at TIMESTAMPTZ
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
- One active schedule per maintenance flow (enforced at API level)
target_list_idis optional — if null, schedule auto-creates sessions without targets (engineer specifies targets on the pending sessions)next_run_atis computed fromcron_expression+timezoneat creation/update
Sessions — batch tracking fields (new columns)
batch_id UUID -- all sessions from one batch launch share this value
target_label VARCHAR(255) -- e.g. "RDS-01"
batch_idis generated at batch launch time (not per-session)target_labelis the label from the target list entry or ad-hoc input
Scheduling Engine
APScheduler runs in-process with the FastAPI backend (async scheduler).
On startup:
- Load all
is_active=truemaintenance schedules - Register each as an APScheduler job using its
cron_expression+timezone
When a schedule fires:
- Resolve target list (
target_list_id→ targets, or empty list if null) - Generate a new
batch_id - Create one
Sessionper target withbatch_id,target_label, statuspending - Update
last_run_at, compute and updatenext_run_at - Create in-app notification: "Maintenance run ready: [Flow Name] — N sessions created"
Schedule changes (create/update/disable) are applied to APScheduler immediately via the API.
Batch Launch (Ad-hoc)
Triggered from the maintenance flow detail page. Engineer picks target list via modal with four tabs:
| Tab | Description |
|---|---|
| Saved List | Pick from team's saved target lists |
| Previous Run | Browse this flow's past batches, re-use that target list |
| Manual Entry | Paste/type server names (one per line) |
| PSA/RMM Import | Placeholder — "Coming soon" |
After confirming, engineer sees a preview: "Will create N sessions for: RDS-01, RDS-02..."
On confirm: creates N sessions with shared batch_id, status pending.
UI / UX
Sidebar
All Flows [total]
Troubleshooting [count]
Projects [count]
Maintenance [count] ← new
Links to /trees?type=maintenance.
TreeLibraryPage
typeFilterexpands to'all' | 'troubleshooting' | 'procedural' | 'maintenance'- Maintenance flows show a distinct badge (wrench icon, amber accent color)
Flow Editor
- New flow type selector includes "Maintenance"
- Uses the same
ProceduralEditorPage— no new editor needed
Maintenance Flow Detail Page (/flows/:id/maintenance)
New page shown when opening a maintenance flow (via getTreeNavigatePath). Sections:
- Overview — name, description, steps summary
- Schedule panel — set/edit/disable cron schedule, timezone, assigned target list
- Batch Launch button — opens target list modal
- Run history — past batches grouped by
batch_id, status rollup (e.g., "6/8 complete")
Sessions Page — Batch View
Sessions with a shared batch_id collapsed into a single row:
- Flow name, launch date, target count, completion status
- Expand to see individual target sessions
Target Lists Settings (/account/target-lists)
New page under Team settings. Engineers can:
- Create a named target list with target entries (label + optional notes)
- Edit / delete existing lists
- See last-used date per list
Routing
getTreeNavigatePath() in @/lib/routing gains 'maintenance' case → /flows/:id/maintenance.
Individual session execution from the detail page still uses ProceduralNavigationPage.
Rollout Phases
| Phase | Scope |
|---|---|
| 1 — DB + API | Alembic migration, model changes, target_lists + schedules endpoints, batch session creation API |
| 2 — Core UI | Sidebar entry, type filter, flow badge, maintenance detail page, batch launch modal |
| 3 — Scheduler | APScheduler integration, auto-session creation, in-app notifications |
| 4 — Target Lists | Saved lists settings page under Team settings |
Each phase is independently shippable without breaking existing flows.
Testing
test_maintenance_tree_type.py— CRUD, check constraint, filter by typetest_target_lists.py— create/list/update/delete, team scopingtest_maintenance_schedules.py— create/update/disable,next_run_atcalculation, schedule fires + creates correct batch sessionstest_batch_sessions.py— correct session count, sharedbatch_id,target_labelvalues, re-use previous session targets- Frontend:
npm run buildafter each phase
Future
- PSA/RMM import (ConnectWise, Kaseya) for target lists — Phase 4 roadmap item
- Patch window constraints (maintenance flows only run within defined windows)
- Per-target session results dashboard