Files
resolutionflow/02-TECHNICAL-ARCHITECTURE.md
Michael Chihlas 52e8190211 Initial commit: Backend API Phase 1a complete
- FastAPI backend with JWT auth
- PostgreSQL database schema
- Trees and Sessions CRUD APIs
- Export functionality (Markdown, Text, HTML)
- Docker setup for local development
- Alembic migrations
2026-01-22 14:38:53 -05:00

542 lines
18 KiB
Markdown

# Technical Architecture
## System Architecture
### High-Level Architecture
```
┌─────────────────────────────────────────────────────┐
│ Frontend (React/Vue) │
│ - Tree Navigation UI │
│ - Tree Editor │
│ - Session Management │
│ - Export Functionality │
└─────────────────┬───────────────────────────────────┘
│ REST API / WebSocket
┌─────────────────┴───────────────────────────────────┐
│ Backend (Python Flask/FastAPI) │
│ - Authentication & Authorization │
│ - Tree CRUD Operations │
│ - Session Tracking │
│ - Export Generation │
│ - File Upload/Storage │
│ - Automation Execution (Future) │
└─────────────────┬───────────────────────────────────┘
┌─────────────────┴───────────────────────────────────┐
│ Database (PostgreSQL) │
│ - Trees (JSON) │
│ - Sessions │
│ - Users & Teams │
│ - Attachments Metadata │
└──────────────────────────────────────────────────────┘
┌─────────────────┴───────────────────────────────────┐
│ Object Storage (S3/MinIO) │
│ - Screenshots │
│ - Command Outputs │
│ - Logs & Attachments │
└──────────────────────────────────────────────────────┘
```
## Tech Stack
### Frontend
**Primary Choice: React**
- **Pros:** Large ecosystem, excellent offline support (PWA), familiar to most developers
- **Alternatives:** Vue.js (simpler), Svelte (faster)
- **UI Framework:** Tailwind CSS + shadcn/ui (clean, professional look)
- **State Management:** React Context + useReducer (simple) or Zustand (if needed)
- **Routing:** React Router
- **Offline:** Service Workers + IndexedDB for offline tree caching
### Backend
**Primary Choice: Python FastAPI**
- **Pros:** Modern, fast, async support, automatic API docs, matches Michael's learning path
- **Alternatives:** Flask (simpler but less performant), Django (heavier)
- **Authentication:** JWT tokens + httpOnly cookies
- **Validation:** Pydantic models
- **ORM:** SQLAlchemy 2.0 (async)
- **Migration:** Alembic
### Database
**Primary Choice: PostgreSQL**
- **Pros:** JSON/JSONB support perfect for tree storage, reliable, scalable
- **Schema Design:**
- Hybrid approach: Relational for users/sessions, JSONB for tree structure
- Full-text search for tree discovery
- Indexes on frequently queried fields
### File Storage
**Primary Choice: S3-compatible storage**
- **Development:** MinIO (self-hosted, S3-compatible)
- **Production:** AWS S3 or DigitalOcean Spaces
- **Strategy:** Pre-signed URLs for uploads, CDN for delivery
### Hosting
**Development:**
- Frontend: Local dev server (Vite)
- Backend: Local Python server
- Database: Docker PostgreSQL
**Production Options:**
1. **Simple Start:** Railway or Render (full-stack hosting)
- Cost: ~$10-20/month
- Pros: Easy deployment, managed databases
- Cons: Less control, potential scaling issues
2. **Scalable:** DigitalOcean Droplets + Managed DB
- Cost: ~$30-50/month
- Pros: More control, better performance
- Cons: More maintenance
3. **Enterprise:** AWS/Azure
- Cost: Variable
- Pros: Full feature set, enterprise compliance
- Cons: Complex, expensive
## Data Models
### Database Schema
#### Users Table
```sql
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
name VARCHAR(255) NOT NULL,
role VARCHAR(50) NOT NULL, -- admin, engineer, viewer
team_id UUID REFERENCES teams(id),
created_at TIMESTAMP DEFAULT NOW(),
last_login TIMESTAMP
);
```
#### Teams Table
```sql
CREATE TABLE teams (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
```
#### Trees Table
```sql
CREATE TABLE trees (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) NOT NULL,
description TEXT,
category VARCHAR(100), -- Citrix, Active Directory, Networking, etc.
tree_structure JSONB NOT NULL, -- The actual decision tree
author_id UUID REFERENCES users(id),
team_id UUID REFERENCES teams(id),
is_active BOOLEAN DEFAULT true,
version INTEGER DEFAULT 1,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW(),
usage_count INTEGER DEFAULT 0
);
CREATE INDEX idx_trees_category ON trees(category);
CREATE INDEX idx_trees_team ON trees(team_id);
CREATE INDEX idx_trees_search ON trees USING gin(to_tsvector('english', name || ' ' || description));
```
#### Sessions Table
```sql
CREATE TABLE sessions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tree_id UUID REFERENCES trees(id),
user_id UUID REFERENCES users(id),
tree_snapshot JSONB NOT NULL, -- Copy of tree at time of use (for version tracking)
path_taken JSONB NOT NULL, -- Array of node_ids visited
decisions JSONB NOT NULL, -- Decisions made at each node with notes
started_at TIMESTAMP DEFAULT NOW(),
completed_at TIMESTAMP,
ticket_number VARCHAR(100),
client_name VARCHAR(255),
exported BOOLEAN DEFAULT false
);
CREATE INDEX idx_sessions_user ON sessions(user_id);
CREATE INDEX idx_sessions_tree ON sessions(tree_id);
CREATE INDEX idx_sessions_dates ON sessions(started_at, completed_at);
```
#### Attachments Table
```sql
CREATE TABLE attachments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
session_id UUID REFERENCES sessions(id),
node_id VARCHAR(100), -- Which decision node this was attached to
file_name VARCHAR(255) NOT NULL,
file_type VARCHAR(50),
file_size INTEGER,
storage_path VARCHAR(500), -- S3 key or file path
uploaded_at TIMESTAMP DEFAULT NOW()
);
```
### Tree Structure (JSON)
```json
{
"tree_id": "citrix-vda-not-registering",
"name": "Citrix VDA Not Registering",
"category": "Citrix",
"description": "Troubleshoot VDA registration issues with Delivery Controller",
"estimated_time": "10-15 minutes",
"start_node": "node_1",
"metadata": {
"author": "Michael Chihlas",
"created": "2026-01-22",
"last_updated": "2026-01-22",
"version": "1.0",
"tags": ["citrix", "vda", "registration", "delivery-controller"]
},
"nodes": {
"node_1": {
"id": "node_1",
"type": "decision",
"question": "Can you ping the VDA from the Delivery Controller?",
"help_text": "From DDC, open PowerShell and run: Test-Connection -ComputerName VDA-HOSTNAME -Count 4",
"documentation_links": [
{
"title": "Citrix VDA Communication Requirements",
"url": "https://docs.citrix.com/..."
}
],
"decision_type": "yes_no",
"allow_notes": true,
"allow_attachments": true,
"allow_custom_branch": true,
"yes": "node_2",
"no": "node_network_issue"
},
"node_2": {
"id": "node_2",
"type": "decision",
"question": "What is the status of the Citrix Virtual Desktop Agent service?",
"help_text": "Run: Get-Service -Name 'BrokerAgent' | Select-Object Status, StartType",
"decision_type": "multiple_choice",
"allow_notes": true,
"options": [
{
"label": "Running",
"value": "running",
"next": "node_check_broker"
},
{
"label": "Stopped",
"value": "stopped",
"next": "node_start_service"
},
{
"label": "Stuck in Starting/Stopping",
"value": "stuck",
"next": "node_service_stuck"
}
]
},
"node_check_broker": {
"id": "node_check_broker",
"type": "action",
"title": "Check Broker Connection",
"instruction": "Verify the VDA can communicate with the Delivery Controller on port 80/443",
"commands": [
"Test-NetConnection -ComputerName DDC-HOSTNAME -Port 80",
"Test-NetConnection -ComputerName DDC-HOSTNAME -Port 443"
],
"documentation_links": [
{
"title": "VDA to DDC Communication Ports",
"url": "https://docs.citrix.com/..."
}
],
"automation_available": false,
"next": "node_3"
},
"node_start_service": {
"id": "node_start_service",
"type": "action",
"title": "Start Citrix VDA Service",
"instruction": "Start the Citrix Virtual Desktop Agent service and check for errors",
"commands": [
"Start-Service -Name 'BrokerAgent'",
"Get-Service -Name 'BrokerAgent' | Select-Object Status"
],
"automation_available": true,
"automation": {
"type": "powershell",
"script_id": "start-citrix-vda-service",
"description": "Starts Citrix VDA service with error handling",
"requires_elevation": true,
"parameters": []
},
"next": "node_verify_registration"
},
"node_verify_registration": {
"id": "node_verify_registration",
"type": "decision",
"question": "Is the VDA now showing as registered in Citrix Studio?",
"help_text": "Open Citrix Studio > Machine Catalogs > Check VDA status",
"decision_type": "yes_no",
"yes": "node_resolved",
"no": "node_check_event_viewer"
},
"node_resolved": {
"id": "node_resolved",
"type": "resolution",
"title": "Issue Resolved",
"summary": "VDA successfully registered with Delivery Controller",
"resolution_notes": "Document the specific action that resolved the issue in the notes above."
},
"node_network_issue": {
"id": "node_network_issue",
"type": "branch",
"title": "Network Connectivity Issue Detected",
"description": "Unable to ping VDA from DDC - this is a network/firewall issue",
"suggested_next_tree": "network-connectivity-troubleshooting",
"manual_steps": [
"Check if VDA is powered on",
"Verify network cable is connected",
"Check firewall rules between DDC and VDA",
"Verify DNS resolution for VDA hostname"
]
}
}
}
```
### Session Data Structure
```json
{
"session_id": "abc-123",
"tree_id": "citrix-vda-not-registering",
"tree_snapshot": { /* full tree at time of use */ },
"user_id": "user-456",
"started_at": "2026-01-22T14:30:00Z",
"completed_at": "2026-01-22T14:45:00Z",
"ticket_number": "INC-12345",
"client_name": "City of Warner Robins",
"path_taken": [
"node_1",
"node_2",
"node_start_service",
"node_verify_registration",
"node_resolved"
],
"decisions": [
{
"node_id": "node_1",
"question": "Can you ping the VDA from the Delivery Controller?",
"answer": "yes",
"notes": "Ping successful, 2ms response time, no packet loss",
"timestamp": "2026-01-22T14:31:00Z",
"attachments": []
},
{
"node_id": "node_2",
"question": "What is the status of the Citrix Virtual Desktop Agent service?",
"answer": "stopped",
"notes": "Service was stopped. Checking dependencies - NetLogon service also stopped.",
"timestamp": "2026-01-22T14:33:00Z",
"attachments": ["attachment-id-789"]
},
{
"node_id": "node_start_service",
"action_performed": "Started Citrix VDA service",
"notes": "Started NetLogon first, then BrokerAgent. Both services now running.",
"automation_used": false,
"timestamp": "2026-01-22T14:40:00Z",
"attachments": []
},
{
"node_id": "node_verify_registration",
"question": "Is the VDA now showing as registered in Citrix Studio?",
"answer": "yes",
"notes": "VDA shows as 'Registered' in Studio. User able to launch session successfully.",
"timestamp": "2026-01-22T14:44:00Z",
"attachments": ["screenshot-registration.png"]
}
]
}
```
## API Endpoints
### Authentication
```
POST /api/auth/register - Register new user
POST /api/auth/login - Login
POST /api/auth/logout - Logout
GET /api/auth/me - Get current user
POST /api/auth/refresh - Refresh JWT token
```
### Trees
```
GET /api/trees - List all trees (with filters)
GET /api/trees/:id - Get specific tree
POST /api/trees - Create new tree (admin/engineer only)
PUT /api/trees/:id - Update tree (admin/engineer only)
DELETE /api/trees/:id - Soft delete tree (admin only)
GET /api/trees/categories - List all categories
GET /api/trees/search - Full-text search trees
```
### Sessions
```
GET /api/sessions - List user's sessions
GET /api/sessions/:id - Get specific session
POST /api/sessions - Start new troubleshooting session
PUT /api/sessions/:id - Update session (add decisions/notes)
POST /api/sessions/:id/complete - Mark session as complete
POST /api/sessions/:id/export - Export session to formatted notes
```
### Attachments
```
POST /api/sessions/:id/attachments - Upload attachment
GET /api/sessions/:id/attachments - List attachments
GET /api/attachments/:id - Get attachment
DELETE /api/attachments/:id - Delete attachment
```
### Teams (Phase 2)
```
GET /api/teams - List teams
POST /api/teams - Create team (admin only)
GET /api/teams/:id/members - List team members
POST /api/teams/:id/members - Add team member
DELETE /api/teams/:id/members/:user_id - Remove team member
```
### Analytics (Phase 3)
```
GET /api/analytics/trees/:id/usage - Tree usage statistics
GET /api/analytics/trees/:id/paths - Common paths taken
GET /api/analytics/team/performance - Team troubleshooting metrics
GET /api/analytics/user/history - User's troubleshooting history
```
### Automation (Phase 4)
```
GET /api/automation/scripts - List available automation scripts
POST /api/automation/execute - Execute automation script
GET /api/automation/history - Automation execution history
```
## Security Considerations
### Authentication & Authorization
- JWT tokens with short expiry (15 min access, 7 day refresh)
- Role-based access control (RBAC)
- Password requirements: min 10 chars, complexity
- Rate limiting on auth endpoints
- Account lockout after failed attempts
### Data Protection
- All passwords hashed with bcrypt (cost factor 12)
- Sensitive data encrypted at rest
- HTTPS only in production
- CORS properly configured
- SQL injection prevention (parameterized queries)
- XSS prevention (input sanitization, CSP headers)
### File Upload Security
- File type validation (whitelist only)
- File size limits (10MB per file)
- Virus scanning (ClamAV integration for Phase 3)
- Separate storage domain (prevent XSS via uploads)
- Signed URLs with expiration
### API Security
- Rate limiting (100 requests/min per user)
- Request size limits
- API versioning (/api/v1/...)
- Audit logging for sensitive operations
## Performance Considerations
### Database
- Indexes on frequently queried fields
- Connection pooling
- Query optimization (EXPLAIN ANALYZE)
- Consider read replicas for Phase 3+
### Caching Strategy
- Redis for session storage (Phase 2)
- Cache frequently accessed trees
- CDN for static assets
- Browser caching headers
### Frontend Performance
- Code splitting (lazy load routes)
- Tree data cached in IndexedDB
- Debounced search inputs
- Virtualized lists for large datasets
- Optimistic UI updates
## Monitoring & Observability
### Logging
- Structured logging (JSON format)
- Log levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
- Request ID tracking across services
- User action auditing
### Metrics (Phase 3)
- API response times
- Database query performance
- Error rates
- User engagement metrics
- System resource usage
### Error Tracking
- Sentry integration for error tracking
- User-friendly error messages
- Automatic error reporting with context
## Deployment Strategy
### CI/CD Pipeline
1. **Development:** Local development with hot reload
2. **Testing:** Automated tests on PR
3. **Staging:** Auto-deploy to staging environment
4. **Production:** Manual approval → deploy
### Database Migrations
- Alembic for schema migrations
- Backwards-compatible changes
- Rollback capability
- Test migrations on staging first
### Backup Strategy
- Automated daily database backups
- Point-in-time recovery capability
- File storage replication
- Backup retention: 30 days
## Future Technical Considerations
### Scalability
- Horizontal scaling (multiple app servers)
- Database sharding (by team_id)
- Microservices architecture (if needed)
- Message queue for async tasks (Celery + Redis)
### Mobile Apps
- React Native for iOS/Android
- Shared API backend
- Offline-first architecture
- Push notifications for team updates
### AI/ML Integration (Phase 5+)
- Suggest next steps based on past sessions
- Auto-categorize tickets
- Predict resolution time
- Natural language tree navigation