Updated documentation; added PERFORMANCE-HEALTH-CHECK.md
This commit is contained in:
66
CLAUDE.md
66
CLAUDE.md
@@ -91,6 +91,20 @@ When adding new frontend pages or components, use "ResolutionFlow" for any user-
|
||||
- Purple gradient theme, custom fonts (Plus Jakarta Sans, Inter, Outfit)
|
||||
- Custom SVG logo in header and auth pages
|
||||
- Updated favicon and browser tab title
|
||||
- **Token Refresh Fix:**
|
||||
- Silent refresh with single-flight queue (prevents concurrent 401 race conditions)
|
||||
- Backend `get_refresh_token_payload` dependency extracts refresh token from Authorization header
|
||||
- Frontend Axios interceptor queues failed requests during refresh, retries after success
|
||||
- Auth store synced after silent refresh via `setTokens` action
|
||||
- **Session Scratchpad (Floating Overlay):**
|
||||
- Fixed-position overlay panel (420px wide, 55vh tall) on right edge
|
||||
- Floating button when collapsed, slide-in panel when expanded
|
||||
- Ctrl+/ keyboard shortcut to toggle
|
||||
- Auto-save with 1s debounce, markdown preview, localStorage persistence
|
||||
- Main content adjusts width via padding transition when panel opens
|
||||
- **Global Thin Scrollbar Styling:**
|
||||
- 6px thin scrollbars site-wide (Firefox `scrollbar-width: thin` + WebKit pseudo-elements)
|
||||
- Theme-aware colors using CSS variables (`--border`, `--muted-foreground`)
|
||||
|
||||
### What's In Progress
|
||||
|
||||
@@ -180,7 +194,7 @@ patherly/
|
||||
│ │ ├── router.tsx
|
||||
│ │ ├── assets/brand/ # Brand logos (SVG)
|
||||
│ │ ├── api/ # Axios API client
|
||||
│ │ │ ├── client.ts # Axios instance with interceptors
|
||||
│ │ │ ├── client.ts # Axios instance with refresh queue interceptor
|
||||
│ │ │ ├── auth.ts
|
||||
│ │ │ ├── trees.ts
|
||||
│ │ │ └── sessions.ts
|
||||
@@ -196,7 +210,7 @@ patherly/
|
||||
│ │ │ ├── tree-editor/ # Tree editor components
|
||||
│ │ │ ├── tree-preview/ # Visual tree preview
|
||||
│ │ │ ├── step-library/ # Step library browser, forms, modals
|
||||
│ │ │ ├── session/ # Session modals, scratchpad sidebar
|
||||
│ │ │ ├── session/ # Session modals, scratchpad floating overlay
|
||||
│ │ │ └── ui/ # MarkdownContent
|
||||
│ │ ├── pages/
|
||||
│ │ │ ├── LoginPage.tsx
|
||||
@@ -508,6 +522,33 @@ Key state: `pendingStep`, `pendingContinuationNodeId`, `customBranchMode`, `bran
|
||||
Custom steps are stored in session JSONB (`custom_steps` field) and referenced by UUID in `pathTaken`.
|
||||
`findNode()` only searches tree structure -- use `findCustomStep()` for custom step UUIDs.
|
||||
|
||||
### Token Refresh: Match Frontend/Backend Contract
|
||||
|
||||
The refresh endpoint must accept tokens the same way the frontend sends them.
|
||||
|
||||
```python
|
||||
# WRONG - Expects bare string, but frontend sends Authorization header
|
||||
@router.post("/refresh")
|
||||
async def refresh_token(refresh_token: str):
|
||||
payload = decode_token(refresh_token)
|
||||
|
||||
# CORRECT - Use dependency that reads from Authorization header
|
||||
@router.post("/refresh")
|
||||
async def refresh_token(
|
||||
payload: Annotated[dict, Depends(get_refresh_token_payload)],
|
||||
):
|
||||
```
|
||||
|
||||
The frontend Axios interceptor sends `Authorization: Bearer <refresh_token>`. The backend must extract it from the header, not expect it as a query/body parameter.
|
||||
|
||||
### CORS Errors Can Mask Server 500s
|
||||
|
||||
When the backend returns a 500 Internal Server Error, CORS headers are not added to the response. The browser reports this as a CORS error, hiding the real cause. Always check backend logs first when debugging CORS issues locally.
|
||||
|
||||
### Run Migrations Before Local Testing
|
||||
|
||||
After cloning or pulling new changes, always run `alembic upgrade head` before starting the backend. Missing migrations cause 500 errors (e.g., `column does not exist`) that manifest as CORS errors in the browser.
|
||||
|
||||
---
|
||||
|
||||
## API Endpoints Reference
|
||||
@@ -586,7 +627,7 @@ interface Decision {
|
||||
|
||||
### State Management
|
||||
|
||||
- **Auth:** `useAuthStore` - Zustand with localStorage persistence
|
||||
- **Auth:** `useAuthStore` - Zustand with localStorage persistence (includes `setTokens` for silent refresh sync)
|
||||
- **Theme:** `useThemeStore` - Dark/light/system preference
|
||||
- **Tree Editor:** `useTreeEditorStore` - Zustand + immer + zundo (undo/redo)
|
||||
- **User Preferences:** `useUserPreferencesStore` - Zustand with localStorage persistence (export format default)
|
||||
@@ -612,9 +653,28 @@ interface Decision {
|
||||
import api from '@/api/client'
|
||||
|
||||
// Token refresh handled automatically by interceptor
|
||||
// Concurrent 401s are queued — only one refresh request fires at a time
|
||||
// On refresh failure, user is logged out and redirected to /login
|
||||
const response = await api.get('/api/v1/trees')
|
||||
```
|
||||
|
||||
### Floating Overlay Pattern (Scratchpad)
|
||||
|
||||
The scratchpad uses `position: fixed` with an `onOpenChange` callback so the parent page can adjust layout:
|
||||
|
||||
```tsx
|
||||
// Child: ScratchpadSidebar.tsx
|
||||
onOpenChange?: (isOpen: boolean) => void
|
||||
// Fires when collapsed state changes, parent uses it to add/remove padding
|
||||
|
||||
// Parent: TreeNavigationPage.tsx
|
||||
const [scratchpadOpen, setScratchpadOpen] = useState(...)
|
||||
<div className={cn('...', scratchpadOpen && 'pr-[440px]')}>
|
||||
<div className="mx-auto max-w-4xl"> {/* centers in available space */}
|
||||
```
|
||||
|
||||
Position overlay at `right-2` (not `right-0`) so it sits inside the page scrollbar, and use full `rounded-lg` (not `rounded-l-lg`).
|
||||
|
||||
---
|
||||
|
||||
## Common Tasks
|
||||
|
||||
634
docs/PERFORMANCE-HEALTH-CHECK.md
Normal file
634
docs/PERFORMANCE-HEALTH-CHECK.md
Normal file
@@ -0,0 +1,634 @@
|
||||
# ResolutionFlow Performance Health Check
|
||||
|
||||
**Purpose:** Verify application performance and scalability before/during beta testing
|
||||
**When to run:** Before beta launch, then monthly during growth phase
|
||||
**Time required:** 2-3 hours first time, 30-60 minutes for routine checks
|
||||
|
||||
---
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- [ ] Docker Desktop running
|
||||
- [ ] Access to Railway dashboard
|
||||
- [ ] VS Code open with ResolutionFlow project
|
||||
- [ ] Python virtual environment activated
|
||||
- [ ] Node.js installed (for k6)
|
||||
|
||||
---
|
||||
|
||||
## 1. Database Performance Check
|
||||
|
||||
### 1.1 Verify Indexes Exist
|
||||
|
||||
**Why:** Indexes are like the index in a book - without them, PostgreSQL scans every row (slow). With them, lookups are instant.
|
||||
|
||||
**Commands to run:**
|
||||
```bash
|
||||
# Connect to your Railway PostgreSQL database
|
||||
# Get connection string from Railway dashboard → PostgreSQL service → Variables → DATABASE_URL
|
||||
|
||||
# Option 1: Use Railway CLI
|
||||
railway connect PostgreSQL
|
||||
|
||||
# Option 2: Use psql directly
|
||||
psql "your-database-url-here"
|
||||
```
|
||||
|
||||
**Once connected, run:**
|
||||
```sql
|
||||
-- Check what indexes exist
|
||||
SELECT
|
||||
tablename,
|
||||
indexname,
|
||||
indexdef
|
||||
FROM pg_indexes
|
||||
WHERE schemaname = 'public'
|
||||
ORDER BY tablename, indexname;
|
||||
```
|
||||
|
||||
**What you're looking for:**
|
||||
|
||||
✅ **GOOD:** You should see indexes on:
|
||||
- `users.email` (for login lookups)
|
||||
- `users.username` (for login lookups)
|
||||
- `trees.created_by` (for "my trees" queries)
|
||||
- `tree_nodes.tree_id` (for loading tree structure)
|
||||
- `sessions.tree_id` (for session lookups)
|
||||
|
||||
❌ **BAD:** If these are missing, queries will slow down as data grows
|
||||
|
||||
**Fix if needed:**
|
||||
```sql
|
||||
-- Example: Add missing index
|
||||
CREATE INDEX idx_trees_created_by ON trees(created_by);
|
||||
CREATE INDEX idx_tree_nodes_tree_id ON tree_nodes(tree_id);
|
||||
CREATE INDEX idx_sessions_tree_id ON sessions(tree_id);
|
||||
```
|
||||
|
||||
### 1.2 Test Query Performance
|
||||
|
||||
**Run realistic queries and time them:**
|
||||
```sql
|
||||
-- Enable timing
|
||||
\timing
|
||||
|
||||
-- Test: Full-text search on trees (simulates search bar)
|
||||
SELECT * FROM trees
|
||||
WHERE to_tsvector('english', name || ' ' || description) @@ to_tsquery('english', 'password');
|
||||
|
||||
-- Test: Load tree with all nodes (simulates opening tree editor)
|
||||
SELECT tn.*
|
||||
FROM tree_nodes tn
|
||||
WHERE tn.tree_id = 1 -- Replace with actual tree ID
|
||||
ORDER BY tn.position;
|
||||
|
||||
-- Test: User's tree list (simulates dashboard)
|
||||
SELECT * FROM trees
|
||||
WHERE created_by = 1 -- Replace with actual user ID
|
||||
ORDER BY updated_at DESC
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
**Benchmarks:**
|
||||
- ✅ **GOOD:** All queries < 50ms
|
||||
- ⚠️ **WARNING:** Any query 50-200ms (optimize later)
|
||||
- ❌ **BAD:** Any query > 200ms (optimize NOW)
|
||||
|
||||
### 1.3 Check Database Size
|
||||
```sql
|
||||
-- See how much data you have
|
||||
SELECT
|
||||
schemaname,
|
||||
tablename,
|
||||
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
|
||||
FROM pg_tables
|
||||
WHERE schemaname = 'public'
|
||||
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
|
||||
```
|
||||
|
||||
**What this tells you:** If tables are growing unexpectedly large, you might have data bloat or missing cleanup logic.
|
||||
|
||||
---
|
||||
|
||||
## 2. Frontend Performance Check
|
||||
|
||||
### 2.1 Test Large Tree Rendering
|
||||
|
||||
**Create a "stress test" tree:**
|
||||
|
||||
1. Log into ResolutionFlow frontend
|
||||
2. Create a new tree called "Performance Test - Large Tree"
|
||||
3. Add 50-100 nodes (use copy/paste to speed this up)
|
||||
4. Save the tree
|
||||
|
||||
**What to watch:**
|
||||
|
||||
- Does the editor lag when adding nodes?
|
||||
- Does scrolling feel smooth?
|
||||
- Does saving take more than 2-3 seconds?
|
||||
|
||||
**Tools to use:**
|
||||
|
||||
Open Chrome DevTools (F12):
|
||||
```
|
||||
1. Go to Performance tab
|
||||
2. Click Record (red circle)
|
||||
3. Interact with large tree (scroll, add nodes, expand/collapse)
|
||||
4. Stop recording
|
||||
5. Look for red bars (blocking/slow operations)
|
||||
```
|
||||
|
||||
**Benchmarks:**
|
||||
- ✅ **GOOD:** No operations block for > 100ms
|
||||
- ⚠️ **WARNING:** Some operations 100-300ms
|
||||
- ❌ **BAD:** Operations > 300ms (users will notice lag)
|
||||
|
||||
### 2.2 Check Bundle Size
|
||||
|
||||
**Why:** Large JavaScript bundles = slow initial page load
|
||||
```bash
|
||||
# From your React frontend directory
|
||||
cd frontend
|
||||
npm run build
|
||||
|
||||
# Look at the output - it will show bundle sizes
|
||||
```
|
||||
|
||||
**Benchmarks:**
|
||||
- ✅ **GOOD:** Main bundle < 500KB gzipped
|
||||
- ⚠️ **WARNING:** 500KB - 1MB
|
||||
- ❌ **BAD:** > 1MB (investigate what's bloating it)
|
||||
|
||||
### 2.3 Lighthouse Audit
|
||||
|
||||
**Chrome has this built-in:**
|
||||
```
|
||||
1. Open ResolutionFlow in Chrome
|
||||
2. F12 → Lighthouse tab
|
||||
3. Select "Desktop" + "Performance"
|
||||
4. Click "Analyze page load"
|
||||
```
|
||||
|
||||
**Benchmarks:**
|
||||
- ✅ **GOOD:** Performance score > 80
|
||||
- ⚠️ **WARNING:** 60-80
|
||||
- ❌ **BAD:** < 60
|
||||
|
||||
**Common issues and fixes:**
|
||||
- "Eliminate render-blocking resources" → lazy load components
|
||||
- "Reduce unused JavaScript" → code splitting needed
|
||||
- "Serve images in next-gen formats" → use WebP instead of PNG
|
||||
|
||||
---
|
||||
|
||||
## 3. API Response Time Check
|
||||
|
||||
### 3.1 Manual Timing Test
|
||||
|
||||
**Use Railway logs:**
|
||||
```
|
||||
1. Go to Railway dashboard → API service → Deployments
|
||||
2. Click "View Logs"
|
||||
3. Perform actions in ResolutionFlow frontend
|
||||
4. Watch logs for response times
|
||||
```
|
||||
|
||||
FastAPI logs look like:
|
||||
```
|
||||
INFO: 127.0.0.1 - "GET /api/trees HTTP/1.1" 200 OK [0.023s]
|
||||
```
|
||||
|
||||
**Benchmarks:**
|
||||
- ✅ **GOOD:** Most endpoints < 100ms
|
||||
- ⚠️ **WARNING:** Some endpoints 100-300ms
|
||||
- ❌ **BAD:** Any endpoint > 500ms
|
||||
|
||||
### 3.2 Automated API Testing
|
||||
|
||||
**Create a simple test script:**
|
||||
```python
|
||||
# File: tests/performance_test.py
|
||||
|
||||
import httpx
|
||||
import time
|
||||
from statistics import mean
|
||||
|
||||
API_BASE = "https://api.resolutionflow.com" # Your Railway API URL
|
||||
TOKEN = "your-jwt-token-here" # Get from browser DevTools after login
|
||||
|
||||
headers = {"Authorization": f"Bearer {TOKEN}"}
|
||||
|
||||
def time_endpoint(method, path, **kwargs):
|
||||
"""Time a single API request"""
|
||||
start = time.time()
|
||||
response = httpx.request(method, f"{API_BASE}{path}", headers=headers, **kwargs)
|
||||
elapsed = (time.time() - start) * 1000 # Convert to milliseconds
|
||||
return elapsed, response.status_code
|
||||
|
||||
# Test critical endpoints
|
||||
tests = [
|
||||
("GET", "/api/trees"),
|
||||
("GET", "/api/trees/1"), # Replace with actual tree ID
|
||||
("GET", "/api/trees/1/nodes"),
|
||||
("POST", "/api/trees/search", json={"query": "password"}),
|
||||
]
|
||||
|
||||
print("API Performance Test Results:")
|
||||
print("-" * 50)
|
||||
|
||||
for method, path in tests:
|
||||
times = []
|
||||
for i in range(5): # Run each test 5 times
|
||||
elapsed, status = time_endpoint(method, path)
|
||||
times.append(elapsed)
|
||||
|
||||
avg_time = mean(times)
|
||||
print(f"{method} {path}")
|
||||
print(f" Average: {avg_time:.2f}ms")
|
||||
print(f" Min: {min(times):.2f}ms, Max: {max(times):.2f}ms")
|
||||
print()
|
||||
```
|
||||
|
||||
**Run it:**
|
||||
```bash
|
||||
python tests/performance_test.py
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Monitoring Setup
|
||||
|
||||
### 4.1 Railway Built-in Monitoring
|
||||
|
||||
**What Railway gives you for free:**
|
||||
```
|
||||
1. Go to Railway dashboard
|
||||
2. Click each service (API, Frontend, PostgreSQL)
|
||||
3. Go to "Metrics" tab
|
||||
```
|
||||
|
||||
**Watch for:**
|
||||
- CPU usage spikes (should stay < 50% normally)
|
||||
- Memory usage growing over time (memory leak indicator)
|
||||
- Request rate (see usage patterns)
|
||||
|
||||
**Set up alerts:**
|
||||
```
|
||||
1. Railway dashboard → Project Settings → Notifications
|
||||
2. Add your email
|
||||
3. Enable "Deployment Failed" and "Service Crashed"
|
||||
```
|
||||
|
||||
### 4.2 Sentry Error Tracking (Recommended)
|
||||
|
||||
**Why add Sentry:**
|
||||
- Free tier = 5,000 errors/month
|
||||
- Email alerts when things break
|
||||
- See exact user actions before crash
|
||||
- Industry standard (your future dev team will expect this)
|
||||
|
||||
**Setup (5 minutes):**
|
||||
|
||||
**Backend (FastAPI):**
|
||||
```bash
|
||||
pip install sentry-sdk[fastapi]
|
||||
```
|
||||
```python
|
||||
# File: main.py (add at the top)
|
||||
|
||||
import sentry_sdk
|
||||
|
||||
sentry_sdk.init(
|
||||
dsn="your-sentry-dsn-here", # Get from sentry.io after signup
|
||||
traces_sample_rate=0.1, # 10% of requests (free tier friendly)
|
||||
environment="production",
|
||||
)
|
||||
```
|
||||
|
||||
**Frontend (React):**
|
||||
```bash
|
||||
npm install @sentry/react
|
||||
```
|
||||
```javascript
|
||||
// File: src/index.js (add at the top)
|
||||
|
||||
import * as Sentry from "@sentry/react";
|
||||
|
||||
Sentry.init({
|
||||
dsn: "your-sentry-dsn-here",
|
||||
integrations: [new Sentry.BrowserTracing()],
|
||||
tracesSampleRate: 0.1,
|
||||
environment: "production",
|
||||
});
|
||||
```
|
||||
|
||||
**Get your DSN:**
|
||||
```
|
||||
1. Sign up at sentry.io (free)
|
||||
2. Create new project → Select "FastAPI" and "React"
|
||||
3. Copy the DSN (looks like: https://abc123@o123.ingest.sentry.io/456)
|
||||
4. Add to Railway environment variables:
|
||||
- SENTRY_DSN=your-dsn-here
|
||||
```
|
||||
|
||||
**What you get:**
|
||||
|
||||
- Email when errors occur
|
||||
- Stack traces showing exactly what broke
|
||||
- User session replay (see what they clicked before crash)
|
||||
- Performance monitoring (slow API calls flagged automatically)
|
||||
|
||||
---
|
||||
|
||||
## 5. Load Testing with k6
|
||||
|
||||
**Why k6:**
|
||||
- Industry standard (Grafana Labs maintains it)
|
||||
- Shows you EXACTLY how many concurrent users your app can handle
|
||||
- Simple JavaScript syntax
|
||||
- Free and open source
|
||||
|
||||
### 5.1 Install k6
|
||||
|
||||
**Windows (using Chocolatey):**
|
||||
```powershell
|
||||
choco install k6
|
||||
```
|
||||
|
||||
**Or download directly:**
|
||||
- Go to: https://k6.io/docs/get-started/installation/
|
||||
- Download Windows installer
|
||||
- Run installer
|
||||
|
||||
**Verify:**
|
||||
```bash
|
||||
k6 version
|
||||
```
|
||||
|
||||
### 5.2 Create Load Test Script
|
||||
|
||||
**File: `tests/load_test.js`**
|
||||
```javascript
|
||||
import http from 'k6/http';
|
||||
import { check, sleep } from 'k6';
|
||||
|
||||
// Test configuration
|
||||
export const options = {
|
||||
stages: [
|
||||
{ duration: '30s', target: 10 }, // Ramp up to 10 users over 30s
|
||||
{ duration: '1m', target: 10 }, // Stay at 10 users for 1 minute
|
||||
{ duration: '30s', target: 20 }, // Ramp up to 20 users
|
||||
{ duration: '1m', target: 20 }, // Stay at 20 users for 1 minute
|
||||
{ duration: '30s', target: 0 }, // Ramp down to 0
|
||||
],
|
||||
thresholds: {
|
||||
http_req_duration: ['p(95)<500'], // 95% of requests must complete in 500ms
|
||||
http_req_failed: ['rate<0.01'], // Less than 1% of requests can fail
|
||||
},
|
||||
};
|
||||
|
||||
const BASE_URL = 'https://api.resolutionflow.com';
|
||||
let authToken;
|
||||
|
||||
// Setup: Login once per virtual user
|
||||
export function setup() {
|
||||
const loginRes = http.post(`${BASE_URL}/api/auth/login`,
|
||||
JSON.stringify({
|
||||
username: 'test_user', // Replace with test account
|
||||
password: 'test_password',
|
||||
}),
|
||||
{ headers: { 'Content-Type': 'application/json' } }
|
||||
);
|
||||
|
||||
return { token: loginRes.json('access_token') };
|
||||
}
|
||||
|
||||
// Main test: Simulate realistic user behavior
|
||||
export default function (data) {
|
||||
const headers = {
|
||||
'Authorization': `Bearer ${data.token}`,
|
||||
'Content-Type': 'application/json',
|
||||
};
|
||||
|
||||
// Scenario 1: Load dashboard (get trees list)
|
||||
let res = http.get(`${BASE_URL}/api/trees`, { headers });
|
||||
check(res, {
|
||||
'dashboard loaded': (r) => r.status === 200,
|
||||
'dashboard fast': (r) => r.timings.duration < 300,
|
||||
});
|
||||
sleep(1); // User reads for 1 second
|
||||
|
||||
// Scenario 2: Open a tree
|
||||
res = http.get(`${BASE_URL}/api/trees/1`, { headers }); // Replace with real tree ID
|
||||
check(res, {
|
||||
'tree loaded': (r) => r.status === 200,
|
||||
'tree load fast': (r) => r.timings.duration < 500,
|
||||
});
|
||||
sleep(2); // User reads tree for 2 seconds
|
||||
|
||||
// Scenario 3: Load tree nodes
|
||||
res = http.get(`${BASE_URL}/api/trees/1/nodes`, { headers });
|
||||
check(res, {
|
||||
'nodes loaded': (r) => r.status === 200,
|
||||
'nodes fast': (r) => r.timings.duration < 500,
|
||||
});
|
||||
sleep(1);
|
||||
|
||||
// Scenario 4: Search trees
|
||||
res = http.post(
|
||||
`${BASE_URL}/api/trees/search`,
|
||||
JSON.stringify({ query: 'password reset' }),
|
||||
{ headers }
|
||||
);
|
||||
check(res, {
|
||||
'search worked': (r) => r.status === 200,
|
||||
'search fast': (r) => r.timings.duration < 400,
|
||||
});
|
||||
sleep(2);
|
||||
}
|
||||
```
|
||||
|
||||
### 5.3 Run Load Test
|
||||
|
||||
**Basic test (10 users):**
|
||||
```bash
|
||||
k6 run tests/load_test.js
|
||||
```
|
||||
|
||||
**Aggressive test (50 users):**
|
||||
```bash
|
||||
k6 run --vus 50 --duration 2m tests/load_test.js
|
||||
```
|
||||
|
||||
**What the output means:**
|
||||
```
|
||||
✓ dashboard loaded
|
||||
✓ dashboard fast
|
||||
|
||||
checks.........................: 95.23% ✓ 1234 ✗ 78
|
||||
data_received..................: 1.2 MB 20 kB/s
|
||||
data_sent......................: 456 kB 7.6 kB/s
|
||||
http_req_blocked...............: avg=1.2ms min=0s med=0s max=45ms p(90)=0s p(95)=0s
|
||||
http_req_duration..............: avg=142ms min=23ms med=98ms max=1.2s p(90)=245ms p(95)=387ms
|
||||
http_reqs......................: 1234 20.5/s
|
||||
```
|
||||
|
||||
**How to read this:**
|
||||
|
||||
- `checks`: % of tests that passed (want > 95%)
|
||||
- `http_req_duration p(95)`: 95% of requests faster than this (want < 500ms)
|
||||
- `http_reqs`: Requests per second your app handled
|
||||
- `http_req_failed`: % of requests that errored (want < 1%)
|
||||
|
||||
### 5.4 Interpret Results
|
||||
|
||||
**✅ GOOD (Ready for beta):**
|
||||
```
|
||||
http_req_duration p(95) < 500ms
|
||||
http_req_failed < 1%
|
||||
All checks passing > 95%
|
||||
```
|
||||
|
||||
**⚠️ WARNING (Watch closely during beta):**
|
||||
```
|
||||
http_req_duration p(95) 500-1000ms
|
||||
http_req_failed 1-5%
|
||||
Some checks failing
|
||||
```
|
||||
|
||||
**❌ BAD (Fix before beta launch):**
|
||||
```
|
||||
http_req_duration p(95) > 1000ms
|
||||
http_req_failed > 5%
|
||||
Lots of timeouts or 500 errors
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Pre-Launch Checklist
|
||||
|
||||
Run this checklist **before** inviting beta testers:
|
||||
|
||||
### Database
|
||||
- [ ] All critical indexes exist (Section 1.1)
|
||||
- [ ] Query performance < 200ms (Section 1.2)
|
||||
- [ ] No unexplained table bloat (Section 1.3)
|
||||
|
||||
### Frontend
|
||||
- [ ] Large tree (100 nodes) renders without lag (Section 2.1)
|
||||
- [ ] Bundle size < 1MB (Section 2.2)
|
||||
- [ ] Lighthouse score > 70 (Section 2.3)
|
||||
|
||||
### API
|
||||
- [ ] All endpoints < 500ms under load (Section 3)
|
||||
- [ ] Railway logs show no errors (Section 4.1)
|
||||
|
||||
### Monitoring
|
||||
- [ ] Railway alerts configured (Section 4.1)
|
||||
- [ ] Sentry installed (optional but recommended) (Section 4.2)
|
||||
|
||||
### Load Testing
|
||||
- [ ] k6 test passes with 20 concurrent users (Section 5.3)
|
||||
- [ ] No request failures during load test (Section 5.4)
|
||||
|
||||
---
|
||||
|
||||
## 7. Monthly Health Check (After Launch)
|
||||
|
||||
Once live with beta testers, run this monthly:
|
||||
|
||||
**Quick version (30 minutes):**
|
||||
```bash
|
||||
# 1. Check Railway metrics
|
||||
# Look for: CPU/memory trends, error rate spikes
|
||||
|
||||
# 2. Review Sentry errors (if installed)
|
||||
# Look for: New error patterns, increasing error rates
|
||||
|
||||
# 3. Run quick load test
|
||||
k6 run tests/load_test.js
|
||||
|
||||
# 4. Check database query times
|
||||
# Run queries from Section 1.2, watch for slowdowns
|
||||
```
|
||||
|
||||
**When to do deep dive:**
|
||||
- After adding major new features
|
||||
- If users report slowness
|
||||
- Before scaling to new MSP clients
|
||||
- Every 3 months minimum
|
||||
|
||||
---
|
||||
|
||||
## 8. Common Performance Issues & Fixes
|
||||
|
||||
### Issue: "Search is slow"
|
||||
|
||||
**Diagnosis:**
|
||||
```sql
|
||||
EXPLAIN ANALYZE
|
||||
SELECT * FROM trees
|
||||
WHERE to_tsvector('english', name || ' ' || description) @@ to_tsquery('english', 'password');
|
||||
```
|
||||
|
||||
**Fix:** Add GIN index:
|
||||
```sql
|
||||
CREATE INDEX idx_trees_fts ON trees USING GIN (to_tsvector('english', name || ' ' || description));
|
||||
```
|
||||
|
||||
### Issue: "Loading tree nodes is slow"
|
||||
|
||||
**Diagnosis:** Missing index on foreign key
|
||||
|
||||
**Fix:**
|
||||
```sql
|
||||
CREATE INDEX idx_tree_nodes_tree_id ON tree_nodes(tree_id);
|
||||
```
|
||||
|
||||
### Issue: "Dashboard takes forever to load"
|
||||
|
||||
**Diagnosis:** Fetching too much data
|
||||
|
||||
**Fix:** Add pagination to API:
|
||||
```python
|
||||
# Instead of: SELECT * FROM trees
|
||||
# Use: SELECT * FROM trees LIMIT 20 OFFSET 0
|
||||
```
|
||||
|
||||
### Issue: "Frontend feels sluggish"
|
||||
|
||||
**Diagnosis:** Re-rendering too often
|
||||
|
||||
**Fix:** Add React.memo() to components, use proper dependency arrays in useEffect
|
||||
|
||||
### Issue: "API crashes under load"
|
||||
|
||||
**Diagnosis:** Not enough Railway resources
|
||||
|
||||
**Fix:**
|
||||
```
|
||||
1. Railway dashboard → API service → Settings
|
||||
2. Increase memory limit (default is 512MB, try 1GB)
|
||||
3. Enable auto-scaling if needed
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Resources
|
||||
|
||||
**Tools mentioned:**
|
||||
- k6: https://k6.io/docs/
|
||||
- Sentry: https://sentry.io/
|
||||
- PostgreSQL EXPLAIN: https://www.postgresql.org/docs/current/using-explain.html
|
||||
- Chrome Lighthouse: Built into Chrome DevTools (F12)
|
||||
|
||||
**When to get help:**
|
||||
- k6 test failing badly (> 10% error rate)
|
||||
- Database queries consistently > 1 second
|
||||
- Sentry showing critical errors
|
||||
- Railway CPU/memory maxing out
|
||||
|
||||
**Next steps after this checklist:**
|
||||
- If all checks pass → Launch beta confidently
|
||||
- If warnings found → Document them, monitor during beta
|
||||
- If critical issues → Fix before launch, re-run tests
|
||||
Reference in New Issue
Block a user