- FastAPI backend with JWT auth - PostgreSQL database schema - Trees and Sessions CRUD APIs - Export functionality (Markdown, Text, HTML) - Docker setup for local development - Alembic migrations
354 lines
12 KiB
Markdown
354 lines
12 KiB
Markdown
# Troubleshooting Decision Tree Application
|
|
|
|
> Transform chaos into clarity - guided troubleshooting with automatic documentation for MSP engineers.
|
|
|
|
## Project Status: 📋 Planning Phase
|
|
|
|
Currently in the planning and architecture phase. Development will begin once key decisions are made and initial troubleshooting scenarios are documented.
|
|
|
|
---
|
|
|
|
## The Problem
|
|
|
|
MSP engineers face constant context switching between diverse technical issues (file shares, server outages, VPN failures, Active Directory problems). This creates:
|
|
|
|
- **Cognitive overload:** 15-25 minutes to regain focus after each context switch
|
|
- **Inconsistent documentation:** Under pressure, notes are rushed or incomplete
|
|
- **Lost tribal knowledge:** Best troubleshooting paths live only in senior engineers' heads
|
|
- **Repeated work:** Same issues investigated from scratch each time
|
|
- **Burnout:** Research shows context switching is a major contributor to burnout
|
|
|
|
## The Solution
|
|
|
|
An intelligent decision tree system that:
|
|
|
|
✅ **Guides** engineers through proven troubleshooting paths
|
|
✅ **Captures** decisions and notes automatically as you work
|
|
✅ **Generates** professional ticket documentation with one click
|
|
✅ **Builds** institutional knowledge that improves over time
|
|
✅ **Reduces** cognitive load during high-stress situations
|
|
|
|
### Success Metric
|
|
If Michael (our primary user) uses this tool for **50% of his tickets in 3 months**, we've succeeded.
|
|
|
|
---
|
|
|
|
## Key Features
|
|
|
|
### MVP (Weeks 1-3)
|
|
- 🌳 **Tree Navigation** - Step-by-step guided troubleshooting
|
|
- 📝 **Automatic Notes** - Capture context at each decision point
|
|
- 📄 **Export** - Generate professional documentation (plain text, markdown, HTML)
|
|
- 🔐 **Multi-User** - Team authentication and access control
|
|
- 📚 **Documentation Links** - Contextual links to KB articles and vendor docs
|
|
|
|
### Phase 2 (Weeks 4-6)
|
|
- 👥 **Team Management** - Controlled authorship, shared access
|
|
- ✏️ **Tree Editor** - Visual interface to create/modify decision trees
|
|
- 📱 **Mobile Responsive** - Works on phone/tablet for on-site work
|
|
- 🔀 **Custom Branches** - Add unique steps on-the-fly during troubleshooting
|
|
- 🔍 **Search & Categories** - Find the right tree quickly
|
|
|
|
### Phase 3 (Weeks 7-12)
|
|
- 📎 **Attachments** - Upload screenshots, logs, command outputs
|
|
- 💾 **Offline Mode** - Continue working without internet, sync when back online
|
|
- 🏢 **Client Context** - Auto-fill client-specific details (server names, topologies)
|
|
- 📧 **Send to Engineer** - Generate simplified checklist for onsite techs
|
|
- 📊 **Analytics** - Track usage, common paths, team performance
|
|
|
|
### Phase 4 (Months 4-6)
|
|
- 🔌 **API & Integrations** - Connect to ConnectWise, Kaseya, LabTech
|
|
- ⚡ **Automation** - Execute PowerShell scripts directly from trees
|
|
- 🏢 **Enterprise Features** - SSO, white-labeling, advanced RBAC
|
|
- 🌐 **Marketplace** - Share and discover community-contributed trees
|
|
|
|
---
|
|
|
|
## Tech Stack
|
|
|
|
### Frontend
|
|
- **React** - Modern, flexible, excellent offline support
|
|
- **Tailwind CSS** - Rapid UI development
|
|
- **Service Workers** - Offline capability
|
|
- **IndexedDB** - Local data storage
|
|
|
|
### Backend
|
|
- **Python FastAPI** - Modern, fast, async support
|
|
- **SQLAlchemy** - ORM with async support
|
|
- **PostgreSQL** - Reliable database with excellent JSON support
|
|
- **Alembic** - Database migrations
|
|
|
|
### Infrastructure
|
|
- **S3-Compatible Storage** - File attachments (MinIO for dev, S3/Spaces for prod)
|
|
- **Railway/Render** - Simple hosting to start
|
|
- **Docker** - Containerized development environment
|
|
|
|
---
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
troubleshooting-tree-app/
|
|
├── docs/
|
|
│ ├── 01-PROJECT-OVERVIEW.md # Vision, goals, market analysis
|
|
│ ├── 02-TECHNICAL-ARCHITECTURE.md # System design, data models, API specs
|
|
│ ├── 03-DEVELOPMENT-ROADMAP.md # Phases, timeline, milestones
|
|
│ ├── 04-FEATURE-SPECIFICATIONS.md # Detailed feature descriptions
|
|
│ └── 05-QUESTIONS-AND-ACTION-ITEMS.md # Decisions needed, next steps
|
|
├── backend/ # Python FastAPI application (future)
|
|
├── frontend/ # React application (future)
|
|
├── database/ # Database schemas, migrations (future)
|
|
└── README.md # This file
|
|
```
|
|
|
|
---
|
|
|
|
## Getting Started
|
|
|
|
### For Michael (Primary User)
|
|
|
|
**Immediate Action Items:**
|
|
|
|
1. **Answer Key Questions** (see `docs/05-QUESTIONS-AND-ACTION-ITEMS.md`)
|
|
- Timeline needs
|
|
- Budget for hosting
|
|
- Team size
|
|
- Branding preferences
|
|
|
|
2. **Document 5 Troubleshooting Scenarios**
|
|
- Citrix VDA Not Registering
|
|
- FSLogix Profile Issues
|
|
- Active Directory Replication Failure
|
|
- SonicWall VPN Tunnel Down
|
|
- User Unable to Access File Share
|
|
|
|
See template in `05-QUESTIONS-AND-ACTION-ITEMS.md`
|
|
|
|
3. **Provide Sample Export**
|
|
- Show how you currently write ticket notes
|
|
- What format/level of detail is needed
|
|
|
|
4. **Review Documentation**
|
|
- Read through all docs in `docs/` folder
|
|
- Flag anything unclear or that you disagree with
|
|
- Add your own thoughts/ideas
|
|
|
|
### For Developers (Future)
|
|
|
|
Once development starts:
|
|
|
|
1. Clone repository
|
|
2. Set up development environment (Docker)
|
|
3. Install dependencies
|
|
4. Run migrations
|
|
5. Start development servers
|
|
6. See `CONTRIBUTING.md` for coding standards
|
|
|
|
---
|
|
|
|
## Development Principles
|
|
|
|
1. **User First** - Every feature must solve a real problem for Michael and his team
|
|
2. **Speed Matters** - Tool must be faster than doing it manually
|
|
3. **Progressive Enhancement** - Start simple, add complexity only when needed
|
|
4. **Offline Capable** - Many MSP sites have poor connectivity
|
|
5. **Automation-Ready** - Architecture supports future integration with scripts/tools
|
|
6. **Documentation Over Memory** - Capture tribal knowledge explicitly
|
|
7. **Fail Gracefully** - Never lose user's work, even if server fails
|
|
|
|
---
|
|
|
|
## Use Cases
|
|
|
|
### Scenario 1: Standard Troubleshooting
|
|
Michael gets a ticket: "User can't access file share"
|
|
1. Opens app, selects "File Share Access Issues" tree
|
|
2. Enters ticket number, client name
|
|
3. Follows decision tree, making selections and adding notes
|
|
4. Reaches resolution in 10 minutes
|
|
5. Clicks "Export", copies formatted notes into ticket
|
|
6. Done - professional documentation with zero extra effort
|
|
|
|
### Scenario 2: Complex Multi-Step Issue
|
|
Michael troubleshooting Citrix VDA registration failure
|
|
1. Starts with "VDA Not Registering" tree
|
|
2. Discovers network issue, branches to "Network Connectivity" tree
|
|
3. Finds firewall blocking traffic, attaches screenshot of rule
|
|
4. Returns to VDA tree, continues troubleshooting
|
|
5. Automation script restarts services, captures output
|
|
6. VDA registers successfully
|
|
7. Exports comprehensive notes showing entire diagnostic path
|
|
|
|
### Scenario 3: Junior Engineer Learning
|
|
New engineer Sarah gets escalated Active Directory issue
|
|
1. Selects "AD Replication Failure" tree (created by Michael)
|
|
2. Tree guides her step-by-step with commands to run
|
|
3. At each step, links to Microsoft Learn docs explain concepts
|
|
4. She adds detailed notes about what she found
|
|
5. Reaches point requiring senior help, shares session link with Michael
|
|
6. Michael reviews her work, sees exactly what she tried
|
|
7. Guides her through final steps over Slack
|
|
8. Sarah learns the process, documents it properly
|
|
|
|
### Scenario 4: On-Site Technician
|
|
Michael needs hands at a remote site
|
|
1. Creates troubleshooting plan in app
|
|
2. Clicks "Send to Engineer", generates simplified checklist
|
|
3. Sends link to on-site tech via text
|
|
4. Tech follows steps, checks boxes, adds photos of error messages
|
|
5. Reports back results in real-time
|
|
6. Michael adjusts plan remotely if needed
|
|
7. Issue resolved with minimal back-and-forth
|
|
|
|
---
|
|
|
|
## Why This Could Be Special
|
|
|
|
### For Individual Engineers
|
|
- Save 30+ minutes per complex ticket
|
|
- Never lose track of troubleshooting progress
|
|
- Professional documentation every time
|
|
- Learn from experienced engineers' approaches
|
|
- Build personal knowledge base over time
|
|
|
|
### For MSP Teams
|
|
- Standardize troubleshooting procedures
|
|
- Onboard junior engineers faster
|
|
- Capture institutional knowledge before engineers leave
|
|
- Improve ticket documentation quality
|
|
- Identify training gaps and common issues
|
|
- Track team performance and efficiency
|
|
|
|
### For the Market
|
|
- 30,000+ MSPs in North America alone
|
|
- Adjacent markets: Internal IT, DevOps, Technical Support
|
|
- Current solutions are either too generic (flowchart tools) or too rigid (static runbooks)
|
|
- **Unique Value:** Purpose-built for technical troubleshooting with automation integration
|
|
|
|
### Potential Business Model
|
|
- **Free Tier:** Personal use, limited trees
|
|
- **Pro Tier:** $15-25/user/month - Team features, unlimited trees, analytics
|
|
- **Enterprise:** Custom pricing - API, SSO, white-labeling
|
|
- **Marketplace:** Revenue share on community trees
|
|
- **Professional Services:** Custom tree development, training, consulting
|
|
|
|
---
|
|
|
|
## Inspiration & Similar Tools
|
|
|
|
### What Exists Today
|
|
- **ServiceNow Knowledge Base** - Good for static docs, not interactive troubleshooting
|
|
- **IT Glue** - Documentation repository, not a troubleshooting guide
|
|
- **Confluence Decision Trees** - Generic flowcharts, not execution-focused
|
|
- **Custom Runbooks** - Static, not adaptive, no automation
|
|
|
|
### What We're Building
|
|
Imagine if ServiceNow Knowledge, Flowchart tools, and PowerShell automation had a baby specifically designed for MSP troubleshooting. That's this.
|
|
|
|
---
|
|
|
|
## FAQ
|
|
|
|
**Q: Why not just use a wiki or documentation system?**
|
|
A: Wikis are great for reference, but they don't guide you through troubleshooting in real-time or automatically generate ticket notes from your actions.
|
|
|
|
**Q: Won't creating trees take more time than just doing the work?**
|
|
A: Initially, yes. But after 2-3 uses of a tree, you've saved more time than you spent creating it. Plus, the tree captures knowledge that helps the entire team.
|
|
|
|
**Q: What if the tree doesn't cover my specific issue?**
|
|
A: You can add custom branches on-the-fly during troubleshooting. These custom paths can then be incorporated into the tree for next time.
|
|
|
|
**Q: How is this different from a flowchart tool?**
|
|
A: Flowcharts are static diagrams. This is an active troubleshooting companion that captures your work and generates documentation.
|
|
|
|
**Q: Can I use this offline?**
|
|
A: Yes (Phase 3). Trees are cached locally, you can work offline, and changes sync when you're back online.
|
|
|
|
**Q: Will this replace my ticketing system?**
|
|
A: No, it complements it. You still create tickets in your PSA, but this generates the detailed notes you paste into tickets.
|
|
|
|
**Q: Can I automate steps?**
|
|
A: Yes (Phase 4). Integrate PowerShell scripts and other automation that can be triggered directly from decision nodes.
|
|
|
|
---
|
|
|
|
## Contributing
|
|
|
|
This is currently a private project in planning phase. Once we move to active development, we'll create a `CONTRIBUTING.md` with:
|
|
- Code of conduct
|
|
- Development workflow
|
|
- Coding standards
|
|
- Testing requirements
|
|
- PR process
|
|
|
|
---
|
|
|
|
## Contact & Feedback
|
|
|
|
**Primary User:** Michael Chihlas
|
|
**Project Lead:** [To be determined]
|
|
**Communication:** [To be determined]
|
|
|
|
For questions, suggestions, or to get involved, contact Michael.
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
[To be determined]
|
|
|
|
Options being considered:
|
|
- Open source (MIT/Apache 2.0) - maximize adoption
|
|
- Source-available with commercial license - protect business interests
|
|
- Proprietary - if building as commercial product
|
|
|
|
---
|
|
|
|
## Acknowledgments
|
|
|
|
- Research on context switching and burnout that inspired this project
|
|
- MSP community for sharing their pain points and workflows
|
|
- All the engineers who've struggled with documentation and wished for a better way
|
|
|
|
---
|
|
|
|
## Roadmap at a Glance
|
|
|
|
```
|
|
└─ [📋 Planning] ← WE ARE HERE
|
|
├─ 📝 Document requirements
|
|
├─ ✅ Make key decisions
|
|
└─ 🏗️ Setup initial architecture
|
|
|
|
└─ [🚀 Week 1-3: MVP]
|
|
├─ Basic tree navigation
|
|
├─ Export functionality
|
|
└─ 5 starter trees
|
|
|
|
└─ [👥 Week 4-6: Team Ready]
|
|
├─ Team management
|
|
├─ Tree editor
|
|
└─ Mobile responsive
|
|
|
|
└─ [💼 Week 7-12: Professional]
|
|
├─ Attachments
|
|
├─ Offline mode
|
|
└─ Analytics
|
|
|
|
└─ [🔌 Month 4-6: Platform]
|
|
├─ API & integrations
|
|
├─ Automation
|
|
└─ Enterprise features
|
|
|
|
└─ [🚀 Beyond: Growth]
|
|
├─ Marketplace
|
|
├─ AI features
|
|
└─ Mobile apps
|
|
```
|
|
|
|
---
|
|
|
|
**Last Updated:** 2026-01-22
|
|
**Project Status:** Planning Phase
|
|
**Next Milestone:** Answer key questions, document first 3 troubleshooting scenarios
|