feat: add Microsoft Learn MCP integration + refine assistant system prompt

- Integrate Microsoft Learn MCP server via Anthropic's MCP connector for real-time documentation lookups (docs search, fetch, code samples) - Refine system prompt: clear persona, structured answer guidelines, when to use RAG flows vs Microsoft Learn, guardrails against fabrication - Add ENABLE_MCP_MICROSOFT_LEARN config toggle (default: True) - Fix bugs from prior edit: wrong MCP URL, broken indentation, undefined usage/token variables, NOT_GIVEN for disabled MCP params - Log MCP tool usage and cache performance Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 19:13:34 -05:00
parent 2007dcb990
commit e4c5948fbd
2 changed files with 89 additions and 16 deletions
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -84,6 +84,9 @@ class Settings(BaseSettings):
    AI_MODEL_GEMINI: str = "gemini-2.5-flash"
    AI_MODEL_ANTHROPIC: str = "claude-haiku-4-5-20251001"
    # MCP (Model Context Protocol) integrations
    ENABLE_MCP_MICROSOFT_LEARN: bool = True
    # Embedding / RAG
    VOYAGE_API_KEY: Optional[str] = None
    EMBEDDING_MODEL: str = "voyage-3.5"
--- a/backend/app/services/assistant_chat_service.py
+++ b/backend/app/services/assistant_chat_service.py
@@ -7,6 +7,9 @@ Uses Anthropic prompt caching to reduce cost on multi-turn conversations:
 - The static system prompt is cached (ephemeral, 5-min TTL)
 - The conversation history prefix is cached via a breakpoint on the
  last existing message before the new user input
 Optionally connects to Microsoft Learn via Anthropic's MCP connector
 for real-time documentation lookups (controlled by ENABLE_MCP_MICROSOFT_LEARN).
 """
 import logging
 from typing import Any
@@ -21,20 +24,49 @@ from app.services.rag_service import search as rag_search, build_rag_context, ex
 logger = logging.getLogger(__name__)
-ASSISTANT_SYSTEM_PROMPT = """You are a Senior Systems and Network Engineer with 15+ years of experience working in Managed Service Provider (MSP) environments. You specialize in:
+ASSISTANT_SYSTEM_PROMPT = """\
- Windows Server, Active Directory, Group Policy, and Hybrid Identity (Entra ID)
+You are ResolutionFlow Assistant — an expert IT systems engineer embedded in a \
- Networking (TCP/IP, DNS, DHCP, VPN, firewall troubleshooting, Cisco/Fortinet)
+troubleshooting platform built for Managed Service Provider (MSP) teams.
 - Virtualization (VMware, Hyper-V) and cloud platforms (Azure, AWS, M365)
 - Endpoint management, RMM tools, and PSA platforms (ConnectWise, Datto, Kaseya)
 - PowerShell scripting and automation
-When answering:
+## Your Role
- Be direct and actionable — MSP engineers need fast, practical answers
+You are a senior peer helping fellow MSP engineers solve problems fast. You have \
- Include specific commands, paths, and config values when relevant
+deep expertise across the MSP technology stack:
- Mention potential risks or gotchas before suggesting changes
+- Windows Server, Active Directory, Group Policy, Hybrid Identity (Entra ID / Azure AD)
- If a relevant troubleshooting flow exists in the team's library, reference it
+- Networking: TCP/IP, DNS, DHCP, VPN, firewalls (Cisco, Fortinet, Meraki, SonicWall)
- Keep responses concise but thorough — prefer bullet points and code blocks
+- Virtualization: VMware vSphere, Hyper-V, Proxmox
- Format code with proper markdown code blocks
+- Cloud platforms: Microsoft 365, Azure, AWS
 - Endpoint management, RMM tools, and PSA platforms (ConnectWise, Datto, Kaseya, NinjaRMM)
 - PowerShell scripting and automation
 - Security: MFA, Conditional Access, EDR, backup/DR
 ## How to Answer
 - **Be direct and actionable.** Engineers are mid-ticket — give them the answer, \
 not a lecture. Lead with the fix, then explain why.
 - **Include specifics.** Exact commands, registry paths, config values, port numbers. \
 Vague advice wastes time.
 - **Warn before you wreck.** If a step could cause downtime, data loss, or a lockout, \
 say so upfront — before the command.
 - **Use structured formatting.** Bullet points for steps, code blocks for commands, \
 bold for key terms. Engineers scan, they don't read essays.
 - **Say when you're unsure.** If you don't know the exact answer, say so. Suggest \
 where to verify (vendor docs, a specific KB article) rather than guessing.
 ## Using the Team's Flow Library
 Your team has built troubleshooting flows in ResolutionFlow. When relevant flows \
 appear in the context below, reference them by name so the engineer can launch them \
 directly. Prefer the team's proven flows over ad-hoc instructions when they exist.
 ## Using Microsoft Learn Documentation
 You have access to Microsoft's official documentation via Microsoft Learn. Use it when:
 - The question involves exact cmdlet syntax, API parameters, or configuration steps
 - You need to verify current Microsoft/Azure behavior or requirements
 - No team flow covers the topic and vendor-specific detail would help
 Do NOT use Microsoft Learn for every question — only when official docs add real value.
 ## Boundaries
 - Stay focused on IT infrastructure, systems administration, and MSP operations.
 - If a question is clearly outside your domain, say so briefly and redirect.
 - Never fabricate error codes, KB article numbers, or CLI flags. If unsure, say so.
 """
@@ -81,7 +113,8 @@ async def _call_anthropic_cached(
    """Call Anthropic with prompt caching on system prompt and history.
    Uses structured system blocks so the static base prompt is cached
-    independently from the per-query RAG context.
+    independently from the per-query RAG context. Optionally connects
    to Microsoft Learn via MCP for real-time documentation lookups.
    """
    import anthropic
@@ -126,18 +159,55 @@ async def _call_anthropic_cached(
    # Add the new user message (uncached — it's new each turn)
    messages.append({"role": "user", "content": new_message})
-    response = await client.messages.create(
+    # MCP server config (optional — controlled by settings)
    mcp_servers = anthropic.NOT_GIVEN
    tools = anthropic.NOT_GIVEN
    if settings.ENABLE_MCP_MICROSOFT_LEARN:
        mcp_servers = [
            {
                "type": "url",
                "url": "https://learn.microsoft.com/api/mcp",
                "name": "microsoft-learn",
            }
        ]
        tools = [
            {
                "type": "mcp_toolset",
                "mcp_server_name": "microsoft-learn",
            }
        ]
    response = await client.beta.messages.create(
        model=settings.AI_MODEL_ANTHROPIC,
        max_tokens=max_tokens,
        system=system_blocks,
        messages=messages,
        mcp_servers=mcp_servers,
        tools=tools,
        betas=["mcp-client-2025-11-20"],
    )
-    text = response.content[0].text
+    # Extract text from response — MCP responses can have multiple block
    # types (text, mcp_tool_use, mcp_tool_result). We join all text blocks.
    text_parts = []
    mcp_tools_used = []
    for block in response.content:
        if hasattr(block, "text"):
            text_parts.append(block.text)
        if getattr(block, "type", None) == "mcp_tool_use":
            mcp_tools_used.append(getattr(block, "name", "unknown"))
    text = "\n".join(text_parts) if text_parts else ""
    usage = response.usage
    input_tokens = usage.input_tokens
    output_tokens = usage.output_tokens
    # Log MCP tool usage
    if mcp_tools_used:
        logger.info("MCP tools used: %s", ", ".join(mcp_tools_used))
    # Log cache performance
    cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
    cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0