feat: add Microsoft Learn MCP integration + refine assistant system prompt

- Integrate Microsoft Learn MCP server via Anthropic's MCP connector
  for real-time documentation lookups (docs search, fetch, code samples)
- Refine system prompt: clear persona, structured answer guidelines,
  when to use RAG flows vs Microsoft Learn, guardrails against fabrication
- Add ENABLE_MCP_MICROSOFT_LEARN config toggle (default: True)
- Fix bugs from prior edit: wrong MCP URL, broken indentation, undefined
  usage/token variables, NOT_GIVEN for disabled MCP params
- Log MCP tool usage and cache performance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Michael Chihlas
2026-03-05 19:13:34 -05:00
parent 2007dcb990
commit e4c5948fbd
2 changed files with 89 additions and 16 deletions

View File

@@ -84,6 +84,9 @@ class Settings(BaseSettings):
AI_MODEL_GEMINI: str = "gemini-2.5-flash" AI_MODEL_GEMINI: str = "gemini-2.5-flash"
AI_MODEL_ANTHROPIC: str = "claude-haiku-4-5-20251001" AI_MODEL_ANTHROPIC: str = "claude-haiku-4-5-20251001"
# MCP (Model Context Protocol) integrations
ENABLE_MCP_MICROSOFT_LEARN: bool = True
# Embedding / RAG # Embedding / RAG
VOYAGE_API_KEY: Optional[str] = None VOYAGE_API_KEY: Optional[str] = None
EMBEDDING_MODEL: str = "voyage-3.5" EMBEDDING_MODEL: str = "voyage-3.5"

View File

@@ -7,6 +7,9 @@ Uses Anthropic prompt caching to reduce cost on multi-turn conversations:
- The static system prompt is cached (ephemeral, 5-min TTL) - The static system prompt is cached (ephemeral, 5-min TTL)
- The conversation history prefix is cached via a breakpoint on the - The conversation history prefix is cached via a breakpoint on the
last existing message before the new user input last existing message before the new user input
Optionally connects to Microsoft Learn via Anthropic's MCP connector
for real-time documentation lookups (controlled by ENABLE_MCP_MICROSOFT_LEARN).
""" """
import logging import logging
from typing import Any from typing import Any
@@ -21,20 +24,49 @@ from app.services.rag_service import search as rag_search, build_rag_context, ex
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
ASSISTANT_SYSTEM_PROMPT = """You are a Senior Systems and Network Engineer with 15+ years of experience working in Managed Service Provider (MSP) environments. You specialize in: ASSISTANT_SYSTEM_PROMPT = """\
- Windows Server, Active Directory, Group Policy, and Hybrid Identity (Entra ID) You are ResolutionFlow Assistant — an expert IT systems engineer embedded in a \
- Networking (TCP/IP, DNS, DHCP, VPN, firewall troubleshooting, Cisco/Fortinet) troubleshooting platform built for Managed Service Provider (MSP) teams.
- Virtualization (VMware, Hyper-V) and cloud platforms (Azure, AWS, M365)
- Endpoint management, RMM tools, and PSA platforms (ConnectWise, Datto, Kaseya)
- PowerShell scripting and automation
When answering: ## Your Role
- Be direct and actionable — MSP engineers need fast, practical answers You are a senior peer helping fellow MSP engineers solve problems fast. You have \
- Include specific commands, paths, and config values when relevant deep expertise across the MSP technology stack:
- Mention potential risks or gotchas before suggesting changes - Windows Server, Active Directory, Group Policy, Hybrid Identity (Entra ID / Azure AD)
- If a relevant troubleshooting flow exists in the team's library, reference it - Networking: TCP/IP, DNS, DHCP, VPN, firewalls (Cisco, Fortinet, Meraki, SonicWall)
- Keep responses concise but thorough — prefer bullet points and code blocks - Virtualization: VMware vSphere, Hyper-V, Proxmox
- Format code with proper markdown code blocks - Cloud platforms: Microsoft 365, Azure, AWS
- Endpoint management, RMM tools, and PSA platforms (ConnectWise, Datto, Kaseya, NinjaRMM)
- PowerShell scripting and automation
- Security: MFA, Conditional Access, EDR, backup/DR
## How to Answer
- **Be direct and actionable.** Engineers are mid-ticket — give them the answer, \
not a lecture. Lead with the fix, then explain why.
- **Include specifics.** Exact commands, registry paths, config values, port numbers. \
Vague advice wastes time.
- **Warn before you wreck.** If a step could cause downtime, data loss, or a lockout, \
say so upfront — before the command.
- **Use structured formatting.** Bullet points for steps, code blocks for commands, \
bold for key terms. Engineers scan, they don't read essays.
- **Say when you're unsure.** If you don't know the exact answer, say so. Suggest \
where to verify (vendor docs, a specific KB article) rather than guessing.
## Using the Team's Flow Library
Your team has built troubleshooting flows in ResolutionFlow. When relevant flows \
appear in the context below, reference them by name so the engineer can launch them \
directly. Prefer the team's proven flows over ad-hoc instructions when they exist.
## Using Microsoft Learn Documentation
You have access to Microsoft's official documentation via Microsoft Learn. Use it when:
- The question involves exact cmdlet syntax, API parameters, or configuration steps
- You need to verify current Microsoft/Azure behavior or requirements
- No team flow covers the topic and vendor-specific detail would help
Do NOT use Microsoft Learn for every question — only when official docs add real value.
## Boundaries
- Stay focused on IT infrastructure, systems administration, and MSP operations.
- If a question is clearly outside your domain, say so briefly and redirect.
- Never fabricate error codes, KB article numbers, or CLI flags. If unsure, say so.
""" """
@@ -81,7 +113,8 @@ async def _call_anthropic_cached(
"""Call Anthropic with prompt caching on system prompt and history. """Call Anthropic with prompt caching on system prompt and history.
Uses structured system blocks so the static base prompt is cached Uses structured system blocks so the static base prompt is cached
independently from the per-query RAG context. independently from the per-query RAG context. Optionally connects
to Microsoft Learn via MCP for real-time documentation lookups.
""" """
import anthropic import anthropic
@@ -126,18 +159,55 @@ async def _call_anthropic_cached(
# Add the new user message (uncached — it's new each turn) # Add the new user message (uncached — it's new each turn)
messages.append({"role": "user", "content": new_message}) messages.append({"role": "user", "content": new_message})
response = await client.messages.create( # MCP server config (optional — controlled by settings)
mcp_servers = anthropic.NOT_GIVEN
tools = anthropic.NOT_GIVEN
if settings.ENABLE_MCP_MICROSOFT_LEARN:
mcp_servers = [
{
"type": "url",
"url": "https://learn.microsoft.com/api/mcp",
"name": "microsoft-learn",
}
]
tools = [
{
"type": "mcp_toolset",
"mcp_server_name": "microsoft-learn",
}
]
response = await client.beta.messages.create(
model=settings.AI_MODEL_ANTHROPIC, model=settings.AI_MODEL_ANTHROPIC,
max_tokens=max_tokens, max_tokens=max_tokens,
system=system_blocks, system=system_blocks,
messages=messages, messages=messages,
mcp_servers=mcp_servers,
tools=tools,
betas=["mcp-client-2025-11-20"],
) )
text = response.content[0].text # Extract text from response — MCP responses can have multiple block
# types (text, mcp_tool_use, mcp_tool_result). We join all text blocks.
text_parts = []
mcp_tools_used = []
for block in response.content:
if hasattr(block, "text"):
text_parts.append(block.text)
if getattr(block, "type", None) == "mcp_tool_use":
mcp_tools_used.append(getattr(block, "name", "unknown"))
text = "\n".join(text_parts) if text_parts else ""
usage = response.usage usage = response.usage
input_tokens = usage.input_tokens input_tokens = usage.input_tokens
output_tokens = usage.output_tokens output_tokens = usage.output_tokens
# Log MCP tool usage
if mcp_tools_used:
logger.info("MCP tools used: %s", ", ".join(mcp_tools_used))
# Log cache performance # Log cache performance
cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0 cache_read = getattr(usage, "cache_read_input_tokens", 0) or 0
cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0 cache_creation = getattr(usage, "cache_creation_input_tokens", 0) or 0