MCP Integration Deep Dive¶
This page documents how adapters handle MCP servers internally — the connection lifecycle, tool schema transformation, agent execution loops, trace capture, and message conversion. For MCP configuration and usage, see the MCP overview and MCP-enabled verification.
MCP Execution Flow¶
When mcp_urls_dict is set on a ModelConfig, the verification pipeline switches from simple LLM invocation to agent-based execution. The full MCP execution flow follows these steps:
1. Connect Connect to each MCP server (HTTP/SSE or stdio)
│
2. Discover List available tools from each server
│
3. Filter Apply mcp_tool_filter to restrict tool set
│
4. Transform Convert MCP tools to adapter-native format
│
5. Execute Run agent loop (LLM call → tool use → tool result → repeat)
│
6. Capture Record trace (messages + tool results + usage)
│
7. Convert Transform adapter messages to unified Message format
│
8. Return AgentResult with final_response, raw_trace, trace_messages
Each adapter implements this flow differently based on its backend infrastructure.
Core MCP Utilities¶
The karenina.utils.mcp module provides adapter-agnostic utilities for MCP server interaction:
Session Management (utils/mcp/client.py)¶
| Function | Purpose |
|---|---|
connect_mcp_session(exit_stack, config) |
Connect to a single MCP server via HTTP/SSE. Registers cleanup with AsyncExitStack. |
connect_all_mcp_servers(exit_stack, mcp_servers) |
Connect to all configured servers in parallel. Returns dict[str, ClientSession]. |
get_all_mcp_tools(sessions) |
Call list_tools() on each session. Returns list[tuple[server_name, session, mcp_tool]]. |
These utilities use the core mcp Python package (mcp.client.streamable_http.streamablehttp_client) for HTTP transport. The Claude Tool adapter reuses these directly; other adapters have their own connection mechanisms.
Tool Description Management (utils/mcp/tools.py)¶
| Function | Purpose |
|---|---|
afetch_tool_descriptions(mcp_urls_dict, tool_filter) |
Async: fetch tool names and descriptions from MCP servers. 30-second timeout per server. |
fetch_tool_descriptions(mcp_urls_dict, tool_filter) |
Sync wrapper with smart event loop detection — uses shared BlockingPortal, falls back to ThreadPoolExecutor, or asyncio.run(). 45-second overall timeout. |
apply_tool_description_overrides(tools, overrides) |
Modify tool descriptions in-place. Works with both LangChain Tool and MCP Tool objects. |
Tool Schema Transformation¶
MCP servers expose tools with a standard schema. Each adapter converts these into its native tool format:
MCP Tool Structure¶
An MCP tool has three fields:
name— Unique tool identifier (e.g.,"web_search")description— Human-readable purpose textinputSchema— JSON Schema dict defining the tool's parameters
Adapter-Specific Transformations¶
| Adapter | Transformation | Details |
|---|---|---|
| LangChain | MCP tool → LangChain Tool | Via langchain-mcp-adapters MultiServerMCPClient.get_tools(). Returns LangChain Tool objects for direct use in LangGraph agents. |
| Claude Tool | MCP tool → @beta_async_tool function |
wrap_mcp_tool() creates an async function decorated with Anthropic's @beta_async_tool. The function calls session.call_tool(tool.name, kwargs) and records results in a ToolResultCollector. |
| Claude Agent SDK | MCP tool → SDK-native tools | The SDK discovers and manages tools natively via mcp_servers in ClaudeAgentOptions. Tools use mcp__{server_name}__{tool_name} naming. |
| Manual | N/A | No MCP support. Raises ValueError if mcp_urls_dict is set. |
Tool Filtering and Description Overrides¶
Tool filtering and description overrides apply at steps 3-4 of the execution flow:
- Filtering — Only tools whose names appear in
mcp_tool_filterare passed to the agent. Applied by each adapter before creating the agent. - Overrides — Tool descriptions are replaced using
apply_tool_description_overrides()before the agent sees them. Useful for improving tool selection accuracy.
Adapter-Specific MCP Strategies¶
LangChain Adapter¶
Transport: HTTP/SSE only (URL-based, via langchain-mcp-adapters)
Key classes:
LangChainAgentAdapter— The AgentPort implementationMultiServerMCPClient— From langchain-mcp-adapters, manages MCP connections
Connection flow:
_convert_mcp_servers_to_urls()convertsMCPServerConfigdicts to URL strings. RaisesAgentExecutionErrorfor stdio transport (not supported by LangChain's URL-based infrastructure).acreate_mcp_client_and_tools()creates aMultiServerMCPClientwith streamable HTTP config, fetches tools with 30-second timeout, applies filter and description overrides.- Returns
(client, tools)where tools are LangChain Tool objects.
Agent execution:
_create_agent()initializes the LLM viainit_chat_model_unified(), loads tools, builds middleware stack viabuild_agent_middleware(), and creates a LangGraph agent withInMemorySavercheckpointer.arun()converts messages to LangChain format, invokes the agent viaagent.ainvoke(), and tracks token usage viaget_usage_metadata_callback().- On
GraphRecursionError, extracts partial state from the checkpointer viaextract_partial_agent_state()— theInMemorySaverenables recovering the conversation so far even when the limit is hit.
Trace capture: Messages are collected from the agent response dict ({"messages": [...]}), converted back to unified format via LangChainMessageConverter.from_provider(), and formatted into raw_trace via harmonize_agent_response().
Cleanup: cleanup_mcp_client() attempts multiple strategies — client.close(), client.aclose(), client.__exit__() — with warning if no method found.
Unique features:
- Middleware stack: summarization, prompt caching, retry logic via
AgentMiddlewareConfig - Partial state recovery on recursion limit via
InMemorySavercheckpointer - Multi-provider support (any LangChain-compatible LLM)
Claude Agent SDK Adapter¶
Transport: HTTP/SSE and stdio (native SDK support)
Key classes:
ClaudeSDKAgentAdapter— The AgentPort implementationClaudeSDKClient— Wrapper around the Claude Agent SDK
MCP config conversion (mcp.py):
convert_mcp_config() translates karenina's simplified format to SDK-native format:
| Input Format | Detection | Output Type |
|---|---|---|
URL (http:// or https://) |
Starts with http |
{"type": "http", "url": ...} |
| Command with args (whitespace-separated) | Contains whitespace | {"command": cmd, "args": [...]} |
| Single path | Default | {"command": path} |
validate_mcp_config() checks that HTTP configs have type="http" and url, stdio configs have command. convert_and_validate_mcp_config() combines both, raising McpConfigValidationError on failure.
Note
HTTP URLs map to type="http" (streamable HTTP), not "sse". The Claude Agent SDK handles the transport protocol internally.
Agent execution:
_convert_mcp_servers()handles both karenina format and pre-converted SDK format (detects by checking for"type"or"command"keys)._build_options()createsClaudeAgentOptionswithpermission_mode="bypassPermissions", configured max_turns, and MCP servers. When MCP servers are present, it restricts available tools to only MCP tools viaallowed_toolslist, using the naming conventionmcp__{server_name}__{tool_name}.arun()extracts a prompt string (user messages joined) and system prompt, then usesClaudeSDKClientas an async context manager:
async with ClaudeSDKClient(options) as client:
await client.query(prompt_string)
async for msg in client.receive_response():
collected_messages.append(msg)
- Limit detection via
ResultMessage.subtypefield check.
Trace capture: All messages from client.receive_response() (UserMessage, AssistantMessage, ResultMessage) are collected, then converted via ClaudeSDKMessageConverter.from_provider() and formatted into raw_trace via sdk_messages_to_raw_trace().
Unique features:
- Stdio transport — The only adapter supporting stdio-based MCP servers (local commands like
npx @mcp/server-github) - Native SDK MCP management — the SDK handles connection lifecycle, tool discovery, and execution internally
- Session ID from
ResultMessage— enables tracing back to specific SDK sessions
Claude Tool Adapter¶
Transport: HTTP/SSE only (via core mcp Python package)
Key classes:
ClaudeToolAgentAdapter— The AgentPort implementationToolResultCollector— Records tool execution results for trace reconstruction- Uses Anthropic SDK's
client.beta.messages.tool_runner()for the agent loop
Connection flow:
- Creates
AsyncExitStackfor resource management. - Calls
connect_all_mcp_servers(exit_stack, mcp_servers)fromutils/mcp/client.py(or the adapter's ownmcp.pywhich reimplements the same functions). - Calls
get_all_mcp_tools(sessions)to discover tools. - Each tool is wrapped with
wrap_mcp_tool(): - Creates an async function decorated with
@beta_async_tool - The function calls
session.call_tool(mcp_tool.name, kwargs)and extracts text content from the result - Results are recorded in a
ToolResultCollectorfor trace reconstruction
Agent execution:
- Static tools (karenina
Toolobjects) are wrapped viawrap_static_tool()with the same collector pattern. apply_cache_control_to_tool()adds Anthropiccache_controlto the last tool's schema, enabling prompt caching._execute_agent_loop()converts messages to Anthropic format, creates atool_runnerviaclient.beta.messages.tool_runner(), and iterates over responses:- For each response: increment turn counter, drain
ToolResultCollectorfor recorded results, match with previoustool_useblocks, inject astool_resultmessages, convert to unified Message format, aggregate usage, check turn limit.
Trace capture: The ToolResultCollector is central to trace reconstruction:
class ToolResultCollector:
def record(tool_name, content, is_error=False) # Capture result
def drain() -> list[ToolResultRecord] # Get and clear
Each ToolResultRecord holds tool_name, content, and is_error. After each LLM response that includes tool_use blocks, the collector is drained and results are matched to the corresponding tool calls.
Unique features:
- Native prompt caching —
cache_controlwithephemeraltype added to tool schemas, enabling Anthropic's prompt caching for efficiency - Direct API — No framework overhead; uses Anthropic Python SDK directly
tool_runnerhandles the multi-turn loop automatically
Manual Adapter¶
MCP is not supported with the manual interface. Setting mcp_urls_dict on a ModelConfig with interface="manual" raises a ValueError. The manual adapter replays pre-recorded traces, so live tool access is not applicable.
Trace Capture and Message Conversion¶
Each adapter captures traces in two formats:
raw_trace(string) — Legacy format with delimiter-separated messages for backward compatibilitytrace_messages(list[Message]) — Structured format using karenina's unified Message type
Unified Message Format¶
@dataclass
class Message:
role: Role # SYSTEM, USER, ASSISTANT, TOOL
content: list[Content] # TextContent, ToolUseContent, ToolResultContent, ThinkingContent
Content block types:
| Type | Fields | Used For |
|---|---|---|
TextContent |
text |
Plain text responses |
ToolUseContent |
id, name, input |
Tool invocation requests |
ToolResultContent |
tool_use_id, content, is_error |
Tool execution results |
ThinkingContent |
thinking |
Extended thinking blocks (Anthropic) |
Adapter Message Conversion¶
Each adapter has a message converter that translates between its native format and unified Messages:
LangChain (LangChainMessageConverter):
| LangChain Type | Unified Role | Notes |
|---|---|---|
SystemMessage |
SYSTEM |
System prompt |
HumanMessage |
USER |
User input |
AIMessage |
ASSISTANT |
LLM response, with optional ToolUseContent from tool_calls |
ToolMessage |
TOOL |
Tool results via ToolResultContent |
Bidirectional: to_provider() (unified → LangChain) and from_provider() (LangChain → unified).
Claude Tool (Anthropic SDK format):
| Anthropic Format | Unified Role | Notes |
|---|---|---|
{"role": "assistant", "content": [...]} |
ASSISTANT |
Content blocks: text, tool_use, thinking |
{"role": "user", "content": [...]} |
USER |
Anthropic convention: tool results sent in user messages |
Functions: convert_to_anthropic() (unified → Anthropic dict) and convert_from_anthropic_message() (Anthropic response → unified Message).
Claude Agent SDK (SDK message types):
| SDK Type | Unified Role | Notes |
|---|---|---|
UserMessage |
USER |
Input messages |
AssistantMessage |
ASSISTANT |
With TextBlock, ToolUseBlock, ThinkingBlock content |
ResultMessage |
ASSISTANT |
Final response with subtype for limit detection |
Input conversion uses to_prompt_string() (extracts user messages as joined text) and extract_system_prompt() (extracts system messages). Output conversion handles SDK-specific block types.
Raw Trace Format¶
The raw_trace string is a legacy format that concatenates messages with delimiters. Each adapter has its own formatter:
- LangChain:
harmonize_agent_response()— formats from LangChain message list - Claude Tool:
claude_tool_messages_to_raw_trace()— formats from unified messages - Claude Agent SDK:
sdk_messages_to_raw_trace()— formats from SDK message list
The verification pipeline uses trace_messages (structured) by default. The raw_trace is available for debugging and backward compatibility.
Transport Protocol Comparison¶
| Transport | LangChain | Claude Agent SDK | Claude Tool |
|---|---|---|---|
| HTTP/SSE | Via langchain-mcp-adapters MultiServerMCPClient |
Native SDK support (type="http") |
Via core mcp package streamablehttp_client |
| Stdio | Not supported (URL-based only) | Native SDK support (command + args) |
Not supported (HTTP/SSE only) |
| Connection library | langchain-mcp-adapters | Claude Agent SDK | mcp (Python package) |
| Cleanup | cleanup_mcp_client() with fallback strategies |
SDK-managed (context manager) | AsyncExitStack with registered cleanup |
Recursion Limit Handling¶
Each adapter handles the agent exceeding its turn limit differently:
| Adapter | Mechanism | Behavior |
|---|---|---|
| LangChain | Catches GraphRecursionError |
Extracts partial state from InMemorySaver checkpointer. Returns what the agent produced so far. |
| Claude Tool | Turn counter in _execute_agent_loop() |
Compares turn count against config.max_turns. Breaks the loop and returns partial results. |
| Claude Agent SDK | ResultMessage.subtype field |
The SDK signals limit reached via the result message. Sets limit_reached=True on AgentResult. |
In all cases, the pipeline's RecursionLimitAutoFail stage (Stage 3) checks AgentResult.limit_reached and marks the verification as auto-failed if the limit was hit, while preserving the captured trace for inspection.
Key Source Files¶
| File | Purpose |
|---|---|
utils/mcp/client.py |
Adapter-agnostic MCP session management |
utils/mcp/tools.py |
Tool description fetching and override |
ports/agent.py |
AgentPort protocol, AgentConfig, AgentResult, MCPServerConfig types |
ports/messages.py |
Unified Message type and content blocks |
adapters/langchain/mcp.py |
LangChain MCP client and tool loading |
adapters/langchain/agent.py |
LangChain agent adapter with LangGraph |
adapters/claude_agent_sdk/mcp.py |
Claude SDK MCP config conversion |
adapters/claude_agent_sdk/agent.py |
Claude SDK agent adapter |
adapters/claude_tool/mcp.py |
Claude Tool MCP session management |
adapters/claude_tool/agent.py |
Claude Tool agent adapter with tool_runner |
adapters/claude_tool/tools.py |
Tool wrapping and ToolResultCollector |
adapters/factory.py |
Agent factory routing |
Related¶
- MCP overview — Configuration and when to use MCP
- MCP-enabled verification — Running MCP verification workflows
- Adapter architecture — Hexagonal architecture and port/adapter pattern
- Port types — Complete AgentPort protocol reference
- Available adapters — Per-adapter features and configuration