Code Agents + MCP: A Step Up in Efficiency

Combining SmolAgents' code generation with MCP's standardized tools creates a powerful pattern that reduces LLM round-trips and enables complex programmatic reasoning.

Most AI agent frameworks force the LLM to output JSON tool calls, wait for results, and repeat. Code Agents take a different approach: they generate and execute Python code that can call multiple tools, perform calculations, and handle complex logic in a single pass.

When you combine SmolAgents (which writes and executes Python) with MCP (which standardizes tool interfaces), you get an agent that can programmatically manipulate data from standardized tools without constant LLM round-trips.

The Architecture

MCP Server: A standalone process exposing tools like get_exchange_rate. It handles API keys and HTTP requests to external providers.

Client App (FastAPI + SmolAgents):

  • Connects to the MCP Server via STDIO
  • Converts MCP tools into Python callables
  • Injects these callables into the SmolAgents sandbox
  • The agent writes Python code to fetch rates, calculate differences, or format data

The Power: A Single Code Block vs. Multiple Round-Trips

Traditional agent frameworks require multiple LLM calls for this task:

User: "Get USD to EUR and USD to GBP rates. Convert 100 USD to whichever is stronger."

Traditional Agent (3+ LLM calls):

  1. LLM outputs: {"tool": "get_exchange_rate", "args": {"base": "USD", "target": "EUR"}}
  2. System executes, returns 1.08
  3. LLM outputs: {"tool": "get_exchange_rate", "args": {"base": "USD", "target": "GBP"}}
  4. System executes, returns 0.79
  5. LLM outputs: {"tool": "calculator", "args": {"expression": "100 * 1.08"}}
  6. System returns 108.00

Code Agent (1 LLM call):

eur_rate = float(get_exchange_rate(base_currency='USD', target_currency='EUR'))
gbp_rate = float(get_exchange_rate(base_currency='USD', target_currency='GBP'))

if eur_rate > gbp_rate:
    result = 100 * eur_rate
    print(f"EUR is stronger. Converting: ${100} USD = €{result:.2f} EUR")
else:
    result = 100 * gbp_rate
    print(f"GBP is stronger. Converting: ${100} USD = £{result:.2f} GBP")

The LLM generates this entire block once. The SmolAgents sandbox executes it, making both tool calls and performing the logic without returning to the LLM.

Implementation Snippet

SmolAgents provides ToolCollection.from_mcp() to automatically convert MCP tools into agent-compatible callables:

from smolagents import CodeAgent, LiteLLMModel, ToolCollection
from mcp import StdioServerParameters

server_params = StdioServerParameters(
    command="uv",
    args=["run", "mcp_server.py"]
)

model = LiteLLMModel(model_id="openai/gpt-4o")

with ToolCollection.from_mcp(server_params) as tools:
    agent = CodeAgent(tools=[*tools.tools], model=model, add_base_tools=True)
    
    result = agent.run(
        "Get USD to EUR and USD to GBP rates. Convert 100 USD to whichever is stronger."
    )
    print(result)

The ToolCollection.from_mcp() method:

  1. Spawns the MCP server process via STDIO
  2. Reads the tool schemas
  3. Converts them into SmolAgents Tool objects
  4. Injects them into the agent's execution environment

Process Isolation:

FastAPI Process (Parent)
    │
    ├─ SmolAgents CodeAgent
    │   └─ Generates Python code
    │       └─ Calls get_exchange_rate(...)
    │
    └─ STDIO ──────> MCP Server Process (Child)
                     └─ Executes actual API call
                     └─ Returns result via STDIO

The agent generates Python code that calls these tools as if they were native functions, but the actual execution happens in a separate, isolated process. This security boundary ensures the agent cannot directly access API keys or make arbitrary external requests.

Why This Matters

Fewer LLM Calls: Complex multi-step operations happen in one code generation instead of multiple tool call cycles.

Real Programming: The agent can use loops, conditionals, data structures. Not just sequential tool calls.

MCP as the Secure Pipe: The MCP server handles credentials and external APIs. The agent just calls Python functions.

Cost Efficiency: One LLM invocation vs. five saves tokens and latency.

Conclusion

Code Agents represent a shift from "LLM as orchestrator" to "LLM as programmer." By generating executable code that calls standardized MCP tools, we get the flexibility of programming with the intelligence of language models.

The MCP protocol ensures tools remain portable and secure, while SmolAgents ensures the generated code runs safely in a sandbox.

This pattern works best for tasks requiring calculation, data transformation, or multi-step logic where traditional agents would ping-pong between the LLM and tools repeatedly.

Reference: Code Mode - Cloudflare Blog