Code Mode - Bifrost

This feature is only available on v1.4.0-prerelease1 and above.

Overview

Code Mode is a transformative approach to using MCP that solves a critical problem at scale:

The Problem: When you connect 8-10 MCP servers (150+ tools), every single request includes all tool definitions in the context. The LLM spends most of its budget reading tool catalogs instead of doing actual work.

The Solution: Instead of exposing 150 tools directly, Code Mode exposes just four generic tools. The LLM uses those tools to write Python code (Starlark) that orchestrates everything else in a sandbox.

The Impact

Compare a workflow across 5 MCP servers with ~100 tools: Classic MCP Flow:

6 LLM turns
100 tools in context every turn (600 tool-definition tokens)
All intermediate results flow through the model

Code Mode Flow:

3-4 LLM turns
Only 4 tools + definitions on-demand
Intermediate results processed in sandbox

Result: Up to 92.8% fewer input tokens, 92.2% lower estimated cost, and around 40% faster execution in large MCP deployments.

Benchmark Results

Bifrost Code Mode was benchmarked against classic MCP across three controlled rounds with increasing MCP footprint. Each round used the same query set with Code Mode off and on.

Round	MCP footprint	Pass rate, classic MCP	Pass rate, Code Mode	Input tokens, classic MCP	Input tokens, Code Mode	Input token change	Est. cost, classic MCP	Est. cost, Code Mode	Cost change
1	96 tools / 6 servers	64/64 (100%)	64/64 (100%)	19.9M	8.3M	-58.2%	$104.04	$46.06	-55.7%
2	251 tools / 11 servers	64/65 (98.5%)	65/65 (100%)	35.7M	5.5M	-84.5%	$180.07	$29.80	-83.4%
3	508 tools / 16 servers	65/65 (100%)	65/65 (100%)	75.1M	5.4M	-92.8%	$377.00	$29.00	-92.2%

Code Mode input token usage compared to classic MCP as tool count increases

At around 500 tools, Code Mode reduced average input tokens per query by roughly 14x: from 1.15M tokens to 83K tokens. See the Bifrost MCP Gateway benchmark writeup, or explore the complete benchmark report.

Code Mode estimated cost compared to classic MCP as tool count increases

Code Mode provides four meta-tools to the AI:

listToolFiles - Discover available MCP servers
readToolFile - Load Python stub signatures on-demand
getToolDocs - Get detailed documentation for a specific tool
executeToolCode - Execute Python code with full tool bindings

When to Use Code Mode

Enable Code Mode if you have:

✅ 3+ MCP servers connected
✅ Complex multi-step workflows
✅ Concerned about token costs or latency
✅ Tools that need to interact with each other

Keep Classic MCP if you have:

✅ Only 1-2 small MCP servers
✅ Simple, direct tool calls
✅ Very latency-sensitive use cases (though Code Mode is usually faster)

You can mix both: Enable Code Mode for “heavy” servers (web, documents, databases) and keep small utilities as direct tools.

How Code Mode Works

The Four Tools

Instead of seeing 150+ tool definitions, the model sees four generic tools:

The Execution Flow

Key insight: All the complex orchestration happens inside the sandbox. The LLM only receives the final, compact result, not every intermediate step.

Why This Matters at Scale

Take a multi-step workflow such as looking up a customer, checking their order history, applying a discount, and sending a confirmation.

Classic MCP: every turn carries the full tool list

With classic MCP, every intermediate result returns to the model, and every next turn includes the complete set of available tool definitions again. As the number of connected MCP servers grows, the model keeps paying to reread the same tool catalog.

Classic MCP flow where every model turn carries the full tool list and intermediate tool results

Code Mode: discover, write once, execute once

With Code Mode, the model discovers the relevant stubs, writes a short orchestration script, and Bifrost runs the tool calls inside the Starlark sandbox. The intermediate tool results stay inside the sandbox, and the model receives the compact final output.

Code Mode flow where the model reads tool stubs, writes code, and Bifrost executes the multi-tool workflow in a sandbox

This is why the savings grow with scale: classic MCP cost grows with every connected tool, while Code Mode cost is bounded by the files and documentation the model actually reads. In the benchmark rounds, this produced 3-4x fewer LLM round trips and input-token savings from 58.2% to 92.8% as tool count increased.

Enabling Code Mode

Code Mode must be enabled per MCP client. Once enabled, that client’s tools are accessed through the four meta-tools rather than exposed directly. Best practice: Enable Code Mode for 3+ servers or any “heavy” server (web search, documents, databases).

Web UI
API
config.json

Enable Code Mode for a Client

Navigate to MCP Gateway in the sidebar
Click on a client row to open the configuration sheet

In the Basic Information section, toggle Code Mode Server to enabled
Click Save Changes

Once enabled:

This client’s tools are no longer in the default tool list
They become accessible through listToolFiles() and readToolFile()
The AI can write code using executeToolCode() to call them

# When adding a new client
curl -X POST http://localhost:8080/api/mcp/client \
  -H "Content-Type: application/json" \
  -d '{
    "name": "youtube",
    "connection_type": "http",
    "connection_string": "http://localhost:3001/mcp",
    "tools_to_execute": ["*"],
    "is_code_mode_client": true
  }'

# Or update an existing client
curl -X PUT http://localhost:8080/api/mcp/client/{id} \
  -H "Content-Type: application/json" \
  -d '{
    "name": "youtube",
    "connection_type": "http",
    "connection_string": "http://localhost:3001/mcp",
    "tools_to_execute": ["*"],
    "is_code_mode_client": true
  }'

{
  "mcp": {
    "client_configs": [
      {
        "name": "youtube",
        "connection_type": "http",
        "connection_string": "http://localhost:3001/mcp",
        "tools_to_execute": ["*"],
        "is_code_mode_client": true
      },
      {
        "name": "filesystem",
        "connection_type": "stdio",
        "stdio_config": {
          "command": "npx",
          "args": ["-y", "@anthropic/mcp-filesystem"]
        },
        "tools_to_execute": ["*"],
        "is_code_mode_client": true
      }
    ]
  }
}

Go SDK Setup

mcpConfig := &schemas.MCPConfig{
    ClientConfigs: []schemas.MCPClientConfig{
        {
            Name:             "youtube",
            ConnectionType:   schemas.MCPConnectionTypeHTTP,
            ConnectionString: bifrost.Ptr("http://localhost:3001/mcp"),
            ToolsToExecute:   []string{"*"},
            IsCodeModeClient: true, // Enable code mode
        },
        {
            Name:           "filesystem",
            ConnectionType: schemas.MCPConnectionTypeSTDIO,
            StdioConfig: &schemas.MCPStdioConfig{
                Command: "npx",
                Args:    []string{"-y", "@anthropic/mcp-filesystem"},
            },
            ToolsToExecute:   []string{"*"},
            IsCodeModeClient: true, // Enable code mode
        },
    },
}

The Four Code Mode Tools

When Code Mode clients are connected, Bifrost automatically adds four meta-tools to every request:

1. listToolFiles

Lists all available virtual .pyi stub files for connected code mode servers. Example output (Server-level binding):

servers/
  youtube.pyi
  filesystem.pyi

Example output (Tool-level binding):

servers/
  youtube/
    search.pyi
    get_video.pyi
  filesystem/
    read_file.pyi
    write_file.pyi

2. readToolFile

Reads a virtual .pyi file to get compact Python function signatures for tools. Parameters:

fileName (required): Path like servers/youtube.pyi or servers/youtube/search.pyi
startLine (optional): 1-based starting line for partial reads
endLine (optional): 1-based ending line for partial reads

Example output:

# youtube server tools
# Usage: youtube.tool_name(param=value)
# For detailed docs: use getToolDocs(server="youtube", tool="tool_name")

def search(query: str, maxResults: int = None) -> dict:  # Search for videos
def get_video(id: str) -> dict:  # Get video details

3. getToolDocs

Get detailed documentation for a specific tool when the compact signature from readToolFile is not sufficient. Parameters:

server (required): The server name (e.g., "youtube")
tool (required): The tool name (e.g., "search")

Example output:

# ============================================================================
# Documentation for youtube.search tool
# ============================================================================
#
# USAGE INSTRUCTIONS:
# Call tools using: result = youtube.tool_name(param=value)
# No async/await needed - calls are synchronous.
#
# CRITICAL - HANDLING RESPONSES:
# Tool responses are dicts. To avoid runtime errors:
# 1. Use print(result) to inspect the response structure first
# 2. Access dict values with brackets: result["key"] NOT result.key
# 3. Use .get() for safe access: result.get("key", default)
# ============================================================================

def search(query: str, maxResults: int = None) -> dict:
    """
    Search for videos on YouTube.

    Args:
        query (str): Search query (required)
        maxResults (int): Max results to return (optional)

    Returns:
        dict: Response from the tool. Structure varies by tool.
              Use print(result) to inspect the actual structure.

    Example:
        result = youtube.search(query="...")
        print(result)  # Always inspect response first!
        value = result.get("key", default)  # Safe access
    """
    ...

4. executeToolCode

Executes Python code in a sandboxed Starlark interpreter with access to all code mode server tools. Parameters:

code (required): Python code to execute

Execution Environment:

Python code runs in a Starlark interpreter (Python subset)
All code mode servers are exposed as global objects (e.g., youtube, filesystem)
Tool calls are synchronous - no async/await needed
Use print() for logging (output captured in logs)
Assign to result variable to return a value
Tool execution timeout applies (default 30s)

Syntax notes:

Use keyword arguments: server.tool(param="value") NOT server.tool({"param": "value"})
Access dict values with brackets: result["key"] NOT result.key
List comprehensions work: [x for x in items if x["active"]]

Example code:

# Search YouTube and return formatted results
results = youtube.search(query="AI news", maxResults=5)
titles = [item["snippet"]["title"] for item in results["items"]]
print("Found", len(titles), "videos")
result = {"titles": titles, "count": len(titles)}

Binding Levels

Code Mode supports two binding levels that control how tools are organized in the virtual file system:

Server-Level Binding (Default)

All tools from a server are grouped into a single .pyi file.

servers/
  youtube.pyi        ← Contains all youtube tools
  filesystem.pyi     ← Contains all filesystem tools

Best for:

Servers with few tools
When you want to see all tools at once
Simpler discovery workflow

Tool-Level Binding

Each tool gets its own .pyi file.

servers/
  youtube/
    search.pyi
    get_video.pyi
    get_channel.pyi
  filesystem/
    read_file.pyi
    write_file.pyi
    list_directory.pyi

Best for:

Servers with many tools
When tools have large/complex schemas
More focused documentation per tool

Configuring Binding Level

Binding level is a global setting that controls how Code Mode’s virtual file system is organized. It affects how the AI discovers and loads tool definitions.

Web UI
config.json
Go SDK

Binding level can be viewed in the MCP configuration overview:

Server-level (default): One .pyi file per MCP server
- Use when: 5-20 tools per server, want simple discovery
- Example: servers/youtube.pyi contains all YouTube tools
Tool-level: One .pyi file per individual tool
- Use when: 30+ tools per server, want minimal context bloat
- Example: servers/youtube/search.pyi, servers/youtube/list_channels.pyi

Both modes use the same four-tool interface (listToolFiles, readToolFile, getToolDocs, executeToolCode). The choice is purely about context efficiency per read operation.

{
  "mcp": {
    "tool_manager_config": {
      "code_mode_binding_level": "server"
    }
  }
}

Options: "server" (default) or "tool"

mcpConfig := &schemas.MCPConfig{
    ToolManagerConfig: &schemas.MCPToolManagerConfig{
        CodeModeBindingLevel: schemas.CodeModeBindingLevelTool, // or CodeModeBindingLevelServer
    },
    ClientConfigs: []schemas.MCPClientConfig{
        // ... clients
    },
}

Auto-Execution with Code Mode

Code Mode tools can be auto-executed in Agent Mode, but with additional validation:

The listToolFiles and readToolFile tools are always auto-executable (they’re read-only)
The executeToolCode tool is auto-executable only if all tool calls within the code are allowed

How Validation Works

When executeToolCode is called in agent mode:

Bifrost parses the Python code
Extracts all serverName.toolName() calls
Checks each call against tools_to_auto_execute for that server
If ALL calls are allowed → auto-execute
If ANY call is not allowed → return to user for approval

Example:

{
  "name": "youtube",
  "tools_to_execute": ["*"],
  "tools_to_auto_execute": ["search"],
  "is_code_mode_client": true
}

# This code WILL auto-execute (only uses search)
results = youtube.search(query="AI")
result = results

# This code will NOT auto-execute (uses delete_video which is not in auto-execute list)
youtube.delete_video(id="abc123")

Code Execution Environment

Available APIs

Available	Not Available
Python-like syntax	`import` statements
Synchronous tool calls	Classes (use dicts)
`print()` for logging	File I/O
Dict/List operations	Network access
List comprehensions	`random`, `time` modules

Runtime Environment Details

Engine: Starlark interpreter (Python subset) Tool Exposure: Tools from code mode clients are exposed as global objects:

# If you have a 'youtube' code mode client with a 'search' tool
results = youtube.search(query="AI news")

Code Processing:

Code is validated for syntax errors
Tool calls are extracted and validated
Code executes in isolated Starlark context
Result variable is automatically serialized to JSON

Execution Limits:

Default timeout: 30 seconds per tool execution
Memory isolation: Each execution gets its own context
No access to host file system or network
Logs captured from print() calls

Error Handling

Bifrost provides detailed error messages with hints:

# Error: youtube is not defined
# Hints:
# - Variable or identifier 'youtube' is not defined
# - Available server keys: youtubeAPI, filesystem
# - Use one of the available server keys as the object name

Timeouts

Default: 30 seconds per tool call
Configure via tool_execution_timeout in tool_manager_config
Long-running operations are interrupted with timeout error

Why Savings Grow with Tool Count

Classic MCP injects every available tool definition on every model turn. As you connect more servers, the repeated tool catalog dominates the input context, so cost rises with the size of your MCP footprint. Code Mode keeps that catalog behind four meta-tools. The model discovers the relevant stub files, reads only the signatures and docs it needs, and executes the multi-tool workflow inside the sandbox. In the benchmark rounds above, that kept Code Mode input usage nearly flat while classic MCP grew from 19.9M to 75.1M input tokens. The effect is most visible at large scale: with 508 tools across 16 servers, Code Mode cut input tokens from 75.1M to 5.4M and estimated cost from $377.00 to$ 29.00, while preserving a 65/65 (100%) pass rate.

​Overview

​The Impact

​Benchmark Results

​When to Use Code Mode

​How Code Mode Works

​The Four Tools

​The Execution Flow

​Why This Matters at Scale

​Classic MCP: every turn carries the full tool list

​Code Mode: discover, write once, execute once

​Enabling Code Mode

​Enable Code Mode for a Client

​Go SDK Setup

​The Four Code Mode Tools

​1. listToolFiles

​2. readToolFile

​3. getToolDocs

​4. executeToolCode

​Binding Levels

​Server-Level Binding (Default)

​Tool-Level Binding

​Configuring Binding Level

​Auto-Execution with Code Mode

​How Validation Works

​Code Execution Environment

​Available APIs

​Runtime Environment Details

​Error Handling

​Timeouts

​Why Savings Grow with Tool Count

​Next Steps

Agent Mode

MCP Gateway Mode

Overview

The Impact

Benchmark Results

When to Use Code Mode

How Code Mode Works

The Four Tools

The Execution Flow

Why This Matters at Scale

Classic MCP: every turn carries the full tool list

Code Mode: discover, write once, execute once

Enabling Code Mode

Enable Code Mode for a Client

Go SDK Setup

The Four Code Mode Tools

1. listToolFiles

2. readToolFile

3. getToolDocs

4. executeToolCode

Binding Levels

Server-Level Binding (Default)

Tool-Level Binding

Configuring Binding Level

Auto-Execution with Code Mode

How Validation Works

Code Execution Environment

Available APIs

Runtime Environment Details

Error Handling

Timeouts

Why Savings Grow with Tool Count

Next Steps