Pydantic AI SDK

Pydantic AI is a Python agent framework that brings FastAPI-like ergonomics to GenAI development. Since Pydantic AI uses standard provider SDKs under the hood, Bifrost adds enterprise features like governance, semantic caching, MCP tools, observability, etc, on top of your existing agent setup. Endpoint: /pydanticai

Provider Compatibility: This integration only works for AI providers that both Pydantic AI and Bifrost support. Currently supported: OpenAI, Anthropic, and Google Gemini.

Setup

Python

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Configure provider to use Bifrost
provider = OpenAIProvider(
    base_url="http://localhost:8080/pydanticai/v1",  # Point to Bifrost
    api_key="dummy-key"  # Keys managed by Bifrost, Or add virtual key
)
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

# Create agent with Bifrost-routed model
agent = Agent(model, instructions="Be concise and helpful.")

result = agent.run_sync("Hello! How are you?")
print(result.output)

Provider/Model Usage Examples

Your existing Pydantic AI provider switching works unchanged through Bifrost:

Python

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.providers.anthropic import AnthropicProvider
from pydantic_ai.providers.google import GoogleProvider

base_url = "http://localhost:8080/pydanticai"

# OpenAI models via Pydantic AI
openai_provider = OpenAIProvider(base_url=f"{base_url}/v1")
openai_model = OpenAIChatModel("gpt-4o-mini", provider=openai_provider)
openai_agent = Agent(openai_model)

# Anthropic models via Pydantic AI
# Note: Anthropic SDK adds /v1 internally, so we don't append it here
anthropic_provider = AnthropicProvider(base_url=base_url)
anthropic_model = AnthropicModel("claude-3-haiku-20240307", provider=anthropic_provider)
anthropic_agent = Agent(anthropic_model)

# Google Gemini models via Pydantic AI
google_provider = GoogleProvider(base_url=base_url, api_key="dummy-key")
google_model = GoogleModel("gemini-2.0-flash", provider=google_provider)
google_agent = Agent(google_model)

# All work the same way
openai_result = openai_agent.run_sync("Hello GPT!")
anthropic_result = anthropic_agent.run_sync("Hello Claude!")
gemini_result = google_agent.run_sync("Hello Gemini!")

print(openai_result.output)
print(anthropic_result.output)
print(gemini_result.output)

Tool Calling

Pydantic AI’s powerful tool system works seamlessly through Bifrost:

Python

from pydantic_ai import Agent, RunContext, Tool
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider
from dataclasses import dataclass

# Configure Bifrost
provider = OpenAIProvider(base_url="http://localhost:8080/pydanticai/v1")
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

# Define tools as functions
def get_weather(location: str) -> str:
    """Get the current weather for a location."""
    return f"The weather in {location} is 72°F and sunny."

def calculate(expression: str) -> str:
    """Perform a mathematical calculation."""
    result = eval(expression)  # Use safe evaluation in production
    return f"The result is {result}"

# Create agent with tools
agent = Agent(
    model,
    tools=[get_weather, calculate],
    instructions="You can check weather and do calculations."
)

result = agent.run_sync("What's the weather in Boston?")
print(result.output)

Tools with Dependency Injection

Use RunContext to pass dependencies to your tools:

Python

from pydantic_ai import Agent, RunContext, Tool
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider
from dataclasses import dataclass

@dataclass
class UserContext:
    user_id: int
    user_name: str

# Configure Bifrost
provider = OpenAIProvider(base_url="http://localhost:8080/pydanticai/v1")
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

def get_user_info(ctx: RunContext[UserContext]) -> str:
    """Get information about the current user."""
    return f"User: {ctx.deps.user_name} (ID: {ctx.deps.user_id})"

agent = Agent(
    model,
    deps_type=UserContext,
    tools=[Tool(get_user_info, takes_ctx=True)],
    instructions="You can look up user information."
)

# Pass dependencies at runtime
deps = UserContext(user_id=123, user_name="Alice")
result = agent.run_sync("What is my user information?", deps=deps)
print(result.output)

Structured Output

Define response types using Pydantic models:

Python

from pydantic import BaseModel, Field
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Define structured output type
class CityInfo(BaseModel):
    city: str = Field(description="Name of the city")
    country: str = Field(description="Country where the city is located")
    population: int = Field(description="Approximate population")

# Configure Bifrost
provider = OpenAIProvider(base_url="http://localhost:8080/pydanticai/v1")
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

# Agent with typed output
agent = Agent(
    model,
    output_type=CityInfo,
    instructions="Extract city information from user queries."
)

result = agent.run_sync("Tell me about Tokyo, Japan")

# result.output is typed as CityInfo
print(f"City: {result.output.city}")
print(f"Country: {result.output.country}")
print(f"Population: {result.output.population}")

Streaming Responses

Stream responses in real-time for better UX:

Python

import asyncio
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Configure Bifrost
provider = OpenAIProvider(base_url="http://localhost:8080/pydanticai/v1")
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

agent = Agent(model, instructions="Tell engaging stories.")

async def stream_story():
    async with agent.run_stream("Tell me a short story about a robot.") as response:
        async for chunk in response.stream_text():
            print(chunk, end="", flush=True)
    print()  # Newline at end

asyncio.run(stream_story())

Adding Custom Headers

Add Bifrost-specific headers for governance and tracking:

Python

from httpx import AsyncClient
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Create HTTP client with custom headers
http_client = AsyncClient(
    headers={
        "x-bf-vk": "your-virtual-key",  # Virtual key for governance
    }
)

# Configure provider with custom client
provider = OpenAIProvider(
    base_url="http://localhost:8080/pydanticai/v1",
    http_client=http_client
)
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

agent = Agent(model)
result = agent.run_sync("Hello!")
print(result.output)

Using Direct Keys

Pass API keys directly to bypass Bifrost’s key management. This requires the Allow Direct API keys option to be enabled in Bifrost configuration.

Learn more: See Quickstart Configuration for enabling direct API key usage.

Python

from httpx import AsyncClient
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.models.anthropic import AnthropicModel
from pydantic_ai.providers.openai import OpenAIProvider
from pydantic_ai.providers.anthropic import AnthropicProvider

base_url = "http://localhost:8080/pydanticai"

# Using OpenAI key directly
openai_client = AsyncClient(
    headers={"Authorization": "Bearer sk-your-openai-key"}
)
openai_provider = OpenAIProvider(
    base_url=f"{base_url}/v1",
    http_client=openai_client
)
openai_model = OpenAIChatModel("gpt-4o-mini", provider=openai_provider)
openai_agent = Agent(openai_model)

# Using Anthropic key directly
# Note: Anthropic SDK adds /v1 internally, so we don't append it here
anthropic_client = AsyncClient(
    headers={"x-api-key": "sk-ant-your-anthropic-key"}
)
anthropic_provider = AnthropicProvider(
    base_url=base_url,
    http_client=anthropic_client
)
anthropic_model = AnthropicModel("claude-3-haiku-20240307", provider=anthropic_provider)
anthropic_agent = Agent(anthropic_model)

# Both work through Bifrost with your own keys
openai_result = openai_agent.run_sync("Hello GPT!")
anthropic_result = anthropic_agent.run_sync("Hello Claude!")

Multi-turn Conversations

Maintain conversation history across multiple turns:

Python

from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Configure Bifrost
provider = OpenAIProvider(base_url="http://localhost:8080/pydanticai/v1")
model = OpenAIChatModel("gpt-4o-mini", provider=provider)

agent = Agent(model, instructions="Remember context from previous messages.")

# First turn
result1 = agent.run_sync("My name is Alice and I live in Paris.")

# Second turn - pass message history to maintain context
result2 = agent.run_sync(
    "What is my name and where do I live?",
    message_history=result1.all_messages()
)

print(result2.output)  # Should mention Alice and Paris

Supported Features

The Pydantic AI integration supports all features available in both the Pydantic AI SDK and Bifrost core functionality:

Feature	Supported
Chat Completions	✅
Tool/Function Calling	✅
Structured Output	✅
Streaming	✅
Multi-turn Conversations	✅
Dependency Injection	✅
OpenAI Models	✅
Anthropic Models	✅
Google Gemini Models	✅
Embeddings	✅
Speech/TTS	✅
Transcription	✅

Your existing Pydantic AI agents work seamlessly with Bifrost’s enterprise features. 😄

Next Steps

Governance Features - Virtual keys and team management
Semantic Caching - Intelligent response caching
Configuration - Provider setup and API key management

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

Pydantic AI SDK

Setup

Provider/Model Usage Examples

Tool Calling

Tools with Dependency Injection

Structured Output

Streaming Responses

Adding Custom Headers

Using Direct Keys

Multi-turn Conversations

Supported Features

Next Steps

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

​Setup

​Provider/Model Usage Examples

​Tool Calling

​Tools with Dependency Injection

​Structured Output

​Streaming Responses

​Adding Custom Headers

​Using Direct Keys

​Multi-turn Conversations

​Supported Features

​Next Steps

Setup

Provider/Model Usage Examples

Tool Calling

Tools with Dependency Injection

Structured Output

Streaming Responses

Adding Custom Headers

Using Direct Keys

Multi-turn Conversations

Supported Features

Next Steps