Guardrails

Overview

Guardrails in Bifrost provide enterprise-grade content safety, security validation, and policy enforcement for LLM requests and responses. The system validates inputs and outputs in real-time against your specified policies, ensuring responsible AI deployment with comprehensive protection against harmful content, prompt injection, PII leakage, and policy violations.

Key Features

Feature	Description
Multi-Provider Support	AWS Bedrock, Azure Content Safety, and Patronus AI integration
Dual-Stage Validation	Guard both inputs (prompts) and outputs (responses)
Real-Time Processing	Synchronous and asynchronous validation modes
Custom Policies	Define organization-specific guardrail rules
Automatic Remediation	Block, redact, or modify content based on policy
Comprehensive Logging	Detailed audit trails for compliance

Supported Guardrail Providers

Bifrost integrates with leading guardrail providers to offer comprehensive protection:

AWS Bedrock Guardrails

Amazon Bedrock Guardrails provides enterprise-grade content filtering and safety features with deep AWS integration. Capabilities:

Content Filters: Hate speech, insults, sexual content, violence, misconduct
Denied Topics: Block specific topics or categories
Word Filters: Custom profanity and sensitive word blocking
PII Protection: Detect and redact 50+ PII entity types
Contextual Grounding: Verify responses against source documents
Prompt Attack Detection: Identify injection and jailbreak attempts

Supported PII Types:

Personal identifiers (SSN, passport, driver’s license)
Financial information (credit cards, bank accounts)
Contact information (email, phone, address)
Medical information (health records, insurance)
Device identifiers (IP addresses, MAC addresses)

Azure Content Safety

Azure AI Content Safety provides multi-modal content moderation powered by Microsoft’s advanced AI models. Capabilities:

Severity-Based Filtering: 4-level severity classification (Safe, Low, Medium, High)
Multi-Category Detection: Hate, sexual, violence, self-harm content
Prompt Shield: Advanced jailbreak and injection detection
Groundedness Detection: Verify factual accuracy against sources
Protected Material: Detect copyrighted content
Custom Categories: Define organization-specific content policies

Detection Categories:

Hate and fairness
Sexual content
Violence
Self-harm
Profanity
Jailbreak attempts

Patronus AI

Patronus AI specializes in LLM security and safety with advanced evaluation capabilities. Capabilities:

Hallucination Detection: Identify factually incorrect responses
PII Detection: Comprehensive personal data identification
Toxicity Screening: Multi-language toxic content detection
Prompt Injection Defense: Advanced attack pattern recognition
Custom Evaluators: Build organization-specific safety checks
Real-Time Monitoring: Continuous safety validation

Advanced Features:

Context-aware evaluation
Multi-turn conversation analysis
Custom policy templates
Integration with existing safety workflows

Configuration

AWS Bedrock Guardrails Setup

Web UI
API
config.json

Navigate to Guardrails
- Open Bifrost UI at http://localhost:8080
- Go to Enterprise → Guardrails
- Click Add Guardrail Provider
Configure AWS Bedrock

Required Fields:

Provider Name: Descriptive name for this guardrail
Provider Type: Select “AWS Bedrock”
AWS Region: Your Bedrock region (e.g., us-east-1)
Guardrail ID: Your Bedrock guardrail identifier
Guardrail Version: Version number or DRAFT

AWS Credentials:

Access Key ID: AWS IAM access key
Secret Access Key: AWS IAM secret key
Session Token: (Optional) For temporary credentials

Validation Settings:

Input Validation: Enable for prompt validation
Output Validation: Enable for response validation
Action on Violation: Block, Log, or Redact
Timeout: Max validation time (default: 5s)

Test Configuration
- Click Test Guardrail
- Send sample prompt to verify detection
- Review detection results

Azure Content Safety Setup

Web UI
API
config.json

Navigate to Guardrails
- Go to Enterprise → Guardrails
- Click Add Guardrail Provider
Configure Azure Content Safety

Required Fields:

Provider Name: Descriptive name
Provider Type: Select “Azure Content Safety”
Endpoint: Your Azure Content Safety endpoint
API Key: Azure subscription key

Content Filters:

Hate: Enable with severity threshold
Sexual: Enable with severity threshold
Violence: Enable with severity threshold
Self-Harm: Enable with severity threshold

Advanced Features:

Prompt Shield: Enable jailbreak detection
Groundedness Detection: Enable for factual verification
Custom Categories: Define organization policies

Patronus AI Setup

Web UI
API
config.json

Navigate to Guardrails
- Go to Enterprise → Guardrails
- Click Add Guardrail Provider
Configure Patronus AI

Required Fields:

Provider Name: Descriptive name
Provider Type: Select “Patronus AI”
API Key: Your Patronus API key
Environment: Production or Development

Evaluators:

Hallucination Detection: Enable factual accuracy checks
PII Detection: Enable personal data identification
Toxicity: Enable harmful content detection
Prompt Injection: Enable attack detection

Custom Policies:

Upload organization-specific evaluators
Define custom safety criteria

Using Guardrails in Requests

Attaching Guardrails to API Calls

Once configured, attach guardrails to your LLM requests using custom headers: Single Guardrail:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-id: bedrock-prod-guardrail" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'

Multiple Guardrails (Sequential):

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-ids: bedrock-prod-guardrail,azure-content-safety-001" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'

Guardrail Configuration in Request:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ],
    "bifrost_config": {
      "guardrails": {
        "input": ["bedrock-prod-guardrail"],
        "output": ["patronus-ai-001"],
        "async": false
      }
    }
  }'

Guardrail Response Handling

Successful Validation (200):

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'd be happy to help you with your task..."
      },
      "finish_reason": "stop"
    }
  ],
  "extra_fields": {
    "guardrails": {
      "input_validation": {
        "guardrail_id": "bedrock-prod-guardrail",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 245
      },
      "output_validation": {
        "guardrail_id": "patronus-ai-001",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 312
      }
    }
  }
}

Validation Failure - Blocked (446):

{
  "error": {
    "message": "Request blocked by guardrails",
    "type": "guardrail_violation",
    "code": 446,
    "details": {
      "guardrail_id": "bedrock-prod-guardrail",
      "validation_stage": "input",
      "violations": [
        {
          "type": "PII",
          "category": "SSN",
          "severity": "HIGH",
          "action": "block",
          "text_excerpt": "My SSN is ***-**-****"
        },
        {
          "type": "prompt_injection",
          "severity": "CRITICAL",
          "action": "block",
          "confidence": 0.95
        }
      ],
      "processing_time_ms": 198
    }
  }
}

Validation Warning - Logged (246):

{
  "id": "chatcmpl-def456",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response with redacted content..."
      },
      "finish_reason": "stop"
    }
  ],
  "bifrost_metadata": {
    "guardrails": {
      "output_validation": {
        "guardrail_id": "azure-content-safety-001",
        "status": "warning",
        "violations": [
          {
            "type": "profanity",
            "severity": "LOW",
            "action": "redact",
            "modifications": 2
          }
        ],
        "processing_time_ms": 187
      }
    }
  }
}

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

Overview

Key Features

Supported Guardrail Providers

AWS Bedrock Guardrails

Azure Content Safety

Patronus AI

Configuration

AWS Bedrock Guardrails Setup

Azure Content Safety Setup

Patronus AI Setup

Using Guardrails in Requests

Attaching Guardrails to API Calls

Guardrail Response Handling

Quick Start

Models Catalog

Provider Integrations

Custom plugins

Open Source Features

Enterprise Features

​Overview

​Key Features

​Supported Guardrail Providers

​AWS Bedrock Guardrails

​Azure Content Safety

​Patronus AI

​Configuration

​AWS Bedrock Guardrails Setup

​Azure Content Safety Setup

​Patronus AI Setup

​Using Guardrails in Requests

​Attaching Guardrails to API Calls

​Guardrail Response Handling

Overview

Key Features

Supported Guardrail Providers

AWS Bedrock Guardrails

Azure Content Safety

Patronus AI

Configuration

AWS Bedrock Guardrails Setup

Azure Content Safety Setup

Patronus AI Setup

Using Guardrails in Requests

Attaching Guardrails to API Calls

Guardrail Response Handling