Skip to main content

Overview

Guardrails in Bifrost provide enterprise-grade content safety, security validation, and policy enforcement for LLM requests and responses. The system validates inputs and outputs in real-time against your specified policies, ensuring responsible AI deployment with comprehensive protection against harmful content, prompt injection, PII leakage, and policy violations.

Key Features

FeatureDescription
Multi-Provider SupportAWS Bedrock, Azure Content Safety, and Patronus AI integration
Dual-Stage ValidationGuard both inputs (prompts) and outputs (responses)
Real-Time ProcessingSynchronous and asynchronous validation modes
Custom PoliciesDefine organization-specific guardrail rules
Automatic RemediationBlock, redact, or modify content based on policy
Comprehensive LoggingDetailed audit trails for compliance

Supported Guardrail Providers

Bifrost integrates with leading guardrail providers to offer comprehensive protection:

AWS Bedrock Guardrails

Amazon Bedrock Guardrails provides enterprise-grade content filtering and safety features with deep AWS integration. Capabilities:
  • Content Filters: Hate speech, insults, sexual content, violence, misconduct
  • Denied Topics: Block specific topics or categories
  • Word Filters: Custom profanity and sensitive word blocking
  • PII Protection: Detect and redact 50+ PII entity types
  • Contextual Grounding: Verify responses against source documents
  • Prompt Attack Detection: Identify injection and jailbreak attempts
Supported PII Types:
  • Personal identifiers (SSN, passport, driver’s license)
  • Financial information (credit cards, bank accounts)
  • Contact information (email, phone, address)
  • Medical information (health records, insurance)
  • Device identifiers (IP addresses, MAC addresses)

Azure Content Safety

Azure AI Content Safety provides multi-modal content moderation powered by Microsoft’s advanced AI models. Capabilities:
  • Severity-Based Filtering: 4-level severity classification (Safe, Low, Medium, High)
  • Multi-Category Detection: Hate, sexual, violence, self-harm content
  • Prompt Shield: Advanced jailbreak and injection detection
  • Groundedness Detection: Verify factual accuracy against sources
  • Protected Material: Detect copyrighted content
  • Custom Categories: Define organization-specific content policies
Detection Categories:
  • Hate and fairness
  • Sexual content
  • Violence
  • Self-harm
  • Profanity
  • Jailbreak attempts

Patronus AI

Patronus AI specializes in LLM security and safety with advanced evaluation capabilities. Capabilities:
  • Hallucination Detection: Identify factually incorrect responses
  • PII Detection: Comprehensive personal data identification
  • Toxicity Screening: Multi-language toxic content detection
  • Prompt Injection Defense: Advanced attack pattern recognition
  • Custom Evaluators: Build organization-specific safety checks
  • Real-Time Monitoring: Continuous safety validation
Advanced Features:
  • Context-aware evaluation
  • Multi-turn conversation analysis
  • Custom policy templates
  • Integration with existing safety workflows

Configuration

AWS Bedrock Guardrails Setup

  • Web UI
  • API
  • config.json
  1. Navigate to Guardrails
    • Open Bifrost UI at http://localhost:8080
    • Go to EnterpriseGuardrails
    • Click Add Guardrail Provider
  2. Configure AWS Bedrock
Required Fields:
  • Provider Name: Descriptive name for this guardrail
  • Provider Type: Select “AWS Bedrock”
  • AWS Region: Your Bedrock region (e.g., us-east-1)
  • Guardrail ID: Your Bedrock guardrail identifier
  • Guardrail Version: Version number or DRAFT
AWS Credentials:
  • Access Key ID: AWS IAM access key
  • Secret Access Key: AWS IAM secret key
  • Session Token: (Optional) For temporary credentials
Validation Settings:
  • Input Validation: Enable for prompt validation
  • Output Validation: Enable for response validation
  • Action on Violation: Block, Log, or Redact
  • Timeout: Max validation time (default: 5s)
  1. Test Configuration
    • Click Test Guardrail
    • Send sample prompt to verify detection
    • Review detection results

Azure Content Safety Setup

  • Web UI
  • API
  • config.json
  1. Navigate to Guardrails
    • Go to EnterpriseGuardrails
    • Click Add Guardrail Provider
  2. Configure Azure Content Safety
Required Fields:
  • Provider Name: Descriptive name
  • Provider Type: Select “Azure Content Safety”
  • Endpoint: Your Azure Content Safety endpoint
  • API Key: Azure subscription key
Content Filters:
  • Hate: Enable with severity threshold
  • Sexual: Enable with severity threshold
  • Violence: Enable with severity threshold
  • Self-Harm: Enable with severity threshold
Advanced Features:
  • Prompt Shield: Enable jailbreak detection
  • Groundedness Detection: Enable for factual verification
  • Custom Categories: Define organization policies

Patronus AI Setup

  • Web UI
  • API
  • config.json
  1. Navigate to Guardrails
    • Go to EnterpriseGuardrails
    • Click Add Guardrail Provider
  2. Configure Patronus AI
Required Fields:
  • Provider Name: Descriptive name
  • Provider Type: Select “Patronus AI”
  • API Key: Your Patronus API key
  • Environment: Production or Development
Evaluators:
  • Hallucination Detection: Enable factual accuracy checks
  • PII Detection: Enable personal data identification
  • Toxicity: Enable harmful content detection
  • Prompt Injection: Enable attack detection
Custom Policies:
  • Upload organization-specific evaluators
  • Define custom safety criteria

Using Guardrails in Requests

Attaching Guardrails to API Calls

Once configured, attach guardrails to your LLM requests using custom headers: Single Guardrail:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-id: bedrock-prod-guardrail" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'
Multiple Guardrails (Sequential):
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-bf-guardrail-ids: bedrock-prod-guardrail,azure-content-safety-001" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ]
  }'
Guardrail Configuration in Request:
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "Help me with this task"
      }
    ],
    "bifrost_config": {
      "guardrails": {
        "input": ["bedrock-prod-guardrail"],
        "output": ["patronus-ai-001"],
        "async": false
      }
    }
  }'

Guardrail Response Handling

Successful Validation (200):
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "I'd be happy to help you with your task..."
      },
      "finish_reason": "stop"
    }
  ],
  "extra_fields": {
    "guardrails": {
      "input_validation": {
        "guardrail_id": "bedrock-prod-guardrail",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 245
      },
      "output_validation": {
        "guardrail_id": "patronus-ai-001",
        "status": "passed",
        "violations": [],
        "processing_time_ms": 312
      }
    }
  }
}
Validation Failure - Blocked (446):
{
  "error": {
    "message": "Request blocked by guardrails",
    "type": "guardrail_violation",
    "code": 446,
    "details": {
      "guardrail_id": "bedrock-prod-guardrail",
      "validation_stage": "input",
      "violations": [
        {
          "type": "PII",
          "category": "SSN",
          "severity": "HIGH",
          "action": "block",
          "text_excerpt": "My SSN is ***-**-****"
        },
        {
          "type": "prompt_injection",
          "severity": "CRITICAL",
          "action": "block",
          "confidence": 0.95
        }
      ],
      "processing_time_ms": 198
    }
  }
}
Validation Warning - Logged (246):
{
  "id": "chatcmpl-def456",
  "object": "chat.completion",
  "created": 1699564800,
  "model": "gpt-4o-mini",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Response with redacted content..."
      },
      "finish_reason": "stop"
    }
  ],
  "bifrost_metadata": {
    "guardrails": {
      "output_validation": {
        "guardrail_id": "azure-content-safety-001",
        "status": "warning",
        "violations": [
          {
            "type": "profanity",
            "severity": "LOW",
            "action": "redact",
            "modifications": 2
          }
        ],
        "processing_time_ms": 187
      }
    }
  }
}