Overview
Guardrails in Bifrost provide enterprise-grade content safety, security validation, and policy enforcement for LLM requests and responses. The system validates inputs and outputs in real-time against your specified policies, ensuring responsible AI deployment with comprehensive protection against harmful content, prompt injection, PII leakage, and policy violations.Key Features
| Feature | Description |
|---|---|
| Multi-Provider Support | AWS Bedrock, Azure Content Safety, and Patronus AI integration |
| Dual-Stage Validation | Guard both inputs (prompts) and outputs (responses) |
| Real-Time Processing | Synchronous and asynchronous validation modes |
| Custom Policies | Define organization-specific guardrail rules |
| Automatic Remediation | Block, redact, or modify content based on policy |
| Comprehensive Logging | Detailed audit trails for compliance |
Supported Guardrail Providers
Bifrost integrates with leading guardrail providers to offer comprehensive protection:AWS Bedrock Guardrails
Amazon Bedrock Guardrails provides enterprise-grade content filtering and safety features with deep AWS integration. Capabilities:- Content Filters: Hate speech, insults, sexual content, violence, misconduct
- Denied Topics: Block specific topics or categories
- Word Filters: Custom profanity and sensitive word blocking
- PII Protection: Detect and redact 50+ PII entity types
- Contextual Grounding: Verify responses against source documents
- Prompt Attack Detection: Identify injection and jailbreak attempts
- Personal identifiers (SSN, passport, driver’s license)
- Financial information (credit cards, bank accounts)
- Contact information (email, phone, address)
- Medical information (health records, insurance)
- Device identifiers (IP addresses, MAC addresses)
Azure Content Safety
Azure AI Content Safety provides multi-modal content moderation powered by Microsoft’s advanced AI models. Capabilities:- Severity-Based Filtering: 4-level severity classification (Safe, Low, Medium, High)
- Multi-Category Detection: Hate, sexual, violence, self-harm content
- Prompt Shield: Advanced jailbreak and injection detection
- Groundedness Detection: Verify factual accuracy against sources
- Protected Material: Detect copyrighted content
- Custom Categories: Define organization-specific content policies
- Hate and fairness
- Sexual content
- Violence
- Self-harm
- Profanity
- Jailbreak attempts
Patronus AI
Patronus AI specializes in LLM security and safety with advanced evaluation capabilities. Capabilities:- Hallucination Detection: Identify factually incorrect responses
- PII Detection: Comprehensive personal data identification
- Toxicity Screening: Multi-language toxic content detection
- Prompt Injection Defense: Advanced attack pattern recognition
- Custom Evaluators: Build organization-specific safety checks
- Real-Time Monitoring: Continuous safety validation
- Context-aware evaluation
- Multi-turn conversation analysis
- Custom policy templates
- Integration with existing safety workflows
Configuration
AWS Bedrock Guardrails Setup
- Web UI
- API
- config.json
-
Navigate to Guardrails
- Open Bifrost UI at
http://localhost:8080 - Go to Enterprise → Guardrails
- Click Add Guardrail Provider
- Open Bifrost UI at
- Configure AWS Bedrock
- Provider Name: Descriptive name for this guardrail
- Provider Type: Select “AWS Bedrock”
- AWS Region: Your Bedrock region (e.g.,
us-east-1) - Guardrail ID: Your Bedrock guardrail identifier
- Guardrail Version: Version number or
DRAFT
- Access Key ID: AWS IAM access key
- Secret Access Key: AWS IAM secret key
- Session Token: (Optional) For temporary credentials
- Input Validation: Enable for prompt validation
- Output Validation: Enable for response validation
- Action on Violation: Block, Log, or Redact
- Timeout: Max validation time (default: 5s)
- Test Configuration
- Click Test Guardrail
- Send sample prompt to verify detection
- Review detection results
Azure Content Safety Setup
- Web UI
- API
- config.json
-
Navigate to Guardrails
- Go to Enterprise → Guardrails
- Click Add Guardrail Provider
- Configure Azure Content Safety
- Provider Name: Descriptive name
- Provider Type: Select “Azure Content Safety”
- Endpoint: Your Azure Content Safety endpoint
- API Key: Azure subscription key
- Hate: Enable with severity threshold
- Sexual: Enable with severity threshold
- Violence: Enable with severity threshold
- Self-Harm: Enable with severity threshold
- Prompt Shield: Enable jailbreak detection
- Groundedness Detection: Enable for factual verification
- Custom Categories: Define organization policies
Patronus AI Setup
- Web UI
- API
- config.json
-
Navigate to Guardrails
- Go to Enterprise → Guardrails
- Click Add Guardrail Provider
- Configure Patronus AI
- Provider Name: Descriptive name
- Provider Type: Select “Patronus AI”
- API Key: Your Patronus API key
- Environment: Production or Development
- Hallucination Detection: Enable factual accuracy checks
- PII Detection: Enable personal data identification
- Toxicity: Enable harmful content detection
- Prompt Injection: Enable attack detection
- Upload organization-specific evaluators
- Define custom safety criteria

