
Capabilities
- Violation Scoring: Continuous 0-1 scale violation detection with configurable thresholds
- Custom Natural Language Rules: Define safety rules in plain English without code
- Policy Management: Use pre-built policies from GraySwan platform or create custom ones
- Indirect Prompt Injection (IPI) Detection: Identify hidden instructions in user inputs
- Mutation Detection: Detect attempts to manipulate or alter content
- Reasoning Modes: Choose from fast (“off”), balanced (“hybrid”), or thorough (“thinking”) analysis
Configuration Fields
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
api_key | string | Yes | - | GraySwan API key |
violation_threshold | number | No | 0.5 | Score threshold (0-1) for triggering intervention. Lower values are more strict. |
reasoning_mode | enum | No | ”off” | Analysis depth: off (fastest), hybrid (balanced), or thinking (most thorough) |
policy_id | string | No | - | Single custom policy ID from GraySwan platform |
policy_ids | array | No | - | Multiple policy IDs for aggregated rule evaluation |
rules | object | No | - | Custom natural language rules as key-value pairs |
Request Header Metadata
For each GraySwan monitor call, Bifrost includes sanitized incoming request headers in GraySwanmetadata.headers. This gives GraySwan request context for correlation and policy analysis, such as x-request-id, x-correlation-id, traceparent, x-tenant-id, x-org-id, content-type, and content-length.
Credential-bearing headers are excluded. Bifrost does not send authorization, proxy-authorization, x-api-key, api-key, x-goog-api-key, x-bf-vk, x-bf-api-key, x-bf-api-key-id, cookie, set-cookie, or grayswan-api-key in GraySwan metadata.
This is metadata only: these values are added to the JSON body sent to GraySwan, not forwarded as outbound HTTP headers, and they cannot override the configured GraySwan API key.
Custom Rules Example

Detection Features
- Real-time violation scoring
- Multi-rule evaluation
- IPI attack detection
- Content mutation monitoring
- Detailed violation descriptions with rule attribution

