Core Features
1. Automatic Pricing Synchronization
The Model Catalog manages pricing data through a two-phase approach: Startup Behavior:- With ConfigStore: Downloads a pricing sheet from Maxim’s datasheet, persists it to the config store, and then loads it into memory for fast lookups.
- Without ConfigStore: Downloads the pricing sheet directly into memory on every startup.
- When ConfigStore is available, an automatic sync occurs every 24 hours to keep pricing data current.
- All pricing data is cached in memory for O(1) lookup performance during cost calculations.
2. Multi-Modal Cost Calculation
It supports diverse pricing models across different AI operation types:- Text Operations: Token-based and character-based pricing for chat completions, text completions, and embeddings.
- Audio Processing: Token-based and duration-based pricing for speech synthesis and transcription.
- Image Processing: Per-image costs with tiered pricing for high-token contexts.
3. Model Information Management
The Model Catalog maintains a pool of available models for each provider, populated from the pricing data. This allows for:- Listing all available models for a given provider.
- Finding all providers that support a specific model.
4. Intelligent Cache Cost Handling
It integrates with semantic caching to provide accurate cost calculations:- Cache Hits: Zero cost for direct cache hits, and embedding cost only for semantic matches.
- Cache Misses: Combined cost of the base model usage plus the embedding generation cost for cache storage.
5. Tiered Pricing Support
The system automatically applies different pricing rates for high-token contexts (e.g., above 128k tokens), reflecting real provider pricing models for various modalities.Configuration
TheModelCatalog can be configured during initialization by passing a Config struct.
PricingURL: Overrides the default URL (https://getbifrost.ai/datasheet) for downloading the pricing sheet.PricingSyncInterval: Customizes the interval for periodic pricing data synchronization. The default is 24 hours.
ModelCatalog:
Architecture
ModelCatalog
TheModelCatalog is the central component that handles all model and pricing operations:
Pricing Data Structure
Each model’s pricing information includes comprehensive cost metrics, supporting various modalities and tiered pricing:Usage in Plugins
Initialization
In Bifrost’s gateway, theModelCatalog is initialized once at the start and shared across all plugins:
Basic Cost Calculation
Calculate costs from a Bifrost response:Advanced Cost Calculation with Usage Details
For more granular cost calculation with custom usage data:Cache Aware Cost Calculation
For workflows that implement semantic caching, use cache-aware cost calculation:Model Discovery
TheModelCatalog provides several methods to query for model and provider information.
Get Models for a Provider
Retrieve a list of all models supported by a specific provider.Get Providers for a Model
Find all providers that offer a specific model.Dynamically Add Models
You can dynamically add models to the catalog’s pool from av1/models compatible response structure. This is useful for providers that expose a model list endpoint.
Reloading Configuration
You can reload the pricing configuration at runtime if you need to change the pricing URL or sync interval.Error Handling and Fallbacks
The Model Catalog handles missing pricing data gracefully with intelligent fallbacks:Cleanup and Lifecycle Management
Properly clean up resources when shutting down:Thread Safety
AllModelCatalog operations are thread-safe, making it suitable for concurrent usage across multiple plugins and goroutines. The internal pricing data cache uses read-write mutexes for optimal performance during frequent lookups.
Best Practices
- Shared Instance: Use a single
ModelCataloginstance across all plugins to avoid redundant data synchronization. - Error Handling: Always handle the case where pricing returns 0.0 due to missing model data.
- Logging: Monitor pricing sync failures and missing model warnings in production.
- Cache Awareness: Use
CalculateCostWithCacheDebugwhen implementing caching features. - Resource Cleanup: Always call
Cleanup()during application shutdown to prevent resource leaks.

