Overview

Fireworks is an OpenAI-compatible provider in Bifrost with native support for:

Chat Completions via /v1/chat/completions
Responses API via /v1/responses
Text Completions via /v1/completions
Embeddings via /v1/embeddings
Streaming for chat, responses, and completions
Tool calling for chat and responses

Unless noted below, Fireworks follows the standard OpenAI-compatible request and response behavior described in OpenAI.

Supported Operations

Operation	Non-Streaming	Streaming	Endpoint
Chat Completions	✅	✅	`/v1/chat/completions`
Responses API	✅	✅	`/v1/responses`
Text Completions	✅	✅	`/v1/completions`
Embeddings	✅	❌	`/v1/embeddings`
List Models	✅	-	`/v1/models`
Images	❌	❌	-
Speech / Transcription	❌	❌	-
Files	❌	❌	-
Batch	❌	❌	-
Count Tokens	❌	❌	-

Fireworks Responses support is native in Bifrost. Requests are sent to Fireworks’ /v1/responses endpoint directly, so fields such as previous_response_id, max_tool_calls, and store are preserved.

Setup & Configuration

Configure Fireworks as a provider.

Web UI
config.json
API
Go SDK

Navigate to Models > Model Providers. Look for Fireworks under Configured Providers. If it is missing, click on Add New Provider and select Fireworks.
Click Add Key or edit an existing key.
Set a name for your key.
Paste your API key directly or use an environment variable (for example, env.FIREWORKS_API_KEY).
Set Allowed Models to All Models (default) or the specific model allowlist you want this key to serve.
Save the provider configuration.

{
  "providers": {
    "fireworks": {
      "keys": [
        {
          "name": "fireworks-key-1",
          "value": "env.FIREWORKS_API_KEY",
          "models": [
            "*"
          ],
          "weight": 1.0
        }
      ]
    }
  }
}

case schemas.Fireworks:
    return []schemas.Key{{
        Name:   "fireworks-key-1",
        Value:  *schemas.NewSecretVar("env.FIREWORKS_API_KEY"),
        Models: []string{"*"},
        Weight: 1.0,
    }}, nil

1. Chat Completions

Fireworks chat completions use the standard OpenAI-compatible wire format.

Fireworks-specific handling

prediction is preserved and forwarded.
Bifrost maps prompt_cache_key to Fireworks prompt_cache_isolation_key for chat-completion cache isolation.
Assistant reasoning_content is preserved for Fireworks chat-completion models that support reasoning history.

Filtered Parameters

For Fireworks chat completions, Bifrost removes or rewrites a small set of OpenAI-specific fields before sending the request upstream:

prompt_cache_key is mapped to Fireworks prompt_cache_isolation_key
prompt_cache_retention is removed
verbosity is removed
store is removed
web_search_options is removed

Example

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "messages": [
      {"role": "user", "content": "Reply with exactly: fireworks ok"}
    ]
  }'

2. Responses API

Fireworks Responses use the native Fireworks endpoint:

/v1/responses

This preserves Responses-only fields and semantics, including:

previous_response_id
max_tool_calls
store
native responses streaming

Example

curl -X POST http://localhost:8080/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "input": [
      {"role": "user", "content": "Reply with exactly: responses ok"}
    ],
    "max_tool_calls": 2
  }'

For continuation requests, Fireworks also supports previous_response_id.

3. Text Completions

Fireworks text completions are sent to the native completions endpoint:

/v1/completions

Example

curl -X POST http://localhost:8080/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/accounts/fireworks/models/deepseek-v3p2",
    "prompt": "In fruits, A is for apple and B is for"
  }'

For Fireworks text completions, Bifrost extracts prompt_cache_key from extra_params and maps it to Fireworks prompt_cache_isolation_key.

4. Embeddings

Fireworks embeddings are sent to:

/v1/embeddings

Embedding-capable models may be different from chat/completions models.

Example

curl -X POST http://localhost:8080/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "fireworks/nomic-ai/nomic-embed-text-v1.5",
    "input": "embedding test"
  }'

Fireworks documents additional embedding-specific fields such as prompt_template, return_logits, and normalize. This page describes the standard embeddings flow currently covered by Bifrost.

5. Unsupported Features

The following operations are still unsupported by the Fireworks provider in Bifrost:

Feature	Status
Image generation / editing / variations	❌
Speech / TTS	❌
Transcription / STT	❌
Files	❌
Batch	❌
Count tokens	❌
Rerank	❌

6. Caveats

Prompt Caching Semantics

For Fireworks chat completions, Bifrost maps prompt_cache_key to Fireworks prompt_cache_isolation_key, which is the Fireworks body field for cache isolation. Fireworks also accepts the header form x-prompt-cache-isolation-key. For text completions, Bifrost extracts prompt_cache_key from extra_params and maps it to the same Fireworks body field. If you need Fireworks session-affinity behavior, pass user, configure x-session-affinity in provider extra headers, or send it through the HTTP gateway via x-bf-eh-x-session-affinity. Live cache-hit behavior remains model and deployment dependent.

Reasoning History

Bifrost preserves assistant reasoning_content for Fireworks chat models that support reasoning history. Fireworks-specific reasoning controls such as reasoning_history are not given special typed handling in this provider page.

Overview

Quick Start

Release Cadence

Migration Guides

SDK Integrations

Providers & Guides

MCP Gateway

Custom plugins

Open Source Features

Fireworks

Overview

Supported Operations

Setup & Configuration

1. Chat Completions

Fireworks-specific handling

Filtered Parameters

Example

2. Responses API

Example

3. Text Completions

Example

4. Embeddings

Example

5. Unsupported Features

6. Caveats

​Overview

​Supported Operations

​Setup & Configuration

​1. Chat Completions

​Fireworks-specific handling

​Filtered Parameters

​Example

​2. Responses API

​Example

​3. Text Completions

​Example

​4. Embeddings

​Example

​5. Unsupported Features

​6. Caveats

Overview

Supported Operations

Setup & Configuration

1. Chat Completions

Fireworks-specific handling

Filtered Parameters

Example

2. Responses API

Example

3. Text Completions

Example

4. Embeddings

Example

5. Unsupported Features

6. Caveats