Skip to main content
DeepInfra supports the OpenAI embeddings API for all embedding models. The endpoint is:
POST https://api.deepinfra.com/v1/openai/embeddings

Example

from openai import OpenAI

openai = OpenAI(
    api_key="$DEEPINFRA_TOKEN",
    base_url="https://api.deepinfra.com/v1/openai",
)

input_text = "The food was delicious and the waiter..."
# Or a list: ["hello", "world"]

embeddings = openai.embeddings.create(
    model="Qwen/Qwen3-Embedding-8B",
    input=input_text,
    encoding_format="float"
)

print(embeddings.data[0].embedding)
print(embeddings.usage.prompt_tokens)

Batch embeddings

Pass an array as input to embed multiple texts in a single request:
embeddings = openai.embeddings.create(
    model="Qwen/Qwen3-Embedding-8B",
    input=["Hello", "World", "How are you?"],
    encoding_format="float"
)

for i, item in enumerate(embeddings.data):
    print(f"Text {i}: {item.embedding[:5]}...")  # First 5 dims

Supported parameters

ParameterNotes
modelEmbedding model name
inputString or array of strings
encoding_formatfloat only

Available models

Browse all embedding models — includes Qwen3 Embedding, BAAI/bge, sentence-transformers, and more.