Documentation Index
Fetch the complete documentation index at: https://docs.deepinfra.com/llms.txt
Use this file to discover all available pages before exploring further.
The DeepInfra Native API gives you access to every model we provide, including model types not covered by the OpenAI-compatible API: image generation, speech recognition, object detection, token classification, fill mask, image classification, zero-shot image classification, and text classification.
For LLMs and embeddings, the OpenAI-compatible API is simpler and recommended. Use the native API when you need model types beyond LLMs/embeddings, or when you need features like webhooks or log probabilities.
The base endpoint is:
https://api.deepinfra.com/v1/inference/{model_name}
JavaScript client
Text Generation (LLMs)
import { TextGeneration } from "deepinfra";
const client = new TextGeneration(
"https://api.deepinfra.com/v1/inference/deepseek-ai/DeepSeek-V3",
"$DEEPINFRA_TOKEN"
);
const res = await client.generate({
input: "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
stop: ["<|eot_id|>"]
});
console.log(res.results[0].generated_text);
console.log(res.inference_status.tokens_input, res.inference_status.tokens_generated);
curl "https://api.deepinfra.com/v1/inference/deepseek-ai/DeepSeek-V3" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{
"input": "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
"stop": ["<|eot_id|>"],
"stream": false
}'
Embeddings
import { Embeddings } from "deepinfra";
const client = new Embeddings("Qwen/Qwen3-Embedding-8B", "$DEEPINFRA_TOKEN");
const output = await client.generate({
inputs: [
"What is the capital of France?",
"What is the capital of Germany?",
],
});
console.log(output.embeddings[0]);
curl -X POST \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-F 'inputs=["I like chocolate"]' \
'https://api.deepinfra.com/v1/inference/Qwen/Qwen3-Embedding-8B'
Image Generation
import { TextToImage } from "deepinfra";
import { createWriteStream } from "fs";
import { Readable } from "stream";
const model = new TextToImage("stabilityai/stable-diffusion-2-1", "$DEEPINFRA_TOKEN");
const response = await model.generate({
prompt: "a burger with a funny hat on the beach",
});
const result = await fetch(response.images[0]);
if (result.ok && result.body) {
Readable.fromWeb(result.body).pipe(createWriteStream("image.png"));
}
curl "https://api.deepinfra.com/v1/inference/stabilityai/stable-diffusion-2-1" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-d '{"prompt": "a burger with a funny hat on the beach"}'
Speech Recognition
curl -X POST \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-F audio=@audio.mp3 \
'https://api.deepinfra.com/v1/inference/openai/whisper-large'
Object Detection
curl -X POST \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-F image=@image.jpg \
'https://api.deepinfra.com/v1/inference/hustvl/yolos-small'
Token Classification
curl -X POST \
-d '{"input": "My name is John Doe and I live in San Francisco."}' \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/Davlan/bert-base-multilingual-cased-ner-hrl'
Fill Mask
curl -X POST \
-d '{"input": "I need my [MASK] right now!"}' \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/bert-base-cased'
Image Classification
curl -X POST \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-F image=@image.jpg \
'https://api.deepinfra.com/v1/inference/google/vit-base-patch16-224'
Zero-Shot Image Classification
curl -X POST \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-F image=@image.jpg \
-F 'candidate_labels=["dog", "cat", "car", "horse", "person"]' \
'https://api.deepinfra.com/v1/inference/openai/clip-vit-base-patch32'
Text Classification
curl -X POST \
-d '{"input": "Nvidia announces new AI chips months after latest launch"}' \
-H "Authorization: Bearer $DEEPINFRA_TOKEN" \
-H 'Content-Type: application/json' \
'https://api.deepinfra.com/v1/inference/ProsusAI/finbert'
HTTP / other languages
The native API is plain HTTP — you can use it from any language (Go, C#, Java, PHP, Ruby, C++, etc.) without any SDK dependency.