Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.deepinfra.com/llms.txt

Use this file to discover all available pages before exploring further.

The DeepInfra Native API gives you access to every model we provide, including model types not covered by the OpenAI-compatible API: image generation, speech recognition, object detection, token classification, fill mask, image classification, zero-shot image classification, and text classification. For LLMs and embeddings, the OpenAI-compatible API is simpler and recommended. Use the native API when you need model types beyond LLMs/embeddings, or when you need features like webhooks or log probabilities. The base endpoint is:
https://api.deepinfra.com/v1/inference/{model_name}

JavaScript client

npm install deepinfra

Text Generation (LLMs)

import { TextGeneration } from "deepinfra";

const client = new TextGeneration(
  "https://api.deepinfra.com/v1/inference/deepseek-ai/DeepSeek-V3",
  "$DEEPINFRA_TOKEN"
);

const res = await client.generate({
  input: "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
  stop: ["<|eot_id|>"]
});

console.log(res.results[0].generated_text);
console.log(res.inference_status.tokens_input, res.inference_status.tokens_generated);
curl "https://api.deepinfra.com/v1/inference/deepseek-ai/DeepSeek-V3" \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
   -d '{
     "input": "<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello!<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
     "stop": ["<|eot_id|>"],
     "stream": false
   }'

Embeddings

import { Embeddings } from "deepinfra";

const client = new Embeddings("Qwen/Qwen3-Embedding-8B", "$DEEPINFRA_TOKEN");
const output = await client.generate({
  inputs: [
    "What is the capital of France?",
    "What is the capital of Germany?",
  ],
});
console.log(output.embeddings[0]);
curl -X POST \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -F 'inputs=["I like chocolate"]' \
    'https://api.deepinfra.com/v1/inference/Qwen/Qwen3-Embedding-8B'

Image Generation

import { TextToImage } from "deepinfra";
import { createWriteStream } from "fs";
import { Readable } from "stream";

const model = new TextToImage("stabilityai/stable-diffusion-2-1", "$DEEPINFRA_TOKEN");
const response = await model.generate({
  prompt: "a burger with a funny hat on the beach",
});

const result = await fetch(response.images[0]);
if (result.ok && result.body) {
  Readable.fromWeb(result.body).pipe(createWriteStream("image.png"));
}
curl "https://api.deepinfra.com/v1/inference/stabilityai/stable-diffusion-2-1" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
  -d '{"prompt": "a burger with a funny hat on the beach"}'

Speech Recognition

curl -X POST \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -F audio=@audio.mp3 \
    'https://api.deepinfra.com/v1/inference/openai/whisper-large'

Object Detection

curl -X POST \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -F image=@image.jpg \
    'https://api.deepinfra.com/v1/inference/hustvl/yolos-small'

Token Classification

curl -X POST \
    -d '{"input": "My name is John Doe and I live in San Francisco."}' \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -H 'Content-Type: application/json' \
    'https://api.deepinfra.com/v1/inference/Davlan/bert-base-multilingual-cased-ner-hrl'

Fill Mask

curl -X POST \
    -d '{"input": "I need my [MASK] right now!"}' \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -H 'Content-Type: application/json' \
    'https://api.deepinfra.com/v1/inference/bert-base-cased'

Image Classification

curl -X POST \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -F image=@image.jpg \
    'https://api.deepinfra.com/v1/inference/google/vit-base-patch16-224'

Zero-Shot Image Classification

curl -X POST \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -F image=@image.jpg \
    -F 'candidate_labels=["dog", "cat", "car", "horse", "person"]' \
    'https://api.deepinfra.com/v1/inference/openai/clip-vit-base-patch32'

Text Classification

curl -X POST \
    -d '{"input": "Nvidia announces new AI chips months after latest launch"}' \
    -H "Authorization: Bearer $DEEPINFRA_TOKEN" \
    -H 'Content-Type: application/json' \
    'https://api.deepinfra.com/v1/inference/ProsusAI/finbert'

HTTP / other languages

The native API is plain HTTP — you can use it from any language (Go, C#, Java, PHP, Ruby, C++, etc.) without any SDK dependency.