# DeepInfra ## Docs - [Authentication](https://docs.deepinfra.com/account/authentication.md): API tokens and scoped JWT for secure, scope-limited inference access. - [Data Privacy](https://docs.deepinfra.com/account/data-privacy.md): How DeepInfra handles your data during inference — what's stored, what's not. - [Okta SSO](https://docs.deepinfra.com/account/okta-sso.md): Configure Okta as a Single Sign-On provider for your DeepInfra team. - [Rate Limits](https://docs.deepinfra.com/account/rate-limits.md): Default concurrent request limits and how to increase them. - [Subprocessors](https://docs.deepinfra.com/account/subprocessors.md): Third-party services DeepInfra uses to provide its services. - [Webhooks](https://docs.deepinfra.com/account/webhooks.md): Receive inference results asynchronously via HTTP callbacks. - [Account Email Values](https://docs.deepinfra.com/api-reference/account-email-values.md) - [Account Rate Limit](https://docs.deepinfra.com/api-reference/account-rate-limit.md) - [Account Update Details](https://docs.deepinfra.com/api-reference/account-update-details.md) - [Add Funds](https://docs.deepinfra.com/api-reference/add-funds.md) - [Anthropic Messages](https://docs.deepinfra.com/api-reference/anthropic-messages.md) - [Anthropic Messages Count Tokens](https://docs.deepinfra.com/api-reference/anthropic-messages-count-tokens.md) - [Billing Portal](https://docs.deepinfra.com/api-reference/billing-portal.md) - [Cli Version](https://docs.deepinfra.com/api-reference/cli-version.md) - [Container Rentals Delete](https://docs.deepinfra.com/api-reference/container-rentals-delete.md) - [Container Rentals Get](https://docs.deepinfra.com/api-reference/container-rentals-get.md) - [Container Rentals Get Params](https://docs.deepinfra.com/api-reference/container-rentals-get-params.md) - [Container Rentals List](https://docs.deepinfra.com/api-reference/container-rentals-list.md) - [Container Rentals Start](https://docs.deepinfra.com/api-reference/container-rentals-start.md) - [Container Rentals Update](https://docs.deepinfra.com/api-reference/container-rentals-update.md) - [Create Api Token](https://docs.deepinfra.com/api-reference/create-api-token.md) - [Create Lora](https://docs.deepinfra.com/api-reference/create-lora.md) - [Create Openai Batch](https://docs.deepinfra.com/api-reference/create-openai-batch.md) - [Create Openai Batch](https://docs.deepinfra.com/api-reference/create-openai-batch-1.md) - [ Create Scoped Jwt](https://docs.deepinfra.com/api-reference/create-scoped-jwt.md) - [Create Ssh Key](https://docs.deepinfra.com/api-reference/create-ssh-key.md) - [Create Voice](https://docs.deepinfra.com/api-reference/create-voice.md): Create a new voice - [Deepstart Apply](https://docs.deepinfra.com/api-reference/deepstart-apply.md) - [Delete Account](https://docs.deepinfra.com/api-reference/delete-account.md) - [Delete Api Token](https://docs.deepinfra.com/api-reference/delete-api-token.md) - [Delete Lora](https://docs.deepinfra.com/api-reference/delete-lora.md) - [Delete Lora Model](https://docs.deepinfra.com/api-reference/delete-lora-model.md) - [Delete Ssh Key](https://docs.deepinfra.com/api-reference/delete-ssh-key.md) - [Delete Voice](https://docs.deepinfra.com/api-reference/delete-voice.md) - [Deploy Create](https://docs.deepinfra.com/api-reference/deploy-create.md) - [Deploy Create Hf](https://docs.deepinfra.com/api-reference/deploy-create-hf.md) - [Deploy Create Llm](https://docs.deepinfra.com/api-reference/deploy-create-llm.md) - [Deploy Delete](https://docs.deepinfra.com/api-reference/deploy-delete.md) - [Deploy Detailed Stats](https://docs.deepinfra.com/api-reference/deploy-detailed-stats.md) - [Deploy Gpu Availability](https://docs.deepinfra.com/api-reference/deploy-gpu-availability.md) - [Deploy List](https://docs.deepinfra.com/api-reference/deploy-list.md) - [Deploy List](https://docs.deepinfra.com/api-reference/deploy-list-1.md) - [Deploy Start](https://docs.deepinfra.com/api-reference/deploy-start.md): Start a stopped deployment. Re-creates pods via auto-scaling. - [Deploy Stats](https://docs.deepinfra.com/api-reference/deploy-stats.md) - [Deploy Status](https://docs.deepinfra.com/api-reference/deploy-status.md) - [Deploy Stop](https://docs.deepinfra.com/api-reference/deploy-stop.md): Stop a running deployment. Terminates pods. Can be restarted later. - [Deploy Update](https://docs.deepinfra.com/api-reference/deploy-update.md) - [Deployment Logs Query](https://docs.deepinfra.com/api-reference/deployment-logs-query.md): Query deployment logs. * Without timestamps (from/to) returns last `limit` messages (in last month). * With `from` only, returns first `limit` messages after `from` (inclusive). * With `to` only, returns last `limit` messages before `to` (inclusive). * With both `from` and `to`, return the first `li… - [Deployment Stats](https://docs.deepinfra.com/api-reference/deployment-stats.md) - [Export Api Token To Vercel](https://docs.deepinfra.com/api-reference/export-api-token-to-vercel.md) - [Get Api Token](https://docs.deepinfra.com/api-reference/get-api-token.md) - [Get Api Tokens](https://docs.deepinfra.com/api-reference/get-api-tokens.md) - [Get Checklist](https://docs.deepinfra.com/api-reference/get-checklist.md) - [Get Config](https://docs.deepinfra.com/api-reference/get-config.md) - [Get Live Metrics](https://docs.deepinfra.com/api-reference/get-live-metrics.md): Get the latest values for the Live metrics section on the web front page. - [Get Lora](https://docs.deepinfra.com/api-reference/get-lora.md) - [Get Lora Status](https://docs.deepinfra.com/api-reference/get-lora-status.md) - [Get Model Loras](https://docs.deepinfra.com/api-reference/get-model-loras.md) - [Get Request Costs](https://docs.deepinfra.com/api-reference/get-request-costs.md) - [Get Ssh Keys](https://docs.deepinfra.com/api-reference/get-ssh-keys.md) - [Get User Loras](https://docs.deepinfra.com/api-reference/get-user-loras.md) - [Get Voice](https://docs.deepinfra.com/api-reference/get-voice.md): Get a voice by its id - [Get Voices](https://docs.deepinfra.com/api-reference/get-voices.md): Get available voices for a given user - [Github Cli Login](https://docs.deepinfra.com/api-reference/github-cli-login.md): deepctl is calling this request waiting for auth token during login. The token is stored in /github/callback - [Github Login](https://docs.deepinfra.com/api-reference/github-login.md): Initiate github SSO login flow. Callback is /github/callback - [Inference Deploy](https://docs.deepinfra.com/api-reference/inference-deploy.md) - [Inference Model](https://docs.deepinfra.com/api-reference/inference-model.md) - [Inspect Scoped Jwt](https://docs.deepinfra.com/api-reference/inspect-scoped-jwt.md) - [List Files](https://docs.deepinfra.com/api-reference/list-files.md) - [List Files](https://docs.deepinfra.com/api-reference/list-files-1.md) - [Logs Query](https://docs.deepinfra.com/api-reference/logs-query.md): Query inference logs. * Without timestamps (from/to) returns last `limit` messages (in last month). * With `from` only, returns first `limit` messages after `from` (inclusive). * With `to` only, returns last `limit` messages before `to` (inclusive). * With both `from` and `to`, return the first `lim… - [Me](https://docs.deepinfra.com/api-reference/me.md) - [Model Delete](https://docs.deepinfra.com/api-reference/model-delete.md) - [Model Families Names](https://docs.deepinfra.com/api-reference/model-families-names.md) - [Model Family](https://docs.deepinfra.com/api-reference/model-family.md) - [Model Meta Update](https://docs.deepinfra.com/api-reference/model-meta-update.md) - [Model Publicity](https://docs.deepinfra.com/api-reference/model-publicity.md) - [Model Schema](https://docs.deepinfra.com/api-reference/model-schema.md) - [Model Versions](https://docs.deepinfra.com/api-reference/model-versions.md) - [Models Deployment List](https://docs.deepinfra.com/api-reference/models-deployment-list.md) - [Models Featured](https://docs.deepinfra.com/api-reference/models-featured.md) - [Models Info](https://docs.deepinfra.com/api-reference/models-info.md) - [Models List](https://docs.deepinfra.com/api-reference/models-list.md) - [Models Lora List](https://docs.deepinfra.com/api-reference/models-lora-list.md) - [Okta Login](https://docs.deepinfra.com/api-reference/okta-login.md) - [Openai Audio Speech](https://docs.deepinfra.com/api-reference/openai-audio-speech.md) - [Openai Audio Speech](https://docs.deepinfra.com/api-reference/openai-audio-speech-1.md) - [Openai Audio Transcriptions](https://docs.deepinfra.com/api-reference/openai-audio-transcriptions.md) - [Openai Audio Transcriptions](https://docs.deepinfra.com/api-reference/openai-audio-transcriptions-1.md) - [Openai Audio Translations](https://docs.deepinfra.com/api-reference/openai-audio-translations.md) - [Openai Audio Translations](https://docs.deepinfra.com/api-reference/openai-audio-translations-1.md) - [Openai Chat Completions](https://docs.deepinfra.com/api-reference/openai-chat-completions.md) - [Openai Chat Completions](https://docs.deepinfra.com/api-reference/openai-chat-completions-1.md) - [Openai Completions](https://docs.deepinfra.com/api-reference/openai-completions.md) - [Openai Completions](https://docs.deepinfra.com/api-reference/openai-completions-1.md) - [Openai Embeddings](https://docs.deepinfra.com/api-reference/openai-embeddings.md) - [Openai Embeddings](https://docs.deepinfra.com/api-reference/openai-embeddings-1.md) - [Openai Files](https://docs.deepinfra.com/api-reference/openai-files.md) - [Openai Files](https://docs.deepinfra.com/api-reference/openai-files-1.md) - [Openai Images Edits](https://docs.deepinfra.com/api-reference/openai-images-edits.md): Edit image using OpenAI Images Edits API - [Openai Images Edits](https://docs.deepinfra.com/api-reference/openai-images-edits-1.md): Edit image using OpenAI Images Edits API - [Openai Images Generations](https://docs.deepinfra.com/api-reference/openai-images-generations.md): Generate image using OpenAI Images API - [Openai Images Generations](https://docs.deepinfra.com/api-reference/openai-images-generations-1.md): Generate image using OpenAI Images API - [Openai Images Variations](https://docs.deepinfra.com/api-reference/openai-images-variations.md): Generate a similar image using OpenAI Images Variations API - [Openai Images Variations](https://docs.deepinfra.com/api-reference/openai-images-variations-1.md): Generate a similar image using OpenAI Images Variations API - [Openai Models](https://docs.deepinfra.com/api-reference/openai-models.md) - [Openai Models](https://docs.deepinfra.com/api-reference/openai-models-1.md) - [Openrouter Models](https://docs.deepinfra.com/api-reference/openrouter-models.md) - [Private Models List](https://docs.deepinfra.com/api-reference/private-models-list.md) - [Rent Gpu Availability](https://docs.deepinfra.com/api-reference/rent-gpu-availability.md) - [Request Rate Limit Increase](https://docs.deepinfra.com/api-reference/request-rate-limit-increase.md) - [Retrieve Openai Batch](https://docs.deepinfra.com/api-reference/retrieve-openai-batch.md) - [Retrieve Openai Batch](https://docs.deepinfra.com/api-reference/retrieve-openai-batch-1.md) - [Retrieve Openai Batches](https://docs.deepinfra.com/api-reference/retrieve-openai-batches.md) - [Retrieve Openai Batches](https://docs.deepinfra.com/api-reference/retrieve-openai-batches-1.md) - [Set Config](https://docs.deepinfra.com/api-reference/set-config.md) - [Setup Topup](https://docs.deepinfra.com/api-reference/setup-topup.md) - [Submit Feedback](https://docs.deepinfra.com/api-reference/submit-feedback.md): Submit feedback - [Team Set Display Name](https://docs.deepinfra.com/api-reference/team-set-display-name.md) - [Text To Speech](https://docs.deepinfra.com/api-reference/text-to-speech.md) - [Text To Speech Stream](https://docs.deepinfra.com/api-reference/text-to-speech-stream.md) - [Update Lora](https://docs.deepinfra.com/api-reference/update-lora.md) - [Update Voice](https://docs.deepinfra.com/api-reference/update-voice.md) - [Upload Lora Model](https://docs.deepinfra.com/api-reference/upload-lora-model.md) - [Usage](https://docs.deepinfra.com/api-reference/usage.md) - [Usage Api Token](https://docs.deepinfra.com/api-reference/usage-api-token.md) - [Usage Rent](https://docs.deepinfra.com/api-reference/usage-rent.md) - [Usage Tokens](https://docs.deepinfra.com/api-reference/usage-tokens.md) - [Text Completions](https://docs.deepinfra.com/apis/completions.md): Legacy OpenAI-compatible completions API for raw text generation. - [DeepInfra Native API](https://docs.deepinfra.com/apis/deepinfra-native.md): Advanced API with access to all model types including image generation, speech, object detection, and more. - [Embeddings](https://docs.deepinfra.com/apis/embeddings.md): Generate embedding vectors from text using the OpenAI-compatible embeddings API. - [Image Generation](https://docs.deepinfra.com/apis/image-generation.md): Generate images from text prompts using the OpenAI-compatible images API. - [Reranking](https://docs.deepinfra.com/apis/reranker.md): Rerank a list of documents by relevance to a query. - [Speech Recognition](https://docs.deepinfra.com/apis/speech.md): Transcribe audio to text using Whisper and other speech recognition models. - [Text to Speech](https://docs.deepinfra.com/apis/text-to-speech.md): Convert text to natural-sounding audio using TTS models. - [Text to Video](https://docs.deepinfra.com/apis/text-to-video.md): Generate video clips from text prompts. - [Log Probabilities](https://docs.deepinfra.com/chat/log-probs.md): Get per-token log probabilities from LLM responses. - [Chat Completions](https://docs.deepinfra.com/chat/overview.md): OpenAI-compatible chat completions API — just change the base URL and model name. - [Prompt Caching](https://docs.deepinfra.com/chat/prompt-caching.md): Reduce latency and cost by caching repeated prompt prefixes. - [Reasoning Models](https://docs.deepinfra.com/chat/reasoning.md): Configure chain-of-thought reasoning with reasoning_effort and the reasoning parameter. - [Streaming](https://docs.deepinfra.com/chat/streaming.md): Stream chat completion responses token by token using server-sent events. - [Structured Outputs](https://docs.deepinfra.com/chat/structured-outputs.md): Get model responses in JSON format using response_format. - [Tool Calling](https://docs.deepinfra.com/chat/tool-calling.md): Let models call external functions — the foundation of AI agents. - [Vision & OCR](https://docs.deepinfra.com/chat/vision.md): Send images to multimodal models for visual understanding and text extraction. - [GPU Clusters](https://docs.deepinfra.com/gpu-instances/overview.md): Rent dedicated B200 and B300 GPU clusters with SSH access for training, fine-tuning, and custom workloads. - [What is DeepInfra](https://docs.deepinfra.com/index.md): AI inference cloud — OpenAI-compatible API, 100s of open-source models, private GPU deployments, and GPU rental. - [AI SDK (Vercel)](https://docs.deepinfra.com/integrations/ai-sdk.md): Use DeepInfra models with the Vercel AI SDK for TypeScript/JavaScript. - [Anthropic SDK & Claude Code](https://docs.deepinfra.com/integrations/anthropic.md): Use DeepInfra models with the Anthropic Messages API, Claude Code, and the Anthropic SDK. - [AutoGen](https://docs.deepinfra.com/integrations/autogen.md): Build multi-agent LLM applications with AutoGen using DeepInfra endpoints. - [LangChain](https://docs.deepinfra.com/integrations/langchain.md): Use DeepInfra models with LangChain for LLM-powered applications. - [LlamaIndex](https://docs.deepinfra.com/integrations/llama-index.md): Use DeepInfra LLMs and embeddings with LlamaIndex. - [Models](https://docs.deepinfra.com/models.md): Browse 100+ open-source models available on DeepInfra. - [Custom LLMs](https://docs.deepinfra.com/private-models/custom-llms.md): Deploy your own LLM on dedicated A100/H100/H200/B200/B300 GPUs with autoscaling and an OpenAI-compatible endpoint. - [LoRA Adapters](https://docs.deepinfra.com/private-models/lora.md): Deploy LoRA fine-tuned language models on DeepInfra. - [LoRA for Image Generation](https://docs.deepinfra.com/private-models/lora-image.md): Deploy LoRA adapters for text-to-image generation using models from Civitai. - [Deploy Private Models](https://docs.deepinfra.com/private-models/overview.md): Run your own LLMs and image models on dedicated GPU infrastructure with autoscaling. - [Quickstart](https://docs.deepinfra.com/quickstart.md): Make your first API call in 60 seconds — no installation required. - [Stable Diffusion](https://docs.deepinfra.com/tutorials/stable-diffusion.md): Generate images with Stable Diffusion and SDXL on DeepInfra. - [Whisper Speech Recognition](https://docs.deepinfra.com/tutorials/whisper.md): Transcribe audio to text with OpenAI Whisper on DeepInfra. ## OpenAPI Specs - [openapi](https://docs.deepinfra.com/api-reference/openapi.json) ## Optional - [Models](https://deepinfra.com/models) - [Discord](https://discord.com/invite/x88dCvhqYq) Built with [Mintlify](https://mintlify.com).