Skip to main content
Deploy LoRA adapter models on top of base models hosted at DeepInfra. Your adapter is loaded on a supported base model and served with the standard OpenAI-compatible API.

Prerequisites

  1. A LoRA adapter model hosted on Hugging Face
  2. A base model that supports LoRA at DeepInfra (see supported base models in the upload form)
  3. A Hugging Face token if your LoRA adapter is private
  4. A DeepInfra account and API key

Deploy a LoRA model

  1. Go to Dashboard
  2. Click New Deployment
  3. Click the LoRA Model tab
  4. Fill in the form:
    • LoRA model name — name used to reference this deployment
    • Hugging Face Model Name — path to your LoRA adapter on Hugging Face
    • Hugging Face Token — optional, required for private repos

Example

Using the public adapter askardeepinfra/llama-3.1-8B-rank-32-example-lora (base: meta-llama/Meta-Llama-3.1-8B-Instruct):
  1. Go to Dashboard → New Deployment → LoRA Model
  2. Fill in:
    • LoRA model name: asdf/lora-example
    • Hugging Face Model Name: askardeepinfra/llama-3.1-8B-rank-32-example-lora
  3. Click Upload
The deployment appears in Dashboard → Deployments. Initial state is InitializingDeployingRunning. Once running, your model page is at https://deepinfra.com/asdf/lora-example.

Inference

curl "https://api.deepinfra.com/v1/openai/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $DEEPINFRA_API_KEY" \
  -d '{
      "model": "asdf/lora-example",
      "messages": [
        {
          "role": "user",
          "content": "Hello!"
        }
      ]
    }'
The LoRA model name is used directly in the model field — the same as any other model.