Skip to main content
DeepInfra hosts text-to-speech models that convert text into natural-sounding audio. Browse all TTS models.

Endpoint

POST https://api.deepinfra.com/v1/inference/{model_name}

Example

import requests

DEEPINFRA_TOKEN = "$DEEPINFRA_TOKEN"
MODEL = "hexgrad/Kokoro-82M"

response = requests.post(
    f"https://api.deepinfra.com/v1/inference/{MODEL}",
    headers={
        "Authorization": f"Bearer {DEEPINFRA_TOKEN}",
        "Content-Type": "application/json",
    },
    json={
        "text": "Hello! This is a text-to-speech example using DeepInfra.",
    },
)

# Save the returned audio
with open("output.wav", "wb") as f:
    f.write(response.content)

Additional parameters

Each model may expose additional parameters such as voice selection, speed, and language. Check the model’s individual API documentation page for supported options.

Available models

Browse all text-to-speech models.