Whisper is OpenAI’s speech recognition model. Given an audio file, it produces transcribed text with per-sentence timestamps. DeepInfra hosts multiple Whisper variants. Browse all speech recognition models.Documentation Index
Fetch the complete documentation index at: https://docs.deepinfra.com/llms.txt
Use this file to discover all available pages before exploring further.
Models
| Model | Notes |
|---|---|
openai/whisper-large | Best accuracy |
openai/whisper-medium | Balanced |
openai/whisper-small | Fast |
openai/whisper-base | Smallest |
openai/whisper-timestamped-medium | Per-word timestamps |
whisper-timestamped gives per-word timestamps.
Example
Supported formats
mp3wav
Additional parameters
Each Whisper variant supports parameters likelanguage, task (transcribe vs. translate), and more. Check the model’s documentation page for the full list: