DeepInfra exposes an Anthropic-compatible Messages API. This means tools that target the Anthropic API — Claude Code, the Anthropic Python and TypeScript SDKs, and any framework with an Anthropic adapter — can point at DeepInfra and use open-source models.
Endpoint
https://api.deepinfra.com/anthropic
Two endpoints are available:
| Endpoint | Description |
|---|
POST /anthropic/v1/messages | Create a message (chat completion) |
POST /anthropic/v1/messages/count_tokens | Count tokens for a message request |
Authentication
Both standard Anthropic authentication methods are supported:
| Header | Example |
|---|
Authorization | Bearer $DEEPINFRA_TOKEN |
x-api-key | $DEEPINFRA_TOKEN |
You can also pass anthropic-version and anthropic-beta headers as needed.
Using the Anthropic SDK
import anthropic
client = anthropic.Anthropic(
base_url="https://api.deepinfra.com/anthropic",
api_key="$DEEPINFRA_TOKEN",
)
message = client.messages.create(
model="deepseek-ai/DeepSeek-V3",
max_tokens=1024,
messages=[
{"role": "user", "content": "Hello!"}
],
)
print(message.content[0].text)
Using with Claude Code
Claude Code can use DeepInfra as its backend. To keep your normal Claude Code setup untouched, add a dedicated shell function to your ~/.bashrc or ~/.zshrc:
deepinfra() {
export ANTHROPIC_BASE_URL=https://api.deepinfra.com/anthropic
export ANTHROPIC_AUTH_TOKEN=$DEEPINFRA_TOKEN
export ANTHROPIC_MODEL=deepseek-ai/DeepSeek-V3.1-Terminus
export ANTHROPIC_SMALL_FAST_MODEL=Qwen/Qwen3-30B-A3B
export CLAUDE_CODE_MAX_OUTPUT_TOKENS=16384
claude "$@"
}
Then run deepinfra instead of claude to launch Claude Code via DeepInfra. Your regular claude command stays unchanged.
ANTHROPIC_SMALL_FAST_MODEL is used for lightweight tasks like tab completions and commit messages. Pick a fast, cheap model here to keep costs low.
Streaming
Streaming works the same as the Anthropic API — use stream=True (Python) or stream: true (JS/cURL):
with client.messages.stream(
model="deepseek-ai/DeepSeek-V3",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short poem about open source."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Token counting
Count the tokens in a message request before sending it:
curl "https://api.deepinfra.com/anthropic/v1/messages/count_tokens" \
-H "Content-Type: application/json" \
-H "x-api-key: $DEEPINFRA_TOKEN" \
-d '{
"model": "deepseek-ai/DeepSeek-V3",
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
]
}'
Notes
- You are running open-source models via the Anthropic protocol, not Anthropic’s Claude models.
- Model names use DeepInfra identifiers (e.g.
deepseek-ai/DeepSeek-V3), not Anthropic model names.
- Not all Anthropic-specific features may be supported. Standard message creation, streaming, and token counting work as expected.