Documentation Index
Fetch the complete documentation index at: https://docs.deepinfra.com/llms.txt
Use this file to discover all available pages before exploring further.
Tool calling (also known as function calling) is the most important capability for building AI agents. It lets models decide when to invoke external tools — web search, code execution, database queries, API calls — and seamlessly weave the results into a final response. Without reliable tool calling, agentic systems break down.
Tool call accuracy is a top priority for us. We invest significant engineering effort in ensuring that function call parsing, argument extraction, and round-trip reliability are correct across all supported models. In third-party benchmarks like the K2-Vendor-Verifier evaluation, DeepInfra achieves a top accuracy score for moonshotai/Kimi-K2-Instruct — among the highest of any provider tested.
We provide an OpenAI-compatible tool calling API. For more background, see the DeepInfra blog.
Setup
import openai
import json
client = openai.OpenAI(
base_url="https://api.deepinfra.com/v1/openai",
api_key="$DEEPINFRA_TOKEN",
)
Define your function
def get_current_weather(location):
"""Get the current weather in a given location"""
if "tokyo" in location.lower():
return json.dumps({"location": "Tokyo", "temperature": "75"})
elif "san francisco" in location.lower():
return json.dumps({"location": "San Francisco", "temperature": "60"})
elif "paris" in location.lower():
return json.dumps({"location": "Paris", "temperature": "70"})
else:
return json.dumps({"location": location, "temperature": "unknown"})
tools = [{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
},
}
}]
messages = [{"role": "user", "content": "What is the weather in San Francisco?"}]
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3",
messages=messages,
tools=tools,
tool_choice="auto",
)
tool_calls = response.choices[0].message.tool_calls
for tool_call in tool_calls:
print(tool_call.model_dump())
Output:
{'id': 'call_X0xYqdnoUonPJpQ6HEadxLHE', 'function': {'arguments': '{"location": "San Francisco"}', 'name': 'get_current_weather'}, 'type': 'function'}
Step 2: Execute the function and send results back
# Extend conversation with assistant's reply
messages.append(response.choices[0].message)
for tool_call in tool_calls:
function_name = tool_call.function.name
if function_name == "get_current_weather":
function_args = json.loads(tool_call.function.arguments)
function_response = get_current_weather(
location=function_args.get("location")
)
messages.append({
"tool_call_id": tool_call.id,
"role": "tool",
"content": function_response,
})
# Get a new response from the model with function results
second_response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3",
messages=messages,
tools=tools,
tool_choice="auto",
)
print(second_response.choices[0].message.content)
Output:
The current temperature in San Francisco, CA is 60 degrees.
Tips
- Write clear, detailed function descriptions — model quality depends heavily on them
- Use lower temperatures (< 1.0) to avoid erratic parameter values
- Avoid system messages when using tool calling
- Model quality degrades with more functions — keep the list focused
- Keep
top_p and top_k at their defaults
Supported features
| Feature | Supported |
|---|
| Single tool calls | ✅ |
| Parallel tool calls | ✅ (quality may vary) |
tool_choice: "auto" | ✅ |
tool_choice: "none" | ✅ |
| Streaming mode | ✅ |
| Nested calls | ❌ |
Notes
- Function definitions count toward your input token usage
- Inference usage is counted as normal when using tool calling