Skip to main content
Tool calling (also known as function calling) is the most important capability for building AI agents. It lets models decide when to invoke external tools — web search, code execution, database queries, API calls — and seamlessly weave the results into a final response. Without reliable tool calling, agentic systems break down. Tool call accuracy is a top priority for us. We invest significant engineering effort in ensuring that function call parsing, argument extraction, and round-trip reliability are correct across all supported models. In third-party benchmarks like the K2-Vendor-Verifier evaluation, DeepInfra achieves a top accuracy score for moonshotai/Kimi-K2-Instruct — among the highest of any provider tested. We provide an OpenAI-compatible tool calling API. For more background, see the DeepInfra blog.

Setup

import openai
import json

client = openai.OpenAI(
    base_url="https://api.deepinfra.com/v1/openai",
    api_key="$DEEPINFRA_TOKEN",
)

Define your function

def get_current_weather(location):
    """Get the current weather in a given location"""
    if "tokyo" in location.lower():
        return json.dumps({"location": "Tokyo", "temperature": "75"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "60"})
    elif "paris" in location.lower():
        return json.dumps({"location": "Paris", "temperature": "70"})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

Step 1: Send tools to the model

tools = [{
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        },
    }
}]

messages = [{"role": "user", "content": "What is the weather in San Francisco?"}]

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)
tool_calls = response.choices[0].message.tool_calls
for tool_call in tool_calls:
    print(tool_call.model_dump())
Output:
{'id': 'call_X0xYqdnoUonPJpQ6HEadxLHE', 'function': {'arguments': '{"location": "San Francisco"}', 'name': 'get_current_weather'}, 'type': 'function'}

Step 2: Execute the function and send results back

# Extend conversation with assistant's reply
messages.append(response.choices[0].message)

for tool_call in tool_calls:
    function_name = tool_call.function.name
    if function_name == "get_current_weather":
        function_args = json.loads(tool_call.function.arguments)
        function_response = get_current_weather(
            location=function_args.get("location")
        )

    messages.append({
        "tool_call_id": tool_call.id,
        "role": "tool",
        "content": function_response,
    })

# Get a new response from the model with function results
second_response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)

print(second_response.choices[0].message.content)
Output:
The current temperature in San Francisco, CA is 60 degrees.

Tips

  • Write clear, detailed function descriptions — model quality depends heavily on them
  • Use lower temperatures (< 1.0) to avoid erratic parameter values
  • Avoid system messages when using tool calling
  • Model quality degrades with more functions — keep the list focused
  • Keep top_p and top_k at their defaults

Supported features

FeatureSupported
Single tool calls
Parallel tool calls✅ (quality may vary)
tool_choice: "auto"
tool_choice: "none"
Streaming mode
Nested calls

Notes

  • Function definitions count toward your input token usage
  • Inference usage is counted as normal when using tool calling