Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.deepinfra.com/llms.txt

Use this file to discover all available pages before exploring further.

Tool calling (also known as function calling) is the most important capability for building AI agents. It lets models decide when to invoke external tools — web search, code execution, database queries, API calls — and seamlessly weave the results into a final response. Without reliable tool calling, agentic systems break down. Tool call accuracy is a top priority for us. We invest significant engineering effort in ensuring that function call parsing, argument extraction, and round-trip reliability are correct across all supported models. In third-party benchmarks like the K2-Vendor-Verifier evaluation, DeepInfra achieves a top accuracy score for moonshotai/Kimi-K2-Instruct — among the highest of any provider tested. We provide an OpenAI-compatible tool calling API. For more background, see the DeepInfra blog.

Setup

import openai
import json

client = openai.OpenAI(
    base_url="https://api.deepinfra.com/v1/openai",
    api_key="$DEEPINFRA_TOKEN",
)

Define your function

def get_current_weather(location):
    """Get the current weather in a given location"""
    if "tokyo" in location.lower():
        return json.dumps({"location": "Tokyo", "temperature": "75"})
    elif "san francisco" in location.lower():
        return json.dumps({"location": "San Francisco", "temperature": "60"})
    elif "paris" in location.lower():
        return json.dumps({"location": "Paris", "temperature": "70"})
    else:
        return json.dumps({"location": location, "temperature": "unknown"})

Step 1: Send tools to the model

tools = [{
    "type": "function",
    "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA"
                }
            },
            "required": ["location"]
        },
    }
}]

messages = [{"role": "user", "content": "What is the weather in San Francisco?"}]

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)
tool_calls = response.choices[0].message.tool_calls
for tool_call in tool_calls:
    print(tool_call.model_dump())
Output:
{'id': 'call_X0xYqdnoUonPJpQ6HEadxLHE', 'function': {'arguments': '{"location": "San Francisco"}', 'name': 'get_current_weather'}, 'type': 'function'}

Step 2: Execute the function and send results back

# Extend conversation with assistant's reply
messages.append(response.choices[0].message)

for tool_call in tool_calls:
    function_name = tool_call.function.name
    if function_name == "get_current_weather":
        function_args = json.loads(tool_call.function.arguments)
        function_response = get_current_weather(
            location=function_args.get("location")
        )

    messages.append({
        "tool_call_id": tool_call.id,
        "role": "tool",
        "content": function_response,
    })

# Get a new response from the model with function results
second_response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3",
    messages=messages,
    tools=tools,
    tool_choice="auto",
)

print(second_response.choices[0].message.content)
Output:
The current temperature in San Francisco, CA is 60 degrees.

Tips

  • Write clear, detailed function descriptions — model quality depends heavily on them
  • Use lower temperatures (< 1.0) to avoid erratic parameter values
  • Avoid system messages when using tool calling
  • Model quality degrades with more functions — keep the list focused
  • Keep top_p and top_k at their defaults

Supported features

FeatureSupported
Single tool calls
Parallel tool calls✅ (quality may vary)
tool_choice: "auto"
tool_choice: "none"
Streaming mode
Nested calls

Notes

  • Function definitions count toward your input token usage
  • Inference usage is counted as normal when using tool calling