Tool Calling

The Responses API supports two types of tools:

  • Function tools: Client-executed. The model decides to call a function, returns the call details, and your code executes it locally and sends the result back.

  • MCP tools: Server-executed. Stateful Responses connects to an MCP (Model Context Protocol) endpoint, executes the tool locally, and feeds the result back to the model in an agentic loop. The LLM backend never sees the tool calls directly.


Function Tools (Client-Executed)

With function tools, the flow is:

  1. You define tools in the request

  2. The model returns a function_call output instead of a message

  3. Your code executes the function locally

  4. You send the result back as function_call_output using previous_response_id

  5. The model generates a final response incorporating the tool result

Complete Function Tool Flow

  • curl

  • Python (OpenAI SDK)

  • Python (PydanticAI)

  • Python (LangGraph)

Step 1: Send request with tool definitions. The model returns a function_call:

curl -X POST $BASE_URL/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AA_TOKEN" \
  -d '{
    "model": "qwen3-32b-tool",
    "input": "What is the weather like in Tokyo?",
    "instructions": "Use the get_weather function. Do not make up weather data.",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {
              "type": "string",
              "description": "The city name, e.g., San Francisco"
            }
          },
          "required": ["location"]
        }
      }
    ]
  }'

Response contains a function_call:

{
  "id": "resp_001",
  "output": [
    {
      "type": "function_call",
      "name": "get_weather",
      "call_id": "call_abc123",
      "arguments": "{\"location\": \"Tokyo\"}"
    }
  ]
}

Step 2: Execute the function locally, then send the result back:

curl -X POST $BASE_URL/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AA_TOKEN" \
  -d '{
    "model": "qwen3-32b-tool",
    "input": [
      {
        "type": "function_call_output",
        "call_id": "call_abc123",
        "output": "{\"location\": \"Tokyo\", \"temperature\": \"22°C\", \"condition\": \"Sunny\"}"
      }
    ],
    "previous_response_id": "resp_001",
    "tools": [
      {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": {"type": "string", "description": "The city name"}
          },
          "required": ["location"]
        }
      }
    ]
  }'
import json

# Step 1: Send request with tools
tools = [
    {
        "type": "function",
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "strict": False,
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city name, e.g., 'San Francisco'",
                },
            },
            "required": ["location"],
        },
    }
]

response1 = client.responses.create(
    model="qwen3-32b-tool",
    input="What's the weather like in Tokyo?",
    instructions="Use the get_weather function. Do not make up weather data.",
    tools=tools,
)

# Step 2: Find the function call in the output
function_call = None
for item in response1.output:
    if item.type == "function_call":
        function_call = item
        break

# Step 3: Execute the function locally
args = json.loads(function_call.arguments)
weather_result = json.dumps({
    "location": args["location"],
    "temperature": "22°C",
    "condition": "Sunny",
    "humidity": "45%",
})

# Step 4: Send the result back
response2 = client.responses.create(
    model="qwen3-32b-tool",
    input=[
        {
            "type": "function_call_output",
            "call_id": function_call.call_id,
            "output": weather_result,
        }
    ],
    previous_response_id=response1.id,
    tools=tools,
)

print(response2.output_text)
# "The weather in Tokyo is 22°C and Sunny with 45% humidity."

PydanticAI handles the tool-calling loop automatically. You define tools as decorated methods.

Assumes provider is configured as shown in Getting Started.

from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIResponsesModel

agent: Agent[None, str] = Agent(
    model=OpenAIResponsesModel("qwen3-32b-tool", provider=provider),
    system_prompt="Use the get_weather tool when asked about weather.",
)

@agent.tool
async def get_weather(ctx: RunContext[None], location: str) -> str:
    """Get the current weather for a location.

    Args:
        location: The city name, e.g., 'San Francisco'
    """
    # Your actual implementation here
    return f"Weather in {location}: 22°C, Sunny, Humidity: 45%"

result = await agent.run("What's the weather like in Tokyo?")
print(result.output)
# PydanticAI automatically calls get_weather and feeds the result back

create_agent runs the client-side tool loop for you: at each iteration the model is invoked, any returned tool calls are executed locally as Python, their ToolMessage outputs are appended, and the model is invoked again. The loop terminates when the model returns a message with no further tool calls.

This is distinct from the server-side MCP loop in stateful-responses (see MCP Tools below); that loop handles tools whose body runs on the server. create_agent handles tools whose body runs in your Python process.

Assumes llm is configured as shown in Getting Started.

from langchain.agents import create_agent
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool

@tool
def get_weather(location: str) -> str:
    """Get the current weather for a location.

    Args:
        location: The city name, e.g., 'San Francisco'
    """
    # Your actual implementation here
    return f"Weather in {location}: 22°C, Sunny, Humidity: 45%"

agent = create_agent(
    llm,
    tools=[get_weather],
    system_prompt="Use the get_weather tool when asked about weather.",
)

result = agent.invoke(
    {"messages": [HumanMessage("What's the weather like in Tokyo?")]}
)
print(result["messages"][-1].text)
# "The weather in Tokyo is 22°C and Sunny with 45% humidity."

Function Tools with Streaming

Function tool calling also works with streaming. The stream includes events for function call arguments as they’re generated:

  • Python (OpenAI SDK)

stream = client.responses.create(
    model="qwen3-32b-tool",
    input="What's the weather like in Tokyo?",
    instructions="Use the get_weather function.",
    tools=tools,
    stream=True,
)

function_call_name = None
function_call_id = None
function_call_arguments = ""
response_id = None

for event in stream:
    if event.type == "response.output_item.added":
        if hasattr(event, "item") and event.item.type == "function_call":
            function_call_name = event.item.name
            function_call_id = event.item.call_id
    elif event.type == "response.function_call_arguments.delta":
        function_call_arguments += event.delta
    elif event.type == "response.completed":
        response_id = event.response.id

# Now execute the function and send the result back (same as non-streaming)

MCP Tools (Server-Executed)

MCP (Model Context Protocol) tools are executed on the server side. You provide a tool definition with a server_url pointing to an MCP-compatible endpoint, and the server handles discovery, execution, and result feeding.

Basic MCP Tool Call

The examples below connect to external MCP servers and require outbound network access.

  • curl

  • Python (OpenAI SDK)

  • Python (PydanticAI)

  • Python (LangGraph)

curl -X POST $BASE_URL/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $AA_TOKEN" \
  -d '{
    "model": "qwen3-32b-tool",
    "input": "What is the Microsoft Services Agreement?",
    "instructions": "You must use the mcp call at least once.",
    "tools": [
      {
        "type": "mcp",
        "server_url": "https://learn.microsoft.com/api/mcp",
        "server_label": "microsoft_docs",
        "require_approval": "never"
      }
    ],
    "tool_choice": "auto"
  }'
tools = [
    {
        "type": "mcp",
        "server_url": "https://learn.microsoft.com/api/mcp",
        "server_label": "microsoft_docs",
        "require_approval": "never",
    }
]

response = client.responses.create(
    model="qwen3-32b-tool",
    input="What is the Microsoft Services Agreement?",
    instructions="You must use the mcp call at least once.",
    tools=tools,
    timeout=300,
)

print(response.output_text)

MCP tools are passed via model_settings since they’re server-executed, not client-side:

mcp_tools = [
    {
        "type": "mcp",
        "server_url": "https://learn.microsoft.com/api/mcp",
        "server_label": "microsoft_docs",
        "require_approval": "never",
    }
]

agent = Agent(
    model=OpenAIResponsesModel("qwen3-32b-tool", provider=provider),
    system_prompt="You must use the mcp call to search Microsoft documentation.",
    model_settings={
        "tools": mcp_tools,
        "timeout": 300,
    },
)

result = await agent.run(
    "What is the Microsoft Services Agreement?"
)
print(result.output)

Bind the MCP tool definition to the LLM with bind_tools. Native Responses API tool types (mcp, web_search, etc.) are passed through unchanged.

from langchain_core.messages import HumanMessage, SystemMessage

mcp_tools = [
    {
        "type": "mcp",
        "server_url": "https://learn.microsoft.com/api/mcp",
        "server_label": "microsoft_docs",
        "require_approval": "never",
    }
]

llm_with_mcp = llm.bind_tools(mcp_tools)

response = llm_with_mcp.invoke([
    SystemMessage("You must use the mcp call at least once."),
    HumanMessage("What is the Microsoft Services Agreement?"),
])
print(response.text)

MCP Response Structure

When MCP tools are used, the output contains additional item types:

Output Type Description

mcp_list_tools

Tools discovered from the MCP server

mcp_call

The tool invocation with name and arguments

message

The model’s final response after incorporating tool results

MCP with Approval Flow

For sensitive operations, set require_approval: "always". This triggers a two-step flow:

  1. The first response returns an mcp_approval_request: the model wants to call a tool but needs your permission.

  2. You send back an mcp_approval_response approving (or denying) the call.

  3. The server executes the tool and the model generates its final response.

  • Python (OpenAI SDK)

tools = [
    {
        "type": "mcp",
        "server_url": "https://gitmcp.io/openai/tiktoken",
        "server_label": "gitmcp",
        "allowed_tools": ["search_tiktoken_documentation"],
        "require_approval": "always",
    }
]

# Step 1: Initial request: returns approval request
response1 = client.responses.create(
    model="qwen3-32b-tool",
    input="What is tiktoken?",
    instructions="You must use the mcp tools.",
    tools=tools,
    stream=True,
)

# Collect the approval request from stream events
approval_request = None
response_id_1 = None
for event in response1:
    if event.type == "response.output_item.added":
        if event.item.type == "mcp_approval_request":
            approval_request = event.item
    elif event.type == "response.completed":
        response_id_1 = event.response.id

# Step 2: Approve the tool call
response2 = client.responses.create(
    model="qwen3-32b-tool",
    input=[
        {
            "type": "mcp_approval_request",
            "id": approval_request.id,
            "name": approval_request.name,
            "server_label": approval_request.server_label,
            "arguments": getattr(approval_request, "arguments", ""),
        },
        {
            "type": "mcp_approval_response",
            "id": f"mcpa_approval_{approval_request.id[-12:]}",
            "approval_request_id": approval_request.id,
            "approve": True,
        },
        {
            "type": "message",
            "role": "user",
            "content": "What is tiktoken?",
        },
    ],
    tools=tools,
    previous_response_id=response_id_1,
    stream=True,
)

require_approval Options

Value Behavior

"never"

Tools execute immediately without approval. Use for non-interactive flows.

"always"

Every tool call requires explicit approval before execution.