What is the key structural difference between OpenAI function calling and Anthropic tool use?

Anthropic returns tool use results as content blocks within the message, which can be mixed with text blocks, rather than a separate function_call field like OpenAI. This means your code must iterate through all content blocks to identify tool_use entries and return corresponding tool_result blocks in the next message. The upside is that the model can naturally interleave explanation text with tool calls, but it requires more careful response parsing.

How much latency can parallel tool calls save in a multi-source agent?

Both OpenAI and Anthropic support parallel tool calls, where the model returns multiple tool_call or tool_use blocks in a single response that you execute concurrently. For an agent that needs 3 data sources, sequential calls require 3 round trips while parallel calls reduce this to 1 round trip for tool selection plus 1 for results — roughly a 60% latency reduction according to the post.

How should production code handle tool execution failures?

Rather than propagating exceptions, return a structured error object inside the tool result containing an error_type, a message, and a retry_suggested boolean. This lets the model decide whether to retry, fall back to an alternative tool, or explain the failure to the user. The post also recommends implementing exponential backoff for transient failures before surfacing an error to the model.

Why can tool definitions significantly increase API costs at scale, and how can this be mitigated?

Tool definitions are counted as input tokens on every API call even when the model never invokes those tools. A set of 20 detailed tool definitions can add 2,000–5,000 tokens per call, which at GPT-4o pricing translates to $0.005–$0.012 extra per call — small individually but significant at high volume. The recommended mitigation is tool routing: run a pre-classification step with a cheap model such as GPT-4o-mini or Claude Haiku to identify relevant tools, then pass only that subset to the expensive model.

When should you choose Anthropic tool use over OpenAI function calling for a new project?

The post recommends Anthropic tool use when you need the most capable model (Claude Opus 4) or when your use case benefits from the model mixing explanations naturally alongside tool calls. OpenAI function calling is preferred when you need the broadest agent framework ecosystem compatibility, strict JSON mode for structured output, or support for fine-tuned models. For systems that may need to switch providers, building a thin abstraction layer supporting both APIs is advised.

Function Calling Patterns: OpenAI vs Anthropic Tool Use in Production

Function calling (or tool use, depending on which API you are using) is the feature that transforms LLMs from text generators into agents capable of taking real-world actions. I have implemented function calling integrations with both OpenAI and Anthropic APIs for production systems — ERP data lookups, external API calls, database writes, and multi-step orchestration. The APIs look similar on the surface but have meaningful differences in how they handle parallel calls, required vs optional tool use, and error reporting. This post compares them concretely and covers the production patterns that matter.

The Core Difference: Philosophy

OpenAI's function calling (now called 'tools' in the v2 API) and Anthropic's tool use are both based on the same concept: you define a set of functions with JSON Schema descriptions, the model decides when to call them, and you execute the calls and return results. The philosophical difference is in control. OpenAI offers tool_choice: 'required' (force the model to use a tool), 'auto' (model decides), or specific tool forcing. Anthropic offers tool_choice with 'auto', 'any' (must use at least one tool), and specific tool forcing. Both support parallel tool calls, but Anthropic's implementation requires careful handling of the content block structure.

Anthropic Tool Use: The Content Block Pattern

Anthropic's tool use returns results as content blocks rather than a separate function_call field. The response may contain mixed text and tool_use blocks. Your code must iterate through content blocks, identify tool_use blocks, execute each, and return tool_result blocks in the next message. This structure enables the model to naturally interleave explanation text with tool calls — a UX advantage, but it requires more careful response parsing than OpenAI's simpler structure.

┌─────────────────────────────────────────────────────────────┐
│         Parallel Tool Call Flow (Anthropic)                  │
│                                                             │
│  User: "What is the status of order #123 and invoice #456?" │
│                          │                                  │
│                          ▼                                  │
│  Model returns TWO tool_use blocks in one response:         │
│  ┌────────────────────┐  ┌────────────────────┐           │
│  │ get_order_status   │  │ get_invoice_status │           │
│  │ id: "tool_abc"     │  │ id: "tool_def"     │           │
│  │ order_id: "123"    │  │ invoice_id: "456"  │           │
│  └─────────┬──────────┘  └──────────┬─────────┘           │
│            │ parallel execution      │                      │
│            ▼                         ▼                      │
│  ┌────────────────────────────────────────────┐            │
│  │  Promise.all([getOrder(123), getInv(456)]) │            │
│  └────────────────────┬───────────────────────┘            │
│                       │                                     │
│                       ▼                                     │
│  Return BOTH tool_result blocks → Model synthesizes answer  │
└─────────────────────────────────────────────────────────────┘

From my experience building multi-tool agents: implement a generic tool execution dispatcher from the start, not per-tool if/else chains. Map tool names to handler functions in a dictionary, validate inputs against the schema before executing, and wrap every handler in try/catch with structured error returns. This pattern scales to dozens of tools without code duplication and makes adding new tools a one-line registration.

Parallel Tool Calls: A Production Game-Changer

Both APIs support parallel tool calls — the model returns multiple tool_call/tool_use blocks in a single response, and you execute them concurrently before returning all results. This dramatically reduces latency for operations that can run in parallel: fetching data from multiple tables, calling multiple external APIs, or running independent computations. Without parallel tool calls, an agent needing 3 data sources would require 3 round trips. With parallel calls, it is 1 round trip for tool selection plus 1 for results — roughly 60% latency reduction.

Error Handling and Retry Pattern

Production tool execution will fail. Network timeouts, database errors, external API rate limits — all of these must be communicated back to the model so it can handle them gracefully. Return a structured error in the tool result rather than propagating exceptions. Include: error_type, message, and a retry_suggested boolean. This lets the model decide whether to retry, use an alternative tool, or explain the failure to the user. Implement exponential backoff for transient failures before returning an error to the model.

import Anthropic from "@anthropic-ai/sdk"

const client = new Anthropic()

// Generic tool dispatcher
const toolHandlers: Record<string, (params: unknown) => Promise<unknown>> = {
  get_order: async (p) => fetchOrder(p as { order_id: string }),
  get_invoice: async (p) => fetchInvoice(p as { invoice_id: string }),
  list_products: async (p) => fetchProducts(p as { category?: string }),
}

// Agentic loop with parallel tool execution
async function runAgent(userMessage: string, maxIterations = 10) {
  const messages: Anthropic.Messages.MessageParam[] = [
    { role: "user", content: userMessage }
  ]

  for (let i = 0; i < maxIterations; i++) {
    const response = await client.messages.create({
      model: "claude-opus-4-5",
      max_tokens: 4096,
      tools: Object.keys(toolHandlers).map(name => ({
        name,
        description: toolDescriptions[name],
        input_schema: toolSchemas[name],
      })),
      messages,
    })

    // Collect all tool_use blocks
    const toolUseBlocks = response.content.filter(b => b.type === "tool_use")

    if (toolUseBlocks.length === 0 || response.stop_reason === "end_turn") {
      // No more tool calls — return final text response
      const textBlock = response.content.find(b => b.type === "text")
      return textBlock?.type === "text" ? textBlock.text : ""
    }

    // Execute all tool calls in PARALLEL
    const toolResults = await Promise.all(
      toolUseBlocks.map(async (block) => {
        if (block.type !== "tool_use") return null
        try {
          const handler = toolHandlers[block.name]
          if (!handler) throw new Error(`Unknown tool: ${block.name}`)
          const result = await handler(block.input)
          return { type: "tool_result" as const, tool_use_id: block.id, content: JSON.stringify(result) }
        } catch (err) {
          return {
            type: "tool_result" as const,
            tool_use_id: block.type === "tool_use" ? block.id : "",
            content: JSON.stringify({ error: String(err), retry_suggested: true }),
            is_error: true,
          }
        }
      })
    )

    // Append assistant response + all tool results to messages
    messages.push({ role: "assistant", content: response.content })
    messages.push({ role: "user", content: toolResults.filter(Boolean) as Anthropic.Messages.ToolResultBlockParam[] })
  }

  throw new Error("Max iterations reached")
}

Structured Output Extraction Pattern

A powerful pattern that is often overlooked: use tool calling purely for structured output extraction, without any actual function execution. Define a tool with the schema of the structure you want to extract from text, set tool_choice to force that specific tool, and the model will always return a valid JSON structure matching your schema. This is more reliable than asking the model to return JSON in its text response — tool use schemas enforce structure at the API level.

Tool definitions count as input tokens on every API call, even when the model does not use those tools. A set of 20 detailed tool definitions can add 2,000-5,000 tokens per call. At GPT-4o pricing, that is $0.005-0.012 added per call — negligible at low volume but significant at scale. Optimization: use tool routing to only send relevant tool definitions based on the user's message intent. A pre-classification call with a cheap model (GPT-4o-mini, Claude Haiku) can route to the appropriate tool subset before calling the expensive model.

Multi-Step Orchestration

Complex agents require multiple rounds of tool use. The pattern is an agentic loop: call model, check for tool use, execute tools, append results, call model again, repeat until the model returns a text-only response. Implement a maximum iteration limit (typically 10-15 steps) to prevent infinite loops. Log each iteration with the full state for debugging. For long-running tasks, persist the conversation state to a database between iterations so the agent can be interrupted and resumed.

OpenAI vs Anthropic: My Recommendation

For new projects, I recommend Anthropic's tool use when you need the most capable model (Claude Opus 4) or when your use case benefits from the model naturally mixing explanation with tool calls. Use OpenAI's function calling when you need the widest ecosystem compatibility (most agent frameworks target OpenAI first), when you need JSON mode for strict structured output, or when you are using fine-tuned models. In practice, the patterns are similar enough that building an abstraction layer that supports both is worth the upfront investment for any system that might need to switch models.

Sources & Further Reading

Frequently Asked Questions

Function Calling Patterns: OpenAI vs Anthropic Tool Use in Production

Frequently Asked Questions

Function Calling Patterns: OpenAI vs Anthropic Tool Use in Production

The Core Difference: Philosophy

Anthropic Tool Use: The Content Block Pattern

Parallel Tool Calls: A Production Game-Changer

Error Handling and Retry Pattern

Structured Output Extraction Pattern

Multi-Step Orchestration

OpenAI vs Anthropic: My Recommendation

Related Articles

The Core Difference: Philosophy

Anthropic Tool Use: The Content Block Pattern

Parallel Tool Calls: A Production Game-Changer

Error Handling and Retry Pattern

Structured Output Extraction Pattern

Multi-Step Orchestration

OpenAI vs Anthropic: My Recommendation

Related Articles