"Tool calling" from LLM. Understanding hot it works

Roman GelembjukRoman Gelembjuk
5 min read

I am interested in learning how LLMs can understand requests requiring a "tool call".

In this post "Tool Calling" and Ollama, there is a nice description of how "Tool calling" works with Ollama.

The idea of this feature is that LLMs can have access to some tools (aka external APIs) and can call them to get extra information. To be able to do this, the LLM has to understand the current request, determine that this request could be forwarded to a tool, and parse the arguments. Canonical example is “What is the weather in ….“. This requires access to some live data because LLM doesn’t have any live updates inside it.

Here is a shorter example of the code from the original article:

#!/bin/bash 
SERVICE_URL="http://localhost:11434"
read -r -d '' DATA <<- EOM
{
  "model": "llama3.1",
  "messages": [
    {
      "role": "user",
      "content": "This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to him"
    }
  ],
  "stream": false,
  "tools": [
    {
      "function": {
        "description": "Say hello to a given person with his name",
        "name": "say_hello",
        "parameters": {
          "properties": {
            "name": {
              "description": "The name of the person",
              "type": "string"
            }
          },
          "required": [
            "name"
          ],
          "type": "object"
        }
      },
      "type": "function"
    },
    {
      "function": {
        "description": "Add two numbers",
        "name": "add_numbers",
        "parameters": {
          "properties": {
            "number1": {
              "description": "The first number",
              "type": "number"
            },
            "number2": {
              "description": "The second number",
              "type": "number"
            }
          },
          "required": [
            "number1",
            "number2"
          ],
          "type": "object"
        }
      },
      "type": "function"
    }
  ]
}
EOM

curl --no-buffer ${SERVICE_URL}/api/chat \
    -H "Content-Type: application/json" \
    -d "${DATA}" | jq '.'

I wanted to understand how effectively an LLM can recognize that a current message should be forwarded to a tool.

Additionally, it is interesting to know how an LLM will select a tool if the context can be interpreted for more than one tool.

What if the message is not just simple and short but also requires a "normal" text response?

I tested three models: qwen2.5:1.5b, mistral-nemo, and llama3.1 to see if the behavior is consistent or varies depending on the model.

Finally, I tried to determine if the results are always the same for a given message by repeating each request 100 times.

To automate this task i have created small golang application.

Here is the list of user messages I sent to each LLM:

Results for "qwen2.5:1.5b"

User messageText responsesay_hello calladd_numbers call
Hello
Say hello to Bob
Add 2 and 3
I came with my friend Bob. We will stay for 2 days. Say hello to him
This is Bob. He needs to add 2 and 3. Say hello to him
Help us to add 2 and 3
This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to him
We need to know what is the smallest natural number. Can you tell us?
This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to him. And we need to know what is the smallest natural number. Can you tell us?

There are some differences for other models.

User messageqwen2.5:1.5bmistral-nemollama3.1
HelloTextTextsay_hello call
Say hello to Bobsay_hello callsay_hello callsay_hello call
Add 2 and 3add_numbers calladd_numbers calladd_numbers call
I came with my friend Bob. We will stay for 2 days. Say hello to himsay_hello callsay_hello callsay_hello call
This is Bob. He needs to add 2 and 3. Say hello to himsay_hello callsay_hello+add_numbers callsay_hello + add_numbers call
Help us to add 2 and 3add_numbers calladd_numbers calladd_numbers call
This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to himsay_hello + add_numbers callsay_hello + add_numbers callsay_hello+add_numbers call
We need to know what is the smallest natural number. Can you tell us?TextTextadd_numbers call
This is Bob. We are doing math. Help us to add 2 and 3. BTW. Say hello to him. And we need to know what is the smallest natural number. Can you tell us?say_hello + add_numbers callsay_hello + add_numbers callsay_hello+add_numbers call

Unexpected results

  • llama3.1 calls say_hello for a prompt "Hello". The "name" arguments is "Hello", sometimes empty.

  • llama3.1 recognizes the prompt "We need to know what is the smallest natural number. Can you tell us?" as add_numbers call with arguments 0 and 1. Why?

Conclusions

  • On a single prompt a tool call and contents return does not happen together. It will be or contents returned or a tool called (this is not proved, just what i see)

  • For a single prompt LLM can call more 1 tool

0
Subscribe to my newsletter

Read articles from Roman Gelembjuk directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Roman Gelembjuk
Roman Gelembjuk