Function Calling in Bash (with Osprey)

In a previous article (Creating AI Agents with Agentic Compose, Bash, Curl, Jq, and Gum), we saw how to use Docker Model Runner with Bash, Jq, Curl, and Gum using a small Bash library I named Osprey. In this article, we'll see how to use Osprey for "loop function calling". And all this thanks to the very recent contribution from Ron Evans (@deadprogram.com) 🙌. But first, let's talk a bit about "function calling" and LLMs.

What is "function calling"?

"Function calling" for LLMs that support it (also called "tool calling"), is the ability for an LLM to detect/recognize in a user prompt the intention to "execute something", such as doing an addition, wanting to know the weather in a particular city, or wanting to do an Internet search.

LLMs that "support tools", if you provide them upstream with a catalog of tools in JSON (like the example below), will be able to make the connection between the user's intention and the corresponding tool. The LLM will then generate a JSON response with the tool name and the parameters to call it.

Example of tools catalog:

[
  {
    "type": "function",
    "function": {
      "name": "calculate_sum",
      "description": "Calculate the sum of two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {
            "type": "number",
            "description": "The first number"
          },
          "b": {
            "type": "number",
            "description": "The second number"
          }
        },
        "required": ["a", "b"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "say_hello",
      "description": "Say hello to the given name",
      "parameters": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "description": "The name to greet"
          }
        },
        "required": ["name"]
      }
    }
  }
]

But be careful, in no case will the LLM be able to execute these functions (or tools), it will just generate a JSON with the function name and the parameters to pass. It's up to you to implement the function execution.

First example of "function calling"

Very Important: Small local models are not all excellent at "function calling". So you'll need to test different models that best correspond to your use case and thus select a model with a good success rate. One of my colleagues, Ignasi Lopez Luna wrote an excellent blog post on this subject: Tool Calling with Local LLMs: A Practical Evaluation.

Curl request for "function calling"

Note: Docker Model Runner uses Llama.cpp to run LLM models and provides a REST API compatible with OpenAI's API.

The script below does the following:

Defines a catalog of tools in JSON (calculate_sum and say_hello) with their name, description and parameters
Creates a user message: "Say hello to Bob and to Sam, make the sum of 5 and 37" (so theoretically, there are 3 tool calls: say_hello twice and calculate_sum)
Prepares the data to send to Docker Model Runner's REST API, where we specify the model to use, options (like temperature ALWAYS AT ZERO for function calling), the user message and of course the tools catalog.
Sends the request to Docker Model Runner's REST API with curl
Displays the raw JSON result and extracts tool calls

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"

# Tools index in JSON format
read -r -d '' TOOLS <<- EOM
[
  {
    "type": "function",
    "function": {
      "name": "calculate_sum",
      "description": "Calculate the sum of two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {
            "type": "number",
            "description": "The first number"
          },
          "b": {
            "type": "number",
            "description": "The second number"
          }
        },
        "required": ["a", "b"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "say_hello",
      "description": "Say hello to the given name",
      "parameters": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "description": "The name to greet"
          }
        },
        "required": ["name"]
      }
    }
  }
]
EOM

USER_MESSAGE="Say hello to Bob and to Sam, make the sum of 5 and 37"

read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "${USER_MESSAGE}"
    }
  ],
  "tools": ${TOOLS}
}
EOM

# Remove newlines from DATA 
DATA=$(echo ${DATA} | tr -d '\n')

JSON_RESULT=$(curl --silent ${DMR_BASE_URL}/chat/completions \
    -H "Content-Type: application/json" \
    -d "${DATA}"
)

echo -e "\n📝 Raw JSON response:\n"
echo "${JSON_RESULT}" | jq '.'

echo -e "\n🔍 Extracted function calls:\n"
echo "${JSON_RESULT}" | jq -r '.choices[0].message.tool_calls[]? | "Function: \(.function.name), Args: \(.function.arguments)"'
echo -e "\n"

As output, you should get something like this:

📝 Raw JSON response:

{
  "choices": [
    {
      "finish_reason": "tool_calls",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "type": "function",
            "function": {
              "name": "say_hello",
              "arguments": "{\"name\":\"Bob\"}"
            },
            "id": "k21o6x4s0idGnDIVzl7inoSo1tenBGE2"
          }
        ]
      }
    }
  ],
  "created": 1754206209,
  "model": "ai/qwen2.5:latest",
  "system_fingerprint": "b1-79e0b68",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 21,
    "prompt_tokens": 275,
    "total_tokens": 296
  },
  "id": "chatcmpl-3SpU44sqFty64mNL9tfrzmELnyr0EOph",
  "timings": {
    "prompt_n": 275,
    "prompt_ms": 1416.667,
    "prompt_per_token_ms": 5.151516363636364,
    "prompt_per_second": 194.1176013840938,
    "predicted_n": 21,
    "predicted_ms": 1037.057,
    "predicted_per_token_ms": 49.38366666666667,
    "predicted_per_second": 20.249610195003747
  }
}

🔍 Extracted function calls:

Function: say_hello, Args: {"name":"Bob"}

First observations

The LLM detected that it needed to call the say_hello function with the name parameter equal to "Bob".
The LLM did not detect that it needed to call the say_hello function with the name parameter equal to "Sam" nor the calculate_sum function
The result of tool calls is in the tool_calls field of the JSON response: "tool_calls": [...]

Let's change the user message

Modify the user message as follows:

USER_MESSAGE="Make the sum of 5 and 37, then say hello to Bob and to Sam"

Then run the script again. You should get something like this:

📝 Raw JSON response:

{
  "choices": [
    {
      "finish_reason": "tool_calls",
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "type": "function",
            "function": {
              "name": "calculate_sum",
              "arguments": "{\"a\":5,\"b\":37}"
            },
            "id": "Gqv0tJyrCUjvPAJL18COXYR7nRYKU5tC"
          }
        ]
      }
    }
  ],
  "created": 1754211196,
  "model": "ai/qwen2.5:latest",
  "system_fingerprint": "b1-79e0b68",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 28,
    "prompt_tokens": 276,
    "total_tokens": 304
  },
  "id": "chatcmpl-G6kIqQ4ijDlIQ388TbBCQ6X6k1IdmT4r",
  "timings": {
    "prompt_n": 276,
    "prompt_ms": 1434.675,
    "prompt_per_token_ms": 5.198097826086956,
    "prompt_per_second": 192.37806471849026,
    "predicted_n": 28,
    "predicted_ms": 1351.9,
    "predicted_per_token_ms": 48.28214285714286,
    "predicted_per_second": 20.711591094015827
  }
}

🔍 Extracted function calls:

Function: calculate_sum, Args: {"a":5,"b":37}

Conclusion(s)

So, the LLM knows how to detect the first tool call intention but not the rest. For simple cases, this might be enough, but for more complex cases, you'll need to do iterations and loop calls (or use "Parallel Tool Calls", but that's yet another topic). For now, we'll focus on simple calling with Osprey.

Simple Function Calling with Osprey

Osprey is a small Bash library that facilitates interaction, thanks to some helpers, with Docker Model Runner (and any LLM engine system providing OpenAI compatible endpoints). For function calling, Osprey provides the following functions:

osprey_tool_calls: to make a tool calls type request
print_tool_calls: to display tool calls detected in the JSON response
get_tool_calls: to extract tool calls from the JSON response (in base64 encoded JSON format)
you can use decode_tool_call to get something readable from the base64 encoded tool call.
get_function_name: to extract the function name to call
get_function_args: to extract the function arguments to call
get_call_id: to extract the tool call ID

So, with Osprey, the previous script becomes much simpler and more readable (this time, I added function execution in the script):

Installing Osprey is very simple:

curl -fsSL https://github.com/k33g/osprey/releases/download/v0.0.7/osprey.sh -o ./osprey.sh
chmod +x ./osprey.sh

#!/bin/bash
. "./osprey.sh" # Import Osprey library

DMR_BASE_URL=${MODEL_RUNNER_BASE_URL:-http://localhost:12434/engines/llama.cpp/v1}
MODEL=${MODEL_LUCY:-"ai/qwen2.5:latest"}

# tools catalog in JSON format
read -r -d '' TOOLS <<- EOM
[
  {
    "type": "function",
    "function": {
      "name": "calculate_sum",
      "description": "Calculate the sum of two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {
            "type": "number",
            "description": "The first number"
          },
          "b": {
            "type": "number",
            "description": "The second number"
          }
        },
        "required": ["a", "b"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "say_hello",
      "description": "Say hello to the given name",
      "parameters": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "description": "The name to greet"
          }
        },
        "required": ["name"]
      }
    }
  }
]
EOM

# Request with function calling
read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [
    {
      "role": "user",
      "content": "Say hello to Bob and to Sam, make the sum of 5 and 37"
    }
  ],
  "tools": ${TOOLS}
}
EOM

echo "⏳ Making function call request..."
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")

echo ""

# Get tool calls for further processing
TOOL_CALLS=$(get_tool_calls "${RESULT}")

# Process each tool call
if [[ -n "$TOOL_CALLS" ]]; then
    echo ""
    echo "🚀 Processing tool calls..."

    for tool_call in $TOOL_CALLS; do
        FUNCTION_NAME=$(get_function_name "$tool_call")
        FUNCTION_ARGS=$(get_function_args "$tool_call")
        CALL_ID=$(get_call_id "$tool_call")

        echo "Executing function: $FUNCTION_NAME with args: $FUNCTION_ARGS"

        # Function execution
        case "$FUNCTION_NAME" in
            "say_hello")
                NAME=$(echo "$FUNCTION_ARGS" | jq -r '.name')
                HELLO="👋 Hello, $NAME!🙂"
                RESULT_CONTENT="{\"message\": $HELLO}"
                ;;
            "calculate_sum")
                A=$(echo "$FUNCTION_ARGS" | jq -r '.a')
                B=$(echo "$FUNCTION_ARGS" | jq -r '.b')
                SUM=$((A + B))
                RESULT_CONTENT="{\"result\": $SUM}"
                ;;
            *)
                RESULT_CONTENT="{\"error\": \"Unknown function\"}"
                ;;
        esac

        echo "Function result: $RESULT_CONTENT"
    done
else
    echo "No tool calls found in response"
fi

As output, you should get something like this:

⏳ Making function call request...

🛠️ Tool calls detected:
Function: say_hello, Args: {"name":"Bob"}

🚀 Processing tool calls...
Executing function: say_hello with args: {"name":"Bob"}
Function result: {"message": 👋 Hello, Bob!🙂}

No surprise, the LLM only detected one tool call, that of say_hello with the name parameter equal to "Bob". But it works 🎉.

Let's now see how to do "loop function calling" with Osprey.

Loop Function Calling with Osprey

For the 0.0.7 release of Osprey, Ron Evans added the ability to do "loop function calling", that is, to make tool calls in a loop until there are no more tool calls to execute. With the example this will be more telling.

Ron added among other things the following functions:

add_tool_calls_message : record the intention to call a tool, for example:

{
  "role": "assistant",
  "tool_calls": [
    {
      "id": "call_123",
      "function": {
        "name": "calculate_sum",
        "arguments": "{\"a\":2,\"b\":3}"
      }
    }
  ]
}

add_tool_message : records the execution of a tool, for example:

{
  "role": "tool",
  "tool_call_id": "call_123",
  "content": "{\"result\": 5}"
}

get_finish_reason : allows extracting the completion finish reason, for example tool_calls or stop to determine the next action in the loop.

This time my user message will be the following:

read -r -d '' USER_MESSAGE <<- EOM
Make the sum of 40 and 2, 
then say hello to Bob and to Sam, 
make the sum of 5 and 37
Say hello to Alice
EOM

And here's what the script will do, at the message level, according to OpenAI API specifications:

User message - Initial request
Assistant message with tool_calls - AI's function call requests
Tool messages - Results of function executions
Assistant message - Final response using tool results

sequenceDiagram
    participant API as LLM API
    participant Script as Script
    participant History as Message History

    API->>Script: Response with tool_calls
    Script->>History: add_tool_calls_message (AI's intent)
    Note over History: {"role":"assistant", "tool_calls":[...]}

    loop For each tool call
        Script->>Script: Execute function
        Script->>History: add_tool_message (execution result)
        Note over History: {"role":"tool", "tool_call_id":"...", "content":"..."}
    end

This will give us a function execution flow like this:

flowchart TD
    A[Tool Call Received] --> B[Extract Function Details]
    B --> C[Get Function Name]
    B --> D[Get Function Arguments]
    B --> E[Get Call ID]

    C --> F{Function Name?}
    F -->|calculate_sum| G[Extract a and b parameters]
    F -->|say_hello| H[Extract name parameter]
    F -->|unknown| I[Return error result]

    G --> J[Perform calculation: a + b]
    H --> K[Generate greeting message]

    J --> L[Format result as JSON]
    K --> L
    I --> L

    L --> M[Add tool result to conversation]
    M --> N[Continue conversation loop]

    style F fill:#fff3e0
    style J fill:#e8f5e8
    style K fill:#e8f5e8
    style I fill:#ffebee

Let's move on to the script.

The loop "function calling" script with Osprey

I started from the example proposed by Ron in https://github.com/k33g/osprey/tree/main/examples/16-multistep-mcp-tools and adapted it for this blog post.

Important note: the script execution will be longer than the previous ones because it will make several tool calls in a loop until there are no more tool calls to execute.

Advice: it may happen that the LLM doesn't detect all tool calls or detects too many. This is to be taken into account in your code but also in the choice of LLM model. (in general, qwen2.5 gives good results)

Here is the complete script:

#!/bin/bash
. "./osprey.sh"

DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"

# Tools catalog in JSON format
read -r -d '' TOOLS <<- EOM
[
  {
    "type": "function",
    "function": {
      "name": "calculate_sum",
      "description": "Calculate the sum of two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {
            "type": "number",
            "description": "The first number"
          },
          "b": {
            "type": "number",
            "description": "The second number"
          }
        },
        "required": ["a", "b"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "say_hello",
      "description": "Say hello to the given name",
      "parameters": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "description": "The name to greet"
          }
        },
        "required": ["name"]
      }
    }
  }
]
EOM

read -r -d '' USER_MESSAGE <<- EOM
Make the sum of 40 and 2, 
then say hello to Bob and to Sam, 
make the sum of 5 and 37
Say hello to Alice
EOM

STOPPED="false"
SET_OF_MESSAGES=()

add_user_message SET_OF_MESSAGES "${USER_MESSAGE}"

while [ "$STOPPED" != "true" ]; do
  # Build messages array conversation history
  MESSAGES=$(build_messages_array SET_OF_MESSAGES)

  # Build request data
  read -r -d '' DATA <<- EOM
{
  "model": "${MODEL}",
  "options": {
    "temperature": 0.0
  },
  "messages": [${MESSAGES}],
  "tools": ${TOOLS}
}
EOM

  echo "⏳ Making function call request..."
  RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")

  echo ""
  echo "🛠️💡 Tool call detected in user message:"
  print_tool_calls "${RESULT}"

  # Get tool calls for further processing
  TOOL_CALLS=$(get_tool_calls "${RESULT}")

  FINISH_REASON=$(get_finish_reason "${RESULT}")
  case $FINISH_REASON in
    tool_calls)
      if [[ -n "$TOOL_CALLS" ]]; then
        add_tool_calls_message SET_OF_MESSAGES "${TOOL_CALLS}"
        echo ""
        echo "🚀 Processing tool calls..."

        for tool_call in $TOOL_CALLS; do
          FUNCTION_NAME=$(get_function_name "$tool_call")
          FUNCTION_ARGS=$(get_function_args "$tool_call")
          CALL_ID=$(get_call_id "$tool_call")

          echo "▶️ Executing function: $FUNCTION_NAME with args: $FUNCTION_ARGS"

          RESULT_CONTENT=""
          # Function execution
          case "$FUNCTION_NAME" in
            "say_hello")
                NAME=$(echo "$FUNCTION_ARGS" | jq -r '.name')
                HELLO="👋 Hello, $NAME!🙂"
                RESULT_CONTENT="{\"message\": \"$HELLO\"}"
                ;;
            "calculate_sum")
                A=$(echo "$FUNCTION_ARGS" | jq -r '.a')
                B=$(echo "$FUNCTION_ARGS" | jq -r '.b')
                SUM=$((A + B))
                RESULT_CONTENT="{\"result\": $SUM}"
                ;;
            *)
                RESULT_CONTENT="{\"error\": \"Unknown function\"}"
                ;;
          esac

          echo -e "Function result: $RESULT_CONTENT\n\n"
          add_tool_message SET_OF_MESSAGES "${CALL_ID}" "${RESULT_CONTENT}"
        done
      else
          echo "No tool calls found in response"
      fi
      ;;
    stop)
      STOPPED="true"
      ASSISTANT_MESSAGE=$(echo "${RESULT}" | jq -r '.choices[0].message.content')

      echo "🤖 ${ASSISTANT_MESSAGE}"

      # Add assistant response to conversation history
      add_assistant_message SET_OF_MESSAGES "${ASSISTANT_MESSAGE}"
      ;;
    *)
      echo "🔴 unexpected response: $FINISH_REASON"
      ;;
  esac

done

Then, when executing the script, you should get something like this:

⏳ Making function call request...

🛠️💡 Tool call detected in user message:
Function: calculate_sum, Args: {"a":40,"b":2}

🚀 Processing tool calls...
▶️ Executing function: calculate_sum with args: {"a":40,"b":2}
Function result: {"result": 42}


⏳ Making function call request...

🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Bob"}

🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Bob"}
Function result: {"message": "👋 Hello, Bob!🙂"}


⏳ Making function call request...

🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Sam"}

🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Sam"}
Function result: {"message": "👋 Hello, Sam!🙂"}


⏳ Making function call request...

🛠️💡 Tool call detected in user message:
Function: calculate_sum, Args: {"a":5,"b":37}

🚀 Processing tool calls...
▶️ Executing function: calculate_sum with args: {"a":5,"b":37}
Function result: {"result": 42}


⏳ Making function call request...

🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Alice"}

🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Alice"}
Function result: {"message": "👋 Hello, Alice!🙂"}


⏳ Making function call request...

🛠️💡 Tool call detected in user message:
🤖 The sum of 40 and 2 is 42. 

Hello, Bob! 😊

Hello, Sam! 😊

The sum of 5 and 37 is 42.

Hello, Alice! 😊

Conclusion

🎉 It is therefore possible to detect all function calls in the user message and this with local models. Some models are more or less "gifted" depending on the use cases. For example hf.co/menlo/lucy-128k-gguf:q4_k_m that you can find at Hugging Face is quite very good at single calls, while ai/qwen2.5:latest will manage in most cases.

There's another way to do "function calling" and detect all tool calls, it's "Parallel Tool Calls" which allows detecting all tool calls in a single completion: Parallel function calling, but not all models support it yet and in the case of Llama.cpp, it seems there's a bug, so to be followed.

It's also possible to "simulate" the "function calling" functionality only by using prompt engineering, but this requires patience and adapting the prompt according to the model used. This will be the subject of a future article.

I hope this article has enlightened you on "function calling" in general (everything done here in Bash is easily adaptable to other programming languages). Tool support in LLMs is becoming essential in the era of MCP. If the model you're using isn't capable of recognizing tool calls, MCP becomes useless.

The examples are available here: https://github.com/k33g/osprey/tree/main/examples/17-tool-calls-loop