Function Calling in Bash (with Osprey)


In a previous article (Creating AI Agents with Agentic Compose, Bash, Curl, Jq, and Gum), we saw how to use Docker Model Runner with Bash, Jq, Curl, and Gum using a small Bash library I named Osprey. In this article, we'll see how to use Osprey for "loop function calling". And all this thanks to the very recent contribution from Ron Evans (@deadprogram.com) 🙌. But first, let's talk a bit about "function calling" and LLMs.
What is "function calling"?
"Function calling" for LLMs that support it (also called "tool calling"), is the ability for an LLM to detect/recognize in a user prompt the intention to "execute something", such as doing an addition, wanting to know the weather in a particular city, or wanting to do an Internet search.
LLMs that "support tools", if you provide them upstream with a catalog of tools in JSON (like the example below), will be able to make the connection between the user's intention and the corresponding tool. The LLM will then generate a JSON response with the tool name and the parameters to call it.
Example of tools catalog:
[
{
"type": "function",
"function": {
"name": "calculate_sum",
"description": "Calculate the sum of two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number"
},
"b": {
"type": "number",
"description": "The second number"
}
},
"required": ["a", "b"]
}
}
},
{
"type": "function",
"function": {
"name": "say_hello",
"description": "Say hello to the given name",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name to greet"
}
},
"required": ["name"]
}
}
}
]
But be careful, in no case will the LLM be able to execute these functions (or tools), it will just generate a JSON with the function name and the parameters to pass. It's up to you to implement the function execution.
First example of "function calling"
Very Important: Small local models are not all excellent at "function calling". So you'll need to test different models that best correspond to your use case and thus select a model with a good success rate. One of my colleagues, Ignasi Lopez Luna wrote an excellent blog post on this subject: Tool Calling with Local LLMs: A Practical Evaluation.
Curl request for "function calling"
Note: Docker Model Runner uses Llama.cpp to run LLM models and provides a REST API compatible with OpenAI's API.
The script below does the following:
Defines a catalog of tools in JSON (
calculate_sum
andsay_hello
) with their name, description and parametersCreates a user message: "Say hello to Bob and to Sam, make the sum of 5 and 37" (so theoretically, there are 3 tool calls:
say_hello
twice andcalculate_sum
)Prepares the data to send to Docker Model Runner's REST API, where we specify the model to use, options (like temperature ALWAYS AT ZERO for function calling), the user message and of course the tools catalog.
Sends the request to Docker Model Runner's REST API with
curl
Displays the raw JSON result and extracts tool calls
DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"
# Tools index in JSON format
read -r -d '' TOOLS <<- EOM
[
{
"type": "function",
"function": {
"name": "calculate_sum",
"description": "Calculate the sum of two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number"
},
"b": {
"type": "number",
"description": "The second number"
}
},
"required": ["a", "b"]
}
}
},
{
"type": "function",
"function": {
"name": "say_hello",
"description": "Say hello to the given name",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name to greet"
}
},
"required": ["name"]
}
}
}
]
EOM
USER_MESSAGE="Say hello to Bob and to Sam, make the sum of 5 and 37"
read -r -d '' DATA <<- EOM
{
"model": "${MODEL}",
"options": {
"temperature": 0.0
},
"messages": [
{
"role": "user",
"content": "${USER_MESSAGE}"
}
],
"tools": ${TOOLS}
}
EOM
# Remove newlines from DATA
DATA=$(echo ${DATA} | tr -d '\n')
JSON_RESULT=$(curl --silent ${DMR_BASE_URL}/chat/completions \
-H "Content-Type: application/json" \
-d "${DATA}"
)
echo -e "\n📝 Raw JSON response:\n"
echo "${JSON_RESULT}" | jq '.'
echo -e "\n🔍 Extracted function calls:\n"
echo "${JSON_RESULT}" | jq -r '.choices[0].message.tool_calls[]? | "Function: \(.function.name), Args: \(.function.arguments)"'
echo -e "\n"
As output, you should get something like this:
📝 Raw JSON response:
{
"choices": [
{
"finish_reason": "tool_calls",
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "say_hello",
"arguments": "{\"name\":\"Bob\"}"
},
"id": "k21o6x4s0idGnDIVzl7inoSo1tenBGE2"
}
]
}
}
],
"created": 1754206209,
"model": "ai/qwen2.5:latest",
"system_fingerprint": "b1-79e0b68",
"object": "chat.completion",
"usage": {
"completion_tokens": 21,
"prompt_tokens": 275,
"total_tokens": 296
},
"id": "chatcmpl-3SpU44sqFty64mNL9tfrzmELnyr0EOph",
"timings": {
"prompt_n": 275,
"prompt_ms": 1416.667,
"prompt_per_token_ms": 5.151516363636364,
"prompt_per_second": 194.1176013840938,
"predicted_n": 21,
"predicted_ms": 1037.057,
"predicted_per_token_ms": 49.38366666666667,
"predicted_per_second": 20.249610195003747
}
}
🔍 Extracted function calls:
Function: say_hello, Args: {"name":"Bob"}
First observations
The LLM detected that it needed to call the
say_hello
function with thename
parameter equal to "Bob".The LLM did not detect that it needed to call the
say_hello
function with thename
parameter equal to "Sam" nor thecalculate_sum
functionThe result of tool calls is in the
tool_calls
field of the JSON response:"tool_calls": [...]
Let's change the user message
Modify the user message as follows:
USER_MESSAGE="Make the sum of 5 and 37, then say hello to Bob and to Sam"
Then run the script again. You should get something like this:
📝 Raw JSON response:
{
"choices": [
{
"finish_reason": "tool_calls",
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"type": "function",
"function": {
"name": "calculate_sum",
"arguments": "{\"a\":5,\"b\":37}"
},
"id": "Gqv0tJyrCUjvPAJL18COXYR7nRYKU5tC"
}
]
}
}
],
"created": 1754211196,
"model": "ai/qwen2.5:latest",
"system_fingerprint": "b1-79e0b68",
"object": "chat.completion",
"usage": {
"completion_tokens": 28,
"prompt_tokens": 276,
"total_tokens": 304
},
"id": "chatcmpl-G6kIqQ4ijDlIQ388TbBCQ6X6k1IdmT4r",
"timings": {
"prompt_n": 276,
"prompt_ms": 1434.675,
"prompt_per_token_ms": 5.198097826086956,
"prompt_per_second": 192.37806471849026,
"predicted_n": 28,
"predicted_ms": 1351.9,
"predicted_per_token_ms": 48.28214285714286,
"predicted_per_second": 20.711591094015827
}
}
🔍 Extracted function calls:
Function: calculate_sum, Args: {"a":5,"b":37}
Conclusion(s)
So, the LLM knows how to detect the first tool call intention but not the rest. For simple cases, this might be enough, but for more complex cases, you'll need to do iterations and loop calls (or use "Parallel Tool Calls", but that's yet another topic). For now, we'll focus on simple calling with Osprey.
Simple Function Calling with Osprey
Osprey is a small Bash library that facilitates interaction, thanks to some helpers, with Docker Model Runner (and any LLM engine system providing OpenAI compatible endpoints). For function calling, Osprey provides the following functions:
osprey_tool_calls
: to make a tool calls type requestprint_tool_calls
: to display tool calls detected in the JSON responseget_tool_calls
: to extract tool calls from the JSON response (in base64 encoded JSON format)you can use
decode_tool_call
to get something readable from the base64 encoded tool call.get_function_name
: to extract the function name to callget_function_args
: to extract the function arguments to callget_call_id
: to extract the tool call ID
So, with Osprey, the previous script becomes much simpler and more readable (this time, I added function execution in the script):
Installing Osprey is very simple:
curl -fsSL https://github.com/k33g/osprey/releases/download/v0.0.7/osprey.sh -o ./osprey.sh chmod +x ./osprey.sh
#!/bin/bash
. "./osprey.sh" # Import Osprey library
DMR_BASE_URL=${MODEL_RUNNER_BASE_URL:-http://localhost:12434/engines/llama.cpp/v1}
MODEL=${MODEL_LUCY:-"ai/qwen2.5:latest"}
# tools catalog in JSON format
read -r -d '' TOOLS <<- EOM
[
{
"type": "function",
"function": {
"name": "calculate_sum",
"description": "Calculate the sum of two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number"
},
"b": {
"type": "number",
"description": "The second number"
}
},
"required": ["a", "b"]
}
}
},
{
"type": "function",
"function": {
"name": "say_hello",
"description": "Say hello to the given name",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name to greet"
}
},
"required": ["name"]
}
}
}
]
EOM
# Request with function calling
read -r -d '' DATA <<- EOM
{
"model": "${MODEL}",
"options": {
"temperature": 0.0
},
"messages": [
{
"role": "user",
"content": "Say hello to Bob and to Sam, make the sum of 5 and 37"
}
],
"tools": ${TOOLS}
}
EOM
echo "⏳ Making function call request..."
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")
echo ""
# Get tool calls for further processing
TOOL_CALLS=$(get_tool_calls "${RESULT}")
# Process each tool call
if [[ -n "$TOOL_CALLS" ]]; then
echo ""
echo "🚀 Processing tool calls..."
for tool_call in $TOOL_CALLS; do
FUNCTION_NAME=$(get_function_name "$tool_call")
FUNCTION_ARGS=$(get_function_args "$tool_call")
CALL_ID=$(get_call_id "$tool_call")
echo "Executing function: $FUNCTION_NAME with args: $FUNCTION_ARGS"
# Function execution
case "$FUNCTION_NAME" in
"say_hello")
NAME=$(echo "$FUNCTION_ARGS" | jq -r '.name')
HELLO="👋 Hello, $NAME!🙂"
RESULT_CONTENT="{\"message\": $HELLO}"
;;
"calculate_sum")
A=$(echo "$FUNCTION_ARGS" | jq -r '.a')
B=$(echo "$FUNCTION_ARGS" | jq -r '.b')
SUM=$((A + B))
RESULT_CONTENT="{\"result\": $SUM}"
;;
*)
RESULT_CONTENT="{\"error\": \"Unknown function\"}"
;;
esac
echo "Function result: $RESULT_CONTENT"
done
else
echo "No tool calls found in response"
fi
As output, you should get something like this:
⏳ Making function call request...
🛠️ Tool calls detected:
Function: say_hello, Args: {"name":"Bob"}
🚀 Processing tool calls...
Executing function: say_hello with args: {"name":"Bob"}
Function result: {"message": 👋 Hello, Bob!🙂}
No surprise, the LLM only detected one tool call, that of say_hello
with the name
parameter equal to "Bob". But it works 🎉.
Let's now see how to do "loop function calling" with Osprey.
Loop Function Calling with Osprey
For the 0.0.7 release of Osprey, Ron Evans added the ability to do "loop function calling", that is, to make tool calls in a loop until there are no more tool calls to execute. With the example this will be more telling.
Ron added among other things the following functions:
add_tool_calls_message
: record the intention to call a tool, for example:
{
"role": "assistant",
"tool_calls": [
{
"id": "call_123",
"function": {
"name": "calculate_sum",
"arguments": "{\"a\":2,\"b\":3}"
}
}
]
}
add_tool_message
: records the execution of a tool, for example:
{
"role": "tool",
"tool_call_id": "call_123",
"content": "{\"result\": 5}"
}
get_finish_reason
: allows extracting the completion finish reason, for exampletool_calls
orstop
to determine the next action in the loop.
This time my user message will be the following:
read -r -d '' USER_MESSAGE <<- EOM
Make the sum of 40 and 2,
then say hello to Bob and to Sam,
make the sum of 5 and 37
Say hello to Alice
EOM
And here's what the script will do, at the message level, according to OpenAI API specifications:
User message - Initial request
Assistant message with tool_calls - AI's function call requests
Tool messages - Results of function executions
Assistant message - Final response using tool results
sequenceDiagram
participant API as LLM API
participant Script as Script
participant History as Message History
API->>Script: Response with tool_calls
Script->>History: add_tool_calls_message (AI's intent)
Note over History: {"role":"assistant", "tool_calls":[...]}
loop For each tool call
Script->>Script: Execute function
Script->>History: add_tool_message (execution result)
Note over History: {"role":"tool", "tool_call_id":"...", "content":"..."}
end
This will give us a function execution flow like this:
flowchart TD
A[Tool Call Received] --> B[Extract Function Details]
B --> C[Get Function Name]
B --> D[Get Function Arguments]
B --> E[Get Call ID]
C --> F{Function Name?}
F -->|calculate_sum| G[Extract a and b parameters]
F -->|say_hello| H[Extract name parameter]
F -->|unknown| I[Return error result]
G --> J[Perform calculation: a + b]
H --> K[Generate greeting message]
J --> L[Format result as JSON]
K --> L
I --> L
L --> M[Add tool result to conversation]
M --> N[Continue conversation loop]
style F fill:#fff3e0
style J fill:#e8f5e8
style K fill:#e8f5e8
style I fill:#ffebee
Let's move on to the script.
The loop "function calling" script with Osprey
I started from the example proposed by Ron in https://github.com/k33g/osprey/tree/main/examples/16-multistep-mcp-tools and adapted it for this blog post.
Important note: the script execution will be longer than the previous ones because it will make several tool calls in a loop until there are no more tool calls to execute.
Advice: it may happen that the LLM doesn't detect all tool calls or detects too many. This is to be taken into account in your code but also in the choice of LLM model. (in general,
qwen2.5
gives good results)
Here is the complete script:
#!/bin/bash
. "./osprey.sh"
DMR_BASE_URL="http://localhost:12434/engines/llama.cpp/v1"
MODEL="ai/qwen2.5:latest"
# Tools catalog in JSON format
read -r -d '' TOOLS <<- EOM
[
{
"type": "function",
"function": {
"name": "calculate_sum",
"description": "Calculate the sum of two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number"
},
"b": {
"type": "number",
"description": "The second number"
}
},
"required": ["a", "b"]
}
}
},
{
"type": "function",
"function": {
"name": "say_hello",
"description": "Say hello to the given name",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The name to greet"
}
},
"required": ["name"]
}
}
}
]
EOM
read -r -d '' USER_MESSAGE <<- EOM
Make the sum of 40 and 2,
then say hello to Bob and to Sam,
make the sum of 5 and 37
Say hello to Alice
EOM
STOPPED="false"
SET_OF_MESSAGES=()
add_user_message SET_OF_MESSAGES "${USER_MESSAGE}"
while [ "$STOPPED" != "true" ]; do
# Build messages array conversation history
MESSAGES=$(build_messages_array SET_OF_MESSAGES)
# Build request data
read -r -d '' DATA <<- EOM
{
"model": "${MODEL}",
"options": {
"temperature": 0.0
},
"messages": [${MESSAGES}],
"tools": ${TOOLS}
}
EOM
echo "⏳ Making function call request..."
RESULT=$(osprey_tool_calls ${DMR_BASE_URL} "${DATA}")
echo ""
echo "🛠️💡 Tool call detected in user message:"
print_tool_calls "${RESULT}"
# Get tool calls for further processing
TOOL_CALLS=$(get_tool_calls "${RESULT}")
FINISH_REASON=$(get_finish_reason "${RESULT}")
case $FINISH_REASON in
tool_calls)
if [[ -n "$TOOL_CALLS" ]]; then
add_tool_calls_message SET_OF_MESSAGES "${TOOL_CALLS}"
echo ""
echo "🚀 Processing tool calls..."
for tool_call in $TOOL_CALLS; do
FUNCTION_NAME=$(get_function_name "$tool_call")
FUNCTION_ARGS=$(get_function_args "$tool_call")
CALL_ID=$(get_call_id "$tool_call")
echo "▶️ Executing function: $FUNCTION_NAME with args: $FUNCTION_ARGS"
RESULT_CONTENT=""
# Function execution
case "$FUNCTION_NAME" in
"say_hello")
NAME=$(echo "$FUNCTION_ARGS" | jq -r '.name')
HELLO="👋 Hello, $NAME!🙂"
RESULT_CONTENT="{\"message\": \"$HELLO\"}"
;;
"calculate_sum")
A=$(echo "$FUNCTION_ARGS" | jq -r '.a')
B=$(echo "$FUNCTION_ARGS" | jq -r '.b')
SUM=$((A + B))
RESULT_CONTENT="{\"result\": $SUM}"
;;
*)
RESULT_CONTENT="{\"error\": \"Unknown function\"}"
;;
esac
echo -e "Function result: $RESULT_CONTENT\n\n"
add_tool_message SET_OF_MESSAGES "${CALL_ID}" "${RESULT_CONTENT}"
done
else
echo "No tool calls found in response"
fi
;;
stop)
STOPPED="true"
ASSISTANT_MESSAGE=$(echo "${RESULT}" | jq -r '.choices[0].message.content')
echo "🤖 ${ASSISTANT_MESSAGE}"
# Add assistant response to conversation history
add_assistant_message SET_OF_MESSAGES "${ASSISTANT_MESSAGE}"
;;
*)
echo "🔴 unexpected response: $FINISH_REASON"
;;
esac
done
Then, when executing the script, you should get something like this:
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
Function: calculate_sum, Args: {"a":40,"b":2}
🚀 Processing tool calls...
▶️ Executing function: calculate_sum with args: {"a":40,"b":2}
Function result: {"result": 42}
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Bob"}
🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Bob"}
Function result: {"message": "👋 Hello, Bob!🙂"}
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Sam"}
🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Sam"}
Function result: {"message": "👋 Hello, Sam!🙂"}
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
Function: calculate_sum, Args: {"a":5,"b":37}
🚀 Processing tool calls...
▶️ Executing function: calculate_sum with args: {"a":5,"b":37}
Function result: {"result": 42}
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
Function: say_hello, Args: {"name":"Alice"}
🚀 Processing tool calls...
▶️ Executing function: say_hello with args: {"name":"Alice"}
Function result: {"message": "👋 Hello, Alice!🙂"}
⏳ Making function call request...
🛠️💡 Tool call detected in user message:
🤖 The sum of 40 and 2 is 42.
Hello, Bob! 😊
Hello, Sam! 😊
The sum of 5 and 37 is 42.
Hello, Alice! 😊
Conclusion
🎉 It is therefore possible to detect all function calls in the user message and this with local models. Some models are more or less "gifted" depending on the use cases. For example hf.co/menlo/lucy-128k-gguf:q4_k_m
that you can find at Hugging Face is quite very good at single calls, while ai/qwen2.5:latest
will manage in most cases.
There's another way to do "function calling" and detect all tool calls, it's "Parallel Tool Calls" which allows detecting all tool calls in a single completion: Parallel function calling, but not all models support it yet and in the case of Llama.cpp, it seems there's a bug, so to be followed.
It's also possible to "simulate" the "function calling" functionality only by using prompt engineering, but this requires patience and adapting the prompt according to the model used. This will be the subject of a future article.
I hope this article has enlightened you on "function calling" in general (everything done here in Bash is easily adaptable to other programming languages). Tool support in LLMs is becoming essential in the era of MCP. If the model you're using isn't capable of recognizing tool calls, MCP becomes useless.
The examples are available here: https://github.com/k33g/osprey/tree/main/examples/17-tool-calls-loop
Subscribe to my newsletter
Read articles from Philippe Charrière directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
