Install and Run Ollama with IBM Granite

With the growing demand for running large language models (LLMs) locally, Ollama has emerged as a powerful tool that simplifies the process. If you're particularly interested in IBM’s Granite models ,known for their efficiency and capabilities , this guide walks you through the entire setup on your own machine.

🧰Minimum Requirements

Before you begin, ensure your system meets the minimum requirements:

Operating System: Windows, macOS, or Linux
RAM: Minimum 8 GB (16+ GB recommended for better performance)
CPU/GPU: CPU is sufficient; a GPU with CUDA support (for Linux/Windows) improves performance
Disk Space: 4–8 GB per model, depending on size

⚙️Step-by-Step Guide

1. Install Ollama

To install Ollama, visit the official download page:
🔗 https://ollama.com/download

Choose the appropriate version for your OS:

macOS: .pkg installer
Windows: .exe installer
Linux (Debian-based):

curl -fsSL https://ollama.com/install.sh | sh

Once installed, start the Ollama server:

ollama serve

2. Download the IBM Granite Model

Ollama supports multiple Granite variants, such as:

granite3.2:8b
granite3.1-dense:2b
granite-code:8b
granite3.3:8b

To pull a model, use the following command in your terminal:

ollama pull granite3.3:8b

💡 You can browse other available Granite models from the https://ollama.com/library or IBM-Granite

To find the latest available Granite model, search for 'Granite' in the Ollama library.

Figure 1: The Ollama library webpage, showcasing available Granite AI models for download

Example:

ollama pull granite3.3:8b

Figure 2: Progress of downloading the IBM granite3.3:8b AI model using Ollama

3. Run Granite in Terminal

To run the model and interact with it directly via command line:

ollama run granite3.3:8b

Figure 3: Interacting with the Granite AI model via Ollama on macOS

This starts an interactive chat session. Type your prompts and receive AI-generated responses. Exit with /bye or Ctrl + D.

🌐Using Open WebUI for a Visual Experience

Prefer a UI-based experience similar to ChatGPT? You can integrate Ollama with Open WebUI, a lightweight, containerized interface.

Step-by-Step Installation of Open WebUI

1. Install Podman Desktop (Recommended on macOS)

Podman Desktop simplifies container management.

Download from: https://podman-desktop.io/
Follow installation prompts. It sets up the Podman machine needed for containerized apps on macOS.

2. Pull the Open WebUI Image

Run this in your terminal:

podman pull ghcr.io/open-webui/open-webui:main

3. Run Open WebUI Container

podman run -d -p 8080:8080 \

--name open-webui \

--restart always \

-v open-webui:/app/backend/data \

-e OLLAMA_BASE_URL=http://host.containers.internal:11434 \

ghcr.io/open-webui/open-webui:main

Once running, visit:
🔗 http://localhost:8080/

Note: On your first visit, you’ll be prompted to create a username and password to access the interface. This step helps secure your local instance.

Figure 4: List of AI models downloaded and managed by Ollama on a local machine.

Figure 5: Open WebUI interface displaying the selection of the granite3.3:8b model

You’ll find all available models listed in the interface. Select your desired Granite model and start chatting!

📌 Note: When using Open WebUI, you don’t need to manually run ollama run ,it connects to Ollama in the background.

🛠️Maintenance & Troubleshooting

To stop or remove the Open WebUI container:

podman stop open-webui

podman start open-webui

podman rm open-webui

✅Final Thoughts

Running IBM's Granite models locally via Ollama gives you the power of LLMs without relying on cloud APIs. With optional integration into Open WebUI, you get a slick user interface complete with prompt history and better usability.

How to Install and Run Ollama with IBM Granite Model on Your Local Machine

Table of contents