Get started with Spring AI and Ollama

Introduction

In this article we look at how to get started with Spring AI project and connect to Ollama Large Language Models (LLMs) hosted locally on your desktop.

Feel free to browse other articles in this getting started with Spring AI series:

Create a simple Spring Application (Beginners)

We can use Spring Initializr to generate a skeleton project for our simple application. You can do that either in IDE itself, or directly with a pre-configured project (e.g. Maven, Java, Spring Boot 3.4.4, spring-ai-demo, com.example package name. JAR packaging, Java 24 with Spring Web).

Clicking on “Generate” button will download a project skeleton zip archive, e.g. spring-ai-demo.zip which you can unzip and open in editor of your choice (e.g. IntelliJ, VS Code, etc.)

We can start adding specific AI model Spring Boot starters later on.

Let’s add a HomeController, e.g.

package com.example;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class HomeController {

    @GetMapping("/")
    public String home() {
        return "Hello World!";
    }
}

Let’s run our simple application, e.g.

./mvnw spring-boot:run

Feel free to browse from another terminal window, e.g. http localhost:8080

http localhost:8080

HTTP/1.1 200 
Connection: keep-alive
Content-Length: 12
Content-Type: text/plain;charset=UTF-8
Date: Tue, 11 Apr 2025 08:00:00 GMT
Keep-Alive: timeout=60

Hello World!

or a directly in the browser at http://localhost:8080

Feel free to explore Spring Boot Getting Started Guide for more details on how to create a simple Spring Boot web application.

Connecting your Spring Application to Ollama models

Configure Ollama

In this section we will explore connecting your application to Ollama Large Language Models (LLMs) hosted on your desktop. If you don’t already have Ollama installed, please download and install it on your machine, e.g.

You can start ollama process from command line too, e.g. ollama start

ollama start

2025/04/11 07:59:06 routes.go:1230: INFO server config env="map[HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:2048 OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/Users/neven/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false http_proxy: https_proxy: no_proxy:]"
time=2025-04-11T07:59:06.948+02:00 level=INFO source=images.go:432 msg="total blobs: 79"
time=2025-04-11T07:59:06.956+02:00 level=INFO source=images.go:439 msg="total unused blobs removed: 0"
time=2025-04-11T07:59:06.962+02:00 level=INFO source=routes.go:1297 msg="Listening on 127.0.0.1:11434 (version 0.6.2)"
time=2025-04-11T07:59:07.020+02:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=metal variant="" compute="" driver=0.0 name="" total="21.3 GiB" available="21.3 GiB"

Download Models for Ollama

You can use ollama command line to pull the models, e.g. ollama pull llama3.2

ollama pull llama3.2

pulling manifest 
pulling dde5aa3fc5ff... 100% ▕████████████████████████████████████████████████████████████████████▏ 2.0 GB                         
pulling 966de95ca8a6... 100% ▕████████████████████████████████████████████████████████████████████▏ 1.4 KB                         
pulling fcc5a6bec9da... 100% ▕████████████████████████████████████████████████████████████████████▏ 7.7 KB                         
pulling a70ff7e570d9... 100% ▕████████████████████████████████████████████████████████████████████▏ 6.0 KB                         
pulling 56bb8bd477a5... 100% ▕████████████████████████████████████████████████████████████████████▏   96 B                         
pulling 34bb5ab01051... 100% ▕████████████████████████████████████████████████████████████████████▏  561 B                         
verifying sha256 digest 
writing manifest 
success

You can list all the downloaded models, e.g. ollama list

ollama list

NAME                                                     ID              SIZE      MODIFIED      
llama3.2:latest                                          a80c4f17acd5    2.0 GB    2 minutes ago    
nomic-embed-text:v1.5                                    0a109f422b47    274 MB    8 weeks ago      
mistral:latest                                           f974a74358d6    4.1 GB    3 months ago     
mxbai-embed-large:latest                                 468836162de7    669 MB    3 months ago     
llama3.2:1b                                              baf6a787fdff    1.3 GB    3 months ago     
llama3.2:3b                                              a80c4f17acd5    2.0 GB    3 months ago     
nomic-embed-text:latest                                  0a109f422b47    274 MB    3 months ago

Finally, let’s run a model, e.g. ollama run llama3.2

ollama run llama3.2

>>> who are you
I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."

>>> Send a message (/? for help)

Finally, let’s add Spring AI code to start using Ollama AI API.

Add Spring AI for Ollama

We will add Spring AI Spring Boot Starter for OpenAI in our Maven pom.xml, e.g.

        <dependency>
            <groupId>org.springframework.ai</groupId>
            <artifactId>spring-ai-starter-model-ollama</artifactId>
            <version>1.0.0-M7</version>
        </dependency>

Notice, we are using version 1.0.0-M7 that’s newest at the time of writing this article. Please replace with the latest Spring AI version.

We also need to add few Spring properties to our src/main/resources/application.properties file, e.g.

spring.application.name=spring-ai-demo

spring.ai.ollama.chat.options.model=llama3.2

Notice, we have used llama3.2 large language model (LLM) for this simple use case. Feel free to explore all Ollama LLM models. These models vary in size, speed and training data.

Finally, let’s add a call to a LLM from our HomeController, e.g.

package com.example;

import org.springframework.ai.chat.client.ChatClient;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
public class HomeController {

    private ChatClient chatClient;

    public HomeController(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    @GetMapping("/")
    public String home() {
        return chatClient
                .prompt()
                .user("who are you")
                .call()
                .content();
    } 
}

Run the application again, e.g.

./mvnw spring-boot:run

Test the application, in another terminal or browser, e.g.

http localhost:8080

HTTP/1.1 200 
Connection: keep-alive
Content-Length: 12
Content-Type: text/plain;charset=UTF-8
Date: Tue, 11 Apr 2025 08:05:00 GMT
Keep-Alive: timeout=60

I am an AI language model created by OpenAI, designed to assist with a wide range of
questions and topics. I am here to provide information, answer questions, and engage
in conversation. How can I help you today?

🏆 Congratulations! You have created your first application that talks to an Ollama hosted LLM service.

Hope you enjoyed this beginners getting started with Spring AI article!

Connecting to other LLMs

If you want to run your Spring AI application with other LLM models, look at our articles:

Get started with Spring AI and Ollama

Table of contents