Serving LLMs Locally with Ollama

Ilkay PolatIlkay Polat
2 min read

When working with large language models (LLMs) locally, Ollama presents a powerful and efficient solution. Ollama is a tool that allows developers to run several open-source LLMs on their own machines without relying on cloud-based services. This makes it an excellent option for development and testing purposes, especially for those who need to work with AI models offline or prefer local execution.

Getting Started with Ollama

To use Ollama, you first need to download and install it on your local machine. Once installed, you can pull the models you wish to use with the ollama command-line tool. Some of the popular models available include:

  • Meta’s Llama 3.1

  • Google’s Gemma

  • Alibaba’s Qwen

  • MistalAI’s Mistral 7B

For example, to install the Gemma 2B model locally, use the following command:

$ ollama pull gemma:2b

Similarly, to use MistralAI’s Mistral 7B model, execute:

$ ollama pull mistral:7b

To check which models are installed on your machine, use:

$ ollama list

For more detailed information on the installed models, you can query Ollama's API:

$ http http://localhost:11434/api/tags -b

Using Ollama with Spring AI

Once the models are installed and Ollama is running, you can integrate them into your project using Spring AI’s Ollama starter dependency:

implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'

Unlike cloud-based AI providers such as OpenAI or MistralAI, Ollama does not require an API key, as it runs locally. This simplifies setup and enhances privacy by keeping data on your machine.

By default, Spring AI uses the Mistral 7B model with Ollama. If you want to use a different model, specify it using the spring.ai.ollama.chat.model property in your configuration:

spring.ai.ollama.chat.model=gemma:2b

This setting directs Spring AI to use the Gemma 2B model from Ollama running on your local machine.

Exploring More

For a practical implementation of a Spring AI-based project, you can check out this GitHub repository: Board Game Buddy - GitHub

Conclusion

Ollama is an excellent choice for running AI models locally without the need for cloud services. By integrating it with Spring AI, developers can streamline AI development while maintaining full control over their data and computational resources. Whether for development, testing, or privacy-sensitive applications, Ollama provides a robust and flexible solution for working with LLMs on your local machine.

0
Subscribe to my newsletter

Read articles from Ilkay Polat directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ilkay Polat
Ilkay Polat