Ollama: a step-by-step guide to running open-source models locally on your machine

hey everyone! today, we’ll explore Ollama—what it is, how it differs from LLaMA, how it can be helpful, the models it supports, its limitations, and the possibilities it opens when paired with cloud computing. let’s break it all down in simple terms.


what exactly is Ollama?

Ollama is a platform designed to make working with Large Language Models (LLMs) more accessible and efficient. it acts as a tool that allows you to run and interact with LLMs on your local machine or in the cloud.

it’s particularly useful for developers, researchers, and organizations that want to use large language models (llms) without diving deep into complex configurations.

in short: think of ollama as your personal assistant for managing and using ai models with ease.


are LLaMA and Ollama the same?

just like people often confuse git and GitHub, LLaMA and Ollama are not the same. they serve different purposes but are related.

  • LLaMA stands for Large Language Model Meta AI. it is a family of advanced models created by meta for natural language processing (NLP) tasks like text generation, summarization, and translation.

  • Ollama is a platform designed to make it easier to work with large language models (llms), including llama and others. it provides a user-friendly interface, tools for interaction, and features like fine-tuning and deployment, helping you use these models more efficiently.

think of LLaMA as the engine that powers ai tasks, and Ollama as the control panel that lets you manage and use the engine seamlessly.

why is this important?

while llama focuses on being a powerful language model, ollama helps developers and businesses use it effectively by handling setup, execution, and optimization.

ollama isn’t limited to llama; it also supports other popular llms, offering flexibility for diverse use cases. this makes it an essential tool for anyone exploring llms for their applications.


how is Ollama helpful?

Ollama simplifies working with ai models in several ways:

  1. easy setup
    you don’t need to be an ai expert to use it. ollama makes running llms on your local machine or the cloud straightforward with minimal setup.

  2. runs locally
    you can run models directly on your machine, keeping your data private. this is great for those who prefer local processing over cloud-based solutions.

  3. customization
    ollama allows you to fine-tune models for specific use cases, so you can adapt ai to your unique needs, like building chatbots, summarizing documents, or generating creative content.

  4. model selection
    it supports multiple models, giving you the flexibility to choose the right one for your task.

  5. integration-ready
    it integrates seamlessly with tools and workflows, making it easier to bring ai into your applications.


how many models does ollama support?

ollama supports a variety of models, specifically designed for different tasks and applications. here are some key models that ollama supports:

  • phi-3 mini: 3.8 billion parameters, 2.3 gb

  • llama 3: 8 billion parameters, 4.7 gb

  • dolphin llama 3: 8 billion parameters, 4.7 gb (multi-modal)

  • wizardlm-2: 7 billion parameters, 4.1 gb

  • llama 2: 7 billion parameters, 3.8 gb

  • mistral: 7 billion parameters, 4.1 gb

  • command r: 35 billion parameters, 20 gb

  • gemma: available in sizes of 2b and 7b

in total, ollama lists numerous models available for various applications, including both general-purpose and specialized tasks.

for a complete list and more details on each model, you can refer to the official ollama documentation here.


cons of using ollama

like any tool, ollama has its drawbacks:

  1. resource-intensive
    running large models locally requires powerful hardware (like high ram and gpus). without sufficient resources, performance may be slow.

  2. limited offline functionality
    while ollama supports local processing, certain features may require an internet connection, such as fetching model updates or cloud-based tasks.

  3. learning curve
    though it simplifies working with ai, beginners might still need time to fully grasp the system’s features and capabilities.

  4. cost considerations
    using advanced models on the cloud may incur costs, especially for large-scale applications.


ollama + cloud: unlocking new possibilities

ollama becomes even more powerful when paired with cloud computing:

  1. scalability
    running models on the cloud removes hardware limitations. you can process large datasets and handle high traffic without worrying about local resources.

  2. collaboration
    deploying ollama in the cloud allows teams to share and collaborate on ai applications seamlessly.

  3. cost efficiency
    pay-as-you-go cloud models ensure you’re only paying for the resources you use, making it budget-friendly for startups and enterprises alike.

  4. global accessibility
    with cloud-based deployment, your ai solutions can be accessed from anywhere, enabling remote work and global outreach.

imagine using ollama in the cloud to power real-time chatbots, automate workflows, or generate insights at scale.


quick guide: how to get started with Ollama

if you want to run any model using Ollama, here's a simple guide to help you get started:

1. install ollama

first, you'll need to install ollama. open your terminal and run this command:

curl -fsSL https://ollama.com/install.sh | sh

this will download and install ollama on your system.

2. pull the model

once ollama is installed, you need to pull the model you want to run. replace model_name:version with the model you need. for example, to pull the llama 3.21b model, run:

ollama pull llama3.2:1b

this will download the specified model so you can use it locally. time taken to execute this command will depend on the model you have chosen and your machine configuration as well.

3. test the model

after the model is downloaded, it’s good practice to test if it's working correctly. run the following command:

ollama run llama3.2:1b

this will run the model and show if everything is set up correctly. if the model runs without errors, you’ve successfully set up ollama with your model!

now you can use any of the open-source models using ollama locally. simply install ollama, pull the desired model, and run it on your system. whether it's a popular model like llama 3.21b or any other supported model, ollama makes it easy to set up and test them locally with just a few simple commands.


closing note

ollama bridges the gap between complex ai models and real-world applications. whether you’re running models locally for privacy or scaling them on the cloud for enterprise needs, ollama makes LLMs more accessible, flexible, and efficient.

explore ollama, experiment with different models, and unlock new possibilities for your projects!

see ya, happy building :)

always eager to learn and connect with like-minded individuals. happy to connect with you all here!

0
Subscribe to my newsletter

Read articles from Vinayak Gavariya directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vinayak Gavariya
Vinayak Gavariya

Machine Learning Engineer