Run LLM locally with Ollama

Talvinder SinghTalvinder Singh
2 min read

So far we have been interacting with Large Language Models (LLM) using API calls, on browsers - like ChatGPT, Claude, Perplexity, and lot many.

I will walk you through steps to run LLM models on our local laptop/desptop using IDE and Ollama.

What is Ollama?

Ollama is a tool that allows you to run large language models (LLMs) locally on your computer. It simplifies the process of downloading, managing, and interacting with these models, acting as a bridge between the models and your system.

Why you need to run them locally?

Interesting question, there could be several reasons.

  1. Privacy & Data Security

  2. Integration into Local Apps. Run multi-agent systems and workflows like AutoGen, CrewAI, LangGraph

  3. No Rate Limits or Throttling

  4. Experiment Freely. Test models like llama3, mistral, tinyllama, codellama, etc.

  5. Benchmark for accuracy, speed, and token cost before production use

  6. Creating internal copilots

  7. Zero Inference Cost

Lets get to the business —

  1. Based on your OS download Ollama from here.

  2. Once install you can verify if Ollama is running locally by browsing - http://localhost:11434/

  3. Go to https://ollama.com/search here you can get list of LLM models that you can pull on your laptop using any IDE.

  1. Run cmd in terminal: ollama pull tinyllama. Tinyllama is compact 1.1B Llama model, 638MB in size.

  2. Above command will pull the tinyllama model image to you laptop. Use ollama list to see list of models available locally to run.

  3. ollama run tinyllama and the magic happens. You can chat with the model now. To exit type /bye.

There are lot of multi-modal LLM models available which you can install locally using ollama pull modelname:tag.

Bonus: You can run reasoning models like deepseek-r1, qwen3.

Embedding models like nomic-embed-text, paraphrase-multilingual

Multimodal model like llava, llama4 and more.

I hope it was helpful. :)

Nextup — Implementation of RAG using LLM model for creating an contextual assistant which searches through internal knowledge base (KB) and suggest recommendations based on context. Stay tuned.

0
Subscribe to my newsletter

Read articles from Talvinder Singh directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Talvinder Singh
Talvinder Singh

I am Lead Devops/Platform engineer. Author of a Novel, an avid reader.