How I Turned an Old Laptop into a Local AI Server (for Free!) with Ollama + Cloudflare Tunnel

Raghul MRaghul M
4 min read

Introduction :

I had an old laptop and a wild idea…

Most GenAI tools need expensive GPUs or cloud credits. I had neither. So, I asked myself can I run a language model locally, without the cloud, and still make it accessible from anywhere?

Turns out, yes and it was surprisingly fun. Here’s exactly how I built a self-hosted AI server using Ollama and Cloudflare Tunnel, step-by-step.


Project overview :

Hardware & OS Setup 💻

I used an old laptop with the following specs:

  • Storage: 465 GB

  • RAM: 4 GB

  • OS: Xubuntu (lightweight and efficient) / Linux Mint (Any Linux you can use )

Why Xubuntu/Linux Mint?

  • Low memory usage

  • Fast performance on older hardware

  • Easy to set up and supports modern tools


Installing Ollama :

Ollama is a powerful CLI tool that allows you to run and interact with language models locally. Here's how I installed it:

$ curl -fsSL https://ollama.com/install.sh | sh

Then I pulled a lightweight model for fast performance:

$ ollama pull tinyllama

I later tried deepseek-r1:1.5b and it worked great too!


🔄 Serving the LLM

To serve the model and make it accessible:

$ OLLAMA_HOST=0.0.0.0 ollama serve

Note: If you run into address already in use, try a different port using:

$ OLLAMA_HOST=0.0.0.0 OLLAMA_PORT=11435 ollama serve

Verify it's working: It will list all the models available

$ curl http://localhost:11435/api/tags

Exposing Ollma Localhost with Cloudflare Tunnel

To make the local server publicly accessible, I used Cloudflare Tunnel. Open new terminal and install Cloudflared tunnel

$ wget https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64 -o cloudflared
$ chmod +x cloudflared

To verify the installation Run this command :

$ cloudflared --version

Once verified lets expose the localhost using Cloudflared

$ cloudflared tunnel --url http://localhost:11435

This gives you a public URL like:

https://your-unique-subdomain.trycloudflare.com

Now your Ollama model is accessible from anywhere securely! You can use it via curl or api from any application .

curl https://your-unique-subdomain.trycloudflare.com/api/generate -d '{
  "model": "deepseek-r1r:1.5b",
  "prompt": "Write a Python function to reverse a string"
}'

🌐 Creating a Streamlit Frontend

I built a Streamlit app to interact with the model easily:

import streamlit as st
import requests

# --- Config ---
OLLAMA_BASE_URL = "https://your-unique-subdomain.trycloudflare.com"  # or your tunnel domain
TAGS_URL = f"{OLLAMA_BASE_URL}/api/tags"
GENERATE_URL = f"{OLLAMA_BASE_URL}/api/generate"

# --- Page Settings ---
st.set_page_config(page_title="Ollama Chat", layout="centered")
st.title("🧠 Ollama Chat Interface")

# --- Fetch Models ---
@st.cache_data
def fetch_models():
    try:
        response = requests.get(TAGS_URL)
        if response.status_code == 200:
            data = response.json()
            models = [model["name"] for model in data.get("models", [])]
            return models
        else:
            st.error(f"Failed to fetch models: {response.status_code}")
            return []
    except Exception as e:
        st.error(f"Error fetching models: {e}")
        return []

# --- UI: Select Model ---
models = fetch_models()
if not models:
    st.warning("No models available. Please load a model into Ollama.")
    st.stop()

selected_model = st.selectbox("📦 Choose a model:", models)

# --- UI: Prompt Input ---
prompt = st.text_area("💬 Enter your prompt:", height=200)

if st.button("🚀 Generate Response"):
    if not prompt.strip():
        st.warning("Prompt cannot be empty.")
    else:
        with st.spinner("Generating response..."):
            payload = {
                "model": selected_model,
                "prompt": prompt,
                "stream": False
            }
            try:
                response = requests.post(GENERATE_URL, json=payload)
                if response.status_code == 200:
                    result = response.json()
                    st.markdown("### ✅ Response")
                    st.write(result.get("response", "No response received."))
                else:
                    st.error(f"Error {response.status_code}: {response.text}")
            except Exception as e:
                st.error(f"Request failed: {e}")

📊 Bonus Tips

  • Use htop or glances to monitor memory and CPU.

  • Check disk usage with lsblk and df -h.

  • Use lightweight models for fast inference on low-end machines.

🚀 What You Can Build

  • Personal chatbot

  • Code generation tool

  • Lightweight AI backend for your apps

  • Home AI server on a budget


📖 Conclusion :

This project proved something awesome . that you don’t need top-tier hardware to experiment with AI. With a bit of creativity and the right tools, even an old laptop can become a powerful AI playground.

Feel free to check out the GitHub repository for the complete setup.

If you try this out or want help replicating it – feel free to reach out! Raghul M

2
Subscribe to my newsletter

Read articles from Raghul M directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Raghul M
Raghul M

I'm the founder of CareerPod, a Software Quality Engineer at Red Hat, Python Developer, Cloud & DevOps Enthusiast, AI/ML Advocate, and Tech Enthusiast. I enjoy building projects, sharing valuable tips for new programmers, and connecting with the tech community. Check out my blog at Tech Journal.