Node.js RecSys: Ollama & Vector Magic

The Quest Begins

Once upon a time : You're scrolling through Netflix for the 147th time, and somehow their algorithm thinks you'd enjoy "Cats" because you once watched "The Lion King." We've all been there.

Today, we're staging a rebellion. We're building something better(or at least attempt): a recommendation system that actually makes sense, powered by embeddings, vector databases, and enough JavaScript magic to make even the most seasoned developer do a double-take.

What We're Building

We'll craft a movie recommendation system that:

Uses Ollama to generate embeddings from movie descriptions (because teaching machines to read is surprisingly effective)
Stores vectors in a simple in-memory database (keeping it lean and mean)
Finds similar movies using Approximate Nearest Neighbour(ANN) search (More about it in next article )
Serves recommendations via a clean REST API (because nobody likes messy endpoints)

No complex infrastructure, no PhD in machine learning required, no sacrificial offerings to the cloud gods—just good old Node.js doing what it does best: making developers' lives easier while running on approximately three cups of coffee.

The Tech Stack

Node.js: Because JavaScript conquered the world, and we're not complaining
Ollama: Our local LLM that turns text into magical vectors
hnswlib-node: For ANN search (because life's too short for exact searches)
Express: To make our API as smooth as a good espresso

Project Structure

movie-recommender/
├── package.json
├── server.js
├── lib/
│   ├── vectorDb.js
│   ├── embeddings.js
│   └── recommender.js
├── data/
│   └── movies.json
└── README.md

The Magic Behind the Curtain

How Embeddings Work (In Human Terms)

Think of embeddings as a way to convert text into coordinates in a high-dimensional space. Similar movies end up close to each other, like finding your tribe at a party—action movies hang out together, rom-coms form their own cluster, and horror movies lurk in the corner making everyone uncomfortable.

Why ANN (Approximate Nearest Neighbor)?

When you have thousands (or millions) of vectors, finding the exact nearest neighbors becomes slower than a dial-up internet connection. ANN algorithms like HNSW (Hierarchical Navigable Small World) give us "close enough" results in blazing fast time. It's like asking for directions and getting "turn left at the big oak tree" instead of GPS coordinates—practical and efficient.

Let's Build This Thing!

Prerequisites

Node.js (v18 or higher)
Ollama - Download from ollama.ai
Embedding model: Run ollama pull nomic-embed-text

Step 1: Project Setup

First, let's get our dependencies sorted:

npm init -y
npm install express ollama hnswlib-node
npm install --save-dev nodemon

Step 2: Sample Movie Data

We'll start with a curated list of movies to keep things interesting:

[
  {
    "id": 1,
    "title": "The Matrix",
    "genre": "Sci-Fi",
    "description": "A computer hacker learns from mysterious rebels about the true nature of his reality and his role in the war against its controllers."
  },
  {
    "id": 2,
    "title": "Inception",
    "genre": "Sci-Fi",
    "description": "A thief who steals corporate secrets through dream-sharing technology is given the inverse task of planting an idea into the mind of a C.E.O."
  },
  {
    "id": 3,
    "title": "The Godfather",
    "genre": "Crime",
    "description": "The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son."
  },
  {
    "id": 4,
    "title": "Pulp Fiction",
    "genre": "Crime",
    "description": "The lives of two mob hitmen, a boxer, a gangster and his wife intertwine in four tales of violence and redemption."
  },
  {
    "id": 5,
    "title": "The Shawshank Redemption",
    "genre": "Drama",
    "description": "Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency."
  }
]

Step 3: The Vector Database

Our vector database will be simpler than your morning coffee order:

// lib/vectorDb.js
const { HierarchicalNSW } = require('hnswlib-node');

class SimpleVectorDB {
  constructor(dimensions = 768) {
    this.dimensions = dimensions;
    this.index = new HierarchicalNSW('cosine', dimensions);
    this.movies = new Map();
    this.initialized = false;
  }

  async initialize(maxElements = 1000) {
    this.index.initIndex(maxElements);
    this.initialized = true;
    console.log('Vector database initialized with', maxElements, 'max elements');
  }

  async addMovie(id, vector, movieData) {
    if (!this.initialized) {
      throw new Error('Database not initialized. Call initialize() first!');
    }

    this.index.addPoint(vector, id);
    this.movies.set(id, movieData);
    console.log(`Added movie: ${movieData.title}`);
  }

  async findSimilar(queryVector, k = 5) {
    if (!this.initialized) {
      throw new Error('Database not initialized');
    }

    const results = this.index.searchKnn(queryVector, k);
    return results.neighbors.map((id, idx) => ({
      movie: this.movies.get(id),
      similarity: 1 - results.distances[idx] // Convert distance to similarity
    }));
  }

  getMovie(id) {
    return this.movies.get(id);
  }

  getAllMovies() {
    return Array.from(this.movies.values());
  }
}

module.exports = SimpleVectorDB;

Step 4: Embedding Service

Here's where Ollama works its magic:

// lib/embeddings.js
const { Ollama } = require('ollama');

class EmbeddingService {
  constructor() {
    this.ollama = new Ollama({ host: 'http://localhost:11434' });
    this.model = 'nomic-embed-text'; // Lightweight embedding model
  }

  async generateEmbedding(text) {
    try {
      console.log('Generating embedding for:', text.substring(0, 50) + '...');

      const response = await this.ollama.embeddings({
        model: this.model,
        prompt: text
      });

      return response.embedding;
    } catch (error) {
      console.error('Error generating embedding:', error.message);
      throw new Error(`Failed to generate embedding: ${error.message}`);
    }
  }

  async generateMovieEmbedding(movie) {
    // Combine title, genre, and description for richer embeddings
    const text = `${movie.title} ${movie.genre} ${movie.description}`;
    return this.generateEmbedding(text);
  }
}

module.exports = EmbeddingService;

Step 5: The Recommendation Engine

The heart of our system:

// lib/recommender.js
const EmbeddingService = require('./embedding');
const SimpleVectorDB = require('./vectorDb');

class MovieRecommender {
  constructor() {
    this.embeddingService = new EmbeddingService();
    this.vectorDb = new SimpleVectorDB();
  }

  async initialize(movies) {
    console.log('Initializing movie recommender...');

    // Initialize the vector database
    await this.vectorDb.initialize(movies.length * 2);

    // Generate embeddings for all movies
    for (const movie of movies) {
      try {
        const embedding = await this.embeddingService.generateMovieEmbedding(movie);
        await this.vectorDb.addMovie(movie.id, embedding, movie);
      } catch (error) {
        console.error(`Failed to process movie ${movie.title}:`, error.message);
      }
    }

    console.log('Recommender system ready!');
  }

  async recommendByMovie(movieId, count = 5) {
    const movie = this.vectorDb.getMovie(movieId);
    if (!movie) {
      throw new Error(`Movie with ID ${movieId} not found`);
    }

    console.log(`Finding movies similar to: ${movie.title}`);

    // Get the movie's embedding and find similar ones
    const movieEmbedding = await this.embeddingService.generateMovieEmbedding(movie);
    const similar = await this.vectorDb.findSimilar(movieEmbedding, count + 1); // +1 to exclude self

    // Filter out the original movie
    return similar
      .filter(result => result.movie.id !== movieId)
      .slice(0, count);
  }

  async recommendByDescription(description, count = 5) {
    console.log('Finding movies similar to description:', description.substring(0, 50) + '...');

    const queryEmbedding = await this.embeddingService.generateEmbedding(description);
    return await this.vectorDb.findSimilar(queryEmbedding, count);
  }

  getAllMovies() {
    return this.vectorDb.getAllMovies();
  }
}

module.exports = MovieRecommender;

Step 6: The API Server

Finally, let's wrap it all up in a neat Express server:

// server.js
const express = require('express');
const fs = require('fs').promises;
const path = require('path');
const MovieRecommender = require('./lib/recommender');

const app = express();
const PORT = process.env.PORT || 3000;

let recommender;

app.use(express.json());
app.use(express.static('public'));

// Initialize the recommender system
async function initializeSystem() {
  try {
    console.log('Loading movie data...');
    const moviesData = await fs.readFile(path.join(__dirname, 'data/movies.json'), 'utf8');
    const movies = JSON.parse(moviesData);

    recommender = new MovieRecommender();
    await recommender.initialize(movies);

    console.log('System ready to recommend!');
  } catch (error) {
    console.error('Failed to initialize system:', error);
    process.exit(1);
  }
}

// API Routes
app.get('/api/movies', (req, res) => {
  try {
    const movies = recommender.getAllMovies();
    res.json({ success: true, movies });
  } catch (error) {
    res.status(500).json({ success: false, error: error.message });
  }
});

app.get('/api/recommend/movie/:id', async (req, res) => {
  try {
    const movieId = parseInt(req.params.id);
    const count = parseInt(req.query.count) || 3;

    const recommendations = await recommender.recommendByMovie(movieId, count);

    res.json({
      success: true,
      recommendations: recommendations.map(r => ({
        ...r.movie,
        similarity: Math.round(r.similarity * 100) / 100
      }))
    });
  } catch (error) {
    res.status(400).json({ success: false, error: error.message });
  }
});

app.post('/api/recommend/description', async (req, res) => {
  try {
    const { description, count = 3 } = req.body;

    if (!description) {
      return res.status(400).json({ 
        success: false, 
        error: 'Description is required' 
      });
    }

    const recommendations = await recommender.recommendByDescription(description, count);

    res.json({
      success: true,
      recommendations: recommendations.map(r => ({
        ...r.movie,
        similarity: Math.round(r.similarity * 100) / 100
      }))
    });
  } catch (error) {
    res.status(500).json({ success: false, error: error.message });
  }
});

app.get('/health', (req, res) => {
  res.json({ status: 'healthy', timestamp: new Date().toISOString() });
});

// Start the server
app.listen(PORT, () => {
  console.log(`Movie Recommender API running on http://localhost:${PORT}`);
  initializeSystem();
});

Running the Beast

Start the System

# Install dependencies
npm install

# Create the data directory and add movies.json
mkdir data
# (Add the movies.json file with our sample data)

# Start the server
npm start

Test It Out

# Get all movies
curl http://localhost:3000/api/movies

# Get recommendations for The Matrix (ID: 1)
curl http://localhost:3000/api/recommend/movie/1

# Get recommendations by description
curl -X POST http://localhost:3000/api/recommend/description \
  -H "Content-Type: application/json" \
  -d '{"description": "A movie about time travel and paradoxes"}'

Why HNSW Works So Well

HNSW (Hierarchical Navigable Small World) is like having a really smart librarian who organizes books not just by genre, but by similarity. It creates multiple layers of connections:

Bottom layer: Contains all vectors
Upper layers: Contain shortcuts to jump around quickly
Search: Starts at the top and drills down, getting more precise at each level

Performance Characteristics

Build time: O(n log n) - slower to build, but worth it
Query time: O(log n) - lightning fast searches
Memory: Uses more RAM but gives consistent performance
Accuracy: 95%+ recall in most cases

Extending the System

Add More Features

User Preferences: Track what users like and personalize recommendations
Real-time Updates: Add new movies without rebuilding the entire index
Hybrid Recommendations: Combine content-based (what we built) with collaborative filtering
A/B Testing: Compare different embedding models and parameters

Scale It Up

Persistent Storage: Replace in-memory storage with Redis or PostgreSQL with vector extensions
Distributed Search: Use Elasticsearch or Weaviate for production workloads
Caching: Add Redis caching for frequently requested recommendations
Load Balancing: Multiple server instances behind a load balancer

Troubleshooting Common Issues

Ollama connection refused

Make sure Ollama is running: ollama serve

Model not found

Pull the model first: ollama pull nomic-embed-text

Out of memory

Reduce the max elements in the vector database or use a smaller embedding model

Slow recommendations

The first request generates embeddings on-the-fly. Consider pre-computing and caching them.

Conclusion: You Did It!

We've just built a recommendation system that would make Netflix engineers nod approvingly. Sure, it's not going to replace their billion-dollar algorithm overnight, but it's a solid foundation that demonstrates the core principles of modern recommendation systems.

The beautiful thing about what we've built is its simplicity. No complex infrastructure, no cloud dependencies, just Node.js, some clever algorithms, and a local LLM doing the heavy lifting. It's the kind of system you can understand, modify, and deploy without needing a team of PhD researchers.

Remember: every recommendation system starts simple. Even the most sophisticated systems are just clever combinations of the techniques we've used here. You're not just building code—you're building the future of how people discover content.

Now go forth and recommend!!

Building a Recommendation System with Node.js, Ollama, and Vector Magic