Building a Similarity Search Engine

Ishan jiretyIshan jirety
5 min read

Similarity search is nothing but comparing two vectors and saying if it’s similar or not, in layman's terms, it’s like comparing Fruit and Apple They both are different but hold the same value. Overall, it’s a way of defining two things on the scale of -1 to 1 to see how similar a subject is to another

🖥️Where is it used?

Generally, it’s used for

  • Semantic searches

  • Ecommerce Recommendations

  • Media Library

  • AI Applications

It helps systems understand what the user is really looking for, even if the exact words don’t match, which makes search more intelligent and useful.

🤔How is it done?

There are many ways of doing it, but let’s start with how to Get Started with it

Embeddings

Embeddings are a way or a fancy way of turning Images, text, and numbers into a vector. The reason why we do it because computers don’t understand English or any other language, but they are good at numbers, so we can say embeddings are vectors of numbers that capture the meaning of a subject.

How Embeddings Work

There are tons of models that are created & trained to generate embeddings. Overall, they are trained to understand the subject and generate its relative vector. The training might look like this dataset

https://huggingface.co/datasets/sentence-transformers/natural-questions

Once a model is trained, it knows the difference between a fruit and apple Also, the similarity between the two.

While doing a similarity search, we generate embeddings for the user’s query and compare it with the object’s embeddings present in the DB using Cosine Similarity A lot of you might have heard about it this algorithm, but how does it work?

🤓The Analogy

Consider you are holding two crayons pointing to the top right corner of a sheet, and another the other crayon in the same direction but slightly off over here, both are pointing in the same direction, but with a slight variatio,n you can consider the similarity to be high close to 1, now consider holding both crayon in the opposite direction. In that case, similarity will be -1 since they are not pointing in the same direction.

It’s like apple & fruit Both crayons are pointing in the same direction for apple and fruit but slightly off

whereas apple & Iron Man Both crayons are pointing in opposite directions to each other

There are three rules

  • If the vectors point in the same direction, then the similarity is between 1

  • If the vectors are right-angled, they are neutral 0

  • If the vectors are opposite they are not similar -1

  • 🔴 Vector X points along the X-axis → (1, 0, 0)

  • 🔵 Vector Z points along the Z-axis → (0, 0, 1)

  • 🟢 Vector Diagonal points equally in all directions → (1, 1, 1)

Vector PairCosine Similarity
X and Z0.0
X and Diagonal0.58
Z and Diagonal0.58

📍The Workflow!

We are going to follow a basic workflow.

  • User uploads an Image

  • That uploaded image is stored and passed to any embedding model

    • text-embedding-3-small

    • text-embedding-ada-002

  • These models will generate embeddings that look like [0.21, -0.09, 0.97, ...] this stores it into any vector-enabled storage

  • And Done!

Database Structure

Now, talking about the workflow, there are different steps in which we process the input.

  • Task Queing

  • Input Analysis (Any Vision Model like gpt-4o-mini)

  • Embedding Generation (I am using a CLIP model https://replicate.com/krthr/clip-embeddings)

  • Store Embeddings

  • Now, while querying, we create an embedding of the query and check the similarity of the records. The higher the similarity, the closer the object is to the query

The code snippets are in the sandbox https://codesandbox.io/p/devbox/2986c4.

Result

Over here we can see my query was long enough, this clearly shows the length of the vector doesn’t matter but the angle does.

{
    "query": "I want an image of a tshirt, do we have it ?",
    "totalResults": 2,
    "results": [
        {
            "id": "cb7e8a44-8f0f-41b3-aab8-1c365c4a35f4",
            "description": "The image features a plain blue t-shirt displayed against a neutral gray background. The t-shirt is shown in a 3D rendering style, highlighting its texture and fit. It appears to be a sports or casual wear item, designed for comfort and functionality.",
            "objects": [
                "t-shirt"
            ],
            "colors": [
                "blue",
                "gray"
            ],
            "materials": [
                "fabric"
            ],
            "type": "t-shirt",
            "objectCategory": "apparel",
            "styleTags": [
                "product photography",
                "3D rendering",
                "studio shot"
            ],
            "moodTags": [
                "clean",
                "simple",
                "professional"
            ],
            "tags": [
                "blue t-shirt",
                "casual wear",
                "sports clothing",
                "3D rendering",
                "plain t-shirt"
            ],
            "reversePrompt": "Create a 3D rendering of a plain blue sports t-shirt against a neutral gray background, emphasizing the texture and fit of the fabric.",
            "createdAt": "2025-07-22T11:01:59.736Z",
            "similarity": 0.237932321812679
        },
        {
            "id": "c80031b3-8db5-46ca-8888-0e5a77ff9271",
            "description": "A plain olive green t-shirt is displayed against a white background in a product photography style. The shirt features a classic crew neck design and short sleeves, showcasing a simple, minimalist aesthetic with clean lines and a smooth, uniform surface.",
            "objects": [
                "t-shirt",
                "crew neck collar",
                "short sleeves",
                "clothing tag"
            ],
            "colors": [
                "olive green",
                "army green",
                "white background"
            ],
            "materials": [
                "cotton",
                "jersey knit fabric"
            ],
            "type": "t-shirt",
            "objectCategory": "apparel",
            "styleTags": [
                "product photography",
                "commercial",
                "studio lighting",
                "clean background",
                "minimalist",
                "professional",
                "centered composition"
            ],
            "moodTags": [
                "casual",
                "practical",
                "understated",
                "versatile",
                "military-inspired",
                "professional"
            ],
            "tags": [
                "casual wear",
                "basic clothing",
                "unisex",
                "crew neck",
                "short sleeve",
                "solid color",
                "minimalist",
                "military green",
                "everyday wear"
            ],
            "reversePrompt": "Professional product photography of a plain olive green t-shirt with crew neck and short sleeves, centered on pure white background, studio lighting, clean minimal composition, high-quality commercial fashion photography",
            "createdAt": "2025-07-22T11:01:16.988Z",
            "similarity": 0.227979845702615
        }
    ]
}

✨Preferred Providers (Easy Integration)

  • Openrouter

  • Replicate

Openrouter provides you with an SDK through which you can stick to one implementation and switch between different models according to the need and result

Replicate. You’ll find most of the models listed over there; you can do a similar operation with replicate.

I’ve used replicate models for embedding solutions & Openrouter for the Vision model

💬 Got Questions?

Drop a comment, reach out on LinkedIn, or shoot me a message if you’re trying this yourself.

0
Subscribe to my newsletter

Read articles from Ishan jirety directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ishan jirety
Ishan jirety

Hi, I am Ishan a web developer who likes to build beautiful and clean UI.