How Vector Embeddings works?

Chaitrali KakdeChaitrali Kakde
5 min read

What is vector embedding?

P.S.: My mom is a huge fan of Govinda and Jitendra

One lazy Sunday, chai in hand, my mom was lost in old Bollywood songs. She paused at a Govinda hit, “Beta, iske jaisi comedy movie lagana!” she said.

The tricky part? Computers don’t understand “masti” like we do. That’s where vector embeddings come in they turn vibes into numbers, mapping movies by mood.

Govinda’s comedies land in the “fun & colorful” zone, Jeetendra’s dramas in the “emotional & heartfelt” one.

So when mom asked, the AI didn’t just look for Govinda’s name it found anything in that same joyful and comedy zone. In seconds, the screen lit up.

“Wah beta, computer ko toh taste aa gaya!” she laughed.

Vector embeddings are numerical representations of data points, transforming text, images, or other data into a format that machine learning models can understand and process.

For example, the word "comedy" might have a vector like this: [0.9, 0.2, 0.7, 0.3, 1, 0, 0, 0, 0.4, 0.8, 0.9] and "freedom" might be: [0.1, 0.8, 0.6, 0.7, 1, 0, 0, 0, 0.7, 0.3, 0.2].

Types of vector embedding

They represent different data types as points in a multidimensional space, where similar data points are clustered closer together. It represents any type of data, including text, images, videos, users, and more.

  1. Text embedding represents individual words as vectors.

  2. Sentence embeddings represent entire sentences as vectors.

  3. Document embeddings represent documents (anything from newspaper articles and academic papers to books) as vectors. They capture the semantic information and context of the entire document.

4)Image embeddings represent images as vectors by capturing different visual features

5)User embeddings represent users in a system or platform as vectors. They capture user preferences, behaviors, and characteristics. User embeddings can be used in everything from recommendation systems to personalized marketing as well as user segmentation.

Are embeddings and vectors the same thing?

In vector embeddings, embeddings and vectors mean almost the same thing — both are just lists of numbers that represent some piece of data in a certain space.

A vector is simply an array of numbers, like [1.2, -0.5, 3.0].
An embedding is a special kind of vector that is created to capture meaning and relationships in the data.

So, all embeddings are vectors, but not all vectors are embeddings. The word "vector" focuses on the raw numbers, while "embedding" focuses on the idea that those numbers are chosen in a way that keeps similar things close together in that space.

How are vector embeddings created?

When we say vector embedding, we mean representing complex data (like words, images, or audio) as a list of numbers.
The goal is: things that are similar in meaning end up close together in this number space.

A Relatable Analogy

Imagine your Spotify playlist. Each song has “vibes”: energy, mood, genre. You might describe them with words like “chill,” “happy,” or “intense.”
Spotify’s recommendation engine doesn’t store those words — it stores numbers for each vibe. When you like one chill song, it looks for other songs with similar numbers. Boom — personalized playlist.

Step 1: Start with Raw Data

For text: The model takes words, sentences, or even whole documents.

For images: It takes pixel values.

For audio: It takes waveform samples or spectrograms.

Step 2: Learn Patterns Using Machine Learning

Machine learning models are trained on huge datasets so they can detect relationships.

For text, models learn which words appear together (context). For images, models learn patterns of shapes, colors, and edges.

Think of it like teaching a map where similar points are close and dissimilar ones are far apart.

Step 3: Transform into a Vector

This is where the embedding layer of a neural network comes in.

1) It takes the input (word/image/audio)

2) Passes it through layers of math operations

3) Outputs a vector — a list of numbers like:

  •     [0.12, -0.45, 0.33, 0.78, ...]
    

    Each number represents some learned feature.

  1. Word2Vec (Google)

    Learns word meaning from context (skip-gram & CBOW methods).

    Famous for analogies like: king - man + woman ≈ queen.

  2. GloVe (Stanford)

    Uses global word co-occurrence statistics to create embeddings.

  3. BERT (Google)

    Creates contextual embeddings — the same word can have different vectors depending on the sentence.

  4. CLIP (OpenAI)

    Trains on image–text pairs so both images and text live in the same vector space.

    Example: "A photo of a cat" and an actual cat image will be near each other.


Step 4: Similar Things Stay Close

Once embedded, you can visualize them like dots on a map:

Similar words/images are clustered together. Dissimilar ones are far apart.

For example:

dog   →  close to  →  puppy, cat
pizza →  close to  →  pasta, burger

How to compare two vector?

  1. Euclidean data: The difference between two n-dimensional vectors a and b is calculated by first adding the squares of the differences between each of their corresponding components, and then taking the square root of that sum. Because Euclidian distance is sensitive to magnitude, it’s useful for data that reflects things like size or counts. Values range from 0 (for identical vectors) to ∞.

  2. Cosine distance, also called cosine similarity, is a normalized measure of the cosine of the angle between two vectors. Cosine distance ranges from -1 to 1, in which 1 represents identical vectors, 0 represents orthogonal (or unrelated) vectors, and -1 represents fully opposite vectors.

from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Example vectors (could be word embeddings, sentence embeddings, etc.)
vector_a = np.array([[1, 2, 3]])
vector_b = np.array([[2, 3, 4]])

# Calculate cosine similarity
similarity = cosine_similarity(vector_a, vector_b)

print(f"Cosine Similarity: {similarity[0][0]:.4f}")
  1. Dot product is, algebraically speaking, the sum of the product of the corresponding components of each vector. Geometrically speaking, it’s a nonnormalized version of cosine distance that also reflects frequency or magnitude.

Code example - vector embedding

The Cool Part

Once everything is turned into vectors, you can:

  • Find similar images (like Google Lens)

  • Search text by meaning, not just exact words

  • Match people with the perfect meme

from sentence_transformers import SentenceTransformer, util

# Load a pre-trained embedding model
model = SentenceTransformer('all-MiniLM-L6-v2')

# Example sentences
sentences = [
    "I love programming in Python",
    "Python coding is my passion",
    "The weather is sunny today"
]

# Generate embeddings
embeddings = model.encode(sentences)

# Show the embeddings
for i, sentence in enumerate(sentences):
    print(f"Sentence: {sentence}")
    print(f"Embedding vector shape: {embeddings[i].shape}")
    print(f"Embedding (first 5 values): {embeddings[i][:5]}")
    print("----")

# Calculate similarity between first two sentences
similarity = util.cos_sim(embeddings[0], embeddings[1])
print(f"Similarity between Sentence 1 and 2: {similarity.item():.4f}")
10
Subscribe to my newsletter

Read articles from Chaitrali Kakde directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Chaitrali Kakde
Chaitrali Kakde