From Pixels to Paragraphs, Embeddings and Vector Spaces

Mojtaba MalekiMojtaba Maleki
4 min read

Hey there, curious minds! 👋
If you're anything like me, you're constantly tinkering, breaking things (on purpose... mostly 😅), and learning how all this amazing AI magic works behind the scenes. My latest deep dive? Embeddings and wow, what a rabbit hole.

Let me take you along the ride through a course I just finished on Vector Databases, Embeddings and Applications. If you're trying to wrap your head around how machines “understand” images and language, you’ll love this.


🧩 What Even Is an Embedding?

An embedding is like a secret code a way of turning things like images or text into a vector of numbers. Why? Because numbers are the only language models truly speak. Once something is embedded, you can do all sorts of cool math on it, like measuring how similar two things are!

Companies like Google, OpenAI, Meta, and basically every modern AI product use embeddings to power search, recommendation systems, question answering, and more. Embeddings are everywhere, even if you don’t see them.


🧪 My First Hands-On: Embedding MNIST with a VAE

Okay, time to get nerdy. I started off with the MNIST dataset, those iconic 28x28 grayscale digits. I built a Variational Autoencoder (VAE) using Keras to compress these images down into just 2 dimensions. That’s right, each handwritten number got squeezed into a 2D vector.

Here’s the magic moment: plotting those vectors. Suddenly, I could literally see how the model perceived similarity, zeros clustered together, ones in their own corner, and so on.

That was my first “aha!” moment. These numbers? They weren’t random. They captured meaning. ✨


🔍 Measuring Similarity: How Close Are Two Zeros?

Once I had embeddings, I got to play scientist, comparing digits using different distance metrics:

  • Euclidean (L2): the straight-line distance.
  • Manhattan (L1): step-by-step grid movement.
  • Dot Product: projection of one vector onto another.
  • Cosine Similarity: angle between vectors, my personal favorite.

I compared two “0” digits and one “1”. Not surprisingly, the two zeros were closer, across all metrics. But seeing it quantified? That was powerful. It wasn’t just intuition anymore, it was math. And I was wielding it.


✍️ Then Came Sentences: Embedding Language

Images were fun, but I couldn’t wait to try this on text. Enter: the sentence-transformers library. With just a few lines of Python, I embedded these three:

  • "The team enjoyed the hike through the meadow"
  • "The national park had great views"
  • "Olive oil drizzled over pizza tastes delicious"

You can probably guess which two were more similar, right? Spoiler: the nature ones were closer in vector space than the pizza line 🍕.

That’s the beauty of sentence embeddings, they capture meaning, not just words. And with cosine distance, I could actually measure how similar two ideas were.


🛠️ Why This Matters (And Why I’m Hooked)

Embeddings are the foundation of so many AI applications, semantic search, chatbots, recommendation systems, LLM memory, you name it. Every time I use ChatGPT, I wonder: what vector magic is going on behind the scenes?

And now, I don’t have to wonder. I get to build that magic.

This course didn’t just teach me about embeddings, it helped me understand how thinking in vector spaces is one of the keys to working with modern AI systems. And honestly? That’s thrilling.


🌱 Small Wins, Big Dreams

It wasn’t all smooth sailing. I got confused by KL divergence (math ain't always friendly), my VAE reconstructions were blurry at first, and it took a bit to wrap my head around vector norms. But every bug I squashed and every scatter plot I drew felt like a step forward.

And for someone with Nobel Prize dreams (yeah, I said it 😎), these baby steps matter. A lot.

I’m just getting started, but embedding theory and practice is now part of my AI toolkit. And it feels good.


📌 TL;DR

  • Embeddings turn stuff (images, sentences) into vectors so AI can work with them.
  • I built a Variational Autoencoder to embed MNIST digits into 2D space.
  • I used sentence-transformers to embed and compare semantic meaning in text.
  • Cosine similarity and dot product are my go-to tools for measuring closeness.
  • Embeddings power modern search, chatbots, and LLM memory systems.

"Even the greatest AI starts with a print statement."

Stay nerdy,

5
Subscribe to my newsletter

Read articles from Mojtaba Maleki directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mojtaba Maleki
Mojtaba Maleki

Hi everyone! My name is Mojtaba Maleki and I was born on the 11th of February 2002. I'm currently a Computer Science student at the University of Debrecen. I'm a jack-of-all-trades when it comes to programming, so if you have a problem, I'm your man! My expertise lies in Machine Learning, Web and Application Development and I have published four books about Computer Science on Amazon. I'm proud to have multiple valuable certificates from top companies, so if you're looking for someone with qualifications, you've come to the right place. If you're not convinced yet, I'm also a great cook, so if you're ever hungry, just let me know!