Day 2 of 50: Diving into Vector Databases

Mumtaz FatimaMumtaz Fatima
2 min read

What are vector databases?

In simple words, vector databases are similar to traditional databases, as in they store data. However, the difference is that they store vectors and can be leveraged to do quick similarity searches for vectors.

For example, if I have five documents: “apple”, “orange”, “plate”, “shoes”, and “drawer”. If I wanted to do a similarity search for “apple”, then I would easily find “orange” because these two are fruits and their vectors would lie close to each other in the vector space.

Now that we have an initial understanding of vector databases, let me clarify a few terms and fill in some of the gaps.

An image describing how vector database search works including pink dots, a cat, banana, etc.

Image credit: MongoDB

A Brief Intro On Vectors

While I don’t want to sound pedantic, it is essential to level the playing field for all my readers.

So what are vectors? In math-speak, vectors are objects that have both magnitude and direction. To translate it into ML-speak, vectors are mathematical representations of data that make it easier to process the data. Before data is fed into an ML model, it is converted into a vector so that it can represent a point in the vector space.

Here is a visual example (credits to Deepset.ai for this visual):

What Is Text Vectorization? Everything You Need to Know | deepset Blog

Uses of Vector Databases

Recommendation Systems: Vector databases (VectorDB) can be used to match user preferences with items. We can vectorize both user preferences and item descriptions and quickly retrieve the most relevant items based on user queries and existing info on user preferences.

Retrieval-Augmented Generation (RAG): VectorDB can be used to store relevant information, like vector representations of different documents, and then used to retrieve the most relevant document for a given query.

Anomaly Detection: Previously detected anomalies can be stored in a VectorDB, and each time a new anomaly is encountered, it can be checked against the database to see if it matches a particular theme.

If you’d like to learn more about vector databases, I’d recommend the following resources:

0
Subscribe to my newsletter

Read articles from Mumtaz Fatima directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Mumtaz Fatima
Mumtaz Fatima