Vector Databases

A vector database is a database that is used to store, manage, and query high-dimensional vectors. These allow us to quickly find similar items using similarity search(semantic search) (cosine similarity, dot product, etc.).

What are these high-dimensional vectors?

A high-dimensional vector is a modified form of any one of these: text, audio, images, etc., into a numerical representation with the semantics preserved.

For example, in the case of a text bat in sentence one is different from the bat in sentence two:

sentence-1: The bat is a nocturnal animal.

sentence-2: The batsman hit the ball hard with the bat.

In the case of an image, the semantics may refer to the hidden meanings: for example, cats and dogs can be categorized as pets, and when a user searches for pets, these images appear.

The high-dimensional vectors are nothing but vectors with numbers. These numbers will also store the semantics. Generally speaking, the number of dimensions required to store the semantics should be greater than the dimensions required for storing the faces.

For example, we have idioms: "The cat is out of the bag." This does not involve a real cat. This means that the secret is finally out.

Languages have higher variability than faces.

How do we convert data into embeddings?

Converting the textual data into embeddings means converting the data into dense numerical vectors. For converting the textual data into dense vectors, we use EMBEDDING models, called sentenceTransformers.

These sentence transformers will take large data in small chunks and convert each chunk into embeddings[dense vectors].

The sentence Transformers will convert the data into vectors in 4 steps:

Step 1: Tokenization: The input sentences are transformed into tokens using the Tokenizer

"Login failed" → ["[CLS]", "login", "failed", "[SEP]"]

Step 2: Transformer Model Encoding: The tokens are then processed by transformers like BERT to give us Vectors. The size is defined by (number_of_tokens, hidden_dimensions)

Step 3: Pooling: To make the vectors into a fixed length, a pooling layer is applied.

Step 4: Output Vector

Sentence = "Generative AI is transforming industries."

#if the above sentence is using sentence embedding then the embedding will be in the form shown below
#(328,) 

corresponding_embedding = [
  0.03212439, -0.01842797, 0.04530185, -0.05213587, 0.00485467, 0.03122897,
  -0.06328751, 0.01565853, -0.04880953, 0.00924053, 0.02458452, -0.00849123,
  -0.03602738, 0.04425017, -0.01365709, 0.02958131, 0.06179914, -0.03802451,
  0.04291816, 0.01805874, -0.04919799, 0.00835763, 0.02501354, -0.05301875,
  0.00197589, 0.04798875, -0.05659477, 0.02387436, -0.00645328, 0.03848263,
  ...
  -0.01489217, -0.02178142, 0.01607488, -0.04318261, 0.01083614, -0.02533856,
  0.04928425, 0.00297121, -0.01523471, -0.04218689, 0.03562157, -0.00947391
]

Examples of Vector Databases

Some open-source Vector databases:

  • ChromaDB -- Python-first vector database

  • Weaviate -- Feature-rich, supports hybrid search (vector + keyword)

  • Milvus -- Scalable, high-performance vector DB for enterprise workloads

  • Qdrant -- Fast Rust-based vector DB

Managed / Cloud Vector Databases:

  • Pinecone -- Fully managed, very popular for production RAG setups

  • Azure AI Search (with Vector Search) -- Microsoft’s search + vector indexing

  • AWS OpenSearch (Vector Search) -- Adds k-NN vector search on top of the OpenSearch engine

  • Google Vertex AI Matching Engine -- Enterprise-scale vector search for Google Cloud AI apps

Lightweight / Embedded Options:

  • FAISS (Facebook AI Similarity Search) -- Library for vector search, not a vector database

  • Annoy (Spotify) -- Approximate nearest neighbor library

  • ScaNN (Google) -- Efficient ANN search library

What are the main applications of Vector Databases?

Vector Databases have become essential in this AI age. They can store embeddings effectively so that when we need to access, modify, or add new embeddings, it will be easier.

Semantic Search:

The embeddings of the documents, paragraphs, or code snippets are stored, and they can be queried with Natural language.

Example: Google search

Retrieval-Augmented Generation:

RAG Architecture is one of the important concepts in Generative AI, which makes it possible to avoid fine-tuning of LLM models using the private data of the organizations. It helps create chatbots without fine-tuning the LLMs.

It stores the data in the form of embeddings, and these embeddings are stored in the vector databases. The query is used to search relevant files in the vector database, and these files are then fed to the LLM along with the query.

Example: Chatbots of Companies

Recommendation Systems:

Recommendation systems are used in collaborative filtering and content-based filtering. These recommendation systems help us in finding similar users or items based on ratings or similar content based on the items based on the features.

Examples: Netflix movie recommendation

Conclusion

Vector databases are the most crucial components of applications that require semantic search. The embeddings of the text, image, audio, and other data are stored effectively in these databases and are retrieved when searched, even using natural language. They are essential for powering RAG pipelines, semantic search, recommendations, and multimodal AI, making them a critical component in deploying scalable, intelligent applications.

Inserting a sample Vector database code using ChromaDB

📄 Download Example Code PDF

0
Subscribe to my newsletter

Read articles from ramnisanth simhadri directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

ramnisanth simhadri
ramnisanth simhadri