Vector Databases


A vector database is a database that is used to store, manage, and query high-dimensional vectors. These allow us to quickly find similar items using similarity search(semantic search) (cosine similarity, dot product, etc.).
What are these high-dimensional vectors?
A high-dimensional vector is a modified form of any one of these: text, audio, images, etc., into a numerical representation with the semantics preserved.
For example, in the case of a text bat in sentence one is different from the bat in sentence two:
sentence-1: The bat is a nocturnal animal.
sentence-2: The batsman hit the ball hard with the bat.
In the case of an image, the semantics may refer to the hidden meanings: for example, cats and dogs can be categorized as pets, and when a user searches for pets, these images appear.
The high-dimensional vectors are nothing but vectors with numbers. These numbers will also store the semantics. Generally speaking, the number of dimensions required to store the semantics should be greater than the dimensions required for storing the faces.
For example, we have idioms: "The cat is out of the bag." This does not involve a real cat. This means that the secret is finally out.
Languages have higher variability than faces.
How do we convert data into embeddings?
Converting the textual data into embeddings means converting the data into dense numerical vectors. For converting the textual data into dense vectors, we use EMBEDDING models, called sentenceTransformers.
These sentence transformers will take large data in small chunks and convert each chunk into embeddings[dense vectors].
The sentence Transformers will convert the data into vectors in 4 steps:
Step 1: Tokenization: The input sentences are transformed into tokens using the Tokenizer
"Login failed" → ["[CLS]", "login", "failed", "[SEP]"]
Step 2: Transformer Model Encoding: The tokens are then processed by transformers like BERT to give us Vectors. The size is defined by (number_of_tokens, hidden_dimensions)
Step 3: Pooling: To make the vectors into a fixed length, a pooling layer is applied.
Step 4: Output Vector
Sentence = "Generative AI is transforming industries."
#if the above sentence is using sentence embedding then the embedding will be in the form shown below
#(328,)
corresponding_embedding = [
0.03212439, -0.01842797, 0.04530185, -0.05213587, 0.00485467, 0.03122897,
-0.06328751, 0.01565853, -0.04880953, 0.00924053, 0.02458452, -0.00849123,
-0.03602738, 0.04425017, -0.01365709, 0.02958131, 0.06179914, -0.03802451,
0.04291816, 0.01805874, -0.04919799, 0.00835763, 0.02501354, -0.05301875,
0.00197589, 0.04798875, -0.05659477, 0.02387436, -0.00645328, 0.03848263,
...
-0.01489217, -0.02178142, 0.01607488, -0.04318261, 0.01083614, -0.02533856,
0.04928425, 0.00297121, -0.01523471, -0.04218689, 0.03562157, -0.00947391
]
Examples of Vector Databases
Some open-source Vector databases:
ChromaDB -- Python-first vector database
Weaviate -- Feature-rich, supports hybrid search (vector + keyword)
Milvus -- Scalable, high-performance vector DB for enterprise workloads
Qdrant -- Fast Rust-based vector DB
Managed / Cloud Vector Databases:
Pinecone -- Fully managed, very popular for production RAG setups
Azure AI Search (with Vector Search) -- Microsoft’s search + vector indexing
AWS OpenSearch (Vector Search) -- Adds k-NN vector search on top of the OpenSearch engine
Google Vertex AI Matching Engine -- Enterprise-scale vector search for Google Cloud AI apps
Lightweight / Embedded Options:
FAISS (Facebook AI Similarity Search) -- Library for vector search, not a vector database
Annoy (Spotify) -- Approximate nearest neighbor library
ScaNN (Google) -- Efficient ANN search library
What are the main applications of Vector Databases?
Vector Databases have become essential in this AI age. They can store embeddings effectively so that when we need to access, modify, or add new embeddings, it will be easier.
Semantic Search:
The embeddings of the documents, paragraphs, or code snippets are stored, and they can be queried with Natural language.
Example: Google search
Retrieval-Augmented Generation:
RAG Architecture is one of the important concepts in Generative AI, which makes it possible to avoid fine-tuning of LLM models using the private data of the organizations. It helps create chatbots without fine-tuning the LLMs.
It stores the data in the form of embeddings, and these embeddings are stored in the vector databases. The query is used to search relevant files in the vector database, and these files are then fed to the LLM along with the query.
Example: Chatbots of Companies
Recommendation Systems:
Recommendation systems are used in collaborative filtering and content-based filtering. These recommendation systems help us in finding similar users or items based on ratings or similar content based on the items based on the features.
Examples: Netflix movie recommendation
Conclusion
Vector databases are the most crucial components of applications that require semantic search. The embeddings of the text, image, audio, and other data are stored effectively in these databases and are retrieved when searched, even using natural language. They are essential for powering RAG pipelines, semantic search, recommendations, and multimodal AI, making them a critical component in deploying scalable, intelligent applications.
Inserting a sample Vector database code using ChromaDB
Subscribe to my newsletter
Read articles from ramnisanth simhadri directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
