Vector databases form an important part of a RAG pipeline and play a pivotal role in the quality of the final results. Their significance can influence the performance of your entire RAG framework — after all, a chain is only as strong as its weakest link. That should be reason enough to take a closer look at the diverse landscape of vector stores.

In this blog post, I describe and compare seven popular databases, highlighting their strengths, limitations, and the scenarios in which each one shines.

Zoom image will be displayed

The first possibility I would like to introduce is Facebook (Meta) AI Similarity Search (FAISS), which was introduced in 2017. It is an open-source library by Facebook AI Research (FAIR) that addresses the challenges of efficient similarity search and clustering in large-scale datasets. Its objective is achieved by its highly optimized C++ implementation. But there are also Python bindings.

FAISS offers a comprehensive suite of Approximate Nearest Neighbor (ANN) algorithms, supporting both brute-force searches through flat indexes and more advanced structures like Inverted File Index (IVF) with various quantization schemes, Product Quantization (PQ) and its optimized forms (OPQ), as well as graph-based indexes such as Hierarchical Navigable Small World (HNSW). These indexing methods can be combined (e.g., IVF + PQ or HNSW with PQ) to balance speed, memory usage, and accuracy.

While FAISS excels in vector similarity search, it does not provide integrated support for hybrid searches that combine keyword or text search functionalities. To implement hybrid search capabilities, FAISS must be integrated with external tools or databases.

Zoom image will be displayed

Pinecone was founded in 2019 by Edo Liberty, a former head of Amazon AI Labs. Its official launching of the managed vector database service was in 2021, with the objective of simplifying the complexities associated with building and maintaining vector search systems.

Pinecone employs a self-developed proprietary graph-based algorithm known as the Pinecone Graph Algorithm (PGA). The flat graph topology that underpins PGA improves memory economy and facilitates real-time data changes. This method was influenced by Microsoft’s Vamana (also known as FreshDiskANN).

Besides their own developed algorithm, their offer also includes automatic parameter tuning based on the uploaded data, so that the developer can treat the vector store as a black box and focus more on building the product and coding. In addition, Pinecone offers hybrid search features to merge conventional lexical word search with semantic vector search.
On the other hand, a significant drawback of using this service is that the detailed algorithms and inner workings of the vector store are unknown to the developer, preventing any advanced modifications. This results in a trade-off between ease of use and configuration possibilities.

Zoom image will be displayed

Weaviate is an open-source vector database introduced by SeMI (Semantic Machine Insights) Technologies in 2018, designed to handle unstructured data efficiently through vector embeddings. At this time the goal was to move beyond traditional keyword-based searches by allowing the system to understand the context and meaning of queries. Over the years, Weaviate has evolved to include features like replication and integrations with technologies such as Kubernetes and Docker, enhancing its scalability and adaptability to various deployment needs. Thus Weaviate is available as Weaviate Cloud (WCD), a managed SaaS solution, and can also be self-managed using Docker and Kubernetes.

The primary objective of Weaviate is to serve as a robust vector search engine that allows developers to build AI-native applications, which utilize primary flat indexing (brute force) for small datasets and the HNSW algorithm for larger datasets. Moreover, the size of the vectors is reduced by product quantization (PQ). Weaviate also provides hybrid search capabilities, combining BM25 for keyword-based search and vector search for semantic similarity.

The entire vector database system is provided in a modular form, allowing customization through vectorizer, reader, and generator modules to suit specific application needs. Moreover, also custom models can be created to further customize the indexing process.

Zoom image will be displayed

Milvus is an open-source vector database developed by Zilliz initiated in 2018 with the aim to create a data infrastructure that simplifies AI adoption. The project was open-sourced under the Apache 2.0 license in 2019 and builds on top of the vector search Library FAISS. The Milvus architecture is based on four layers: access layer, coordinator service, worker node, and storage. These layers are mutually idenpejndty from each other for scaling and disaster recovery. For better preformat Milvus server should be deployed with docker or kubernetes. This can be done by using the Zilliz Cloud, the fully managed cloud service for Milvus, or your own Kubernetes cluster.

If the database increases tremendously Mivlous allows the deployment of Disk-Accelerated Approximate Nearest Neighbors (DsikANN), which is an algorithm developed by Microsoft that performance similarity search while keeping most of the data on disk instead of in RAM.

Furthermore, one key feature of Milvus is its support for various indexing methods, such as IVF, HNSW, ANNOY, and DISKANN. This allows users to find a balance between search speed, accuracy, and resource consumption.

Zoom image will be displayed

ChromaDB, commonly referred to as Chroma, is an open-source vector database designed specifically for applications involving LLMs. One key characteristic of ChromaDB is its simple implementation, as it focuses on only a few methods for creating, deleting, retrieving, and updating vectors in the database.

The company’s first seed funding occurred in April 2023, highlighting how young the product is and that it is still evolving and under development. This early-stage development phase might also explain why ChromaDB currently focuses entirely on dense vector search. Therefore, there is no hybrid search option available. However, it does support metadata fields that can be attached to each vector, allowing for filtering.

All vectors added to Chroma are organized in an HNSW graph in memory. Chroma does not yet offer alternative ANN algorithms — HNSW is the default and only indexing method for dense vectors in current versions. Additionally, it does not maintain a separate flat index.

Another limitation of the current version is that Chroma does not support built-in sharding across multiple machines. Nor does it offer the option to use compressed indexes to save memory, such as PQ or disk-based search.

These limitations indicate that, in its current state, ChromaDB is well-suited for quick prototyping but is less appropriate for large-scale applications.

Zoom image will be displayed

Qdrant is another open-source vector database founded in Berlin in 2021 by CEO André Zayarni and CTO Andrey Vasnetsov In Qdrant, vectores are organized in collection. Similar to ChromaDB, metadata in the form of a payload with additional information can be attached to a vector. So the data in the database can also be only filtered by the payload of each vector because the developers are convinced that vector search capabilities should go beyond simple nearest neighbour search.

https://github.com/bittush8789

https://www.linkedin.com/in/bittu-kumar-54ab13254/

Understanding Vector Databases: A Complete Guide

Subscribe to my newsletter

Bittu Sharma

Bittu Sharma