Comprehensive guide to Qdrant Vector DB: Installation and Setup

Introduction:

In the era of AI-driven applications and unstructured data management, vector databases have become essential tools for enabling semantic search and similarity matching. Qdrant is one such cutting-edge, open-source vector database designed to simplify working with high-dimensional data embeddings. Whether you're building a recommendation system, integrating semantic search, or powering an AI chatbot, Qdrant makes it seamless to store, query, and manage vector embeddings at scale.

At FutureSmart AI, we pride ourselves on staying adaptable and embracing new technologies. If a solution has great potential and delivers value, we’re always eager to explore and adopt it. After testing Qdrant ourselves and seeing its impressive results, we’ve prepared this comprehensive guide to walk you through its installation and setup on Docker and locally. While the blog primarily focuses on on-premise deployment, it’s worth noting that Qdrant also offers a robust cloud platform that simplifies scaling and management for production use cases.

What is a Vector Database

Let us first understand what a Vector database is, and what advantages it has over traditional dbs. In simple terms, a vector database is a specialized database designed to store and work with vector embeddings. Vector embeddings are numerical representations of data that capture its meaning, features, or relationships. These embeddings are often generated by AI models and are used to process unstructured data like text, images, audio, or videos.

Unlike traditional databases, which rely on exact matches (like finding a name or ID), vector databases focus on finding similarity between data points, even when the input isn’t identical. Unlike traditional relational databases, vector databases enable semantic search, allowing advanced similarity retrieval and unstructured data management.

Unstructured data management with vector embeddings

The above image is an example of how we can still find relevant or similar objects even if they are structurally completely different. If it were JSON objects instead of plain text, it would still be able to do it. That’s the power of similarity search or vector search. You can try out this similarity checker yourself at AI Demos Playground.

What is Qdrant

Among the top open-source vector databases, Qdrant stands out as a powerful, Rust-based vector database and similarity search engine. It supports seamless integration with LangChain for building sophisticated AI solutions. It offers robust performance, a user-friendly API, and support for Python. With its use of indexing, Qdrant delivers both speed and precision, making it a competitive choice for modern applications.

Key Features of Qdrant:

HNSW Indexing: Uses the Hierarchical Navigable Small World (HNSW) algorithm for fast and accurate similarity searches.
Distance Metrics: Supports Cosine Similarity, Dot Product, and Euclidean Distance for flexible vector search.
Free Vector Database: Qdrant is open-source and perfect for creating a local vector database.
Multi-Language APIs: Offers APIs for Python, JavaScript/TypeScript, Rust, and Go, ensuring smooth integration with various tech stacks.
Recommendation API: Includes a built-in API for creating efficient recommendation systems.
Scalability and Production-Ready: Designed for real-world applications, it scales to handle millions or billions of vectors seamlessly.
Hybrid Compatibility: Works well with databases like PostgreSQL for blending relational and vector data.

Semantic Vs Vector Search

Semantic Search: Interprets user intent and context to deliver relevant results beyond simple keyword matching. It utilizes vector search to provide more accurate and contextually relevant outcomes.

Vector Search: Transforms text into vectors representing semantic meaning, enabling rapid similarity comparisons within large datasets.

In essence, vector search serves as a foundational component of semantic search, facilitating the understanding and retrieval of information based on meaning rather than mere keywords.

Setting Up the Environment:

We'll cover two approaches to using Qdrant:

Local Setup: Create a local LangChain vector database for prototyping.
Docker-based Server: Use a containerized approach for scalability and production.

Installing Qdrant Using Docker:

Pull the Qdrant Docker Image:

docker pull qdrant/qdrant

Run the Qdrant Container:

docker run -p 6333:6333 -p 6334:6334 -v "${PWD}/qdrant_storage:/qdrant/storage:z" qdrant/qdrant

This Docker command runs the Qdrant container, exposing ports 6333 (REST API) and 6334 (gRPC API), while mapping the host directory ${PWD}/qdrant_storage to /qdrant/storage inside the container to persist data.

Python Client:

To create a vector database in Python, start by installing the Python client:

pip install qdrant-client[fastembed]

Securing Your Qdrant Instance:

When running the Qdrant container, you can enable API key authentication by setting specific environment variables. This is an Optional Step, but it helps secure your Qdrant vector database from unauthorized requests, ensuring only trusted clients can access it.

QDRANT__SERVICE__API_KEY: Set this to the desired API key.

Here’s the command:

docker run -d  -p 6333:6333  -e QDRANT__SERVICE__API_KEY=your-api-key-here -v "${PWD}/qdrant_storage:/qdrant/storage:z"  qdrant/qdrant

Replace your-api-key-here with a strong API key. if needed.

Understanding the Data

The Json data represents movie entries including key details like its name, description, director, and release year. In Qdrant, this information will be stored as a point, where the vector could be derived from the movie description (e.g., embedding for semantic search), and the additional fields like director, year, and name will be stored as metadata or payload for enriched querying and context.

[
    {
        "name": "Sholay",
        "description": "Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.",
        "director": "Ramesh Sippy",
        "year": 1975,
    },
    {
        "name": "Lagaan",
        "description": "Villagers unite to play a cricket match against British officers to abolish oppressive taxes.",
        "director": "Ashutosh Gowariker",
        "year": 2001,
    }
        ...
        ...
]

Loading the Data

import os

def read_files_from_folder(folder_path):
    file_data = []

    for file_name in os.listdir(folder_path):
        if file_name.endswith(".json"):
            with open(os.path.join(folder_path, file_name), 'r') as file:
                # content = file.read()
                content = json.load(file)
                file_data.append({"file_name": file_name, "content": content})

    return file_data

folder_path = "data"
file_data = read_files_from_folder(folder_path)

Then, we create separate lists for documents, metadata, which we add to our collection.

documents = []
metadatas = []
ids = []

import uuid 

for file_index, data in enumerate(file_data):
    context = data["content"]
    documents.extend(movie["description"] for movie in context)
    metadatas.extend(
        {**{key: value for key, value in movie.items() if key != "description"}, "source": data["file_name"]}
        for movie in context
    )
    ids.extend(str(uuid.uuid4()) for _ in context)  # Generate a unique UUID for each movie

documets = ['Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.',
...]

metadatas = [{'name': 'Sholay',
 'director': 'Ramesh Sippy',
 'year': 1975,
 'source': 'movies.json'},
...]

ids = ['7aa5f6f5-5de5-4776-8090-9a7c38a4cfcf',
...]

Initializing Qdrant Client through Local or Docker

Setting Up a Local In-Memory Qdrant Instance:

This approach is ideal for development, prototyping, or testing without the need for a running server.

from qdrant_client import QdrantClient, models
# Initialize the local client
qdrant = QdrantClient(":memory:")  # or QdrantClient(path="path/to/db")

Connecting to a Qdrant Server via Docker:

Ensure that your Qdrant server is running, typically accessible at http://localhost:6333.

from qdrant_client import QdrantClient, models
# Connect to Docker server Qdrant instance
qdrant = QdrantClient("<http://localhost:6333>", api_key=your-api-key-here)

You can choose between running Qdrant in a Docker container (qdrant) or a local setup (client), depending on your specific requirements and environment. Code is same for both approach

Adding Documents

Inserting Data into Your Collection:

You can replace qdrant variable with client for using local in-memory setup

# Use the new add method
qdrant.add(
    collection_name="new_movie_collection",
    documents = documents,
    metadata = metadatas
)

Output: List of IDs of added documents. If no ids provided, will be randomly generated
['d8984772ed664b2b8e2da23b0660989c',...,'fb32ac5bc5a74e48ba0a8fbb578fedfe']

This function adds text documents to a Qdrant collection, creating the collection with the default vector config if it doesn’t exist. Documents are embedded using the default embedding model. If you need custom embeddings or vectors, we will see a different function in a while.

Querying

Performing Vector Searches:

search_result = qdrant.query(
    collection_name="movie_collection",
    query_text="for adults",
    limit = 1
)
print(search_result)

# List of QueryResponse object
[QueryResponse(id='96d0e002-bc96-47a5-9307-16e0924ff9f6', embedding=None, sparse_embedding=None, metadata={'document': 'The heartwarming tale of a mute and deaf man and his relationships with two women.', 'name': 'Barfi!', 'director': 'Anurag Basu', 'year': 2012}, document='The heartwarming tale of a mute and deaf man and his relationships with two women.', score=0.76938045)]

In Qdrant, a QueryResponse represents the outcome of a search query, typically including the point ID, document, similarity score, associated payload (metadata), and optionally, vector embeddings. By default, the vector data is not included in the response. To include, you can adjust the parameter: with_vector as True

Using a Different Embedding Model

Qdrant’s default embedding model is BAAI/bge-small-en, used for generating vector embeddings. You can use open-source embedding models like all-MiniLM-L6-v2 for custom embeddings:

To use a different model when creating a new collection, you need to specify its configuration explicitly, including the embedding model's details, during the collection setup in the Qdrant API or dashboard.

from sentence_transformers import SentenceTransformer

encoder = SentenceTransformer("all-MiniLM-L6-v2")

This resource provides a comprehensive guide on implementing sentence embeddings, similarity measures, semantic search, and clustering using Sentence Transformers.

Create a Collection:

qdrant.create_collection(
    collection_name="my_movies",
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(),  # Vector size is defined by used model
        distance=models.Distance.COSINE,
    ),
)

Format Data

documents = []
metadatas = []
ids = []

import uuid 

for file_index, data in enumerate(file_data):
    context = data["content"]
    documents.extend(movie["description"] for movie in context)
    metadatas.extend(
        {**{key: value for key, value in movie.items() }, "source": data["file_name"]}
        for movie in context
    )
    ids.extend(str(uuid.uuid4()) for _ in context)  # Generate a unique UUID for each movie

documets = ['Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.',
...]

metadatas = [{'name': 'Sholay',
 'description': 'Two ex-convicts are hired by a retired policeman to capture a ruthless dacoit terrorizing a village.',
 'director': 'Ramesh Sippy',
 'year': 1975,
 'source': 'movies.json'},
...]

ids = ['7aa5f6f5-5de5-4776-8090-9a7c38a4cfcf',
...]

Insert Data:

In Qdrant, you can insert data using two methods:

Record-Oriented Approach: Utilize the upload_points method with a list of points, each containing an id, vectorembedding, and payload.
Column-Oriented Approach: Employ the upload_collection method providing separate lists for ids, vectors, and payload.

Both methods facilitate efficient data insertion, allowing you to choose the format that best suits your workflow.

qdrant.upload_collection(
    collection_name="my_movies",
    ids=ids,
    vectors=encoder.encode(documents),
    payload=metadatas
)

Querying:

In the provided query, a filter is applied on the payload data to retrieve points from the my_movies collection where the year field is greater than or equal to 2005. This refines the similarity search to include only relevant results.

from qdrant_client.models import Filter, FieldCondition, MatchValue

hits = qdrant.query_points(
    collection_name="my_movies",
    query=encoder.encode("engineering student life").tolist(),
    limit=3,
    with_payload=True,
    query_filter=Filter(
        should=[FieldCondition(
            key="year",
            range=models.Range(
            gt=None,
            gte=2005,
            lt=None,
            lte=None,),
        )]
    ),
).points

for hit in hits:
    print(hit.payload, "score:", hit.score)

Output

{'name': '3 Idiots', 'description': 'Three engineering students navigate the pressures of academia while challenging societal norms.', 'director': 'Rajkumar Hirani', 'year': 2009} score: 0.6074454
{'name': 'Dangal', 'description': 'A former wrestler trains his daughters to become world-class wrestlers against societal odds.', 'director': 'Nitesh Tiwari', 'year': 2016} score: 0.23663726
{'name': 'Taare Zameen Par', 'description': 'A dyslexic boy discovers his artistic talent with the help of a compassionate teacher.', 'director': 'Aamir Khan', 'year': 2007} score: 0.1795995

In our in AI Demos Playground, we provide all AI tools like Semantic Similarity Checker that allows you to upload text or files and receive similarity scores. It utilizes models from OpenAI, Hugging Face, Google, and Mistral to provide accurate assessments of semantic similarity.

Integration with OpenAI

To utilize OpenAI embeddings, encode your text using the OpenAI API and provide the resulting vector when uploading or querying points in your database. When creating a collection in Qdrant, set the vector size to 1,536 to match the dimensionality of the embeddings generated by models like text-embedding-ada-002

import openai
import os 

embedding_model = "text-embedding-3-small"

openai_client = openai.Client(
    api_key="<YOUR_API_KEY>"
)

result = openai_client.embeddings.create(input="hey texts how are you?", model=embedding_model)
result

# Extract the embedding vector
embedding_vector = result.data[0].embedding

# Determine the dimensionality
embedding_dimension = len(embedding_vector)
print(f"The embedding dimensionality is: {embedding_dimension}")
# The embedding dimensionality is: 1536

Accessing the Qdrant Web UI:

You can manage Qdrant deployments through the WebUI, accessible at Dashboard.

The dashboard provides two primary sections: Console and Collection

Console: Use the REST API to interact with Qdrant for tasks like querying or managing data.
Collections: This section lets you organize, manage, and upload collections, as well as handle snapshots for backups or migrations.

The WebUI also features a Graph Tool to visualize relationships within datasets. Found under the Graph Tab in Collections, it represents data points as an interactive tree graph. Clusters of similar points are grouped, helping users uncover hidden patterns and explore connections. The tool is flexible, allowing zooming and manipulation for enhanced clarity and deeper insights into the data structure.

Key Improvements for Using Qdrant

Caching: Cache frequent query results using Redis or store precomputed embeddings to reduce latency.
LangChain Integration: Ensure embedding dimensions match Qdrant vectors and streamline query pipelines for LLM-based systems.
Scalability: Use Qdrant Cloud for scaling and dividing large datasets into smaller collections to enhance performance.
Vector Search Techniques: Experiment with distance metrics and chunk large documents to improve search relevance and precision.

Conclusion:

In this blog, we've explored how to set up and utilize Qdrant, a robust open-source vector database, using both local and Docker-based approaches. We've covered data preparation, insertion, and querying, demonstrating how Qdrant facilitates efficient similarity searches and unstructured data management. By following these steps, you can leverage Qdrant's capabilities to build sophisticated AI solutions tailored to your specific needs.

As you continue to work with Qdrant, consider exploring its advanced features, such as hybrid search capabilities and integration with large language models, to enhance your applications further. For more detailed information and documentation, visit Qdrant's official website.

At FutureSmart AI, we specialize in developing advanced AI solutions tailored to your business needs. Our expertise encompasses building state-of-the-art vector databases and integrating LangChain-powered applications, enabling efficient semantic search and unstructured data management.

Have questions or need assistance? Contact us at contact@futuresmart.ai for a consultation! Visit our website to discover how our AI technologies have delivered measurable business value.

Don’t miss our next tutorial, where we’ll dive deeper into semantic search optimizations, vector search, and more advanced LangChain-Qdrant integrations.

Let’s build the future together! 🌟

💡

Get the Full Code in our GitHub