Perform Image-Driven Reverse Image Search on E-Commerce Sites with ImageBind and Qdrant
Introduction
In 1950, when Alan Turing introduced the term "Machine Intelligence" in his paper "Computing Machinery and Intelligence," no one had imagined that one day, in the future, it would lead to various innovations using artificial intelligence in different domains. One domain that is very popular and important among users is online shopping. With the surge in e-commerce, users increasingly rely on visual cues to guide them in their purchasing decisions. In response to this shift in consumer behavior, image-driven product search has emerged as a powerful tool to enhance the shopping experience. E-commerce platforms like Amazon, Myntra, Ajio, and Meesho are using image-driven product search widely.
You must be familiar with image-driven searches on shopping websites. This innovative approach uses the visual content in images to let users explore products more intuitively and efficiently. By simply uploading or capturing an image, shoppers can quickly find similar or related items within a vast catalog. Whether seeking fashion inspiration, home decor ideas, or specific product recommendations, image-driven search offers a dynamic and personalized shopping journey tailored to individual preferences and tastes.
We can make the results accurate by using the recently developed all-in-one embedding model by Meta: ImageBind. But before using the embedding model, we need to use a vector database to store them.
When it comes to image search, vector databases have been particularly transformative. Traditional image search methods often rely on metadata tags or textual descriptions, which can be limited in capturing the rich visual content of images. With vector databases, images are transformed into high-dimensional vectors that encapsulate their visual features by allowing for more accurate and nuanced similarity comparisons. This means that users can search for images based on visual similarity by enabling tasks for e-commerce product search from images with remarkable precision. However, when it comes to vector databases, we are faced with a dilemma: which database will be the best for our application?
Here, I have chosen the Qdrant Vector database, which offers an advanced search algorithm for approximate nearest neighbor search for advanced AI applications, namely the HNSW algorithm. Meesho is already using the Qdrant vector database but still, the results are not very accurate. We can make the results more accurate by integrating the Qdrant and ImageBind embedding models. Before diving into the content, let’s look at the steps:
Loading the Dataset.
Initializing the Qdrant Vector DB.
Image Embeddings with ImageBind.
Deploying with Gradio.
Reverse Product Image Search with Qdrant
Let's install the dependencies first to get started with the reverse product image search.
%pip install opendatasets gradio qdrant-client transformers sentence_transformers sentencepiece tqdm |
Loading the Dataset
Using the opendatasets
library, download the Kaggle dataset using your username and key. You can obtain them by visiting the Settings page on Kaggle. Click on "Access API Keys," and a kaggle.json
file will be downloaded. This file will contain your username and API key.
import opendatasets as od |
Now, let’s store the images in a list so that we can easily access the images.
import random
import gradio as gr
from PIL import Image
from qdrant_client import QdrantClient
from qdrant_client.http import models
import tempfile import os from tqdm import tqdm
import os |
Initializing the Qdrant Vector DB
Initialize the Qdrant Client with in-memory storage. The collection name will be “imagebind_data” and we will be using cosine distance.
# Initialize Qdrant client and load collection |
Image Embeddings with ImageBind
ImageBind is an innovative model developed by Meta AI’s FAIR Lab. This model is designed to learn a joint embedding across six different modalities: images, text, audio, depth, thermal, and IMU data. One of the key features of ImageBind is its ability to learn this joint embedding without requiring all combinations of paired data. It has been discovered that only image-paired data is necessary to bind the modalities together effectively. This unique capability allows ImageBind to leverage recent large-scale vision-language models and extend their zero-shot capabilities to new modalities simply by utilizing their natural pairing with images.
We’ll use ImageBind for creating embeddings, but before diving deep, first, let’s follow some steps required for installing ImageBind.
- Clone the git repository of Imagebind:
git clone https://github.com/facebookresearch/ImageBind.git |
- Change the directory:
cd Imagebind |
Edit the requirements.txt file: Delete Mayavi and Cartopy if getting errors or facing problems.
Install the requirements:
pip install -r requirements.txt |
- Come back to your system.
Then, load the model.
import sys |
After initializing the model, we will now create embeddings.
from imagebind.models.imagebind_model import ModalityType |
Then we’ll update the Qdrant Vector DB with the generated embeddings.
import uuid |
We’ll prepare a processing function in which we will take the image as an input and perform a reverse image search with the help of the embeddings.
def process_text(image_query): |
Deploying with Gradio
Now that we have prepared the processing image function, we’ll use Gradio to deploy by defining its interface.
import tempfile |
Image Search Using Product Category
If you want to search images with the product category, then you have to define some functions. We’ll define a function to get images from the category.
# Define function to get images of selected category |
Then list the product categories.
# Define your product categories |
After that, we’ll define a function for category selection.
# Define function to handle category selection |
Deploying with Gradio
Now, we’ll create Gradio interface components for the category selection, such as category dropdown and submit button.
# Create interface components for the category selection |
After that, we’ll create a Gradio interface and pass the functions and components.
category_search_interface = gr.Interface( |
Merging Two Gradio Interfaces
We have deployed two Gradio Interfaces: one is Reverse Image Search and another is Image Search Using Product Category. What if we can see both in one application? That can be performed by using TabbedInterface.
# Combine both interfaces into the same API |
Now, we’ll get an internal URL and a public URL. The application is ready with two tabs.
Tab 0: Reverse Image Search with ImageBind for E-Commerce
Tab 1: Category-Driven Product Search for E-Commerce
Examples
Let’s see how our application performs.
I passed a girl’s image who was wearing a sleeveless Dress. Let’s see if our application can find a similar image from the products.
The application found a dress which is quite similar. Impressive!
Let’s try passing Shoes as an Image. I passed a picture of three people’s feet with sneakers. Let’s see the result.
The result is impressive. The output is sneakers of the same style.
Let’s try a category search. For example, I want to see what are the products in the Boys Apparel category. Gradio can give only one image as output, but in this application, it will not give the same image for every search.
The first search gave an image of a white t-shirt. Let’s search again to see what other t-shirts there are.
The result is a blue-gray t-shirt. So, the results are not repeated. Great!
Conclusion
With Qdrant Vector DB, reverse image search is possible for e-commerce products. We saw from our results how to perform a reverse image search by uploading an image and getting images of similar products from select product categories. The results were accurate with the help of the ImageBind embedding model.
Hope you enjoyed reading this blog. Now it is your turn to try this integration.
CodeSpace
You can find the code on GitHub.
Thanks for reading!
This article was originally published here: https://medium.com/p/0a62f0169e19
Subscribe to my newsletter
Read articles from Akriti Upadhyay directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by