Vector Databases & Weaviate π
Table of contents
Introduction
Artificial intelligence π€
has evolved at an unprecedented rate , no one could've imagined. AI is bringing a revolution in all fields and changing how things work. To support this, efficient data storing π and processing has become crucial to reduce the runtime β for such applications that use semantic searching
π and other high computation expenses π².
Vector Databases ππ
Vector Databases were introduced in the late 2010s. Vector databases, also known as vectorized databases, are a novel type of database system that leverages vectorized data storage π and processing techniques β. Unlike traditional relational databases that store data in rows and columns, vector databases focus on storing data as high-dimensional vectors
πͺ. These vectors are representations of the data points in a multi-dimensional space, where each dimension corresponds to a particular attribute π or feature of the data. Depending on the complexity and the granularity of the data, the dimensions can be hyper-tuned to minimize feature loss and retain most of the data efficiently. These vectors are typically generated by applying some kind of transformation
or embeddings
by ML π€ models, algorithms etc. The main advantage of a vector database is that it allows for fast and accurate similarity search
π and retrieval of data based on their vector distance or similarity. This means that instead of using traditional methods of querying β databases based on exact matches or predefined criteria, you can use a vector database to find the most similar or relevant data based on their semantic or contextual
meaning.
Weaviate π
Weaviate is an open source
vector database. It stores data in the form of a vector of objects which helps in lightning-fast search of that object or a semantically similar object.
Some features that Weaviate boasts :
π Weaviate allows you to store and retrieve data objects based on their semantic properties by indexing them with vectors.
πWeaviate can be used stand-alone (aka bring your vectors) or with a variety of modules that can do the vectorization for you and extend the core capabilities.
πWeaviate has a GraphQL-API to access your data easily. This makes the retrieval fast and efficient as you only query what you want.
πWeaviate is fast (check out their open-source benchmarks).
Weaviate is very easy to use, due to its comprehensive documentation and functions that are go-to-use.
For example :
import weaviate from 'weaviate-ts-client';
// ts-js library for weaviate
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.schema
.getter()
.do();
This piece of code fetches the schemas of different classes in the database. In case we just wanted to fetch the schema of a particular class in the database, with just a few changes the code will look like this:
import weaviate from 'weaviate-ts-client';
// ts-js library for weaviate
const classname = 'your-classname'
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.schema
.classGetter()
.withClassName(classname)
.do()
Weaviate also provides its cloud service to new users for free on a trial basis for 14 days. You can use their inbuilt sandbox query or get the API key and use it locally for third-party applications. As Weaviate is an open-source tool, they welcome anyone who is interested to contribute to their codebase π©βπ». Be sure to check out this wonderful tool in your projects !!
Subscribe to my newsletter
Read articles from Harsh Shah directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Harsh Shah
Harsh Shah
4th year b.tech in computer engineering