Introduction

What is a database, and why do we need them?

A database is a collection of data that is organized in such a way that the data, when we try to access it later, is easy to add new data, read, update, and delete. Databases are the backbone of software applications. Depending on the type of data the application works on, we need different types of databases. For example, for a traditional e-commerce website, there are different types of data requirements. They have user accounts, products, orders, payments, product reviews, shopping carts, activity logs, etc., and these requirements can’t be satisfied with a single type of database. Product reviews, shopping carts, and activity logs are best accessible when stored in NoSQL-type databases, and Relational databases can be used for the user accounts, Products, and Orders type of data. So, depending on the type of data we need different databases.

Relational databases

In Relational databases, the data is stored in a table format. In rows and columns. The tables in a database will have relationships between them. The Relationships are in the form of Keys. SQL is used for querying and managing the data.

Sample SQL code:

CREATE TABLE departments (
    id INT PRIMARY KEY,
    department_name VARCHAR(100)
);

CREATE TABLE employees (
    id INT PRIMARY KEY,
    name VARCHAR(100),
    department_id INT,
    salary DECIMAL(10, 2),
    FOREIGN KEY (department_id) REFERENCES departments(id)
);

Examples of Relational Databases include:

MySQL
PostgreSQL
Oracle Database
Microsoft SQL Server

Document-Oriented Databases

In this type of database, the data is stored in the form of documents. The structure of the documents is very flexible, making it ideal for semi-structured or unstructured data. The term NoSQL means they are non-relational databases, and there are no table-like schemas.

The main advantage of this database is that it can be easily scaled. They are extremely useful for social media applications.

Here, instead of tables, we have collections, and instead of rows, we have documents.

Examples: MongoDB, CouchDB, Amazon DocumentDB

Format of storage:

{ "name": "iPhone 15",

"price": 1299,

"colors": ["black", "silver", "blue"],

"inStock": true }

pip install pymongo

from pymongo import MongoClient

client = MongoClient("mongodb://localhost:27017/")

db = client["company_db"]

employees = db["employees"]

MongoDB does not have a dedicated query language like SQL. MongoDB has JavaScript-like syntax for performing operations through:

MongoDB Shell (mongosh)
MongoDB Drivers (pymongo)
MongoDB Compass (GUI)

Key-Value Stores

These databases store data in the form of key-value pairs. These kinds of databases are used for lookup operations. These databases are super fast for Read and write operations. Applications of this database include: session storage, caching, and real-time counters.

Data inside these databases is like this:

"user:101" → {"name": "Alice", "cart": ["item1", "item2"]}

Examples: Redis, DynamoDB, Riak, Memcached

Sample Redis code

SET emp:1 '{"name": "Alice", "dept": "Engineering", "salary": 70000}'
GET emp:1
SET emp:1 '{"name": "Alice", "dept": "Engineering", "salary": 75000}'
DEL emp:1

Graph Databases

These databases store edges and nodes as data. These databases are ideal for scenarios where understanding and navigating relationships are crucial.

Examples:

Neo4j
Amazon Neptune
ArangoDB

CREATE (a:Person {name: "Alice", age: 30})
CREATE (b:Person {name: "Bob", age: 32})
CREATE (c:Company {name: "TechCorp"})

CREATE (a)-[:FRIENDS_WITH]->(b)
CREATE (a)-[:WORKS_AT]->(c)

MATCH (a:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)
RETURN friend

Vector databases

These databases are designed to store, manage, and search high-dimensional vector embeddings. These vector databases store similar types of vector embeddings in the same buckets. So that it is easier for retrieval.

Let’s say you are searching for something related to “Banks.” The vector database will search for all the documents related to Banks and will present you with the documents. You can adjust the similarity score so that you can get all the documents that have relevant information.

The vector databases are getting popular because of Generative AI.

Examples of vector databases include: Pinecone, Chroma, Weaviate, etc.

Query languages:

Vector store	Language
Pinecone	REST & SDK (Python, etc.)
ChromaDB	only Python APIs
Weaviate	Milvus Query Language

Sample Chroma code:

import chromadb
from chromadb.utils import embedding_functions

client = chromadb.Client()

collection = client.get_or_create_collection(name="my_documents")

collection.add(
    documents=[
        "Python is a popular programming language.",
        "LangChain is a framework for building LLM apps.",
        "Databases store and manage data efficiently."
    ],
    ids=["doc1", "doc2", "doc3"],
    metadatas=[
        {"topic": "programming"},
        {"topic": "LLM"},
        {"topic": "database"}
    ]
)

results = collection.query(
    query_texts=["What is LangChain used for?"],
    n_results=2
)

print("Query results:")
for doc, score in zip(results['documents'][0], results['distances'][0]):
    print(f"→ {doc} (score: {score:.4f})")

Conclusion

A single database cannot satisfy all the needs of an application or software, so we need to choose an appropriate database by our use case. We use relational databases for structure and integrity where whereas we use NoSQL databases for scalability and flexibility. Vector databases are useful for generative AI applications, and graph databases for modeling relationships.

We need to know where to use a particular database. The data architects, developers, and data engineers should choose the right databases based on the requirements.

Types of Databases