Database Selection Made Easy: How to Find the Best Fit for Your Project
Today, my co-founder Idil Saglam and I discussed which database to use for Vestiaire.io. This made us realize there’s much debate around databases on social media. But is there a database that is better than others? Not.
No solution fits all when choosing the best database. However, the thought process is the same regardless of the project. In this devlog, we document the thought process of choosing a database for Vestiaire.io so you can apply it to your projects.
Why is it Important?
Databases are tools used to store data hosted on servers running 24/7, and your applications interact with them via an API.
As they run 24/7, they represent a significant cost for the project. The cost of a server generally depends on factors like storage, CPU, and RAM usage. These usages differ between databases, and the database's response time becomes critical for your application's user experience.
Defining Your Use Cases
A clear vision of use cases is key when choosing the best database for your project.
But how do we define the use case?
The first step is to list everything that should be stored in the database.
For us at Vestiaire.io, we need to store the following data:
User details
Product information: multiple types of products with different properties (glass, clothing, accessories, cosmetics, etc.)
Activity logs
The next step is to describe how your application will use the data.
After listing everything that should be stored, describe how your application will use this data. Applications rarely use data as it is without any relations or processing.
We want to relate users, products, and activity logs.
We want to implement a search using filters like color, shape, brand, and price for each product.
Our suggestion API returns a list of products, a maximum of one per type, depending on the user’s preferences and the other elements in the list.
Once you have a clear list of what is stored in your database and how your application will use this data, you can evaluate your use case options.
Comparing Different Possibilities for Your Use Case
The Key Difference Between Different Databases
Databases are just tools that let you store data. But what’s the difference between them? To answer this, let’s divide databases into categories: NoSQL, SQL, and Graph.
SQL Databases
SQL or structured and relational databases store data in rows or columns, allowing relationships between different table rows using foreign keys.
They have ACID (Atomicity, Consistency, Isolation, Durability) properties, making them trustworthy, especially in mission-critical systems like finance and healthcare.
Examples of SQL databases are PostgreSQL, MySQL, MariaDB, Oracle SQL, Microsoft SQL Server, and ClickHouse.
NoSQL Databases
NoSQL databases do not require a strict schema, lack ACID properties, and do not have relationships between data. This allows faster response times and scalability. However, the lack of relations means higher storage usage since all related data must be duplicated. These databases are suitable for real-time applications or those handling huge amounts of data.
Examples of NoSQL databases are MongoDB, Cassandra, Couchbase, and DynamoDB.
Graph Databases
Graph databases, the least known category, store data as a graph structure with nodes and edges. They allow more complex relationships than SQL databases and have fast query response times, similar to NoSQL databases.
Choosing the Database that Fits Your Use Case
After evaluating the different database categories, it’s time to choose one for your use case. For Vestiaire.io, we need to define relationships between users and products and are not expecting a petabyte-scale database, so we can consider SQL or Graph databases.
Our data structure at Vestiaire.io requires different product-user relations, meaning multiple JOINs between tables if using an SQL database. This impacts speed, especially with complex queries like our recommendation API.
We also plan to add many features to Vestiaire.io soon, requiring flexibility in our data structure, which is a disadvantage for SQL databases.
Thus, for our use case at Vestiaire.io, a Graph Database seems to be the best option compared to NoSQL and SQL databases.
Where to Host the Database?
Databases are servers storing data on their local file system and need to run 24/7 to fulfill data requests.
You can host it yourself (self-hosted) or use a service provider.
The Maintenance Challenge
The most challenging part of hosting a database is maintenance. Updating your database can risk data loss and downtime.
Your database is more critical than your code as you can rewrite code but cannot recreate existing user data. Downtime means your application is unusable.
Issues with Service Providers
Service providers also have drawbacks:
Privacy and Security: Trust your provider not to sell or compromise your data.
Vendor Locking: The provider's service changes impact you since your data is with them.
Pricing: Providers bill based on resource usage, creating cost uncertainty, especially in the early stages.
Best of Both Worlds
The best scenario is using a service provider offering an open-source alternative and a generous free tier.
If you can use a graph database like Neo4J, it fits perfectly. Their startup program offers a free enterprise license until $3M annual revenue, with fewer than 50 employees, and they ask you to share your story with the Neo4J community.
Deciding which database to choose is not a matter of preference but meeting the project’s requirements. Different databases can achieve similar results since they are just tools. Choosing the best solution for your project carefully can improve scalability and revenue by decreasing development time.
Thanks for reading
Subscribe to my newsletter
Read articles from Kaan Yagci directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Kaan Yagci
Kaan Yagci
I am a freelance platform and software engineer working remotely with over ten years of experience. I am passionate about building scalable and revenue-optimized products. Over the years, I have built different products as a solo developer or with different-sized teams. I am here to share my knowledge and help others optimize their projects without years of learning and trying.