Messenger Application System Design
Messages plays an important role in this current times, we do a lot of communication just by sharing text with each other rather than talking on phone. That's why I have thought of writing an article on messaging application would be a great idea.
Let's get started...
let's go with a structured approach, we will wrap up this process of designing in 4 steps:
Write down the MVP.
Estimate the scale.
Design Trade Off.
System Design deep dive.
Write down the MVP:
In this step we discuss about key functionalities of the product that we are going to design.
-> Send or receive a message.
-> 1 : 1 message.
-> Group messaging.
-> Conversations where messages are part of.
Estimate the scale of Application: In this step must estimate the traffic that could hit our application.
assume that we have 2 billion users and 400 million are the daily active users.
at max they would be sharing 50 messages a day.
so the Average messages per day = 400 million x 50, which is 20 Billion
20 billion messages per day.
the write queries per second would be 20 billion / 86400, and assume the read queries can be 4 times more of write queries.
lets calculate the message content :
-
so 200 bytes x 200 billion is equal to 4000 GB, meaning 4TB messages per day. By this figure we can say that Database sharding is needed.
Design Trade offs: By the CAP theorem we must go with any one, availability or consistency. In this application I prefer to go with consistency because i do not want my application to be eventually consistent meaning if any message is sent by the user and he gets a acknowledgement after 5 mins that the message which the user tried to send got failed that not a pleasant experience i believe.
So i want my application to be reasonably low in latency, consistency high and availability compromised.
Design Deep dive :
API design :
(a) sendMessage(sender_id,conversation_id,text, message_id,meta-data,timestamp);
(b) getConversationList(user_id,offset,limit);
(c) getMessages(user_id, conversation_id,offset,limit);
here the sendMessage api is just a simple post request, i prefer the conversation_id rather than receiver_id because in group chats there would be many receivers for a message, so i assumed every chat as a conversation, refer it with a id.
getConversationList, here its a fetch api or get api, which is responsible to fetch all the conversation list the user have.
getMessages, here its again a get api, which would get the messages of a particular conversation.
Data Base Sharding and the Sharding keys : I have selected the data base as NoSql because the message content can vary. I have divided the database into three sections. user data collection ii. conversation data collection iii. latest conversation data collection.
User data sharding key : for user data collection I have chosen the user_id as the sharding key, so any query related to the user data would be a intra shard query.
Conversation data sharding key : for conversation data collection I have chosen conversation_id as the sharding key. This would make any query regarding a particular conversation as intra shard query.
Latest Conversation data sharding key : for latest conversation data collection I have chosen user_id as the sharding key. This would make the query which will fetch the conversations the user is involved in as a intra shard query.
Web sockets are used for the latest update on any conversations.
Subscribe to my newsletter
Read articles from Sachin Allugani directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by