Why Do LinkedIn and Twitter (Now X) Use NoSQL Databases? The Hidden Tech Behind Social Media Giants


Introduction: The Social Media Data Explosion
Every second, millions of people like, share, and comment on posts across social media platforms. Did you know?
Twitter (Now X) processes 500 million tweets per day (~6,000 tweets per second).
LinkedIn handles over 100,000 profile views per second.
Facebook, Twitter, and LinkedIn combined store over 500 petabytes of data.
If these platforms used traditional SQL databases, they would struggle with performance, fail to scale, and have slow response times. This is why they use NoSQL databases—a faster, more scalable, and flexible alternative.
Why Traditional SQL Databases Struggle
Most relational databases (MySQL, PostgreSQL) work well for small-scale applications, but they don’t handle large, real-time data flows efficiently.
Major SQL Limitations for Social Media Platforms
SQL Challenges | Impact on Social Media |
Limited Scalability | Social media platforms need to scale across multiple servers; SQL struggles with horizontal scaling. |
Schema Rigidity | Social media data is dynamic (e.g., tweets, likes, follows). SQL requires a fixed structure, making updates difficult. |
Slow Performance for Big Data | Twitter processes millions of tweets per second, and SQL joins slow down retrieval speeds. |
High Write & Read Load | Billions of database transactions per day can overload SQL servers. |
💡 Example: Imagine a tweet goes viral. SQL databases would struggle to update millions of user timelines instantly.
Why NoSQL? The Backbone of Social Media Data
NoSQL databases overcome SQL’s limitations by providing high scalability, real-time performance, and flexible schema design.
🌍 Key Advantages of NoSQL
Feature | SQL (Relational DB) | NoSQL (Voldemort, Manhattan) |
Scalability | Vertical (Limited) | Horizontal (Unlimited) |
Schema | Fixed (Tables, Columns) | Flexible (JSON, Key-Value, Graph, Column-based) |
Read/Write Speed | Slower due to joins | Fast due to distributed architecture |
Use Case | Structured data, transactions | Unstructured data, real-time queries |
🔵 How LinkedIn Uses NoSQL (Apache Voldemort)
🔍 The Challenge:
LinkedIn stores over 100 terabytes of user data.
Handles 100,000+ queries per second for:
Profile views
Job recommendations
People You May Know feature
🔹 Solution: Apache Voldemort
Voldemort is a distributed NoSQL key-value store, similar to Amazon DynamoDB.
🔹 Why LinkedIn Chose Voldemort: ✅ High-speed profile lookups (fetches user data in <5 milliseconds).
✅ Decentralized storage (removes single points of failure).
✅ Reduces data retrieval latency by 40% compared to SQL.
💡 Real-World Example
When you see "People You May Know" on LinkedIn, Voldemort fetches user data across multiple servers in real-time to suggest connections within milliseconds.
🛠️ Voldemort’s Architecture
Key-Value Storage: Data is stored as simple key-value pairs.
Partitioning & Replication: Ensures fault tolerance and load balancing.
Eventual Consistency Model: Prioritizes speed over strong consistency.
How Twitter (Now X) Uses NoSQL (Manhattan)
🔍 The Challenge:
Over 500M tweets per day (~6,000 per second).
Every like, retweet, and reply generates millions of read/write operations.
Twitter’s previous SQL system couldn’t handle viral tweets efficiently.
🔹 Solution: Manhattan – Twitter’s Custom NoSQL Database
Manhattan is a distributed NoSQL system designed for ultra-low latency.
🔹 Why Twitter Chose Manhattan:
✅ Handles 150,000+ database queries per second.
✅ Ensures timeline updates in <1 second.
✅ Stores petabytes of tweets across multiple data centers.
✅ Can withstand sudden traffic spikes (e.g., viral tweets, breaking news).
💡 Real-World Example
When a tweet goes viral, Manhattan distributes data across thousands of servers to ensure real-time visibility.
🛠️ Manhattan’s Architecture
Multi-Model Support: Stores Key-Value, Column Family, and Graph data.
Global Replication: Ensures instant tweet availability worldwide.
Real-Time Analytics: Helps Twitter rank and recommend trending content.
📊 NoSQL vs. SQL – The Final Verdict
Feature | SQL Databases (MySQL, PostgreSQL) | NoSQL Databases (Voldemort, Manhattan) |
Performance | Slower due to joins | Faster, distributed storage |
Scalability | Vertical (limited) | Horizontal (scales across servers) |
Schema Flexibility | Fixed schema | Schema-less, flexible data models |
Best Use Cases | Small apps, transactions | High-scale, real-time applications |
📌 Conclusion: NoSQL databases outperform SQL for real-time, large-scale applications like LinkedIn & Twitter.
📢 “Do you think SQL is outdated for social media platforms? Or does it still have a place? Let’s discuss in the comments!”
Subscribe to my newsletter
Read articles from Farhan Khan directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
