Why Do LinkedIn and Twitter (Now X) Use NoSQL Databases? The Hidden Tech Behind Social Media Giants

Farhan KhanFarhan Khan
4 min read

Introduction: The Social Media Data Explosion

Every second, millions of people like, share, and comment on posts across social media platforms. Did you know?

  • Twitter (Now X) processes 500 million tweets per day (~6,000 tweets per second).

  • LinkedIn handles over 100,000 profile views per second.

  • Facebook, Twitter, and LinkedIn combined store over 500 petabytes of data.

If these platforms used traditional SQL databases, they would struggle with performance, fail to scale, and have slow response times. This is why they use NoSQL databases—a faster, more scalable, and flexible alternative.

Why Traditional SQL Databases Struggle

Most relational databases (MySQL, PostgreSQL) work well for small-scale applications, but they don’t handle large, real-time data flows efficiently.

Major SQL Limitations for Social Media Platforms

SQL ChallengesImpact on Social Media
Limited ScalabilitySocial media platforms need to scale across multiple servers; SQL struggles with horizontal scaling.
Schema RigiditySocial media data is dynamic (e.g., tweets, likes, follows). SQL requires a fixed structure, making updates difficult.
Slow Performance for Big DataTwitter processes millions of tweets per second, and SQL joins slow down retrieval speeds.
High Write & Read LoadBillions of database transactions per day can overload SQL servers.

💡 Example: Imagine a tweet goes viral. SQL databases would struggle to update millions of user timelines instantly.

Why NoSQL? The Backbone of Social Media Data

NoSQL databases overcome SQL’s limitations by providing high scalability, real-time performance, and flexible schema design.

🌍 Key Advantages of NoSQL

FeatureSQL (Relational DB)NoSQL (Voldemort, Manhattan)
ScalabilityVertical (Limited)Horizontal (Unlimited)
SchemaFixed (Tables, Columns)Flexible (JSON, Key-Value, Graph, Column-based)
Read/Write SpeedSlower due to joinsFast due to distributed architecture
Use CaseStructured data, transactionsUnstructured data, real-time queries

🔵 How LinkedIn Uses NoSQL (Apache Voldemort)

🔍 The Challenge:

  • LinkedIn stores over 100 terabytes of user data.

  • Handles 100,000+ queries per second for:

    • Profile views

    • Job recommendations

    • People You May Know feature

🔹 Solution: Apache Voldemort

Voldemort is a distributed NoSQL key-value store, similar to Amazon DynamoDB.

🔹 Why LinkedIn Chose Voldemort: ✅ High-speed profile lookups (fetches user data in <5 milliseconds).
✅ Decentralized storage (removes single points of failure).
✅ Reduces data retrieval latency by 40% compared to SQL.

💡 Real-World Example

When you see "People You May Know" on LinkedIn, Voldemort fetches user data across multiple servers in real-time to suggest connections within milliseconds.

🛠️ Voldemort’s Architecture

  • Key-Value Storage: Data is stored as simple key-value pairs.

  • Partitioning & Replication: Ensures fault tolerance and load balancing.

  • Eventual Consistency Model: Prioritizes speed over strong consistency.

How Twitter (Now X) Uses NoSQL (Manhattan)

🔍 The Challenge:

  • Over 500M tweets per day (~6,000 per second).

  • Every like, retweet, and reply generates millions of read/write operations.

  • Twitter’s previous SQL system couldn’t handle viral tweets efficiently.

🔹 Solution: Manhattan – Twitter’s Custom NoSQL Database

Manhattan is a distributed NoSQL system designed for ultra-low latency.

🔹 Why Twitter Chose Manhattan:

✅ Handles 150,000+ database queries per second.
✅ Ensures timeline updates in <1 second.
✅ Stores petabytes of tweets across multiple data centers.
✅ Can withstand sudden traffic spikes (e.g., viral tweets, breaking news).

💡 Real-World Example

When a tweet goes viral, Manhattan distributes data across thousands of servers to ensure real-time visibility.

🛠️ Manhattan’s Architecture

  • Multi-Model Support: Stores Key-Value, Column Family, and Graph data.

  • Global Replication: Ensures instant tweet availability worldwide.

  • Real-Time Analytics: Helps Twitter rank and recommend trending content.

📊 NoSQL vs. SQL – The Final Verdict

FeatureSQL Databases (MySQL, PostgreSQL)NoSQL Databases (Voldemort, Manhattan)
PerformanceSlower due to joinsFaster, distributed storage
ScalabilityVertical (limited)Horizontal (scales across servers)
Schema FlexibilityFixed schemaSchema-less, flexible data models
Best Use CasesSmall apps, transactionsHigh-scale, real-time applications

📌 Conclusion: NoSQL databases outperform SQL for real-time, large-scale applications like LinkedIn & Twitter.

📢 “Do you think SQL is outdated for social media platforms? Or does it still have a place? Let’s discuss in the comments!”

0
Subscribe to my newsletter

Read articles from Farhan Khan directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Farhan Khan
Farhan Khan