Dummies guide for understanding Caching in Computer Systems

Imagine you’re a student preparing for exams, and you keep your most-used notes on your desk rather than digging through your entire bookshelf every time. This quick access saves time and effort. That’s what caching does in computer systems, it stores frequently accessed data in a smaller, faster storage location (cache) to reduce the time it takes to fetch that data.

What is a Cache?

A cache is a small but super-fast storage area where computers save copies of data they need often. It helps reducing latency (the time it takes to get data) and improves performance (how fast stuff gets done).

But caches are small in storage size. So computer systems need smart strategies to decide:

  • When to put data in the cache

  • How to update that data

  • What to remove when the cache is full

Write Strategies - How Data Gets Stored in a Cache

1. Write-Through Caching

Think of a write-through cache as a super-diligent student. Every time you write something in your notebook (cache), you also immediately copy it into your main notes folder (main storage).

  • How it works: Every time data is written to the cache, it’s also updated in the primary storage.

  • Why it’s good: The main storage always has the latest information.

  • Drawback: Slightly slower because you’re updating two places at once.

2. Write-Back Caching

This one’s like jotting down notes in a quick shorthand on your desk first, and only transferring them to your main folder when absolutely needed.

  • How it works: Data is written only to the cache first. Updates are sent to the primary storage later, like in a batch.

  • Why it’s good: Fast writes because the primary storage isn’t immediately involved.

  • Drawback: If the cache fails before syncing, the latest changes could be lost.

3. Read-Through Caching

Imagine you’re studying from your desk notes (cache), and if something’s missing, you quickly fetch it from the bookshelf (main storage).

  • How it works: Data is first searched in the cache. If it’s not there, it’s fetched from the primary storage and added to the cache for future use.

  • Why it’s good: Reduces the load on the main storage for repetitive tasks.

  • Drawback: The first access might take longer if the data isn’t in the cache.

4. Distributed Caching

What if you and your friends share notes across different desks in a study group? That’s what distributed caching is like.

  • How it works: Data is cached across multiple servers (desks) to handle large-scale requests.

  • Why it’s good: Great for systems with lots of users, ensuring quick access no matter where the data is.

  • Drawback: Managing coordination between servers can be tricky.

5. Cache Invalidation

If you’ve updated your main notes but forget to update your desk notes, you might study outdated material. Cache invalidation solves this.

  • How it works: Ensures that when data changes in the main storage, outdated copies in the cache are updated or removed.

  • Why it’s good: Keeps the cache accurate and prevents mistakes.

  • Drawback: Can momentarily slow things down due to synchronisation.

Eviction Policies - What to Keep and What to Toss

Caches can only hold so much, just like your desk can only have limited space. When the desk is full, you’ll need a strategy to decide which notes to replace. Here are some commonly used policies:

1. Least Recently Used (LRU)

Replace the notes you haven’t looked at in ages.

  • How it works: Kicks out the data you haven’t used for the longest time.

  • Why it’s good: Smart and fair

  • Drawback: A bit more work to track what was used last

2. First In First Out (FIFO)

Replace the note that’s been on the desk the longest.

  • How it works: Removes the data that was added first.

  • Why it’s good: Easy to manage

  • Drawback: Might throw out something important you still need

3. Random Replacement

Pick any note to replace, without overthinking it.

  • Why it’s good: Super simple

  • Drawback: Not very smart

Where Caching Happens ?

Caching doesn’t just happen in one place. It can happen in various places, here are few examples:

  • Web browsers (like Chrome or Firefox or Safari): storing images and files to load websites faster

  • CPU (Central Processing Unit): storing instructions so programs run quickly

  • Databases: remembering common queries and results

  • CDNs (Content Delivery Networks): caching content close to users for faster access

Why Does Caching Matter?

Without caching, every request for data would go all the way to the main storage, which could be far away (or just slow). By keeping frequently used data closer, caching speeds things up kind of like how having notes at your desk saves you from running to your bookshelf every few minutes.

Caching strategies help balance speed, accuracy, and resource usage. Whether it’s for websites, apps, or databases, choosing the right strategy depends on what’s most important for the system.

Just like keeping your notes organised helps you study faster, caching helps computer systems run better. It saves time, energy, and frustration. Next time a website loads instantly, or a game runs smoothly, remember it’s probably the cache doing its magic behind the scenes.

0
Subscribe to my newsletter

Read articles from Sandeep Choudhary directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sandeep Choudhary
Sandeep Choudhary