Latency vs Throughput: The Hidden Trade-Off in System Design

Dichan ShresthaDichan Shrestha
1 min read

When measuring system efficiency, two metrics matter most: latency and throughput.

Latency: Speed of a Single Operation

  • Definition: The time taken to complete one task (e.g., a database query).

  • Example: If an API call takes 200ms, that’s its latency.

  • Goal: Lower latency means faster responses.

Throughput: Total Operations Over Time

  • Definition: The number of operations a system can handle per second.

  • Example: A server processing 1,000 requests per second has high throughput.

  • Goal: Maximize throughput to handle more users.

The Trade-Off

  • High throughput often increases latency (e.g., batch processing slows individual requests).

  • Low latency may reduce throughput (e.g., real-time systems prioritize speed over volume).

Best Practice

  • For most systems, optimize for reasonable latency while maximizing throughput.

  • Example: A search engine should return results quickly (low latency) but also handle millions of queries per second (high throughput).

0
Subscribe to my newsletter

Read articles from Dichan Shrestha directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Dichan Shrestha
Dichan Shrestha