TCP & UDP: The Backbone of Video Calls and Streaming

Nitin GumberNitin Gumber
14 min read

What is TCP/IP?

TCP (Transmission Control Protocol) is one of the core communication protocols that allow computers and devices to communicate with each other over the internet. It ensures that data sent from one device reaches the other accurately and in the correct order. Think of TCP as a reliable mail delivery service for data.

How Does TCP Work?

TCP divides large data into smaller pieces called packets and sends them over the network. It makes sure all packets are delivered, even if there are network issues. Here’s a step-by-step explanation of how TCP works:

  1. Connection Establishment (Handshake):

    • Before any data is sent, TCP establishes a connection between the sender and receiver using a process called the three-way handshake:

      1. SYN: The sender sends a "SYN" (synchronize) message to the receiver, saying, "Can we start communicating?"

      2. SYN-ACK: The receiver replies with a "SYN-ACK" (synchronize-acknowledge) message, saying, "Yes, I'm ready. Can you confirm?"

      3. ACK: The sender responds with an "ACK" (acknowledge) message, confirming the connection is established.

  2. Data Transfer:

    • The sender breaks the data into packets and sends them to the receiver.

    • Each packet has a sequence number so the receiver knows the correct order.

    • If a packet is lost or arrives out of order, TCP resends it until all packets are received correctly.

  3. Acknowledgments:

    • The receiver sends an acknowledgment (ACK) to the sender for each packet received. This ensures that the sender knows the data has arrived safely.
  4. Connection Termination:

    • After the data is sent, the connection is closed using a process called the four-way handshake to ensure no data is left behind.

Features of TCP

  1. Reliable:

    • TCP guarantees that all data is delivered accurately and in the correct order.
  2. Error Detection and Correction:

    • TCP checks for errors in the data and resends corrupted or missing packets.
  3. Flow Control:

    • It ensures that the sender doesn’t overwhelm the receiver with too much data at once.
  4. Congestion Control:

    • TCP adjusts the rate of data transfer based on network conditions to avoid overloading the network.
  5. Connection-Oriented:

    • TCP requires a connection to be established between devices before data can be sent.

Real-World Analogy for TCP

Imagine you are sending a series of letters (data) to a friend:

  1. Three-Way Handshake:

    • You call your friend first to confirm they're available to receive your letters (establish a connection).
  2. Sending Letters:

    • You number each letter so your friend can read them in order. If one gets lost in the mail, you resend it.
  3. Acknowledgment:

    • After receiving each letter, your friend calls you to confirm they got it.
  4. End of Conversation:

    • Once all letters are received, you both agree to hang up the call (close the connection).

TCP works the same way by ensuring all "letters" (data packets) are delivered accurately and in sequence.

Importance of the Three-Way Handshake

  1. Ensures Both Devices Are Ready to Communicate:

    • It confirms that both the sender and receiver are online, available, and ready to communicate. Without this confirmation, data could be sent to an unresponsive or offline device.
  2. Establishes Synchronization:

    • Both devices synchronize their sequence numbers during the handshake. This ensures that the data packets are tracked correctly, enabling proper reordering or retransmission if packets are lost or arrive out of sequence.
  3. Avoids Data Loss:

    • By establishing a connection before sending data, TCP ensures that the receiver is prepared to accept the data. If data were sent without this process, packets could be lost if the receiver is not ready.
  4. Confirms Network Reliability:

    • The handshake helps check that the network path between the devices is functioning correctly. It verifies that the devices can exchange packets in both directions.
  5. Enables Full-Duplex Communication:

    • TCP is a full-duplex protocol, meaning both devices can send and receive data simultaneously. The handshake sets up this two-way communication channel.
  6. Prevents Resource Wastage:

    • If the receiver is unavailable, the handshake prevents the sender from wasting resources by attempting to send data that will never reach its destination.

How the Three-Way Handshake Works

The process involves three steps:

  1. SYN (Synchronize):

    • The sender sends a SYN packet to the receiver to initiate the connection, saying, "I want to start communicating. Here's my sequence number."
  2. SYN-ACK (Synchronize-Acknowledge):

    • The receiver responds with a SYN-ACK packet, saying, "Got your request. Here's my sequence number, and I acknowledge yours."
  3. ACK (Acknowledge):

    • The sender sends an ACK packet back, saying, "Got your response. Let's start communicating."

Real-World Analogy

Think of the three-way handshake as a phone call:

  1. You Call (SYN):

    • You dial your friend's number and say, "Hey, are you free to talk?"
  2. Your Friend Replies (SYN-ACK):

    • Your friend answers the call and says, "Yes, I'm free. Can you hear me?"
  3. You Confirm (ACK):

    • You respond, "Yes, I can hear you. Let's start talking."

Only after this process do you start your conversation, ensuring both parties are ready and connected.

What Happens Without the Handshake?

If the three-way handshake didn't exist, the following problems could arise:

  1. Lost Data:

    • The receiver might not be ready, causing initial packets to be dropped.
  2. Miscommunication:

    • Sequence numbers wouldn't be synchronized, making it impossible to properly reorder data.
  3. Wasted Resources:

    • The sender might waste bandwidth sending data to an unavailable receiver.

Conclusion of three way handshake

The three-way handshake is critical in TCP because it ensures reliable, synchronized, and efficient communication between devices. By confirming that both devices are ready and can handle the connection, it sets the stage for accurate and orderly data transfer.

What is UDP?

UDP (User Datagram Protocol) is a communication protocol used to send data across the internet. Unlike TCP, which is reliable and ensures all data reaches the destination, UDP focuses on speed and simplicity. It sends data without establishing a connection and does not guarantee delivery, making it faster but less reliable.

How Does UDP Work?

  1. No Connection Establishment:

    • UDP is a connectionless protocol, meaning it doesn’t establish a connection between the sender and receiver before sending data. It just sends the data immediately.
  2. Packets of Data (Datagrams):

    • UDP sends data in small packets called datagrams. Each packet is independent, so they may arrive out of order or not arrive at all.
  3. No Acknowledgments:

    • Unlike TCP, UDP doesn’t wait for acknowledgments from the receiver. This saves time but means the sender doesn’t know if the data was received.

Features of UDP

  1. Fast:

    • Because it skips the handshake and error-checking steps, UDP is much faster than TCP.
  2. Unreliable:

    • UDP doesn’t guarantee that data will reach its destination or be in the correct order.
  3. Connectionless:

    • No need to establish a connection before sending data.
  4. Lightweight:

    • UDP has minimal overhead, making it efficient for applications where speed is more important than reliability.
  5. Broadcast and Multicast:

    • UDP supports sending data to multiple devices at once, making it useful for applications like streaming or gaming.

Real-World Analogy for UDP

Imagine sending postcards through the mail:

  1. You write a postcard (data) and drop it in the mailbox (send it via UDP).

  2. The mail service delivers the postcards to the recipient.

  3. You don’t wait for confirmation that the postcards were received, and if one gets lost, you don’t resend it.

In this analogy, postcards are like UDP packets—fast and simple but not guaranteed to arrive.

Where is UDP Used?

UDP is ideal for applications where speed is more important than reliability. Examples include:

  1. Streaming Media:

    • In video or audio streaming (e.g., Netflix, YouTube, or Spotify), missing a small piece of data is better than introducing delays.
  2. Online Gaming:

    • In multiplayer games, real-time updates are more important than resending lost packets. For example, a slight lag in player movement is acceptable.
  3. Voice Over IP (VoIP):

    • Applications like Zoom or WhatsApp calls use UDP to prioritize speed over reliability for smooth conversations.
  4. DNS (Domain Name System):

    • When looking up a website's IP address, DNS requests use UDP because they’re small and don’t need to be 100% reliable.

Why is UDP Important?

  • UDP provides a lightweight and efficient way to send data quickly.

  • It’s perfect for applications where delays are unacceptable, even if some data is lost.

  • Without UDP, real-time communication and media streaming would be much slower and less practical.

Conclusion of UDP

UDP is a fast and simple protocol designed for speed and efficiency, not reliability. It’s an essential tool for applications that prioritize real-time communication, even if some data is lost along the way.

Comparison: UDP vs. TCP

FeatureTCPUDP
ReliabilityReliableUnreliable
Connection=rConnection-orientedConnectionless
SpeedSlowerFaster
Error CheckingExtensiveMinimal
Use CasesWeb browsing, file downloads, emailStreaming, gaming, DNS

Why Does Zoom Use UDP for Hosts and Calls?

Zoom, like many real-time communication platforms (such as Google Meet or Microsoft Teams), primarily uses UDP (User Datagram Protocol) for its audio and video communication. Here's why Zoom hosts and participants are connected via UDP:


1. Speed Over Reliability

  • Real-time Communication: In a Zoom call, speed is critical. Even a small delay in transmitting audio or video can disrupt the experience, leading to lags or awkward pauses.

  • UDP Advantage: UDP doesn't wait for acknowledgments or resend lost packets, making it much faster than TCP. It focuses on delivering data quickly, even if a few packets are dropped.

Example: Imagine you're in a live video call, and a few frames or sounds are skipped—it's better than freezing the entire stream to retransmit missing data.


2. Tolerance for Data Loss

  • Real-time Prioritization: In audio and video calls, small data losses (e.g., a skipped word or a slight visual glitch) are acceptable because the human brain can often fill in the gaps.

  • Why Not TCP?: TCP prioritizes reliability over speed by resending lost packets. This would cause delays, resulting in choppy video or out-of-sync audio.


3. Low Latency

  • UDP Keeps Latency Low: Since UDP doesn’t require handshakes, acknowledgments, or retransmissions, it minimizes latency. This ensures smooth and uninterrupted conversations, even in situations with fluctuating network quality.

4. Adaptive to Network Conditions

  • Zoom uses a technique called Forward Error Correction (FEC) to compensate for UDP’s lack of reliability:

    • FEC adds redundant data to the packets so the receiver can reconstruct some missing packets without requiring retransmission.

    • This combination ensures that Zoom achieves reliability without sacrificing speed.


5. Multicast and Broadcast Support

  • Efficient for Group Calls: UDP supports multicast and broadcast, meaning it can send the same data to multiple recipients efficiently. This is useful for group video calls where multiple participants receive the same audio/video stream.

Why Doesn’t Zoom Use TCP for Calls?

Zoom uses TCP for certain tasks, like logging in, file transfers, or chat messages (where reliability is more important than speed). However, for real-time audio and video:

  1. TCP causes delays due to retransmissions.

  2. Congestion control in TCP slows down communication in poor network conditions, which would make calls unwatchable.

  3. UDP’s simplicity and speed are better suited for smooth real-time interaction.

Why Are Zoom Viewers Connected via TCP?

In Zoom, viewers—those who are passively watching a webinar or meeting—are often connected using TCP (Transmission Control Protocol) instead of UDP. Here’s why TCP is the better choice for this use case:


1. Reliability Is More Important for Viewers

  • Consistent Experience: For viewers, it’s important that all data, such as slides, presentations, and videos, arrives completely and in the correct order.

  • TCP Advantage: TCP guarantees reliable delivery by retransmitting lost packets and ensuring data is received in the right order. This is crucial for a smooth viewing experience.

Example: Imagine watching a webinar where parts of a presenter’s slide are missing. TCP ensures you get the full content, even if there’s a slight delay.


2. Viewers Don’t Need Low Latency

  • Not Real-Time Interaction: Unlike hosts or active participants, viewers are not actively speaking or sharing their video/audio. Small delays caused by TCP retransmissions won’t disrupt the experience since viewers are passively consuming content.

  • Why Not UDP?: UDP is designed for low latency but doesn’t guarantee delivery. For viewers, missing parts of the stream (like slides or a critical sentence in audio) would be more frustrating than a slight delay.


3. TCP Handles Congestion Better

  • Stable Connections: TCP has built-in mechanisms to manage network congestion. If the viewer’s internet connection is unstable, TCP adjusts the data flow to prevent overwhelming the connection.

  • Why Important for Viewers?: Viewers are more likely to be on diverse and potentially less stable networks. TCP ensures a consistent experience regardless of the connection quality.


4. Content Delivery and Buffering

  • Buffering with TCP: Viewers can benefit from buffering mechanisms. For example, if there’s a brief network disruption, TCP allows content (like video or audio) to be preloaded so the viewer doesn't experience interruptions.

  • UDP Doesn’t Buffer: In UDP, data is lost if not immediately processed, making it less suitable for non-interactive viewing.


5. Security and Reliability for Large Groups

  • Webinars and Large-Scale Events: For large-scale events where viewers are in “listen-only” mode, TCP ensures every viewer gets the same consistent stream without data loss.

  • Multimedia Content: TCP is ideal for delivering mixed content, such as video, slides, and chat messages, which need to be synchronized.

Why Doesn’t Zoom Use UDP for Viewers?

  • Viewers prioritize reliability over speed, making TCP a better fit.

  • Missing content (slides, audio, or video) would significantly impact the viewer's experience.

  • The slight delay caused by TCP is negligible for viewers who don’t need real-time interaction.

Switching from TCP to UDP in Communication

Switching from TCP to UDP in a system, like Zoom or any real-time application, involves prioritizing speed and efficiency over reliability. Here's an overview of how and why such a switch might occur, and the scenarios where this is applicable.


Why Switch from TCP to UDP?

  1. Low Latency Requirements:

    • In real-time applications like video calls, gaming, or live streaming, latency is critical. Delays caused by TCP retransmissions can disrupt user experience.

    • UDP eliminates the overhead of connection establishment and retransmissions, offering faster delivery.

  2. Real-Time Communication:

    • For audio/video, occasional packet loss (minor glitches) is acceptable, but delays due to retransmission are not. UDP ensures continuous data flow.
  3. Improved Performance:

    • TCP can struggle in high-latency or unstable networks due to its congestion control and acknowledgment mechanisms.

    • UDP’s simpler protocol performs better in such conditions.

  4. Efficiency in Multicast/Broadcast:

    • In scenarios like webinars or live events, where the same data is sent to multiple users, UDP is more efficient as it supports multicast.

How Does the Switch Happen?

When switching from TCP to UDP:

  1. Connection Negotiation:

    • Initially, the system may establish a connection using TCP (e.g., for login, authentication, or setting up the session).

    • Once the session is authenticated, the system might switch to UDP for transmitting real-time data (e.g., audio/video streams).

  2. Protocol Coordination:

    • The sender and receiver agree to switch protocols during the session.

    • For example, in a Zoom call, the system might start with TCP for connection setup and metadata transfer and then switch to UDP for the call itself.

  3. Fallback Mechanism:

    • If UDP fails (due to blocked UDP ports or firewalls), the system can fall back to TCP as a backup for data transmission.

Real-World Example: Zoom

  1. Initial Setup:

    • When a Zoom meeting starts, it often uses TCP to handle login, authentication, and establishing the connection.

    • TCP is reliable, ensuring all critical setup data is transmitted securely and in order.

  2. Switching to UDP:

    • Once the meeting begins, Zoom may switch to UDP for real-time audio and video transmission, as UDP minimizes latency and keeps the call smooth.
  3. Fallback to TCP:

    • If UDP is unavailable (e.g., due to network restrictions or firewalls), Zoom can continue the meeting using TCP, though this might result in higher latency.

Challenges of Switching to UDP

  1. Reliability:

    • UDP doesn’t guarantee data delivery. Systems must implement error correction (e.g., Forward Error Correction) to handle lost packets.
  2. Firewalls and Network Restrictions:

    • Many networks block or restrict UDP traffic, making it less accessible than TCP.
  3. Synchronization:

    • UDP doesn’t ensure packets arrive in order, so additional mechanisms are needed to manage out-of-order or missing packets.
  4. Fallback Complexity:

    • Implementing a fallback to TCP adds complexity to the system design.

Scenarios for Switching Between TCP and UDP

  1. Video Conferencing:

    • Start with TCP for connection setup, authentication, and sending metadata.

    • Switch to UDP for real-time audio and video to minimize delays.

  2. Gaming:

    • TCP is used for login and game state synchronization.

    • UDP is used for fast-paced in-game actions like player movements and interactions.

  3. Streaming Services:

    • TCP is used for on-demand video streaming (like Netflix) where reliability is crucial.

    • UDP is used for live broadcasts or real-time streams.

Conclusion

Both TCP and UDP are essential to modern internet communication, and each serves specific purposes. While TCP ensures reliability and accuracy, UDP prioritizes speed and low latency. Systems that switch between them, like Zoom, leverage the best of both worlds: the reliability of TCP for setup and control, and the speed of UDP for real-time communication.

This balance ensures that users experience a seamless, efficient, and reliable service, tailored to the demands of different tasks—whether it’s a secure login, a fast-paced gaming session, or a smooth Zoom call.

21
Subscribe to my newsletter

Read articles from Nitin Gumber directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nitin Gumber
Nitin Gumber