Demystifying YouTube: A Look Behind the Scenes

surani smitsurani smit
6 min read

YouTube is one of the most popular video-streaming platforms that allows users to upload, watch, search, and share video-based content. It also provides features to like, dislike, add comments to videos, etc. So, YouTube is a huge system! However, Have you ever thought about how the YouTube system works and the underlying principles behind its design? Let’s move forward to understand this in detail.

Key Requirements:

  • Functionality:

    • Users should be able to upload videos.

    • Users should be able to view videos.

    • Users should be able to change video quality.

    • The system should keep the count of likes, dislikes, comments, and views.

  • Non-Functional Requirements:

    • Performance: Fast video uploads and seamless streaming experiences.

    • Scalability & Reliability: The system must handle massive user traffic and data volume while remaining highly available. Consistency can be slightly compromised for high availability (e.g., slight lag in view count).

    • Latency & Throughput: Low latency and high throughput for optimal video delivery.

    • Cost-Efficiency: Striking a balance between functionality and cost.

Capacity Estimation:

  • Suppose total users = 2 billion.

  • Suppose daily active users = 400 million.

  • Suppose the number of videos watched/day/user = 5.

  • Total video views/day = 400 million 5 = 2 billion views/day.

  • Youtube would be view-heavy (read-heavy) system.Suppose view to upload ratio (read-to-write ratio) is 1:100, then total video upload/day = 2 billion/100 = 20 million videos upload/day.

  • Suppose the average video size is 100 MB. Total storage needed/day = 20 million 100 MB = 2000 TB/day = 2 PB/day. Here we ignore video compression and replication, which would change our estimates.

  • If we use existing CDN cloud services to serve videos, then it would cost money for data transfer. Suppose we use Amazon’s CDN CloudFront, which costs $0.01 per GB of data transfer. So the total cost for video streaming/day = Total video views/day avg video size in GB $0.01 = 2 billion 0.1 0.01 = 2 million $/day. As we observe, serving videos from the CDN would cost lots of money.

High-Level Design:

The system comprises several key components:

  • %[https://app.eraser.io/workspace/8T4MhoX69aGgJ352sjFt]

  • Client: Users access YouTube through computers, mobile phones, etc.

  • Video Storage: A BLOB (Binary Large Object) storage system stores transcoded videos.

  • Transcoding Server: Converts videos into multiple formats and resolutions for optimal playback on various devices and bandwidths.

  • API Server: Handles non-streaming requests like feed recommendations, video upload URL generation, metadata updates, and user signup.

  • Web Server: Routes incoming client requests to the API server or transcoding server.

  • CDN (Content Delivery Network): Stores encoded videos for faster streaming. Popular videos are typically served from the CDN.

  • Load Balancer: Distributes requests evenly among API servers.

  • Metadata Storage: Stores video metadata like title, URL, thumbnails, user information, view counts, etc. It's sharded and replicated for high performance and availability.

  • Metadata Cache: Improves read performance by caching frequently accessed metadata, user info, and thumbnails.

Video Uploading Process:

We use the uploadVideo API for uploading the video content, which returns an HTTP response that demonstrates video is uploaded successfully or not.

string uploadVideo(string apiKey, stream videoData, string videoTitle, string videoDescription, string videoCategory, string videoTags[], string videoLanguage, string videoLocation)

  • apiKey: An identification of the registered account.

  • videoData: Uploaded video data.

  • videoTitle: The title of the video.

  • videoDescription: The description text of the video.

  • videoCategory: Video category data like sports, education, etc.

  • videoTags[]: A list of tags for the video.

  • videoLanguage: The language of the content like English, Hindi, etc.

  • videoLocation: The location where the video was recorded.

The video upload flow is divided into two processes running in parallel: 1) Uploading the video content and 2) Updating the video metadata.

  1. Uploading Video Content:

    • Users upload videos, triggering transcoding for various formats and resolutions.

    • Parallelization across multiple machines increases throughput.

    • Popular videos might undergo further compression for size reduction while maintaining quality.

  2. Updating Video Metadata:

    • While uploading, metadata like title, description, and thumbnails are sent for storage.

Video Streaming Process:

We use the uploadVideo API for uploading the video content, which continuously sends the small pieces of the video media stream from the given offset.

stream viewVideo(string apiKey, string videoId, int videoOffset, string codec, string videoResolution)

  • apiKey: An identification of the registered account.

  • videoId: An identifier for the video.

  • videoOffset: This is the time from the start of the video, which enables users to watch a video on any device from the same point where they left off.

  • codec: Codec is the video compression standard to compress large video content into smaller sizes. It uses efficient video compression algorithms to facilitate this process. We send the client’s codec info in the API to support play/pause from multiple devices.

  • videoResolution: We also send the client’s resolution details because different devices may have different resolutions.

  1. Client Request:

    • Users request a video, and the platform considers factors like device type, screen size, processing power, and network bandwidth.
  2. Content Delivery:

    • Based on the above factors, the system delivers the optimal video version from the nearest edge server in real-time.

    • Devices load video data in small chunks, receiving a continuous stream from CDN or video storage.

Optimizations:

  • CDN Strategy: Popular videos are streamed from the CDN, while less popular ones reside in high-capacity video storage. Videos gaining popularity can be migrated to the CDN.

  • Streaming Protocol: Standard protocols like MPEG-DASH ensure efficient data transfer during streaming. This allows for:

    • Dynamic bitrate adjustments to minimize buffering.

    • Quality delivery based on network bandwidth and user devices.

Metadata Management:

  • MySQL Database: Stores user and video information in separate tables.

  • Replication (Master-Slave Architecture): Scales read requests by distributing them across read replicas. However, this might cause temporary data inconsistencies (e.g., view count differences).

  • Sharding: Distributes data across multiple machines for efficient read/write operations.

  • Vitess: A horizontal scaling system for MySQL, managing shard distribution and improving performance.

    Vitess is a database clustering system that runs on top of MySQL. It has several built-in features that allow us to scale horizontally similar to the NoSQL database.

    Here are some important features of Vitess:

    Scalability: Its built-in sharding features let you grow database without adding sharding logic to application.

    Performance: It automatically rewrites bad queries to improve database performance. It also uses caching mechanisms and prevents duplicate queries.

    Manageability: It improves manageability by automatically handling failovers and backups functionalities.

    Sharding management: MySQL doesn’t natively support sharding, but we will need it as your database grows. It helps us to enable live resharding with minimal read-only downtime.

  • Caching: Utilizes distributed caches like Redis or Memcached to store frequently accessed metadata, reducing database load.

Disaster Management:

  • Data Backups: Data is backed up across geographically dispersed data centers for redundancy in case of outages or disasters.

Summary:

YouTube is a massive video-streaming platform that allows users to upload, watch, and interact with videos. This article explores YouTube's key functionalities, non-functional requirements, capacity estimation, high-level design, video uploading and streaming processes, optimizations, metadata management, and disaster management strategies. It provides insights into how YouTube handles scalability, performance, and cost-efficiency to deliver a seamless user experience.\

By:

  • Surani Smit [210303125005]

  • Aviral [210303125016]

  • Srushtee patil [210303125002]

  • Parth Mishra [210303125020]

  • Faizal Ikhariya [210303125009]

0
Subscribe to my newsletter

Read articles from surani smit directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

surani smit
surani smit