Scaling Video Uploads and Processing in a Startup: Why We Moved to Cloud-Based Chunk Uploads (and How It Works)

Aditya RajAditya Raj
5 min read

When we first launched VidSimplify an AI-powered platform that summarizes, transcribes, and intelligently edits long-form videos. Our initial upload and processing pipeline was custom-built. In my earlier blog, I shared how we built a custom chunked upload system and parallel video processing logic for our AI startup - https://vidsimplify.hashnode.dev/why-chunked-and-resumable-uploads-are-a-game-changer-for-video-processing

That setup worked reasonably well for short videos. But as we began supporting long-form content (15–60+ mins, often several gigabytes in size), our infrastructure began to strain under the load.

We needed something more scalable, fault-tolerant, and cloud-native.

That’s when we decided to migrate to a secure, resumable, and parallel-friendly architecture using Google Cloud Storage (GCS). This post explains the why, the how, and the tangible benefits of this shift β€” with detailed architecture, pseudo-code, and diagrams.


πŸ“¦ The Backend-Driven Upload System (Our Initial Approach)

How It Worked

  • The frontend chunked large files (e.g., into 5MB segments).

  • Each chunk was uploaded to our FastAPI backend via a multipart request.

  • The backend assembled and stored the final video on disk.

  • Once the video was fully saved, we triggered AI processing.

Why It Failed at Scale

  • 🚧 High memory & disk I/O on backend servers

  • 🐌 Slow uploads due to routing everything through the backend

  • ❌ No resumability if upload was interrupted

  • 😩 No processing until full upload was complete

  • πŸ” Complex retry logic to handle partial failures


☁️ Why GCS-Based Direct Upload Made Sense

As our user base grew and upload sizes ballooned, we had to find a solution that:

  • Scales with file size (no backend bottleneck)

  • Allows upload resumability

  • Ensures secure and token-based upload

  • Supports partial file access for parallel AI processing

Google Cloud Storage offered all of this out of the box along with strong ecosystem support.


βœ… New Upload Architecture: Cloud-Native, Resumable, and Parallel

High-Level Flow

plaintextCopyEdit[Frontend]
    β”œβ”€β”€β–Ά Requests Signed Resumable URL from FastAPI
    └──▢ Uploads File in Chunks Directly to GCS (Resumable)

[GCS]
    └── Stores Uploaded Video Securely

[Backend]
    β”œβ”€β”€ Polls or Listens for Completion Signal
    └── Downloads Partial Segments and Triggers Parallel Processing

Why It Works

  • πŸ” Secure: Only signed, time-limited URLs allow upload

  • πŸ”„ Resumable: Upload can recover from interruptions

  • πŸš€ Faster: No backend bandwidth bottleneck

  • οΏ½οΏ½ Segmented Processing: Enables parallel download and analysis


πŸ” Secure Resumable Upload via GCS

Step 1: Get Signed Upload URL (Backend β†’ GCS)

pythonCopyEditdef get_resumable_upload_url(file_name: str, content_type: str):
    from google.cloud import storage
    bucket = storage.Client().bucket("vidsimplify-uploads")
    blob = bucket.blob(file_name)
    return blob.create_resumable_upload_session(
        content_type=content_type,
        origin="https://vidsimplify.com",
        expiration=datetime.timedelta(minutes=15),
    )

Frontend receives a signed URL like:

jsonCopyEdit{
  "upload_url": "https://storage.googleapis.com/upload/storage/v1/b/vidsimplify-uploads/o?uploadType=resumable&..."
}

Step 2: Upload Chunks via fetch or XMLHttpRequest

jsCopyEditasync function uploadChunk(file, url, start, end) {
  const chunk = file.slice(start, end + 1);
  await fetch(url, {
    method: 'PUT',
    headers: {
      'Content-Length': chunk.size,
      'Content-Range': `bytes ${start}-${end}/${file.size}`,
    },
    body: chunk,
  });
}

πŸ’‘ Bonus: The frontend tracks chunk upload progress and shows precise feedback like:

Uploading: 67% complete
Speed: 2.3 MB/s
Note: Slow uploads are often caused by weak internet, not the app.


🧠 Smart Processing: Parallelism + Byte Ranges

The real power of this architecture comes after the upload β€” when we parallelize processing using partial downloads of the uploaded video.

Range-Based Downloading

pythonCopyEditdef download_segment(url, start_byte, end_byte):
    headers = {"Range": f"bytes={start_byte}-{end_byte}"}
    res = requests.get(url, headers=headers, stream=True)
    with open(f"segment_{start_byte}.mp4", "wb") as f:
        for chunk in res.iter_content(8192):
            f.write(chunk)

We split long videos into N segments (e.g., 5-minute blocks), and assign each to a separate worker:

plaintextCopyEditSegment 1: 0s–5m   β†’ Worker A
Segment 2: 5m–10m  β†’ Worker B
Segment 3: 10m–15m β†’ Worker C

Each worker runs transcription, summarization, or key-moment detection in parallel, then syncs results.


Pseudo-Code: Task Queue Integration

pythonCopyEdit@app.post("/process")
def start_processing(path):
    segments = split_by_duration(path, duration=300)  # 5-min chunks
    for start, end in segments:
        process_segment.delay(path, start, end)
pythonCopyEdit@app.task
def process_segment(file_path, start, end):
    url = generate_signed_url(file_path)
    segment = download_segment(url, start, end)
    result = run_ai_tasks(segment)
    store_results(result)

🧭 System Architecture Diagram

                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚        Frontend UI         β”‚
                     β”‚ (Chunk uploader + progress)β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚ Get Signed GCS Resumable URL (FastAPI API) β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚     Upload in Chunks to GCS (Resumable PUT)      β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚     Google Cloud Storage Bucket         β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚ Detect Upload Completion & Trigger Processing   β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
      β”‚ Launch Parallel Workers with Byte-Range Requests β”‚
      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

UX Improvement: Chunk Progress and Transparency

To make things user-friendly, we added:

  • Real-time upload progress bar: Showing percentage uploaded and current speed

  • Upload stage indicators: "Initializing", "Uploading", "Finalizing", and "Processing"

  • Error awareness: If the upload is stuck, we inform users that it’s likely due to their network, not our server

This transparency improved user trust and reduced customer support issues significantly.


πŸ’‘ Key Benefits for Startups

FeatureOld Backend UploadGCS Resumable Upload
Server memory useHighMinimal
Fault tolerance (resume uploads)ManualNative
Upload speedLimited by backendCDN-accelerated
Processing startPost-uploadInstantly on chunk
Long video supportComplexStreamlined
Parallel processingDifficultSeamless via ranges

πŸ‘‹ Final Thoughts

For us, the move to a cloud-first, chunked, and parallel processing pipeline was a natural evolution as our users uploaded longer and heavier videos. It simplified the backend, improved user experience, and drastically boosted our processing speed.

If you're building a media-heavy AI product β€” especially one dealing with large files β€” I highly recommend starting with this architecture or migrating to it before scale hits you hard.

0
Subscribe to my newsletter

Read articles from Aditya Raj directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Aditya Raj
Aditya Raj