When we first launched VidSimplify an AI-powered platform that summarizes, transcribes, and intelligently edits long-form videos. Our initial upload and processing pipeline was custom-built. In my earlier blog, I shared how we built a custom chunked upload system and parallel video processing logic for our AI startup - https://vidsimplify.hashnode.dev/why-chunked-and-resumable-uploads-are-a-game-changer-for-video-processing

That setup worked reasonably well for short videos. But as we began supporting long-form content (15–60+ mins, often several gigabytes in size), our infrastructure began to strain under the load.

We needed something more scalable, fault-tolerant, and cloud-native.

That’s when we decided to migrate to a secure, resumable, and parallel-friendly architecture using Google Cloud Storage (GCS). This post explains the why, the how, and the tangible benefits of this shift — with detailed architecture, pseudo-code, and diagrams.

📦 The Backend-Driven Upload System (Our Initial Approach)

How It Worked

The frontend chunked large files (e.g., into 5MB segments).
Each chunk was uploaded to our FastAPI backend via a multipart request.
The backend assembled and stored the final video on disk.
Once the video was fully saved, we triggered AI processing.

Why It Failed at Scale

🚧 High memory & disk I/O on backend servers
🐌 Slow uploads due to routing everything through the backend
❌ No resumability if upload was interrupted
😩 No processing until full upload was complete
🔁 Complex retry logic to handle partial failures

☁️ Why GCS-Based Direct Upload Made Sense

As our user base grew and upload sizes ballooned, we had to find a solution that:

Scales with file size (no backend bottleneck)
Allows upload resumability
Ensures secure and token-based upload
Supports partial file access for parallel AI processing

Google Cloud Storage offered all of this out of the box along with strong ecosystem support.

✅ New Upload Architecture: Cloud-Native, Resumable, and Parallel

High-Level Flow

plaintextCopyEdit[Frontend]
    ├──▶ Requests Signed Resumable URL from FastAPI
    └──▶ Uploads File in Chunks Directly to GCS (Resumable)

[GCS]
    └── Stores Uploaded Video Securely

[Backend]
    ├── Polls or Listens for Completion Signal
    └── Downloads Partial Segments and Triggers Parallel Processing

Why It Works

🔐 Secure: Only signed, time-limited URLs allow upload
🔄 Resumable: Upload can recover from interruptions
🚀 Faster: No backend bandwidth bottleneck
�� Segmented Processing: Enables parallel download and analysis

🔐 Secure Resumable Upload via GCS

Step 1: Get Signed Upload URL (Backend → GCS)

pythonCopyEditdef get_resumable_upload_url(file_name: str, content_type: str):
    from google.cloud import storage
    bucket = storage.Client().bucket("vidsimplify-uploads")
    blob = bucket.blob(file_name)
    return blob.create_resumable_upload_session(
        content_type=content_type,
        origin="https://vidsimplify.com",
        expiration=datetime.timedelta(minutes=15),
    )

Frontend receives a signed URL like:

jsonCopyEdit{
  "upload_url": "https://storage.googleapis.com/upload/storage/v1/b/vidsimplify-uploads/o?uploadType=resumable&..."
}

Step 2: Upload Chunks via `fetch` or `XMLHttpRequest`

jsCopyEditasync function uploadChunk(file, url, start, end) {
  const chunk = file.slice(start, end + 1);
  await fetch(url, {
    method: 'PUT',
    headers: {
      'Content-Length': chunk.size,
      'Content-Range': `bytes ${start}-${end}/${file.size}`,
    },
    body: chunk,
  });
}

💡 Bonus: The frontend tracks chunk upload progress and shows precise feedback like:

Uploading: 67% complete
Speed: 2.3 MB/s
Note: Slow uploads are often caused by weak internet, not the app.

🧠 Smart Processing: Parallelism + Byte Ranges

The real power of this architecture comes after the upload — when we parallelize processing using partial downloads of the uploaded video.

Range-Based Downloading

pythonCopyEditdef download_segment(url, start_byte, end_byte):
    headers = {"Range": f"bytes={start_byte}-{end_byte}"}
    res = requests.get(url, headers=headers, stream=True)
    with open(f"segment_{start_byte}.mp4", "wb") as f:
        for chunk in res.iter_content(8192):
            f.write(chunk)

We split long videos into N segments (e.g., 5-minute blocks), and assign each to a separate worker:

plaintextCopyEditSegment 1: 0s–5m   → Worker A
Segment 2: 5m–10m  → Worker B
Segment 3: 10m–15m → Worker C

Each worker runs transcription, summarization, or key-moment detection in parallel, then syncs results.

Pseudo-Code: Task Queue Integration

pythonCopyEdit@app.post("/process")
def start_processing(path):
    segments = split_by_duration(path, duration=300)  # 5-min chunks
    for start, end in segments:
        process_segment.delay(path, start, end)

pythonCopyEdit@app.task
def process_segment(file_path, start, end):
    url = generate_signed_url(file_path)
    segment = download_segment(url, start, end)
    result = run_ai_tasks(segment)
    store_results(result)

🧭 System Architecture Diagram

                     ┌────────────────────────────┐
                     │        Frontend UI         │
                     │ (Chunk uploader + progress)│
                     └───────────┬────────────────┘
                                 │
         ┌──────────────────────▼─────────────────────┐
         │ Get Signed GCS Resumable URL (FastAPI API) │
         └──────────────────────┬─────────────────────┘
                                │
      ┌─────────────────────────▼────────────────────────┐
      │     Upload in Chunks to GCS (Resumable PUT)      │
      └─────────────────────────┬────────────────────────┘
                                │
            ┌───────────────────▼────────────────────┐
            │     Google Cloud Storage Bucket         │
            └───────────────────┬────────────────────┘
                                │
       ┌────────────────────────▼────────────────────────┐
       │ Detect Upload Completion & Trigger Processing   │
       └────────────────────────┬────────────────────────┘
                                │
      ┌─────────────────────────▼────────────────────────┐
      │ Launch Parallel Workers with Byte-Range Requests │
      └──────────────────────────────────────────────────┘

UX Improvement: Chunk Progress and Transparency

To make things user-friendly, we added:

Real-time upload progress bar: Showing percentage uploaded and current speed
Upload stage indicators: "Initializing", "Uploading", "Finalizing", and "Processing"
Error awareness: If the upload is stuck, we inform users that it’s likely due to their network, not our server

This transparency improved user trust and reduced customer support issues significantly.

💡 Key Benefits for Startups

Feature	Old Backend Upload	GCS Resumable Upload
Server memory use	High	Minimal
Fault tolerance (resume uploads)	Manual	Native
Upload speed	Limited by backend	CDN-accelerated
Processing start	Post-upload	Instantly on chunk
Long video support	Complex	Streamlined
Parallel processing	Difficult	Seamless via ranges

👋 Final Thoughts

For us, the move to a cloud-first, chunked, and parallel processing pipeline was a natural evolution as our users uploaded longer and heavier videos. It simplified the backend, improved user experience, and drastically boosted our processing speed.

If you're building a media-heavy AI product — especially one dealing with large files — I highly recommend starting with this architecture or migrating to it before scale hits you hard.

Scaling Video Uploads and Processing in a Startup: Why We Moved to Cloud-Based Chunk Uploads (and How It Works)

📦 The Backend-Driven Upload System (Our Initial Approach)

How It Worked

Why It Failed at Scale

☁️ Why GCS-Based Direct Upload Made Sense

✅ New Upload Architecture: Cloud-Native, Resumable, and Parallel

High-Level Flow

Why It Works

🔐 Secure Resumable Upload via GCS

Step 1: Get Signed Upload URL (Backend → GCS)

Step 2: Upload Chunks via `fetch` or `XMLHttpRequest`

🧠 Smart Processing: Parallelism + Byte Ranges

Range-Based Downloading

Pseudo-Code: Task Queue Integration

🧭 System Architecture Diagram

UX Improvement: Chunk Progress and Transparency

💡 Key Benefits for Startups

👋 Final Thoughts

Subscribe to my newsletter

Aditya Raj

Aditya Raj

Scaling Video Uploads and Processing in a Startup: Why We Moved to Cloud-Based Chunk Uploads (and How It Works)

📦 The Backend-Driven Upload System (Our Initial Approach)

How It Worked

Why It Failed at Scale

☁️ Why GCS-Based Direct Upload Made Sense

✅ New Upload Architecture: Cloud-Native, Resumable, and Parallel

High-Level Flow

Why It Works

🔐 Secure Resumable Upload via GCS

Step 1: Get Signed Upload URL (Backend → GCS)

Step 2: Upload Chunks via fetch or XMLHttpRequest

🧠 Smart Processing: Parallelism + Byte Ranges

Range-Based Downloading

Pseudo-Code: Task Queue Integration

🧭 System Architecture Diagram

UX Improvement: Chunk Progress and Transparency

💡 Key Benefits for Startups

👋 Final Thoughts

Subscribe to my newsletter

Aditya Raj

Aditya Raj

Step 2: Upload Chunks via `fetch` or `XMLHttpRequest`