Scaling Video Uploads and Processing in a Startup: Why We Moved to Cloud-Based Chunk Uploads (and How It Works)


When we first launched VidSimplify an AI-powered platform that summarizes, transcribes, and intelligently edits long-form videos. Our initial upload and processing pipeline was custom-built. In my earlier blog, I shared how we built a custom chunked upload system and parallel video processing logic for our AI startup - https://vidsimplify.hashnode.dev/why-chunked-and-resumable-uploads-are-a-game-changer-for-video-processing
That setup worked reasonably well for short videos. But as we began supporting long-form content (15β60+ mins, often several gigabytes in size), our infrastructure began to strain under the load.
We needed something more scalable, fault-tolerant, and cloud-native.
Thatβs when we decided to migrate to a secure, resumable, and parallel-friendly architecture using Google Cloud Storage (GCS). This post explains the why, the how, and the tangible benefits of this shift β with detailed architecture, pseudo-code, and diagrams.
π¦ The Backend-Driven Upload System (Our Initial Approach)
How It Worked
The frontend chunked large files (e.g., into 5MB segments).
Each chunk was uploaded to our FastAPI backend via a multipart request.
The backend assembled and stored the final video on disk.
Once the video was fully saved, we triggered AI processing.
Why It Failed at Scale
π§ High memory & disk I/O on backend servers
π Slow uploads due to routing everything through the backend
β No resumability if upload was interrupted
π© No processing until full upload was complete
π Complex retry logic to handle partial failures
βοΈ Why GCS-Based Direct Upload Made Sense
As our user base grew and upload sizes ballooned, we had to find a solution that:
Scales with file size (no backend bottleneck)
Allows upload resumability
Ensures secure and token-based upload
Supports partial file access for parallel AI processing
Google Cloud Storage offered all of this out of the box along with strong ecosystem support.
β New Upload Architecture: Cloud-Native, Resumable, and Parallel
High-Level Flow
plaintextCopyEdit[Frontend]
ββββΆ Requests Signed Resumable URL from FastAPI
ββββΆ Uploads File in Chunks Directly to GCS (Resumable)
[GCS]
βββ Stores Uploaded Video Securely
[Backend]
βββ Polls or Listens for Completion Signal
βββ Downloads Partial Segments and Triggers Parallel Processing
Why It Works
π Secure: Only signed, time-limited URLs allow upload
π Resumable: Upload can recover from interruptions
π Faster: No backend bandwidth bottleneck
οΏ½οΏ½ Segmented Processing: Enables parallel download and analysis
π Secure Resumable Upload via GCS
Step 1: Get Signed Upload URL (Backend β GCS)
pythonCopyEditdef get_resumable_upload_url(file_name: str, content_type: str):
from google.cloud import storage
bucket = storage.Client().bucket("vidsimplify-uploads")
blob = bucket.blob(file_name)
return blob.create_resumable_upload_session(
content_type=content_type,
origin="https://vidsimplify.com",
expiration=datetime.timedelta(minutes=15),
)
Frontend receives a signed URL like:
jsonCopyEdit{
"upload_url": "https://storage.googleapis.com/upload/storage/v1/b/vidsimplify-uploads/o?uploadType=resumable&..."
}
Step 2: Upload Chunks via fetch
or XMLHttpRequest
jsCopyEditasync function uploadChunk(file, url, start, end) {
const chunk = file.slice(start, end + 1);
await fetch(url, {
method: 'PUT',
headers: {
'Content-Length': chunk.size,
'Content-Range': `bytes ${start}-${end}/${file.size}`,
},
body: chunk,
});
}
π‘ Bonus: The frontend tracks chunk upload progress and shows precise feedback like:
Uploading: 67% complete
Speed: 2.3 MB/s
Note: Slow uploads are often caused by weak internet, not the app.
π§ Smart Processing: Parallelism + Byte Ranges
The real power of this architecture comes after the upload β when we parallelize processing using partial downloads of the uploaded video.
Range-Based Downloading
pythonCopyEditdef download_segment(url, start_byte, end_byte):
headers = {"Range": f"bytes={start_byte}-{end_byte}"}
res = requests.get(url, headers=headers, stream=True)
with open(f"segment_{start_byte}.mp4", "wb") as f:
for chunk in res.iter_content(8192):
f.write(chunk)
We split long videos into N segments (e.g., 5-minute blocks), and assign each to a separate worker:
plaintextCopyEditSegment 1: 0sβ5m β Worker A
Segment 2: 5mβ10m β Worker B
Segment 3: 10mβ15m β Worker C
Each worker runs transcription, summarization, or key-moment detection in parallel, then syncs results.
Pseudo-Code: Task Queue Integration
pythonCopyEdit@app.post("/process")
def start_processing(path):
segments = split_by_duration(path, duration=300) # 5-min chunks
for start, end in segments:
process_segment.delay(path, start, end)
pythonCopyEdit@app.task
def process_segment(file_path, start, end):
url = generate_signed_url(file_path)
segment = download_segment(url, start, end)
result = run_ai_tasks(segment)
store_results(result)
π§ System Architecture Diagram
ββββββββββββββββββββββββββββββ
β Frontend UI β
β (Chunk uploader + progress)β
βββββββββββββ¬βββββββββββββββββ
β
ββββββββββββββββββββββββΌββββββββββββββββββββββ
β Get Signed GCS Resumable URL (FastAPI API) β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌβββββββββββββββββββββββββ
β Upload in Chunks to GCS (Resumable PUT) β
βββββββββββββββββββββββββββ¬βββββββββββββββββββββββββ
β
βββββββββββββββββββββΌβββββββββββββββββββββ
β Google Cloud Storage Bucket β
βββββββββββββββββββββ¬βββββββββββββββββββββ
β
ββββββββββββββββββββββββββΌβββββββββββββββββββββββββ
β Detect Upload Completion & Trigger Processing β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌβββββββββββββββββββββββββ
β Launch Parallel Workers with Byte-Range Requests β
ββββββββββββββββββββββββββββββββββββββββββββββββββββ
UX Improvement: Chunk Progress and Transparency
To make things user-friendly, we added:
Real-time upload progress bar: Showing percentage uploaded and current speed
Upload stage indicators: "Initializing", "Uploading", "Finalizing", and "Processing"
Error awareness: If the upload is stuck, we inform users that itβs likely due to their network, not our server
This transparency improved user trust and reduced customer support issues significantly.
π‘ Key Benefits for Startups
Feature | Old Backend Upload | GCS Resumable Upload |
Server memory use | High | Minimal |
Fault tolerance (resume uploads) | Manual | Native |
Upload speed | Limited by backend | CDN-accelerated |
Processing start | Post-upload | Instantly on chunk |
Long video support | Complex | Streamlined |
Parallel processing | Difficult | Seamless via ranges |
π Final Thoughts
For us, the move to a cloud-first, chunked, and parallel processing pipeline was a natural evolution as our users uploaded longer and heavier videos. It simplified the backend, improved user experience, and drastically boosted our processing speed.
If you're building a media-heavy AI product β especially one dealing with large files β I highly recommend starting with this architecture or migrating to it before scale hits you hard.
Subscribe to my newsletter
Read articles from Aditya Raj directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
