Large File Uploads: Choosing Between Streaming and Buffering

Handling large file uploads is a common challenge when building systems that deal with media, backups, or user-generated content. Once your files start exceeding your machine’s RAM (or even disk size, in some cases), naive upload methods break down fast. This article explores the different approaches you can use to upload large files, especially when the file size exceeds RAM and how each strategy works across VM-based and serverless environments.
We’ll walk through three common techniques:
In-memory buffering
Temporary disk buffering
Streaming uploads
1. In-Memory Buffering
How it works:
You load the entire file into memory (e.g., a Buffer
or byte array) before uploading it to the provider.
Pros
Simple and fast for small-to-medium files
Code is easier to write/debug
Cons
Fails or crashes if file > Memory
Not suitable for serverless where memory is capped (e.g., 512MB)
Use when
Files are small or predictable in size
You control the infrastructure and can scale Memory
You need to inspect the whole file in Memory
Code Example
func uploadInMemory(ctx context.Context, file multipart.File, fileSize int64, bucketName, objectName string) error {
buffer := make([]byte, fileSize)
_, err := file.Read(buffer)
if err != nil {
return err
}
client, err := storage.NewClient(ctx)
if err != nil {
return err
}
defer client.Close()
wc := client.Bucket(bucketName).Object(objectName).NewWriter(ctx)
if _, err := io.Copy(wc, bytes.NewReader(buffer)); err != nil {
return err
}
return wc.Close()
}
2. Disk-Based Temporary Buffering
How it works:
You store the file on disk temporarily (e.g., /tmp
) before uploading it in full.
Pros
Works with files larger than RAM
More stable than in-memory on constrained environments
Cons
Needs disk space, ephemeral on serverless (e.g., Google Cloud Function 2nd gen
/tmp
max 2GB)Slower than memory due to I/O
Use when
Works well in VMs or containers with disk access
File is too large for memory but can fit in local disk
Code Example
func uploadFromDisk(ctx context.Context, file multipart.File, bucketName, objectName string) error {
tempFile, err := os.CreateTemp("", "upload-*.tmp")
if err != nil {
return err
}
defer os.Remove(tempFile.Name())
defer tempFile.Close()
_, err = io.Copy(tempFile, file)
if err != nil {
return err
}
f, err := os.Open(tempFile.Name())
if err != nil {
return err
}
defer f.Close()
client, err := storage.NewClient(ctx)
if err != nil {
return err
}
defer client.Close()
wc := client.Bucket(bucketName).Object(objectName).NewWriter(ctx)
if _, err := io.Copy(wc, f); err != nil {
return err
}
return wc.Close()
}
3. Streaming Upload
How it works:
Stream the file directly from the input (request or file source) to the storage provider without buffering the whole file.
Pros:
Handles truly massive files
Minimal RAM usage
Ideal for serverless and modern cloud-native apps
Cons:
Slightly more complex implementation (you manage streams, backpressure, etc.)
Limited control over retries if the connection drops mid-upload
Use when:
File size is unpredictable
You care about performance, cost, and stability
You want maximum scalability
Code Example
func uploadStreamed(ctx context.Context, file multipart.File, bucketName, objectName string) error {
client, err := storage.NewClient(ctx)
if err != nil {
return err
}
defer client.Close()
wc := client.Bucket(bucketName).Object(objectName).NewWriter(ctx)
if _, err := io.Copy(wc, file); err != nil {
return err
}
return wc.Close()
}
Conclusion
Handling large file uploads effectively requires different strategies depending on the environment and constraints. In-memory buffering is simple but limited by RAM size, making it unsuitable for serverless environments. Disk-based buffering allows handling of larger files than RAM can accommodate but requires disk access and is slower due to I/O operations. Streaming uploads offer the best solution for handling large, unpredictable file sizes with minimal RAM use, suited for serverless and cloud-native applications. Each approach has its pros, cons, and suitable use cases.
When handling large file uploads, avoid buffering into memory unless absolutely necessary. For modern apps, streaming uploads are the most scalable and efficient method. Especially when working within the constraints of serverless environments.
If you want resumability and reliability, Google Cloud Storage's native resumable uploads are automatically used when you write via Writer. For reducing backend load altogether, pre-signed URLs are a great client-side strategy.
Let the infrastructure work with you, not against your RAM.
Subscribe to my newsletter
Read articles from Billy directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
