S3 Object Versioning Responsibly


I’ll start off with what I’m not saying - I’m not saying to disable replication of your objects in your storage provider. As with any other service in $cloud_of_choice, there are a lot of knobs to be turned - and there’s a time and a place for each of them. In this segment, I’ll talk about s3 versioning - and it’s responsible implementation. I will be primarily speaking in AWS terms - as that’s the flavor of the month - but these features are widespread and shouldn’t vary wildly.
The What
S3 versioning is a feature used to keep copies of objects as they change. Any time an object is uploaded, overwritten, or deleted a new “version” is created. This is usually enabled at the bucket level. It’s helpful for backup scenarios, keeping copies of changing files, and as a CYA in oops
moments. Other clouds have this feature as well. Azure considers this blob versioning and GCP calls it object versioning in Google Cloud Storage (GCS).
The How
In order to enable object versioning in AWS, here’s the basic steps in the console:
Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/.
In the left navigation pane, choose General purpose buckets.
In the buckets list, choose the name of the bucket that you want to enable versioning for.
Choose Properties.
Under Bucket Versioning, choose Edit.
Choose Suspend or Enable, and then choose Save changes.
If you’re using Terraform OpenTofu this is a code example:
resource "aws_s3_bucket" "example" {
bucket = "example-bucket"
}
resource "aws_s3_bucket_acl" "example" {
bucket = aws_s3_bucket.example.id
acl = "private"
}
resource "aws_s3_bucket_versioning" "versioning_example" {
bucket = aws_s3_bucket.example.id
versioning_configuration {
status = "Enabled"
}
}
Now if you’re using a terraform tofu module for the implementation - it could look something like this:
module "s3_bucket" {
source = "terraform-aws-modules/s3-bucket/aws"
version = "4.6.1"
bucket = local.bucket_name
tags = {
Owner = "Robservations"
}
versioning = {
status = true
mfa_delete = false
}
}
So at this point, you have a bucket, you’ve enabled versioning, you’re safe and sound.
The Why (tf is it so expensive)
Like many things in AWS, if you implement what you’d like without some sort of clean up mechanism, things will get costly. When it’s on paper, it seems trivial - if you keep copies of everything it’s going to get expensive. Let’s step through a hypothetical situation:
Artifacts and Expirations (a purely fictional story)
You’re the operator of a self-hosted tool that runs CI/CD jobs (on a large scale) and stores all of it’s items in S3. So we’ll say we’re just looking at an “artifacts” bucket:
Stored Item | Source | Notes |
.zip or .gz files | From artifacts: in .gitlab-ci.yml | These are the output of artifacts:paths from jobs |
Test Reports | artifacts:reports in pipeline jobs | JUnit, code quality, coverage, accessibility, etc. |
Manual Uploads | Jobs that explicitly store something | Eg: binaries, docker images (rare but happens) |
Pipeline Metadata | Metadata files alongside artifacts | Pipeline refs, trace data (sometimes) |
So you watch this bucket grow. It becomes unruly. You realize that you have 2 issues you’re up against
The bucket is growing, and appears to be forever growing.
You don’t have a backup strat.
So you, being the devops wizard that you are, decide to look into the application to solve part one of this problem. You’ve identified there’s a setting for expiring the artifacts in $x number of days - this is a great option to stop the bleeding. You set this to 30 days and therefore the bucket should have a predictable size and growth pattern (FinOps teams love this). So #1 is done. 🎉
Now for some backups - because just in case
. There’s a lot of ways to skin this, but lets say that you’ve settled on versioning and there are guardrails to protect bucket deletion, etc. So at this point, you’ve verified that your objects have versioning installed. Issue #2 is solved! 🎉🎉
The following shows what happens when a version of an object is uploaded or modified in place, a few times.
The important piece to note here are the versions and their size. In this case, the original file size was 5.8MB, the in between object tAO…
is 8.7 KB, and the active object is 42.8 KB. It should as no surprise that the total storage for this is the combination of all three, and that’s how she goes.
Where It Gets Tricky
The versioning we’ve seen is doing what it’s intended to do. Now, I’ve decided that the files are too big and this car.jpg
has tipped the scaled on my bill. Time to nuke the file - inside of AWS or programmatically. Once it’s gone, you get the warm and fuzzy - job is safe, no more 💸 - but the bill doesn’t change. The bleeding continues. Bucket states Objects (0)
and bill says 📈
Even though you’ve deleted the objects, the objects remain. The versioning that was enabled to protect us is doing its job - keeping versions of the objects. While this seems obvious, it’s an easy item to overlook when analyzing your cloud spend. So in order to really, really delete the file - now you must also delete the delete marker.
Some tools to help identify these
If you’re in AWS CLI and a small bucket, you can evaluate the objects as such:
aws s3api list-object-versions --bucket $BUCKET_NAME
This will cause your screen to turn to text salad, eyes will glaze over, and while the data is great, this won’t be useful unless used programmatically.
A more reasonable option for a bucket with more data would be creating an S3 Inventory Report.
Enabling S3 Inventory on your bucket will give you daily or weekly CSV/ORC/Parquet reports of all objects including:
Object key
Size
Last modified
Is latest version
Storage class
Non-current flag (if versioned)
Steps:
Go to your bucket → Management tab
Click "Create Inventory"
Choose destination bucket and file format
Enable versioning metadata
Wait for the report (can be daily/weekly)
Fixing This
What you’ll want to do is look are lifecycle rules.
Create a Lifecycle Rule in AWS Console
1. Go to the S3 Console
Navigate to: https://s3.console.aws.amazon.com/s3/
Choose the bucket where you want to apply the lifecycle rule.
2. Open the “Management” Tab
Once in the bucket view, click on the “Management” tab.
Scroll down to "Lifecycle rules".
Click “Create lifecycle rule”.
3. Name Your Rule
Give it a descriptive name like:
Expire-NonCurrent-Versions
.Optionally, add a tag filter or prefix if you want the rule to apply only to a subset of objects (e.g., all objects under
logs/
).
4. Choose Rule Scope
- Choose “Apply to all objects in the bucket” unless you're targeting a specific prefix or tag.
5. Set Lifecycle Rule Actions
Here’s where you define what happens to non-current versions:
✅ Check “Expire noncurrent versions of objects”
Set “Days after objects become noncurrent” — a good default is 30 days.
(Optional) Check “Permanently delete previous versions” after more days for added cleanup.
Example:
Expire noncurrent versions after: 30 days
Delete expired object delete markers (optional if you want to remove delete markers)
6. Review and Create
Review the rule summary.
Click “Create rule”.
Fixing This as Code
If you’re using the S3 module - the following terraform will help you manage these versions from spiraling into a financial burden. This example will keep noncurrent versions for only 30 days. You can, and should, adjust this to meet your applications needs and service levels.
lifecycle_rule = [
{
id = "remove old-versions"
enabled = true
noncurrent_version_expiration = {
days = 30
}
}
]
In a Sentence
Object versioning is a good way to protect the objects in your bucket; however, gone unchecked, buckets can grow exponentially over time, and with that growth, significant cost.
Subscribe to my newsletter
Read articles from Rob Heckel directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
