Managing S3 object lifecycle policy with python

In our previous article we learned about uploading files in the bucket.
In this article we will see how we can manage lifecycle of the items in the bucket using python.
There are quite a few options to choose from when selecting a lifecycle policy. However, some will cost money. For example, object transaction from one S3-storage to another storage might cost some money because they are not included in the free tier. However, deleting the object does not cost anything and since for the series we are not actually interested in storage transition, we will skip these and focus on bottom three.
- Expire current version of objects
This rule targets the current copy of each object and automatically removes it after a specified number of days. It helps when to make sure that stale files don’t stay on our bucket and build up storage costs.
{
'ID': '<give a unique name to easily identify>',
'Filter': {}, empty path means it applies to all files for that folder
'Status': 'Enabled',
'Expiration': {'Days': 4}
}
- Permanently delete noncurrent versions of object
If we use versioning on bucket then, every update or overwrite creates an old version. A noncurrent-version expiration rule will delete those historic files after they’ve aged out It helps us save storage cost.
{
'ID': '<give a unique name to easily identify>',
'Filter': {},
'Status': 'Enabled',
'NoncurrentVersionExpiration': {
'NoncurrentDays': 60
},
}
- Delete expired object markers or incomplete multipart uploads
Expired object markers appear in versioned buckets when the current version is expired but it leaves the marker behind. A dedicated rule can clean up these markers so your bucket view stays tidy.
Abort incomplete multipart uploads purges any multipart upload parts that never completed within a set timeframe which prevents the orphaned parts from piling up and increase the billing.
{
'ID': '<give a unique name to easily identify>',
'Filter': {},
'Status': 'Enabled',
'ExpiredObjectDeleteMarker': True,
'AbortIncompleteMultipartUpload': {
'DaysAfterInitiation': 3
}
}
Creating a lifecycle policy
With boto3 we use put_bucket_lifecycle_configuration
to create a lifecycle configuration
def apply_s3_lifecycle_configuration(self, lifecycle_configuration):
try:
self.s3.put_bucket_lifecycle_configuration(
Bucket=self.bucket_name,
LifecycleConfiguration=lifecycle_configuration
)
except ClientError as e:
print(f"Error: : {e}")
Running the code will give us something like this, if we check the console.
You can also check it with your python code
def get_lifecycle_configurate(self):
configs = self.s3.get_bucket_lifecycle_configuration(Bucket = self.bucket_name)
print(configs["Rules"])
Output should look something like
[{'Expiration': {'Days': 3}, 'ID': 'DeleteUploadsAfter3Days', 'Filter': {'Prefix': 'aws-demo-fruit'}, 'Status': 'Enabled'}, {'ID': 'ExpireNonVersioned', 'Filter': {}, 'Status': 'Enabled', 'NoncurrentVersionExpiration': {'NoncurrentDays': 3}}]
Deleting the lifecycle configuration
Deleting the lifecycle configuration with boto3 client is a bit tricky since it does not provide a way to delete a single configuration. We will need to delete all attached configuration and reapply the ones we actually need. What I mean by that is we first get all configurations on the bucket
response = s3.get_bucket_lifecycle_configuration(Bucket=<our bucket>) # this deletes all lifecycle configuration
rules = response["Rules"]
We filter the one we don’t need
needed_rules = [r for r in rules if r.get('ID') != rule_id]
Then we reapply the ones we need
s3.put_bucket_lifecycle_configuration(
Bucket=<our bucket>,
LifecycleConfiguration={'Rules': needed_rules}
)
Delete action might take a while, sometimes 24 hours , so even if you did not see immediate result, be patient.
Check AWS documentation for further learning
As always full code in my GitHub
Subscribe to my newsletter
Read articles from Saurab Dahal directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
