- 09 Jan 2023
- 1 Minute to read
- Print
- DarkLight
Deleting Large Buckets via Lifecycle Policies
- Updated on 09 Jan 2023
- 1 Minute to read
- Print
- DarkLight
When you have a bucket with 10s of millions of files, deleting via multi-object delete (max 1000 per delete request) will take longer than client application or load balancer timeouts.
A way to deal with deleting these massive buckets without the potential for failures/timeouts is to set a bucket Lifecycle Policy.
By doing this the Lifecycle processes run by the backend storage nodes will handle the deleting of objects in the background.
This also allows the tenant to handle the deletion process without needing an operator/admin to be involved.
Below are two examples of a policy that expires objects after 1 day, one in JSON for awscli and one in XML for s3cmd. Both assume the clients are already configured with the appropriate credentials.
Once all the objects are deleted from the bucket, the bucket can be deleted as normal.
awscli
expire.json
{
"Rules": [
{
"ID": "delete-all-objects",
"Prefix": "",
"Status": "Enabled",
"Expiration": {
"Days": 1
}
},
{
"ID": "delete-prior-versions",
"Prefix": "",
"Status": "Enabled",
"Expiration": {
"Days": 1
},
"NoncurrentVersionExpiration": {
"NoncurrentDays": 3
}
},
{
"ID": "delete-incomplete-multipart-uploads",
"Prefix": "",
"Status": "Enabled",
"AbortIncompleteMultipartUpload": {
"DaysAfterInitiation": 1
}
}
]
}
Then run the command:
aws s3api put-bucket-lifecycle --bucket <bucketname> --lifecycle-configuration file://expire.json
s3cmd
lifecycle_policy.xml
<LifecycleConfiguration>
<Rule>
<ID>delete-all-objects</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>1</Days>
</Expiration>
</Rule>
<Rule>
<ID>delete-prior-versions</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<Expiration>
<Days>1</Days>
</Expiration>
<NoncurrentVersionExpiration>
<NoncurrentDays>3</NoncurrentDays>
</NoncurrentVersionExpiration>
</Rule>
<Rule>
<ID>delete-incomplete-multipart-uploads</ID>
<Prefix></Prefix>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>1</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Then run the command:
s3cmd setlifecycle lifecycle_policy.xml s3://<bucketname>
You will see output similar to this:
s3://bucketname/: Lifecycle Policy updated