Deleting Large Buckets via Lifecycle Policies
  • 09 Jan 2023
  • 1 Minute to read
  • Dark
    Light

Deleting Large Buckets via Lifecycle Policies

  • Dark
    Light

Article summary

When you have a bucket with 10s of millions of files, deleting via multi-object delete (max 1000 per delete request) will take longer than client application or load balancer timeouts. 

A way to deal with deleting these massive buckets without the potential for failures/timeouts is to set a bucket Lifecycle Policy. 

By doing this the Lifecycle processes run by the backend storage nodes will handle the deleting of objects in the background. 

This also allows the tenant to handle the deletion process without needing an operator/admin to be involved.

Below are two examples of a policy that expires objects after 1 day, one in JSON for awscli and one in XML for s3cmd. Both assume the clients are already configured with the appropriate credentials.

Once all the objects are deleted from the bucket, the bucket can be deleted as normal.


awscli

expire.json

{
  "Rules": [
    {
      "ID": "delete-all-objects",
      "Prefix": "",
      "Status": "Enabled",
      "Expiration": {
        "Days": 1
      }
    },
    {
      "ID": "delete-prior-versions",
      "Prefix": "",
      "Status": "Enabled",
      "Expiration": {
        "Days": 1
      },
      "NoncurrentVersionExpiration": {
        "NoncurrentDays": 3
      }
    },
    {
      "ID": "delete-incomplete-multipart-uploads",
      "Prefix": "",
      "Status": "Enabled",
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 1
      }
    }
  ]
}

Then run the command:

aws s3api put-bucket-lifecycle --bucket <bucketname> --lifecycle-configuration file://expire.json



s3cmd

lifecycle_policy.xml

<LifecycleConfiguration>
    <Rule>
        <ID>delete-all-objects</ID>
        <Prefix></Prefix>
        <Status>Enabled</Status>
        <Expiration>
            <Days>1</Days>
        </Expiration>
    </Rule>
    <Rule>
        <ID>delete-prior-versions</ID>
        <Prefix></Prefix>
        <Status>Enabled</Status>
        <Expiration>
            <Days>1</Days>
        </Expiration>
        <NoncurrentVersionExpiration>
            <NoncurrentDays>3</NoncurrentDays>
        </NoncurrentVersionExpiration>
    </Rule>
    <Rule>
        <ID>delete-incomplete-multipart-uploads</ID>
        <Prefix></Prefix>
        <Status>Enabled</Status>
        <AbortIncompleteMultipartUpload>
            <DaysAfterInitiation>1</DaysAfterInitiation>
        </AbortIncompleteMultipartUpload>
    </Rule>
</LifecycleConfiguration>


Then run the command:

s3cmd setlifecycle lifecycle_policy.xml s3://<bucketname>


You will see output similar to this:

s3://bucketname/: Lifecycle Policy updated

Was this article helpful?

What's Next
Changing your password will log you out immediately. Use the new password to log back in.
First name must have atleast 2 characters. Numbers and special characters are not allowed.
Last name must have atleast 1 characters. Numbers and special characters are not allowed.
Enter a valid email
Enter a valid password
Your profile has been successfully updated.
ESC

Eddy AI, facilitating knowledge discovery through conversational intelligence