How to remove multi-part-upload (MPU) data and free up bucket space


For larger file uploads, most S3 clients make use of the multi-part-upload (MPU) feature of the S3 protocol. This allows the client to break large files into smaller chunks, upload these smaller chunks, and re-try any chunks that failed without having to start over.

Most S3 clients are good about cleaning up MPU data that it no longer needs, but if a connection drops or the client crashes, it could leave this data behind. The data is generally not used again, however it may silently use additional disk space on your account until it is removed. It is worthwhile to check for and remove this MPU data if disk storage costs appear larger than expected.

Most S3 clients don’t have an MPU data purge feature, so in the following example, Python and the boto library is used to check for and clean up this data.

Step 1 — Create a .boto file to store your keys.

View the following external instructions on how to create a .boto config file. This will be used to store your DreamObjects keys.

There should now be a file named .boto in your user's home directory, which stores your DreamObjects keys.

Step 2 — Create the clean-up script

Create a file titled via SSH. The following article explains how to do this.

You can then add the code below to this file. This script iterates over all buckets checking for MPU data. If any are found, it displays the file name, the date it was uploaded, its size, and then prompts you if it should be deleted.

Once the MPU data is deleted, it cannot be recovered. Please be sure you don’t need the data before removing it.

Clean-up script code

You do not need to adjust any of the code below since your keys are already stored in your .boto file from step 1 above.


import boto
from boto.s3.connection import OrdinaryCallingFormat # Connect to DreamObjects c = boto.connect_s3(host='', calling_format=boto.s3.connection.OrdinaryCallingFormat()) # Iterate over all buckets for b in c.get_all_buckets(): print '\nBucket: ' + # Check for MPU data and calculate the total storage used total_size = 0 for mpu in b.get_all_multipart_uploads(): ptotalsize = 0 for p in mpu.get_all_parts(): ptotalsize += p.size print mpu.initiated, mpu.key_name, ptotalsize, str(round(ptotalsize * 1.0 / 1024 ** 3, 2)) + 'GB' total_size += ptotalsize print 'Total: ' + str(round(total_size * 1.0 / 1024 ** 3, 2)) + 'GB' # If there is any usage, prompt to delete it and do so if requested if total_size > 0 and str(raw_input('Delete MPU data? (y/n) ')) == 'y': for mpu in b.get_all_multipart_uploads(): mpu.cancel_upload() print 'MPU data deleted!' else: print 'No changes made to bucket.'

Clean-up script example output

Bucket: my-user-bucket
2024-04-02T19:36:21.072Z backups/ 0.1GB
Total: 0.1GB
Delete MPU data? (y/n) y
MPU data deleted! Bucket: workbackup Total: 0.00GB No changes made to bucket.

Step 3 — Run the file

While still logged into your server via SSH, run the file by using the following command.

[server]$ python

See also

Did this article answer your questions?

Article last updated PST.

Still not finding what you're looking for?