The tale of the versioned AWS S3 Bucket and the Increasing S3 Bill

Simon Bennett
Simon Bennett · Sep 15, 2022
The tale of the versioned AWS S3 Bucket and the Increasing S3 Bill Artwork

This story started with an email to SnapShooter support that raised my eyebrows. The subject line was: Increasing S3 Bill is not always unusual for customers getting set up as their assets grow and their daily, weekly, and monthly retention rules reach their limits. Your bucket will increase in size until it hits a natural level.

The first line of the email, however, did raise my heart rate.

It's just been flagged to me that our S3 bill is roughly going up £15 per month each, even though we've not added any new backups to SnapShooter.

I would be lying if I did not have a slight panic on their behalf. What if SnapShooter was not deleting old files correctly? AWS S3 API is incredibly reliable. Did we send the correct delete requests?

I looked, and SnapShooter reported 160GB of data in the bucket. We called using the API to scan the bucket (with customers' permission) to check what the bucket was reporting.

Here is a quick and dirty way to scan a bucket and sum the files in PHP.

<?php
 
use Aws\S3\Exception\S3Exception;
$size = 0;
$s3 = new Aws\S3\S3Client([
'version' => 'latest',
'region' => 'us-east-1', // Change
'credentials' => [
'key' => 'key', //Change
'secret' => 'secret', //Change
],
]);
$bucket = 'example'; //Change
try {
$results = $s3->getPaginator('ListObjects', [
'Bucket' => $bucket
]);
 
foreach ($results as $result) {
foreach ($result['Contents'] as $object) {
echo $object['Key'] . '- ' . $object['Size'] . PHP_EOL;
$size += $object['Size'];
}
}
} catch (S3Exception $e) {
echo $e->getMessage() . PHP_EOL;
}
 
echo 'size:' . $size;

The result is 160GB of data (by the time you convert bytes to GB). Time to check versions' is not something we can always do via SnapShooter, as it depends if the customer has provided the correct permissions to deal with versions.

So we arranged a zoom meeting to discuss our findings and to help the customer.

The first thing I always do with AWS is go to the billing page and look at the last month's invoice, see where their spending is

In this example AWS we own, we can see that most of the cost is with storage, with some costs for Tier 1 and 2 requests. Sometimes you can see users have very, very high API requests, which cause increased billing. But like in this example and the customers, the cost was all in the storage.

Our customer was storing, in effect, 5.3TB of data. Somewhat more than 160GB SnapShooter was reported. At this point, I know it is AWS S3 Versioning, but time to get the customer to go to S3 and show them.

S3 Bucket Metrics

The customer only had one bucket, so we viewed the metrics on that and were presented with a graph going up and to the right, Each month growing by ~750GB.

We went to the properties tab, and there it was, Bucket Versioning: Enabled!

What is Bucket Versioning?

Versioning is a way to keep multiple variants of an object in the same bucket. You can use the S3 Versioning feature to preserve, retrieve, and restore every version of everything stored in your buckets. In summary, when you override or delete a file, AWS keeps the old version of the object around, and you do not lose data.

However, if not managed, your billing will go up and up. In the case of SnapShooter, we prune old backups. So, if you ask to keep the last 3 months of backups, once we complete another monthly backup, we will delete the oldest to keep it to 3. It deleted that oldest backup from the S3 current versions, but S3 was holding an old file version. Not viewable from the typical bucket view, but there is an old version.

How to Fix the Incorrect Bucket Versioning?

There are two ways to fix bucket versioning. The first and simplest is to turn it off. It would be best to be careful, as this can delete old versions. You need to make sure you are ready for that.

The way we recommend is to set up a bucket policy to delete old versions of files.

  • Go to the Management Tab of the bucket you want to fix

  • Press Create lifecycle rule

  • Name your rule, "Prune old versions."

  • Choose a rule scope. If you want SnapShooter not to be versioned, you can enter "snapshooter/"or set it to run on the whole bucket. Again make sure you know what you're doing.

  • Under the Lifecycle rule actions tick Permanently delete noncurrent versions of objects.

  • Choose the number of days after an object is noncurrent. In the SnapShooter example, that would be when a file got deleted. The minimum you can enter is 1. For the number of newer versions to retain, you will need to leave blank for SnapShooter as we never write to the same file twice

  • Create a Rule

The rule might take a little while to start running, but you will see your total bucket size dropping. S3 Storage lens runs on a 24h delay, so you will need to come back.

The Final Result

As you can see in this customer's example, they went from storing over 5.5TB of storage back to their expected 16GB. An ~$109.44 a month saving. My heart rate returned to normal.