Website backups to DreamObjects using Python

This tutorial walks you through writing a short python script that backs up your DreamHost website to DreamObjects. This is an automated script that runs daily. Using this, you will have backups you can access at any time if you need to restore a file.

For simplicity, the script keeps 7 daily backups, but this general approach can be used for any amount of backups with any schedule you like.

This script does not back up your databases in this version, but you could extend it to do that as well.

The Steps

  1. First, set up a DreamObjects account and a bucket to store the backups.
  2. Create a .boto config file.
  3. Next, write the python backup script step by step.
  4. Finally, add a cron job in the DreamHost Control Panel to make the backup script run daily.

A finished version of the script for you to download is at the end of this tutorial, but it will help if you follow along first.

Enable DreamObjects and create a Bucket

Follow DreamObjects Users and DreamObjects Bucket tutorials to create a user and bucket in your DreamObjects account.

Create a .boto config file

This script uses the popular programming language Python, and a library of code called boto. Boto provides the magic to communicate with DreamObjects, but first needs to be set up before beginning.

  1. Log into your website via SSH.
  2. Navigate to your user's home directory.
    [server]$ cd $HOME
  3. Create a .boto file (note the leading dot) that stores your Keys. View the following article for instructions:
  4. Once the .boto file has been created, proceed to the next section.

Create the script file

  1. Create a Python file with the extension ”.py” at the end. You can name it anything you like. The very first line of your script must be this:
    #!/usr/bin/python
  2. Next, make it executable so it can be ran. The easiest way to do that is to run:
    [server]$ chmod a+x SCRIPT_NAME.py

Adding code to the Python script

The following code examples follow in order. Add a blank line between each section to make your code more readable.

Import modules

Your script needs several modules to run the code. Add these three lines at the top of your script (below #!/usr/bin/python):

import tarfile
import datetime
import os

Define the home directory

Just below the import statements, add the following to make sure the script is running from your home directory.

home_dir = os.getenv('HOME')
os.chdir(home_dir)

Define variables

This part of the script consists of defining a few variables and making sure everything is ready to go for the backup itself. Below the os.chdir(home_dir) lines, define some variables:

tmp_dir = 'tmp'
backup_bucket = 'website-backup'
target_dir = 'example.com'

Here’s what they will all be used for:

  • tmp_dir defines where the backup file is temporarily stored.
  • backup_bucket defines the bucket name where backups will be stored. This is the name of the bucket you created in the panel.
  • target_dir defines the name of the directory used as the basis for backups. This example assumes it’s just your website directory. Change 'example.com' to your actual website.

Create a temporary directory if it doesn’t exist already

Add these lines below your defined variables:

if not os.path.isdir(tmp_dir):
    os.makedirs(tmp_dir)

Define time frame and path to the backup file

day_number = datetime.datetime.today().weekday()

backup_filename = "{0}.backup.{1}.tar.gz".format(
    target_dir,
    str(day_number),
    )
backup_filepath = os.path.join(
    home_dir,
    tmp_dir,
    backup_filename,
    )

This example keeps 7 day backups, each day overwriting the backup from one week previous. This bit of code defines a name and location for the backup that will be created.

  • First, use the datetime library to figure out what day of the week today is as a number. Sunday is 0, Monday is 1, Thursday is 4, etc. Then, put that number into the “day_number” variable.
  • Next, define the file name for the backup file to be something like “example.com.backup.4.tar.gz”. That’s what that backup_filename line works out to (on Thursdays).
  • Finally, define the full path to the backup file.

Create the backup file

The next few lines create the backup file:

tar = tarfile.open(backup_filepath, "w:gz")
tar.add(target_dir)
tar.close()

In the above code, the tar file is created, the website is added to it (defined earlier as “target_dir”) and then it's closed.

Upload the backup file to DreamObjects

Now that the backup file is created, copy it to DreamObjects. First, open a connection to DreamObjects:

connection = boto.connect_s3(
    host='objects-us-west-1.dream.io',
    )

That’s only a single line because we already defined the access key and secret key in the .boto file earlier.

Upload the tarfile

Next, upload the tarfile to DreamObjects:

bucket = connection.get_bucket(backup_bucket)
key = bucket.new_key(backup_filename)
key.set_contents_from_file(open(backup_filepath, 'rb'))

The first line of this code defines the bucket to be used with DreamObjects (“backup_bucket”).

The second line creates the object in DreamObjects with the name defined earlier (“backup_filename”).

The third line sends the file stored locally (“backup_filepath”) up to DreamObjects.

Clean up

At this point, a backup file is still sitting around on the hosting server where it doesn’t belong. Add thsi line to the bottom to get rid of it:

os.remove(backup_filepath)

The Cron Job

Now that the script is complete, you can create a cron job to run daily. This runs the script every day to back up your website to DreamObjects.

  1. First, make sure the website user of your domain is a shell user.
  2. View the 'How do I create a cron job' article to set up a daily cron in your panel.
  3. In the 'Command to run' field, you would enter the command 'python', followed by the location of your script. For example, let's say you create a script titled SCRIPT_NAME.py in your user's home directory. You would enter this as the 'Command to run':
    python $HOME/SCRIPT_NAME.py

The full script

#!/usr/bin/python

import tarfile
import datetime
import os

home_dir = os.getenv('HOME')
os.chdir(home_dir)

tmp_dir = 'tmp'
backup_bucket = 'website-backup'
target_dir = 'example.com'

if not os.path.isdir(tmp_dir):
    os.makedirs(tmp_dir)

day_number = datetime.datetime.today().weekday()

backup_filename = "{0}.backup.{1}.tar.gz".format(
    target_dir,
    str(day_number),
    )
backup_filepath = os.path.join(
    home_dir,
    tmp_dir,
    backup_filename,
    )

tar = tarfile.open(backup_filepath, "w:gz")
tar.add(target_dir)
tar.close()

connection = boto.connect_s3(
    host='objects-us-west-1.dream.io',
    )

bucket = connection.get_bucket(backup_bucket)
key = bucket.new_key(backup_filename)
key.set_contents_from_file(open(backup_filepath, 'rb'))

os.remove(backup_filepath)

Did this article answer your questions?

Article last updated .