Yet another simple Amazon S3 backup script for Drupal

416527996 da402f298e b

Yet another simple Amazon S3 backup script for Drupal

As part of setting up servers, we need to set up a backup system. While there are many out there, often times they boil down to a script. In that regard, I recently wrote a backup script to back up a server with multiple sites to Amazon S3.

Here were my requirements:
1. Backup off the server (i.e. Amazon S3)
2. Backup all sites and databases on the server individually.
3. Only backup the contents of tables that are not cache tables.
4. Have auto expire of backup files.

I decided on S3 because it was easy, fast and cheap. It also had the ability to set up lifecycle rules so that as files got older, I could move them to glacier or delete them. This removed the need for the backup script to handle the deletion logic. All I needed was a script to send the backup files to Amazon S3 every night.

This is how I did it.

First, sign up with Amazon S3 and generate an Access and Secret keys. This can be done from your account Security Credentials page.

Next, create a bucket. Follow their instructions for how to do this if you don't know how. For the bucket, go to the properties and set the lifecycle rules however you want. For now mine are to send to glacier after 7 days and then delete after 3 months. Glacier is far cheaper than S3 but takes 4-5 hours to recover.

Next, install s3cmd which is a command line tool for sending files to S3. If you are on Ubuntu, it is in the apt repo so just type apt-get install s3cmd. You can get more information along with other install methods at http://s3tools.org/s3cmd

Then run s3cmd --configure from the command line of your server and enter your Access and Secret keys. This will store the keys in a config file so that your server can access your S3 account. You can also optionally enter a GPG encryption key. This is separate from the access keys and will be used to encrypt the data before it is sent to S3. Use this if you are very worried about security. By default, your data should be safe unless you make it public. Encrypting sensitive stuff, however, is never a bad idea.

At this point, your command line should be set up to send files to S3. Now add in the script to do the backups. Create a file that will be run daily with cron that contains:


#!/bin/bash

TMP="/tmp"
DB_USER="root"
DB_PASSWD="password"
SITES_DIR="/var/www"
S3_BUCKET="s3://bucket-name"
DATE=$(date +%Y-%m-%d)

# Backup all databases to S3
for DB in $(mysql --user=$DB_USER --password=$DB_PASSWD -e 'show databases' -s --skip-column-names|grep -Ev "^(information_schema|performance_schema|mysql)$");
do
#First dump the structures
TABLES=`mysql --skip-column-names -e 'show tables' --user=${DB_USER} --password=${DB_PASSWD} ${DB}`
mysqldump --complete-insert --disable-keys --single-transaction --no-data --user=$DB_USER --password=$DB_PASSWD --opt $DB $TABLES > $TMP/$DB-$DATE
#Then dump the data, except for cache and temporary tables.
TABLES2=`echo "$TABLES" | grep -Ev "^(accesslog|cache_.*|flood|search_.*|semaphore|sessions|watchdog)$"`
mysqldump --complete-insert --disable-keys --single-transaction --no-create-info --user=$DB_USER --password=$DB_PASSWD $DB $TABLES2 >> $TMP/$DB-$DATE
#Gzip everything
gzip -v $TMP/$DB-$DATE;
#Upload to Amazon S3
s3cmd put $TMP/$DB-$DATE.gz $S3_BUCKET/databases/$DB-$DATE.gz;
#Cleanup
rm $TMP/$DB-$DATE.gz;
done

# Backup all sites to S3
cd $SITES_DIR;
for DIR in $(find "$SITES_DIR" -mindepth 1 -maxdepth 1 -type d);
do
#Tar and Gzip each directory
BASE=$(basename "$DIR");
tar -czf $TMP/$BASE.tar.gz $BASE;
#Upload to Amazon S3
s3cmd put $TMP/$BASE.tar.gz $S3_BUCKET/sites/$BASE-$DATE.tar.gz;
#Cleanup
rm $TMP/$BASE.tar.gz;
done

That's it! The bash script should be fairly straightforward and easy to modify if you want to. One thing to note is that I first backup the table structures and then the contents so that I don't get the contents of cache tables (Couldn't find the article where I got that code so if you add a comment below I'll add in a link to it).

Let me know if you have any improvements.

Why I didn't use drush or backup_migrate?

I wanted a solution that ran outside of Drupal and would work for any type of site. While backup_migrate can do many of the same things, having it run from within Drupal is a major issue in case it breaks.

Photo courtesy of http://www.flickr.com/photos/kapten/416527996/

Related Posts

A simple bash script to back up your Drupal sites

davidmcgarry
Read more

Quick Enhancements to Make Drupal Content Admin Dirt Simple

Tom McCracken
Read more

Drupal Mega Menus Made Simple

Mark Carver
Read more

How to Allow Password SSH Authentication for Pantheon/Mercury Servers or other Amazon Images

Dustin Currie
Read more

A Simple Entity Data API for Module Builders

Tom McCracken
Read more

Widget herding made simple - Introducing the Widgets module

Tom McCracken
Read more