I normally try to avoid server/unix jargon on this blog but as you can tell I have become infatuated with Amazon’s affordable storage solution, S3, as of late. We all know that it is important to keep recent backups of anything you value at all, so why not automate the process? Until I started tinkering with S3, my server backup process involved manually downloading and compressing the contents of my server and storing the compressed file on my hard drive. That usually took longer than I’d like so I wouldn’t do it terribly often. The main benefit of backing up with S3 is that you take advantage to your server’s high-speed connection and bypass the need to download files over your connection, as well as safely store files somewhere other than your own server.
For the most part, I took the advice of John Eberly in his automated S3 backups article. However, I did several things differently so I thought I would show what I did in an easy-to-follow format. The fruits of my labor come into play near the end with a simple shell script I wrote that compresses my entire server httpdocs directory (similar to public_html or www-type folders on other servers), does a mysql dump of my WordPress database and sends a tar.gz file of them to a backup bucket I have on my Amazon S3 account.
S3 Basics
I’m going to assume that you already have an Amazon S3 account. Login to amazon.com/s3 and find your Access Key ID and Secret Access Key on the AWS Access Identifiers page. If you’re still hesitant about giving S3 a try, take a look at this.

That’s my S3 bill for last month. I use S3 a good bit and I haven’t even hit $3 per month yet. You literally have nothing to lose.
I can has bucket?

Now that you have your S3 information, you’ll need to create a bucket to store your server backup data. A bucket in S3 lingo can simply be considered a top level directory. Buckets work like regular folders, although technically S3 must be tricked into using folders within buckets with specially-crafted filenames like example_folder/yourfile.jpg. Bucket names are globally unique, so you have to pick a name that isn’t already taken. To easily interact with S3, I recommend the S3fox Firefox add-on.

S3fox’s listing of my buckets
To create a bucket in S3fox, right-click anywhere in the right pane and select Create Directory. Now you have your bucket.
Setting up Ruby & s3sync
To interact with S3 with my shell script, we’ll need to install s3sync, a popular Ruby program for interacting with S3 over a command line. As such, you’ll need to install Ruby if your server doesn’t have it already. The following commands need to be run on your server, so SSH into it (ssh you@yourserver.com in Terminal on OS X, Putty on Windows).
I have the yum package manager installed on my server, so installing Ruby was trivial. If you don’t have yum, apt-get or emerge on your box, you can install Ruby and the SSL library libopenssl-ruby via RPM with rpm -Uhv ruby-*.rpm. Or you may opt to build it yourself with the source and ./configure && make && make install.
Ensure that Ruby is actually up and running with the following command. It should return your Ruby version. Mine was ruby 1.8.5 (2006-12-04 patchlevel 2) [i386-linux].
Now to actually download and decompress s3sync. But before you do this, make sure you are in an appropriate directory. I extracted the s3sync folder in my home directory, ~/. You can determine where you are by running pwd, print working directory. Some of the commands below might not work without sudo in front of them, if you are not already logged in as root or a privileged user.
tar xvzf s3sync.tar.gz
rm s3sync.tar.gz
cd s3sync
mkdir certs
cd certs
wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar
sh ssl.certs.shar
cd ..
mkdir s3backup
S3sync is now decompressed within its own s3sync folder in your home directory, along with a subdirectory certs containing SSL files to be used later, and another subdirectory s3backup where temporary backup files will be stored while they are being transferred to S3.
Giving s3sync your S3 info
In the s3sync folder, edit s3config.yml with your Access Key ID, Secret Access Key and directory for SSL certs. The directory will be ~/s3sync/certs if you followed this guide. Just to be safe you might want to give it your full path, such as /var/www/vhosts/yourserver.com/s3sync/certs or whatever it may be.
Making the Shell Script
Go into the ~/s3sync folder (you should already be here) and open up a text editor to create the actual backup script below. I’ll be using vi but you can just as easily create the file in something like TextMate and FTP it into the appropriate place.
Now that you’re in vi, press i to enable insert mode, and type the following code. You can paste it but sometimes a bit will get cut off from the top, so be on the lookout.
# directory structure:
# ~/s3sync has scripts
# ~/s3sync/s3backup is a folder for temp backup files
cd ~/
BUCKET=your_bucket_name
DBNAME=your_database_name
DBPWD=your_database_password
DBUSER=your_database_user_name
NOW=$(date +_%b_%d_%y)
tar czvf httpdocs_backup$NOW.tar.gz httpdocs
mv httpdocs_backup$NOW.tar.gz s3sync/s3backup
cd s3sync/s3backup
touch $DBNAME.backup$NOW.sql.gz
mysqldump -u $DBUSER -p$DBPWD $DBNAME | gzip -9 > $DBNAME.backup$NOW.sql.gz
tar czvf server_backup$NOW.tar.gz $DBNAME.backup$NOW.sql.gz httpdocs_backup$NOW.tar.gz
rm -f $DBNAME.backup$NOW.sql.gz httpdocs_backup$NOW.tar.gz
cd ..
ruby s3sync.rb -r ––ssl s3backup/ $BUCKET:
cd s3backup
rm -f *
IMPORTANT: This script is written for my (mt) (dv) 3.0 server which uses “httpdocs” as the public web folder. If your server uses another folder, please change that wherever it appears in the script. Also edit the first few lines with your database info and S3 bucket name. Once you’ve typed that and edited the necessary information, press esc to exit insert mode, then press :wq and then enter to save and quit vi.
IMPORTANT 2: Use full directories if you can. For example, cron jobs are often run as a different user than you so running “cd ~/” will result in a different path and be bound to mess things up. In my case, I replaced “cd ~/” with “/var/www/vhosts/paulstamatiou.com” but it varies with each server config.
If you try to run that script now, it won’t work for several reasons. First off, it doesn’t have the proper permissions.
Now your script is executable, but if you ran into the same problem I did and received a “bad interpreter” error, you can run this:
Test it out
Everything is setup and all you need to do is execute the script. Make sure you’re in the ~/s3sync folder and run s3backup.sh.
You will be asked for your password and then things will begin whizzing by your terminal as files are compressed. Script execution time depends heavily on how many files are being compressed and uploaded. For me, the entire process takes around 65 seconds with a 60MB server backup ending up on S3.
When I was testing this script I had put the source/destination paths in the wrong place while running s3sync.rb with the –delete (sync) flag and lost all of my backup files from the past year. I recommend making a test bucket if you plan on playing with s3sync.rb/s3cmd.rb directly.
Automate, Automate, Automate!
Okay so now you’re hopefully grinning after s3backup.sh worked flawlessly. If not, leave a comment and I’ll try to help troubleshoot the issue. The next step is to get this running every day, week, month or however often you would like to backup your server files and MySQL database.
Fortunately my server and many others have cron job folders in /etc. If a script is put in the cron.daily folder for example, it will be run daily. If put in the cron.monthly folder, the script will be run monthly. I decided to set this script up to run daily so all I had to do was copy s3backup.sh into the /etc/cron.daily/ folder.
Feedback
I wrote this post rather hastily as I have to begin studying for final exams for my summer courses, but if you see anything that doesn’t seem right, please let me know. Also, you are following this how to on your own and I am not liable should you somehow reformat your entire hard drive. If you have never touched a unix system in your life, I don’t think this is the best article to start with. If you’re a Linux guru and know of a few ways to optimize anything in this post or my shell script, I’d love to hear about it.
UPDATE: In the recent version of s3sync, you will have to update s3config.rb and change the confpath to this:
confpath = ["./", "#{ENV['S3CONF']}", "#{ENV['HOME']}/.s3conf", "/etc/s3conf"]
Tweet This
Stumble This


{ 41 trackbacks }
{ 43 comments… read them below or add one }
Nice information Paul. For me now Amazon S3 remains one of the things that i must try in the future.. it it pretty cheap, atleast when i look at your bill.
More from author
If loving Amazon S3 is wrong, I don’t want to be right. The other day I tried explaining the service to my fiance but when we got to the (low, low) pricing part of it, he stopped believing that it could possibly work.
However, before the FF plugin came along, I tried a lot of access methods that just didn’t work so well with my computer/routines. The FF-focused system is best for me.
More from author
I have to agree with you, this is one of the most useful articles you’ve posted in a while. I have been using S3 for a few months now, mainly as an off-site backup for my critical documents and my entire iTunes library. It’s impossible to beat the pricing; Only $3.00 per month for 20GB of storage. The only problem I have is that uploading files is painfully slow. When I did my original backup of my iTunes library, it took six days of continuous uploading. And it was only about 13GB at that point.
I just finished implementing this backup solution on my own server, and customized the bash script to better suit it’s structure. I’m running it now on a development domain, and it’s taking a long time. Performing “ls -g” on the directory before running the script told me the directory was only about 10MB. Perhaps it’s time to start looking at a different host.
More from author
@Abi – “but when we got to the (low, low) pricing part of it, he stopped believing that it could possibly work.”
haha, too funny.
More from author
Nice post again Paul. You keep making my life more and more simple. Any thoughts on making a Wordpress plugin than will do all of this? Do you use S3 to backup your personal computer also?
More from author
This is a fairly cheap route, if you don’t use too much bandwidth. If you get up to around 2TB/mo or so, the normal bandwidth quota prices for server rentals, it ends up costing $360/mo whereas I can rent a 2TB/mo server for $90/mo.
More from author
@Don – you bring up a good point. For me though, I have a found a nice usage balance between data stored and data transferred.
More from author
Ruby is one thing I’m not familiar with, so this error threw me off:
./s3config.rb:17: undefined method `load_file’ for YAML:Module (NoMethodError)
from s3sync.rb:25:in `require’
from s3sync.rb:25
Anyone with any ideas?
My server is running ruby 1.8.1 (2004-01-21) [i386-linux]
Thanks for a great article!
@Jared – did that come up when running s3backup.sh? It sounds like you possibly didn’t save s3config.yml correctly or added some extra characters it doesn’t recognize. I’m no Ruby expert either, so it might have something to do with your actual Ruby install. You could try installing a newer version of Ruby.
More from author
I really enjoyed your article. Thanks.
More from author
Thanks for this pointer to S3. I am going to give it a shot based on the feedback I’m seeing from you and others. BTW you made it to the front page of Digg!
More from author
Thanks for this, small problem I had when blindly following: I wasn’t aware that all bucket names are global !
I also couldn’t convince s3fox to create a bucket, so I had to use the java program JetS3t (which you can run online from within their web page) to create the bucket. Once the bucket was created, I could navigate into it with s3fox, and it worked as advertised after that.
The other issue I had was building ruby from source, you do have to go into ext/openssl and build and install that as well, and for that to work, you need an openssl devel setup, and be careful your old one isnt in /usr/include while you try to install the latest one into /usr/local/include. Ruby mkmf might find the old one and refuse to continue.
Ruby probs and things might be better sent to the s3sync forums than here; just a thought.
That guy running 1.8.1 has no hope without an upgrade… The http streaming and stuff wasn’t added till something like 1.8.4.
BTW s3sync lists 1.8.4 as min supported version.
More from author
You have a very good way of making the complicated seem achievable :-)
More from author
This could cost you THOUSANDS OF DOLLARS!
I did this very thing, backing up to S3 nightly. My bandwith is calculated not by total bandwith used, but by a percentile based on the mbits/sec used.
It turns out, during the several hours it would take to backup, I jumped into a very high percentile, costing me $1,000 extra dollars for the month. And that was just from about two weeks of using this.
More from author
Very nice. Is there also a way to do this without having to install Ruby?
More from author
Is this Askimet at work? Wow, I’m impressed.
Kidding aside, this is currently #3 on the popular del.icio.us feed, nice work Paul!
More from author
@Scott – well it obviously depends on your type of backup. My backup is about 60MB and takes about a minute, yours is probably hundreds of gigabytes.
More from author
I can run this script and everything seems to work till it try to process on of the last lines. I receive this:
One argument must be on S3
s3sync.rb [options] <source> <destination> version 1.1.4
–help -h –verbose -v –dryrun -n
–ssl -s –recursive -r –delete
–public-read -p –expires="<exp>" –cache-control="<cc>"
–exclude="<regexp>" –progress –debug -d
One of <source> or <destination> must be of S3 format, the other a local path.
Examples: (using S3 bucket 'bucket' and prefix 'pre')
Put the local etc directory itself into S3
s3sync.rb -r /etc bucket:pre
(This will yield S3 keys named pre/etc/…)
Put the contents of the local /etc dir into S3, rename dir:
s3sync.rb -r /etc/ bucket:pre/etcbackup
(This will yield S3 keys named pre/etcbackup/…)
Put contents of S3 "directory" etc into local dir
s3sync.rb -r bucket:pre/etc/ /root/etcrestore
(This will yield local files at /root/etcrestore/…)
Put the contents of S3 "directory" etc into a local dir named etc
s3sync.rb -r bucket:pre/etc /root
(This will yield local files at /root/etc/…)
another nice tutorial… cheers paul
More from author
@eville84 – that dialog usually comes up when you run s3sync.rb without any arguments. Do your source/destination directories/buckets exist?
More from author
Nice tutorial.
Couple of points though. I understand that the ruby line in the bash script should be:
ruby s3sync.rb -r –ssl ./s3backup bucket:mybucket
Thats two dashes before ’ssl’ and ‘bucket:mybucket’ to specify the bucket.
(I can’t test it though until I’ve upgraded Ruby :( )
thanx.
More from author
@jalal – the syntax is just “yourbucket:” there’s no actual “bucket” before it. run s3sync.rb without any parameters/arguments when you can and it will show you the syntax.
More from author
Awesome post Paul. Just set it all up in under 10 minutes and it seems to be working great! I have a few notes, though.
First, @eville84: I think you’re missing one of the dashes in front of ssl (it should be two dashes, WordPress converts a double dash to one long dash… stupid formating issue). I had the same problem.
I changed the first line of your script to cd ~mmalone/ instead of just cd ~/. A tilde will cd to your home directory, by default, but when cron executes the script it’ll probably go to the root user’s home directory (I didn’t actually try this, and I’m sure it depends on the system). Putting a username in there will make cd change to that user’s directory.
If your DB is on a different server like mine is, simply add -h to mysqldump (line 16) between -p$DBPWD and $DBNAME.
The date command on my system requires a + sign before the formatting rule. Simple change, just add a + before _%b_%d_%y — line 11 now looks like this: NOW=$(date +%b_%d_%y).
One other little security issue: when you include your password as a command line argument to mysqldump it may show up in top/ps/other process lists on your system. If you’re the only one on your system, or if users can’t see one another’s processes, then you’re probably safe… but still, just a heads up.
More from author
Paul you are a superstar! Wasn’t too sure about S3 but you really sold me with this!
Your guide was a breeze to follow and this biggest problem I had was installing yum on my (dv), a problem which i eventually attributed to a few issues with dependencies, and a typo in the file name of the yum-centos rpm which I kept making!
More from author
Hey Paul, that was a great tut. However I needed this to run on Windows, so what I did was:
1) Install the Windows binary of Ruby.
2) Install bsdtar for Windows
3) Install a Windows version of the *nix touch command
4) Created a .bat script (instead of a .sh script) to run all the commands
Here is my .bat script:
echo on
cd c:\s3sync
SET BUCKET=your_bucket_name
SET DBNAME=your_database_name
SET DBPWD=your_database_password
SET DBUSER=your_database_user_name
@REM Get the date in MMDDYY format
FOR /F “TOKENS=1* DELIMS= ” %%A IN (’DATE/T’) DO SET CDATE=%%B
FOR /F “TOKENS=1,2 eol=/ DELIMS=/ ” %%A IN (’DATE/T’) DO SET mm=%%B
FOR /F “TOKENS=1,2 DELIMS=/ eol=/” %%A IN (’echo %CDATE%’) DO SET dd=%%B
FOR /F “TOKENS=2,3 DELIMS=/ ” %%A IN (’echo %CDATE%’) DO SET yyyy=%%B
SET NOW=%mm%%dd%%yyyy%
bsdtar czvf httpdocs_backup%NOW%.tar.gz C:\http\xampp\htdocs
mv httpdocs_backup%NOW%.tar.gz c:\s3sync\s3backup
cd c:\s3sync\s3backup
touch %DBNAME%.backup%NOW%.sql.gz
mysqldump -u %DBUSER% -p%DBPWD% %DBNAME% | gzip -9 > %DBNAME%.backup%NOW%.sql.gz
bsdtar czvf server_backup%NOW%.tar.gz %DBNAME%.backup%NOW%.sql.gz httpdocs_backup%NOW%.tar.gz
del /F %DBNAME%.backup%NOW%.sql.gz httpdocs_backup%NOW%.tar.gz
cd ..
ruby s3sync.rb -r s3backup\ %BUCKET%:
cd s3backup
del /F /Q *
The only catch was the –ssl switch to work. I can only assume the SSL functionality is using something built into *nix. Other than that, everything worked out well.
More from author
This tutorial is really usefull. Thank you!!
More from author
Nice writeup Paul- good to get back to your roots with tutorials! Sadly my server doesn’t like Ruby, I should really change to a new one at some point.
More from author
Paul, I enjoyed your article.
However I found out that to create a backup of clicdev.com, I was going to get killed by bandwidth usage. I came up with a solution involving Jungledisk and true rsync; you can read it here: http://nexus.zteo.com/2007/08/06/cheap-server-backup-with-amazon-s3/
I would appreciate it if you find the time to let me know if you see any problem with my approach.
Cheers,
-C.
More from author
It amazes me how inexpensive Amazon S3 really is. However, I can see that through their vast distributed infrastructure, they can make it possible. I imagine if other companies wanted to compete they easily could, but it would have to have an enticing API with unbeatable performance.
More from author
Hey Paul,
Not sure anyone will see this but you however to get a consistent backup using mysqldump you need to read lock tables and flush logs
http://dev.mysql.com/doc/refman/4.1/en/backup.html
http://dev.mysql.com/doc/refman/4.1/en/mysqldump.html
so mysqldump command becomes…
mysqldump -u $DBUSER -p$DBPWD $DBNAME –flush-logs –lock-all-tables …
Of course there are options to handle larger datasets as well and also getting consistent backups of INNODB tables like –single-transaction.
Consistency is the C in A.C.I.D. Non consistent backups purely provide a dump of what the data was at the point in time of the read (to dump the data) without taking into account and relationships between the data.
Have Fun
Paul
More from author
Thanks for posting, it was helpful in getting started. A few tips I’ve made:
1) Split the backup and upload steps into two scripts (make_bashup.sh and s3upload.sh) so you can test/debug each part separately.
2) exclude unnecessary files from the tar (such as SVN directories):
tar –exclude “.svn” -czvf betterexplained$NOW.tar.gz /var/www/vhosts/…
3) Add echo statements so you can see where your script is at (for debugging)
>> echo “backing up…”
>> echo “uploading to s3…”
4) Use the –progress option for s3 to see how far along you are (also, the ssl option needs –ssl not -ssl):
ruby s3sync.rb -r –ssl –progress s3backup/ $BUCKET:
(By the way, the colon is not a typo… that’s the format s3 needs. That confused me for a sec.)
Anyway, those are the major changes I made, thanks for kicking this off!
More from author
@Kalid – thanks for stopping by. As for the double dash for ssl, I have that but WordPress thinks it’s cool to automatically convert it to a single dash. As such I noted under the code to substitute it for two dashes. Great suggestion for the progress flag, I didn’t even think about it.
More from author
Ah, it seems like you have auto conversion of dash-dash (- -) to an em-dash. The “-ssl” in the scripts above should be – -ssl (dash dash SSL).
Anyway, nice article.
More from author
@Paul – On a related note, you can use HTML entities — in this case & ndash ; (no spaces between the characters) — to represent the two dashes. WordPress won’t touch them. So instead of trying to use two dashes, use ––. That’s ampersand (&), ndash, and semi-colon (;).
@Sean – thanks for that! I never knew what the entities were.
More from author
I use a similar Bash script to prepare my backup files and then call s3sync. One essential step I use is to call ncrypt (ncrypt.sourceforge.net) to encrypt the backup files and then par2 (parchive.sourceforge.net) to create parity recovery metadata (in case of data corruption) of the encrypted files. It is very bad idea to send your files unencrypted to S3 (or to any other provider). At least for us this is essential, because we send to S3 critical data of our bussiness.
Regards,
MV
Nice script! tried it and worked flawlessly !! except for the S3 account i used a ordanaire FTP account on my server at home.
More from author
Thanks for the great read!
s3sync.rb seems to have been updated in a way that breaks this article. Can you update your article?
More from author
Millions of thanks, saved my lot of time :)
Аднажды в студеную зимнию пору. Бродил Я по нету. Наткнулся на пост. Понравилось очень! Респект выражаю! И даже закладки себе добавляю!
More from author
I’ve written yet another S3 backup script that you may want to check out: http://dev.davidsoergel.com/trac/s3napback/. It’s very easy to use and handles backup rotation, incremental backups, compression, encryption, and MySQL and Subversion dumps. Enjoy!
I use As3FileSync to backup files.