Whenever I talk to another developer and find out they’re not using version control (a.k.a. Source Code Management system, or SCM) as part of their workflow, I become a little shocked and horrified. There are just too many great reasons for using version control. Both Git and Subversion are free to use, relatively simple to set up, and give you snapshots to go back to anytime you break something in your code. An SCM is indispensable for any team of more than one developer, but it’s just as useful if you’re on your own.
Tons of developers love Git, and although Git does have some really great features when compared to Subversion, there’s one particular benefit to using Subversion that Git users rarely consider. That is, when you use Git to commit hundreds or even thousands of revisions to your local machine… what happens if your hard drive crashes? Unless you also have set up a remote repository and get familiar with pull requests and merges — Git actually requires a little more effort to get this big benefit — and it’s built into Subversion by design.
One of the primary benefits of using Subversion as your SCM solution is that it’s like having an insurance policy against your local machine breaking, your laptop being stolen, or your hard drive crashing. Every time you commit, you’re sending that code to another server. Granted you can do the same thing with a master Git repository on Github.com or your own server in the cloud, but unlike Git, Subversion is meant to be run somewhere other than your laptop or workstation.
To extend the insurance metaphor, as great as it is having the code on your laptop backed up to a central Subversion server, it makes just as much sense to protect yourself from the possibility that your Subversion server’s hard drive will crash.
The best way to do this is to create a mirror repository on another server, and use the
svnsync program to create a replica of your primary Subversion repository, or repositories, if you have multiple. If you’re familiar with master-slave database replication, Subversion repository replication is quite similar. Following are the steps and a few gotchas I learned this week as I finally had a chance to set up a proper Subversion mirror replication slave server.
Step One: Set up Subversion on a 2nd Server
This guide assumes you’re already running Subversion on one server. We’ll call that source-server from now on. The first step is to set up a second Subversion server to be used as a mirror. We’ll call that mirror-server from here on out.
What’s interesting to note here is that unlike most database replication setups, with Subversion, it doesn’t matter too much what version the server is, or what platform you want to run it on. Unfortunately, I didn’t realize that when I started. I purposely downloaded an older version of VisualSVN 2.06 bundled with Subversion 1.6.17, and later found out I would’ve been better off running the latest VisualSVN 2.5.3 bundled with Subversion 1.7.3. Why? As of Subversion 1.7, you can now use
svnsyncwith the new
--allow-non-emptyoption, which is designed for exactly the situation of starting to sync a mirror when it already has content in it. More details in Chapter 9: Subversion Repository Mirroring in the SVN Book.
In our case, we have our source-server repository running Subversion 1.6.6 on Ubuntu Server 10.04 LTS, and I set up the free version of VisualSVN 2.06 (bundled with Subversion 1.6x) on a Windows 7 PC. Whether you install from source or from a binary, it’s important to at least know what version of Subversion each server is running.
Step Two: Dump the Source Repository
Unless you’re starting fresh with both an empty source-repo and an empty mirror-repo, chances are you have lots of commits on your current source-repo. You can actually “play back” these changes on the mirror-repo starting from revision 0 forward, but in most cases it’s faster to dump the source and load it onto the master.
#> svnadmin dump http://source-server/svn/source-repo > source-repo.dump #> tar czf source-repo.tgz source-repo.dump
Note for Subversion 1.6 and below: Unfortunately, the
svnsyncprogram has a limitation in that it assumes you’re starting your mirror-repo from revision 0. This is fine if your repository is small with only a few revisions, but can be quite slow if the reverse is true. Anyway, we’ll be performing some manual tweaks to our mirror-repo later on using this dump-and-load technique. If you’re already running Subversion 1.7 or greater, they’ve added a new feature to circumvent this limitation.
Step Three: Create the Mirror Repository
Most guides to using svnsync warn you to never commit to the mirror repository. The reason for this is that you only ever want your replication user (syncuser) to make changes. Otherwise you risk breaking replication on the slave. On the mirror-server open up a terminal or command prompt and type:
#> cd /svn/repos #> svnadmin create mirror-repo
In my case, using VisualSVN makes this incredibly simple. Just click to the administrative interface and right-click on Repositories container create a new repository. Don’t check the box to create branches, tags and trunk! Then add a new user, syncuser. This is the only user that will need access to the mirror-server repositories. More on this a little later.
Now at this point, we have an empty mirror-repo at revision 0. Here’s the first gotcha. The
svnsync program needs to store some special properties about its own syncronization activities. It does this by setting some properties on the repository at -r 0. In order to do this there has to be a valid pre-revision property change hook on the repository that calls
exit 0. Some tutorials have you simply add the line
exit 0 to your script, but I would recommend against this approach because it leaves the door open for some other user to modify properties and muck up the works. This hook is a perfect place to put your check that only syncuser is allowed to do things. Here’s the script I used for the pre-revision property change hook on Windows:
IF "%3" == "syncuser" (goto :label1) else (echo "Only syncuser may change revision properties" >&2 ) exit 1 goto :eof :label1 exit 0
You might get errors such as
svnsync: DAV request failed, svnsync: Error setting property 'sync-lock' could not remove a property if you forget this step. It took me quite a while to come up with the above on Windows — the vast majority of samples online for pre-revision hooks are bash scripts.
#> cd /svn/repos #> tar xzf source-repo.tgz
Step Four: Load the source repository
This is where these directions differ from most of what you’ll find online. Typically, the next step you normally see in other tutorials is to start the synchronize process with
#> svnsync init X:Repositoriesmirror-repo http://source-server/svn/source-repo
I tried that first, and it did work fine, but I could tell shortly that it would take a very, very long time to sync from -r 0 to HEAD over the network. I subsequently got some extra advice on the Subversion user’s mailing list to perform the sync on a local repository, but the technique I describe here works just as well.
We can import our dump file to our mirror repository with the svnadmin load command as follows:
#> svnadmin load mirror-repo < source-repo.dump
This command can take a while to run, proportional to the size and number of commits in your svn dump file.
Step Five: Manually Set Sync Properties
Now we have two repositories that are exact copies of each other, but they aren’t yet synchronized in a master-slave configuration, and they’re not automatically syncing just yet. If you try to start
svnsync initialize command now, you’ll get the following error:
svnsync: Cannot initialize a repository with content in it
This can be really frustrating if you’ve never used
svnsync before. As I said earlier, the
svnsync program expects to be initialized on an empty repository at revision 0, to play the revision history forward from there. In this case, our repository has thousands of commits in it already, and we want to start up sync from the current revision forward.
To do this, we have to understand what happens when calling
svnsync initialize. What happens is the
svnsync program creates three special properties at -r 0, for tracking its own syncing activities. These can be seen on an actively mirrored subversion repository with the
svn proplist command.
#> svn proplist --revprop -r0 http://mirror-server/svn/mirror-repo Unversioned properties on revision 0: svn:sync-from-uuid svn:sync-last-merged-rev svn:date svn:sync-from-url
You can ignore
svn:date; only the
svn:sync* properties are relevant to syncing. Okay, now that we know what the unversioned properties on -r 0 are, we’re going to hack our own values into those properties, using the
svn propset command. We’ll take these one at a time.
To set the
svn:sync-from-uuid property by hand, we need to find out the UUID of the source-server’s source-repo, with
#> svn info http://source-server/svn/source-repo Authentication realm: <http://localhost:80> Subversion Repository Password for 'yourusername': Path: source-repo URL: http://localhost/svn/source-repo Repository Root: http://localhost/svn/zupper.com.br Repository UUID: 9d96f4c0-7d9a-42f6-b8c8-54e79b961fad Revision: 3738 Node Kind: directory Last Changed Author: jsmith Last Changed Rev: 3738 Last Changed Date: 2012-03-01 16:38:38 -0700 (Thu, 01 Mar 2012)
Okay, there we can see it in the output, so copy and paste it — you don’t want to type that. Back on mirror-server we can now issue this command:
#> svn propset --revprop -r0 svn:sync-from-uuid 9d96f4c0-7d9a-42f6-b8c8-54e79b961fad property 'svn:sync-from-uuid' set on repository revision 0
That response means it worked. Okay, next, we can set the last-merged-rev, or the revision that was last merged. To be safe, you should check the current revision number of both repositories, and use the lower of the two, probably your mirror-repo, which would indicate that someone has already committed new code on source-repo.
#> svn propset --revprop -r0 svn:sync-last-merged-rev 3738 http://mirror-server/svn/mirror-repo property 'svn:sync-last-merged-rev' set on repository revision 0
Again, a successful response. Next, we need to set the source URL on the mirror repository using
#> svn propset --revprop -r0 svn:sync-from-url http://source-server/svn/source-repo property 'svn:sync-from-url' set on repository revision 0
Great, now we’re ready to tell our Subversion mirror to sync:
#> svnsync synchronize http://mirror-server/svn/mirror-repo Transmitting file data . Committed revision 3739. Copied properties for revision 3739.
You may not see a confirmation message exactly like mine… in my case it just means that the mirror was able to fetch 1 new change from source-repo.
Last Step: Automate synchronization
Now that we have two subversion repositories mirrored, we need to add a post-commit hook on our source-repo that pushes commits to the mirror. This is done by editing the repository’s post-commit hook. On the source-server
#> sudo vi /svn/repositories/source-repo/hooks/post-commit svnsync --non-interactive --username syncuser --password XXXXXXX sync http://mirror-server/svn/mirror-repo/ &
That should be it. Commit some code as normal (to source-repo), then browse to your mirror-repo or do an svn info on it to make sure your commit made it over to mirror-server. If so, congratulations! You’ve just completed this tutorial and are twice as safe from Subversion hard drive failure as you were before.
One obvious security concern in the example above is you probably aren’t going to store the syncuser’s password in the post-commit hook. It does not need to actually be placed in clear text in your post-commit hook file, I just wanted to show that to make the point that your source-server has to be able to see your mirror-server and have the syncuser password hashed or stored. It’s not a big deal in our case, since our repos are on the LAN and nobody can fiddle without access to the box. In any case, there’s a variety of methods out there to conceal your subversion password. Storing encrypted passwords on Ubuntu Server without Gnome Keyring… now that’s a whole other story.