Using Mercurial for Version Control

The Neuroinformatics Research Group has switched from using CVS to Mercurial, a modern and distributed version control system.

Why Mercurial?

Mercurial, often referred to by its command, "hg," offers several distinct advantages:

  • Better support of open-source development by allowing third party developers without a commit-bit on the central repository to still still be first-class citizens when modifying the code. Mercurial will also allow changes by other labs and developers to more easily be merged into the official code base
  • Great support for merging. Too often with CVS, we devoted a whole day to merging changes. Mercurial greatly simplifies the process of merging.
  • Speed & Safety. Every checkout includes a full, local copy of the entire history. Therefore, most operations against the repository can be done without ever going over the "slow" network. The full copy of the repository also means that there is no single point of failure if a server fails. Mercurial additionally features atomic commits, so the repository is never left in a partial state.
  • Changesets allow for grouping of related changes into a single commit instead of committing each file individually, making it easier to understand the exact changes related to any specific feature or bug.
  • Easy learning curve coming from CVS and Subversion. Many of the commands and concepts are familar to users of CVS and Subversion. Additionally, tools like TortoiseHG (http:_bitbucket.org-tortoisehg-stable-wiki-Home) and MercurialEclipse (http:_bitbucket.org-mercurialeclipse-main-wiki-Home) exist for users familiar with similar CVS and Subversion tools.

Getting Started

This page will not attempt to fully teach Mercurial, since there are many excellent guides already:

  • O'Reilly's Mercurial: The Definitive Guide (http:_www.amazon.com-Mercurial-Definitive-Guide-Animal-dp-0596800673) by Bryan O'Sullivan is available for free online at hgbook.red-bean.com (http:_hgbook.red-bean.com-)
  • Joel Spolsky's hginit is a very nice tutorial.
  • For the impatient, Mercurial has a quick start guide

Tools for interacting with Mercurial include:

There are several other resources that we have found useful:

We host all NRG related repositories at bitbucket.org/nrg/.

Before we get started, please consider configuring Mercurial with your name and email and the improved diff format (can track file renames). In ~/.hgrc or %USERPROFILE%\Mercurial.ini:

[ui]
username = John Paulett <me@wustl.edu>

[diff]
git = 1

To checkout the XNAT repository, we can issue the following command

hg clone https://bitbucket.org/nrg/xdat_release

You can then make changes and commit those changes.

Great Commit Messages

Instead of meaningless commit messages like "Fixed some bugs" try to provide a description of the changes and the reason for the change. Additionally, do not try to fit too much into a changeset. A logic unit for the size of a changeset is the minimum change needed to fix a bug or implement a feature. If fixing multiple bugs, it might make sense to split those into multiple changesets. The advantage of creating appropriately sized changeset with meaningful commit messages comes from allowing other developers to grasp what changes are occurring to the code base and potentially make suggestions to improve the changeset.

Please refer to Louis Brandy's Writing better commit messages for a great discussion on commit messages.

Integration Manager Workflow

Mercurial supports multiple workflows, from a centralized model such as Subversion provides to the Dictator and Lieutenant model used by the Linux kernel. We have chosen to use a "Integration Manager" workflow, which allows for open development, but enforces that all code is checked by a second party before becoming part of the official, mainline repository that everyone relies upon.


file:hg_repos.svg

Let's use an example of xdat_core. All official NRG repositories are located at bitbucket.org/nrg (http:_bitbucket.org-nrg), so the official version of xdat_core is at bitbucket.org/nrg/xdat_core (http:_bitbucket.org-nrg-xdat_core). Tim Olsen and I (John), have personal public forks of xdat_core at bitbucket.org/timolsen23/xdat_core (Now located at bitbucket.org/timolsen23/xdat_core_tim/src (https:_bitbucket.org-timolsen23-xdat_core_tim-src)) and bitbucket.org/johnpaulett/xdat_core (http:_bitbucket.org-johnpaulett-xdat_core-src), respectively. On my computer, I have cloned my public fork. Let's say I have made a change to fix a bug. I commit my change locally, then push to my public repository.

For xdat_core, Tim acts as the Integration Manager, so in bitbucket, I send a pull request to Tim to review my changes. Once Tim is comfortable with my changeset, he will pull from my public fork then push into the official repository. If Tim found a problem with my changeset, he would likely ask me to fix the problem before he would push my code to the public repository.

Once the changes appear in the official repository, other developers can pull from the official repository into their local clones (merging, if needed), then eventually pushing to their public fork. If another developer needs a changeset from my public repository that is not yet in the official repository, she can pull directly from my public repository.

In my xdat_core/.hg/hgrc file, I have configured several aliases to the repositories I typically interact with:

[paths]
default = ssh://hg@bitbucket.org/johnpaulett/xdat_core/
nrg = ssh://hg@bitbucket.org/nrg/xdat_core/
tim = ssh://hg@bitbucket.org/timolsen23/xdat_core

So, I can pull from Tim's repository by:

hg pull tim

How to Get Your Change into the Mainline

One of the great advantages of Mercurial is that you no longer need commit access to NRG's CVS server to easily share your changes with the world. Previously, when a developer had a change, she would typically email the XNAT discussion list with the files she modified. This approach was error prone because it required the developer to find all changed files and required an NRG developer to manually check for potential merge issues.

With Mercurial, there are several methods for contributing code. The first step, regardless of the method you choose, is to ensure that you have recently pulled from the official repository, merged in your changes, if necessary, and re-tested your changes. Doing so will make the work of the integration manager much easier and reduce the chance of introducing bugs.

By far the easiest method to contribute is to create your own fork on bitbucket, make your changes, push your changes to your fork on bitbucket, then send a Pull Request via bitbucket or the XNAT discussion list. The Integration Manager can then easily review your changes and integrate them into the official repository.

The Mercurial wiki details several other methods for communicating your changes (http:_mercurial.selenic.com-wiki-CommunicatingChanges), including exports, bundles and patch bombs (http:_mercurial.selenic.com-wiki-PatchbombExtension) .

Please, share your changes! We are excited about growing the XNAT developer community.

Branching Strategies

Branching in Mercurial is very powerful, but terminology and architectural differences from other version control systems can lead to some confusion for developers.

Mercurial features named branches ("hg branch"), but we suggest staying away from them unless there is a specific reason for using them (they permanently reside in the repository).

In most cases a separate clone is the most effective form of branching for branches that should one day disappear (e.g. feature branches). For example, when converting xdat_core to Maven2, I created a clone of an existing xdat_core repository (local clones will use hard-links, thus saving disk space):

hg clone xdat_core xdat_core_mvn

I could then hack on the xdat_core_mvn project, adding support for Maven2. As bugs in xdat_core needed to be resolved, I could just cd into the xdat_core project, fix the bugs, then go back to xdat_core_mvn.

Advanced Mercurial

Useful hgrc Settings

You can have a global settings file at ~/.hgrc. Each repository has its own version at <repo>/.hg/hgrc, which can override or add settings. See the[ hgrc man page|http:_www.selenic.com-mercurial-hgrc.5.html] for more information on this file and the Using Extensions (http:_mercurial.selenic.com-wiki-UsingExtensions) wiki to see additional extensions. hgtips also has a good introduction to the hgrc files.

Below is an example of my (JP) current ~/.hgrc file:

[ui]
username = First Last <me@wustl.edu>
# use the compact layout for hg log
style = compact
# speed up communication via SSH with compression
ssh = ssh -C

[extensions]
# get color output for commands like hg status
color =
# automate typical hg pull && hg update && hg merge cycle
fetch =
# enable the Mercurial Patch Queue extension
mq =
# pretty command line view of development lines (Usage: hg glog)
graphlog =
# enable Mercurial Queues
mq =
# locally bookmark changesets
bookmarks =
# conversion tools
convert=
churn=

[diff]
# use the advanced git diff format instead of the old unix diff format
git=1

[bookmarks]
# only move the current bookmark forward
track.current = True

[alias]
# DANGEROUS revert that wipes everything
killitwithfire = revert --no-backup --all
# create an alias to find the status of a Mercurial patch queue
qstatus = status --rev -2:.
# version the queues
qinit = qinit -c
# add user name to patches
qnew = qnew -U
# show the past 5 changesets
recent = log -l 5
# make the web server look better
serve = serve --style gitweb
# include the file that the config appears in
debugconfig = showconfig --debug
# show changeset (Usage: hg show 27)
show = log --patch --rev

Reducing Merge Changesets

Not infrequently with a distributed version control system will multiple developers change the same file, resulting in one of the developers to needing to merge. Each merge is represented as a changeset in the repository's history. In some ways, excessive merge commits can clutter the repository history. While it is a relatively minor issue, keeping the history clean makes browsing the history more comprehensible. There are two techniques in Mercurial that avoid merges.

Mercurial Patch Queues are a system which allow you to carefully build multiple patches and only turn them into changesets when you are ready and synced up to the official tip.

Rebasing , a concept from git, takes any local changes you have made, severs them out of the repository, pulls the tip from the remote repository, and finally plants your local changes on top of the tip. Never perform a rebase of changesets that you have pushed out from your local repository (it can cause painful issues). Rebasing effectively edits the repository's history and you never want to edit public history.

Mercurial Patch Queues (mq)

The MQ Extension provides a powerful tool for managing patches:

Using Git (Unsupported)

Some developers may prefer using git to Mercurial. Github developed a tool, hg-git (http://hg-git.github.com) , for checking out git repositories using Mercurial. With a little work (http://traviscline.com-blog-2010-04-27-using-hg-git-to-work-in-git-and-push-to-hg) , you can also go in the other direction and interact with a Mercurial repository using the git client.

We believe that the richest experience of developing for XNAT is when using Mercurial, and will only officially support Mercurial. However, if you are a die-hard git user, you still have a option for hacking on XNAT. Also consider looking at the Mercurial wiki's Git Concepts.


Mercurial Tips & Tricks

hg serve

Launches a web server that shows the repository's history at http://localhost:8000 (by default). Other users can pull from this webserver. Useful for quickly sharing a changeset.

Additional Resources

$label.name