Version Control Systems: Mercurial
Forest Bond
Overview
- Author:
- Matt Mackall
- Model:
- distributed
- Language:
- Python / C
- License:
- GPL
History
-
Originally written as a possible BitKeeper replacement for the Linux
kernel developers, but git had already been adopted by the time Mercurial
was ready.
-
Is now approaching version 1.0 (current is 0.9.3), and is used by several
decently-sized projects:
- ALSA
- MoinMoin
- mutt
- NTFS-3G
- rpm.org
- Xen
Concept
Mercurial is a fully distributed version control system, like Bazaar-NG in
some ways, but unlike it in a few ways.
- A branch is a working copy (checkout) plus a store (.hg directory).
- Cherry-picking changesets is disallowed by design.
User Interface
The primary Mercurial interface is the hg command-line utility:
-
The hg command-line interface should be somewhat familiar to
Subversion/CVS users, although it provides many more commands (the version
I tested has 77).
-
In addition to commands provided by svn and others, hg also provides
clone, push, pull for moving changesets between repos.
-
The hg command set can be augmented through the use of extensions, some of
which are shipped with hg itself.
User Interface (cont'd)
-
The hg command-line utility calls external scripts to perform certain
tasks: hgmerge, hgeditor.
-
These scripts can be replaced to achieve limited customization of handling
of critical activities like resolving conflicts and cryptographically
signing changesets.
Data Management
-
Mercurial stores the smaller of (delta, file object), deliverying high
disk-space efficiency.
-
SHA1 hashes of file objects and changesets are stored both for
identification as well as data integrity.
Data Management (cont'd)
A RevLog is stored for each file object. This is an index file that points
to the exact location of the data for each revision of the object. From
a
design document on the Mercurial wiki:
-
"With one read of the index to fetch the record and then one read of the
revlog, Mercurial can reconstruct any version of a file in time
proportional to the file size."
-
"So that adding a new version requires only O(1) seeks, the revlogs and
their indices are append-only."
Advantages & Disadvantages
Advantages:
- Some systems perform worse (bzr)
- Small code size
- Cross-platform
- Low disk-space usage
- Written In Python :)
Disadvantages:
- Some systems perform better (git)
- Limited merge freedom (no cherry-picking)
- Yet another DVCS / not widely used
- Linus: lacks guarantees against corruption
Advantages – Small Code Size
Mercurial is significantly smaller than comparable packages:
| Package |
Language |
Lines Of Code |
| Mercurial (hg) 0.9.3 |
Python/C |
20013 |
Monotone 0.31 |
C/C++ |
63462 |
| Bazaar-NG (bzr) 0.14 |
Python |
63807 |
| git 1.4.4.2 |
C with helper scripts |
78233 |
Note: these numbers are mine. I tried to be as fair as possible.
Advantages – Cross-Platform
Mercurial should run on just about any system with a reasonably complete
Python implementation. Binary packages are available from the Mercurial
website for the following platforms:
- GNU/Linux
- Mac OS X
- FreeBSD
- NetBSD
- Solaris
- Windows
Advantages & Disadvantages – Performance
From
a page at Jst's Blog:
| Operation |
bzr (0.12.0c1) |
hg (0.9) |
git (1.4.2.4) |
| diff (top level) |
16.957 |
5.600 |
1.572 |
| diff dom/ |
10.596 |
2.240 |
0.140 |
| diff dom/src/ |
10.504 |
2.212 |
0.124 |
| diff dom/src/base/ |
10.468 |
2.212 |
0.124 |
| diff dom/src/base/nsDOMClassInfo.cpp |
10.472 |
2.084 |
0.116 |
| diff dom/src/base/nsGlobalWindow.cpp |
10.012 |
2.024 |
0.088 |
| diff in dom/ |
16.833 |
5.548 |
0.136 |
| diff in dom/src/base/ |
16.881 |
5.504 |
0.112 |
Advantages & Disadvantages – Performance
The (dated and somewhat anecdotal) evidence on the previous slide suggests
that Mercurial is 3-5 times faster than Bazaar-NG and 5-25 times slower
than git.
Conclusion
Mercurial is a (relatively) fast & light distributed version control
system that suits a branch-merge, peer-to-peer workflow.