The Advisory Boar
Some months ago, someone posted a link to my article on
using Git to manage a web site to
the Hacker News web site,
where it was briefly quite popular.
In March, the founder of
Hacker Monthly magazine ("The
Best of Hacker News, in Print") contacted me to ask for permission to
reprint my article in the next issue. I got a print copy and a year's
digital subscription. I had only heard of the magazine in passing
before, but it looks nice.
Hacker Monthly issue #11 is out now, and here's
a PDF of my article.
I wrote long ago about the
trouble I had with Net::XMPP
while setting up a notification hook for Archiveopteryx, but I didn't
think anyone would find the script itself particularly interesting. But
people have asked me about it, so here it is.
When the Archiveopteryx source code lived in Perforce, we would submit
everything to the
src/main branch, review and test, then
p4 integrate to merge selected changes into release
src/rel/2.0. The only changes we submitted
directly to the latter branches were release-specific, like setting
the version number in
Jamsettings. We could safely re-run
p4 integrate at any time, and it would show us only those
changes that we had not already reviewed.
When we moved to git, we continued to work this way—development happened
master, and we would use
git cherry-pick to
integrate or backport selected changes into older release branches. New
release branches were created by branching from the current master, and
maintained the same way. We did this for almost two years and several
releases, but it was not much fun.
There was no easy way to answer the question
Which commits do
I need to consider for inclusion? for any given release branch. In
git log --cherry-pick will tell you, but it doesn't
work very well. We used to do monthly releases, but it was so painful to
deal with the build-up of commits at release time that we were forced to
backport changes in smaller batches throughout the month (but that was
not, in itself, a bad thing).
This tutorial explains how to share a Git repository among developers.
It is meant for small teams who are adopting Git for the first time, and
want to get started quickly with a familiar setup before exploring Git's
many new possibilities.
If you follow this route, you will end up with a single centrally-hosted
repository that everyone in your group can use to publish their own work
and fetch whatever others have published. People used to a centralised
VCS will find this model easy to adjust to, but of course, each user's
"working copy" will itself be a fully-fledged Git repository, and many
new workflows are available to users as they learn more.
It would help if you're familiar with basic Git terminology and usage,
but if not, you can skim through to find out which commands you need to
read about and experiment with. (I recommend
Git from the bottom up
for an introduction.) I shall assume that everyone has git 1.6.5 or
later installed, and that they have ssh access to the server that will
host the repository.
git commit and
git push, and a few
seconds later, the mains power died. Normally, I wouldn't have noticed,
but my trusty UPS is broken, so for the first time in many years, every
power glitch makes its presence felt; and now, I can fully experience
the joy of being bitten in the rear by Ext4's delayed allocation.
When my machine came up again, the newly-created commit object and some
associated tree objects were corrupted.
pointed to that corrupted commit, so most git commands died with this
$ git log
fatal: object 54590b644cb542d30ec962c138a763dddc26aac0 is corrupted
To my great good fortune, my
git push had completed before
the power failed, so I knew I could recover everything from the remote
repository. I flailed around a little before finding out how, but here's
what ultimately worked for me.
First, I kept running
git fsck and deleting the objects it
$ git fsck
fatal: object 54590b644cb542d30ec962c138a763dddc26aac0 is corrupted
$ rm -f .git/objects/54/590b644cb542d30ec962c138a763dddc26aac0
Then I copied the corrupted objects back from the remote repository one
by one, using a trick Sam Vilain
showed me on IRC:
$ ssh remote.ho.st \
"git cat-file commit 54590b644cb542d30ec962c138a763dddc26aac0" | \
git hash-object -w -t commit --stdin
If I had deleted the corrupted objects and reset my
point to an older commit, a plain old
git fetch should have
retrieved the missing objects. I didn't think of that soon enough, and
recovered the missing commit first, so
git fetch thought
everything was up to date. But fetching the objects one by one worked
git fsck stopped complaining.
I'm not sure what I would have been able to do if the remote repository
had not been updated in time. I would almost certainly have lost the
most recent commit, and perhaps also its immediate parent.
I really hope my UPS gets fixed soon.
I've noticed that a lot of people in the open source world have a
negative opinion of Perforce,
whether they've used it or not. Here is one
There's also Perforce, which I don't know much about, but I gather it's
a crappy proprietary centralised VCS which is worse than Subversion in
pretty much every way.
This kind of offhand dismissal by people who are not familiar with
Perforce is very common. When we were switching from Perforce to git
for the Perl 5 source code, a lot of people assumed we wanted to do it
because Perforce wasn't good enough (but it was really because the open
source licensing procedure was non-trivial, and the lack of anonymous
repository access was seen as inhibiting contributors; there were also
objections to depending on a free-but-not-Free program).
There are other people who have used Perforce and not liked something
about it. Their opinions range from reasoned critiques to
[Dear Perforce… ] Fuck you, you miserable, untrustworthy, misleading,
overpriced bastard. I hope your office goes up in flames along with all
your off-site backups. I pray that some open source product that
actually works is embraced by all the major companies and drives you out
of business. I hope that no other company is duped by your salespeople
into thinking you have something even remotely close in quality to the
ancient and craptastic product known as CVS. Never before have I
experienced so much pain in the most simplistic of version control tasks
as I have since starting to work at a company that made the mistake of
I used Perforce exclusively for many years, both for large projects with
many other users and small personal projects, and my experience with it
was very different. I loved Perforce. I found it refreshingly simple to
learn, it worked fast and unsurprisingly and well, and it had excellent
support and documentation (of the kind that few open source programs of
any kind have, even now). I encountered only two or three minor bugs in
it after several years of use, and I never once had to fix the
repository (a welcome change from CVS).
There are, of course, many valid criticisms of Perforce, and my
intention is not to defend it against those. I've suffered from some of
its problems myself: its (mostly justifiable) dependence on the network
was at odds with my very slow dialup link, p4p (the proxy) didn't work
very well for me, some administrators I know had problems configuring
their server the way they wanted, and so on. I switched to git myself a
few years ago, and later helped other projects
I cared about to move away from Perforce too. I haven't regretted the
But Perforce certainly did not suck, and there are some things I still
miss about it. As non-distributed VCSes go, I think Perforce is vastly
better than the (many) other programs I've used.
I wrote a git
post-commit hook that looks at certain files
in my repository whenever I change them, edits them a bit if it wants
to, and commits any changes it made. The changes it makes are not very
interesting, but such a hook could, for example, be used to maintain
"Last modified: ..." lines in static HTML files as shown below.
Let's say we want to update all
foo/*.html files that
contain something like the following line:
<div class=lastmod>Last modified: ...</div>
The idea is simple: use
git diff --name-only HEAD^ HEAD to
get a list of the files that were changed by the last commit, pick the
ones we're interested in, edit them using sed, and commit any changes
The HTML source for my (i.e., this) web site lives in a
Git repository on my local workstation.
This page describes how I set things up so that I can make changes live
by running just "
git push web".
The one-line summary:
repository that has a detached work tree, and a
that runs "
git checkout -f".