When the Archiveopteryx source code lived in Perforce, we would submit
everything to the src/main
branch, review and test, then
use p4 integrate
to merge selected changes into release
branches like src/rel/2.0
. The only changes we submitted
directly to the latter branches were release-specific, like setting
the version number in Jamsettings
. We could safely re-run
p4 integrate
at any time, and it would show us only those
changes that we had not already reviewed.
When we moved to git, we continued to work this way—development happened
in master
, and we would use git cherry-pick
to
integrate or backport selected changes into older release branches. New
release branches were created by branching from the current master, and
maintained the same way. We did this for almost two years and several
releases, but it was not much fun.
There was no easy way to answer the question Which commits do
I need to consider for inclusion?
for any given release branch. In
theory, git log --cherry-pick
will tell you, but it doesn't
work very well. We used to do monthly releases, but it was so painful to
deal with the build-up of commits at release time that we were forced to
backport changes in smaller batches throughout the month (but that was
not, in itself, a bad thing).
Sometimes we had to backport an important fix before we had finished
reviewing all of the commits before it. Over time, out-of-order
application caused cherry-picked commits to have different patch-ids
from the original commits, thus defeating --cherry-pick
's
ability to recognise them. So we had to keep wading through piles of
commits that we had already backported. A related problem was that
there's no simple way to record that we have considered a particular
change and decided not to backport it (a null integration in Perforce
terms). We had to fake it by creating a tag that recorded the last
commit we'd reviewed, and moving it forward after each review session.
It's possible to do this with some discipline and a few helper scripts,
but cherry-picking individual commits is not a natural way to work with
Git. It breaks the history, as described above, and should be used only
in exceptional situations. In general, you should merge entire branches,
preserving history and keeping the toolchain firmly on your side. This
means you can't keep doing everything in master
, because
there's no way to exclude the new commits you don't want to release
when you merge into an older release branch.
The trick is to make each change (by which I mean one or more commits
related to a single feature or bugfix) in its own branch so that it can
be merged forwards into any later branches without bringing in
unwanted commits. In practice, this means creating a "topic branch" for
each change, rooted at or before the oldest branch in which the change
needs to be released. Then you can decide, for each release branch,
whether to merge the topic branch or not. When you're done merging, you
can delete the branch. If you use consistent names for topic branches,
you can see what branches need to be merged at any time, and re-merging
a branch is safe (if you haven't made changes to it, nothing happens).
It's easy to create lots of branches in Git, but it's not always easy to
decide how far back a feature or bugfix is relevant. If you get it wrong
and don't realise it before you merge forwards, it's hard to then merge
the changes into an older branch (and that's a case in which cherry-pick
may be the only answer). But this is the workflow that Git was designed
for, and it makes sense for a project like the Linux kernel, with rapid
development, many unrelated contributors, and very little backporting.
It's a little harder to adjust to for a project like Archiveopteryx, but
it would simplify release management at the cost of the development-time
effort needed to always organise changes into separate branches.
Aside: whether you merge forwards or integrate backwards, you have to
deal with conflicts that arise when the code you're bringing in depends
on something that has changed in a newer branch, or something that isn't
in your branch at all. p4 integrate
makes it much easier to
review a merge and fix conflicts than git merge
, which just
dumps you at the shell prompt. A "git add -p
"-like merge
interface would be very useful.
There have been many proposals to change Git to allow easier tracking of
cherry-picked commits. The most promising one was to annotate them with
the original patch-id and maintain a local database to map patch-ids to
commits, and teach git-cherry
to use it. Linus approved the
idea in principle, but Junio Hamano (the current Git maintainer) was not
especially thrilled with it and didn't want to support it directly as a
special case. With Git 1.7.x, however, it should be possible to use
notes
to annotate backported commits sensibly, but that idea is still firmly
in "roll your own" territory.