Managing release branches: git merge vs. p4 integrate

By Abhijit Menon-Sen <ams@toroid.org>

2010-05-02

When the Archiveopteryx source code lived in Perforce, we would submit everything to the src/main branch, review and test, then use p4 integrate to merge selected changes into release branches like src/rel/2.0. The only changes we submitted directly to the latter branches were release-specific, like setting the version number in Jamsettings. We could safely re-run p4 integrate at any time, and it would show us only those changes that we had not already reviewed.

When we moved to git, we continued to work this way—development happened in master, and we would use git cherry-pick to integrate or backport selected changes into older release branches. New release branches were created by branching from the current master, and maintained the same way. We did this for almost two years and several releases, but it was not much fun.

There was no easy way to answer the question Which commits do I need to consider for inclusion? for any given release branch. In theory, git log --cherry-pick will tell you, but it doesn't work very well. We used to do monthly releases, but it was so painful to deal with the build-up of commits at release time that we were forced to backport changes in smaller batches throughout the month (but that was not, in itself, a bad thing).

Sometimes we had to backport an important fix before we had finished reviewing all of the commits before it. Over time, out-of-order application caused cherry-picked commits to have different patch-ids from the original commits, thus defeating --cherry-pick's ability to recognise them. So we had to keep wading through piles of commits that we had already backported. A related problem was that there's no simple way to record that we have considered a particular change and decided not to backport it (a null integration in Perforce terms). We had to fake it by creating a tag that recorded the last commit we'd reviewed, and moving it forward after each review session.

It's possible to do this with some discipline and a few helper scripts, but cherry-picking individual commits is not a natural way to work with Git. It breaks the history, as described above, and should be used only in exceptional situations. In general, you should merge entire branches, preserving history and keeping the toolchain firmly on your side. This means you can't keep doing everything in master, because there's no way to exclude the new commits you don't want to release when you merge into an older release branch.

The trick is to make each change (by which I mean one or more commits related to a single feature or bugfix) in its own branch so that it can be merged forwards into any later branches without bringing in unwanted commits. In practice, this means creating a "topic branch" for each change, rooted at or before the oldest branch in which the change needs to be released. Then you can decide, for each release branch, whether to merge the topic branch or not. When you're done merging, you can delete the branch. If you use consistent names for topic branches, you can see what branches need to be merged at any time, and re-merging a branch is safe (if you haven't made changes to it, nothing happens).

It's easy to create lots of branches in Git, but it's not always easy to decide how far back a feature or bugfix is relevant. If you get it wrong and don't realise it before you merge forwards, it's hard to then merge the changes into an older branch (and that's a case in which cherry-pick may be the only answer). But this is the workflow that Git was designed for, and it makes sense for a project like the Linux kernel, with rapid development, many unrelated contributors, and very little backporting. It's a little harder to adjust to for a project like Archiveopteryx, but it would simplify release management at the cost of the development-time effort needed to always organise changes into separate branches.

Aside: whether you merge forwards or integrate backwards, you have to deal with conflicts that arise when the code you're bringing in depends on something that has changed in a newer branch, or something that isn't in your branch at all. p4 integrate makes it much easier to review a merge and fix conflicts than git merge, which just dumps you at the shell prompt. A "git add -p"-like merge interface would be very useful.

There have been many proposals to change Git to allow easier tracking of cherry-picked commits. The most promising one was to annotate them with the original patch-id and maintain a local database to map patch-ids to commits, and teach git-cherry to use it. Linus approved the idea in principle, but Junio Hamano (the current Git maintainer) was not especially thrilled with it and didn't want to support it directly as a special case. With Git 1.7.x, however, it should be possible to use notes to annotate backported commits sensibly, but that idea is still firmly in "roll your own" territory.