Linus Makes a git of Himself
Recently, Linus gave a Google tech talk on git, the SCM that he wrote to manage Linux kernel development after the falling out with BitKeeper. In the talk, Linus lambasts both CVS and Subversion, for a variety of reasons, and argues why he believes that git is far superior. In true Linus fashion, he frequently uses hyperbole and false(?) egotism as he lays out his arguments. This in itself could all be in good fun, but Linus takes it a too far. The even worse part, in my opinion, is that Linus confidently pedals arguments that are either specious or poorly focused. In the end, he just loses a lot of respect.
Now both CVS and Subversion are far from perfect. In fact, I personally dislike CVS and am disappointed with the progress of Subversion. Git clearly includes some advanced features that empower the user when compared with either of these tools. However, the way Linus presents the superiority of git is very misleading. Firstly, he throws out several insults towards CVS/Subversion that are plain wrong, such as:
- Tarballs/patches (the way kernel development was managed way, way back) is superior to using CVS. This almost has to be a joke, but sadly I do not think it was meant that way. CVS may be lacking many powerful features, but this statement ignores the many extremely useful features that CVS does have.
- There is no way to do CVS “right”. Only if you buy the argument that decentralised is the only way, which Linus does not convince me of one bit.
- It is not possible to do repeated merges in CVS/Subversion. It may suck that the tools do not support this directly, but everyone knows how to do repeated merges with a small amount of extra effort on the user’s part. I dislike the extra effort, but this is very different from it being impossible.
- To merge in CVS you end up preparing for a week and setting aside a whole day. Only a stupid process would ever lead to such a ridiculous situation; one that is not adapted to the tool being used. A more accurate criticism would be to say that CVS limits your ability to parallelise development due to weak merging capabilities.
Ignoring the insults, the very core of Linus’ arguments does not make sense. He claims the absolute superiority of the decentralised model, but the key advantages that he uses to back this up are not unique to decentralised systems. Almost all of the key advantages are much more about the ability to handle complex branching and merging than they are about decentralisation. The live audience was not fooled, and one member even questioned Linus on pretty much this exact point a bit over half way through the talk. In response, Linus starts off on a complete tangent about geographically distributed teams and network latency, leaving the question totally unanswered. The closest he gets is making some very weak points about namespace issues with branches in centralised systems (like somehow that would be difficult to solve).
The inability to answer (or perhaps the ability to avoid) important questions is in fact a recurring theme throughout the talk. Another prime example is when an audience member asks an insightful question about whose responsibility it is to merge changes. In a centralised system, the author is naturally tasked with merging their own new work into the shared code base. The advantage here is they know their own changes and are thus well equipped to merge them. In a decentralised system, people pull changes from others and thus end up merging the new work of others into their own repository. Linus is so happy with this question he quips that he had payed to be asked it. Such a shame, then, that he fails to answer it in any meaningful way. Instead, he waffles about the powerful merging capabilities of git (again, not unique to distributed systems). He then hypothesises a case where he pulls a conflict and instead of merging the change himself he requests the author of the change to do the merging for him. He concludes proudly that this is so “natural” in the distributed model. But hang on a second, the original point was that this would have actually been more natural in the centralised model: the author would have merged their own change and Linus need never have been bothered. Honestly, this is as close as you can get to stand up comedy when giving a talk about an SCM!
All in all, a very disappointing talk. Distributed SCMs are a great thing, and bring many advantages. The powerful merging that the distributed model has brought (through necessity) is something that other SCM vendors should be taking a good look at. This is no way to promote the distributed model, however. A more thoughtful view on the pros and cons of distributed versus centralised models, including what each can take from the other, would have been much more informative. My feel is that centralisation has many benefits for certain projects, and like most other things, the best solution would adapt the good parts of each model to the project at hand.
This entry was posted on Tuesday, June 5th, 2007 at 1:33 am and is filed under Opinion, SCM, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

June 5th, 2007 at 4:10 pm
After studying (note, not using) popular decentralized systems, I became pretty convinced that the concepts are better than svn, but not all that different. Just more of the basics right. Imagine hashes as version numbers instead of simple increments like in svn. Less stomping on feet that way. And imagine that a branch is just a separate repo. And you are right that they tend to have way better merging than svn. They can still be used in a centralized fashion if wanted, though.
June 5th, 2007 at 11:59 pm
Hi Tom,
I think the most interesting thing about the distributed model is the extra challenges it presents to the implementation. These challenges have given birth to solutions that would also be very useful in a centralised environment. Layering a partially centralised process and/or UI on top of the distributed technology – hiding some of the unwanted complexity of distribution that is not always needed – could provide a compelling replacement to traditional centralised SCMs.
July 11th, 2007 at 1:24 pm
Chris Suer says:I think the idea of categorising a version control system as centralised or decentralised is not going to make sense very soon and it’s probably causing a lot of confusion already.
All of the so called centralised version control systems will eventually have the ability to perform local commits (if they don’t already) at which point they become similar to the decentralised systems.
Any decentralised version control system can be considered a centralised system if you call one particular node the master (in the Linux case, it would be Linus’s tree).
I’m skipping over the differences in processes involved: e.g. push vs.pull, but they aren’t really anything to do with whether it’s centralised or not.
My biggest concern with git is that you have to have the entire history of a project downloaded locally. They say it’s fast so it doesn’t matter (I haven’t tried it) but it doesn’t sound like something that will scale (history only gets longer). I realise that you can do shallow copies with git but I believe that leaves you with a crippled local copy. It strikes me that whether history is present on a local machine is purely a caching issue and so the absence of it shouldn’t be a problem.
August 21st, 2007 at 3:03 am
Martijn Meijering says:Ah well, maybe Linus just likes stepping on people’s toes ?
Personally, I find git, mercurial, monotone etc very interesting, but I’m very happy with subversion as it is, even without merge tracking. The thing is, coming from an XP background, I’m really into continuous integration, so I try not to use branching.
There are at least two ways to deal with the pain that comes from the merging that goes with branching: continuous integration or excellent merging tools.
If you do CI, you avoid working on a private branch, you just work in your working copy, synchronise a couple of times a day and use tools that make this easy. Not everybody likes this, but done properly (with unit testing, TDD etc), this is a breeze. Small changes, rare conflicts, no poisoning of the trunk. Subversion is excellent if you want to work like this.
The people who like git take the opposite approach: instead of avoiding the pain of merging by not branching, they accept branching as natural and develop excellent tools to make merging easy. If you like to work the way the kernel developers do, Subversion is probably not the tool you want to be using.
Some people will feel strongly that distributed development is better than CI, just as I think the opposite is true. But clearly, both methods can be very successful.
Git is a more powerful tool in the sense that it supports both models well. However, since I do not need the extra features it offers, I’d rather use Subversion, if only because it has nice GUI’s such as Tortoise. This will undoubtedly change however, as I’m sure the git people will come up with decent GUI’s as well.
Just my 2 cents.
August 22nd, 2007 at 12:09 am
Hi Martijn,
Yeah, Linus does enjoy taking that attitude. Just a shame he overdid that rather than talking about the really interesting parts of git.
Re: CI vs branching, I am as big a fan of CI as anyone. Heck, I started a company to make a CI server :). I also take your point that CI can be an alternative to some forms of branching. However, there are other forms of branching that CI cannot replace. For example, maintenance branches for old release streams and development branches for experimental/long term projects. In these cases developer teams cannot avoid parallel development, and there will always be merging.
Taking things even further, in an agile approach teams are encouraged to start simple and refactor mercilessly. This can be painful, however, when there are parallel codelines. Refactoring is the enemy of parallel development as the merging tools are not smart enough to handle it. This is a big enough topic to be worth its own post!