Lotka's Law in OSS Authorship; and, Gauntlet Launches

My friend Adam, one of the founders of Gauntlet Systems, writes about Lotka’s Law, a type of 80/20 distribution, and its applicability to the authorship of open source software:

The 80/20 rule is actually only one of a specific type of numerical relationships known generally as power laws. Much has been written about power laws and their applicability to everything from linguistics to hedge funds. Recently folks have been writing a lot about the power law scaling of web logs and of the “long tail” of web businesses. Way back in 1926, a statistician for MetLife named Alfred Lotka, published a paper in which he observed that the productivity of scientific authors also followed an power-law relationship. Put simply “Lotka’s Law” says that a few authors do most of the work, dragging along a long tail of less productive authors.

Using Gauntlet Systems’ analysis tools, we can look for this phenomenon in software development as well. These tools facilitate reporting on the activity in a software project – you can think of it as Business Intelligence for software project managers. We’ve pulled a number of Open Source Software projects into the Gauntlet System environment. We can easily use these tools to see how much different authors contribute to projects. Taking two popular projects, Lucene and Hibernate, as examples, we can easily generate the following graphs for activity over the past year.

Adam and the rest of the Gauntlet team were all early engineers at WebLogic, and I’m psyched to see that they’ve launched a web site about their new company. I’m completely biased since they’re all friends, but what their doing is incredibly cool and worthwhile for any development team. (I wish I’d had it at my last company — what we were able to cobble together for similar purposes, in our “spare” time, was pathetic by comparison.) Their demo is full of great analytics of open source software projects. I think the tools they’re developing will become just as reflexively necessary for good software development as source control is today.