Analysis of free software communities (III): activity

This post is part of a series: introduction (I), adoption (II), activity (III), work hours (IV)generations (V), and coda (VI).

  • Images: on the left, the number of changes to the codebase (commits) agregated by year. On the right, the number of developers with at least 1 commit that year.
  • Data: trunk from project repositories during the period 1999-2010.

Data patterns

Certainly, not the number of features developed or bug fixes. It is even barely possible to compare activity between projects, as there are a high variability in terms of changesets: some people could send several little changesets and others just 1 big change, some project could have a special policy which affect the results (i.e.: make a commit formatting the code accoring to the style rules and other with the changes), etc. Some people could even argue that the language they are written in affects the number of changes (GRASS is written in C, gvSIG in Java and QGIS in C++) due to the libraries available or the semantics of every language. So, is it possible to find out something? Well, in my opinion, we can trace at least the following:

  • the internal evolution of a project.
  • how a project is doing in terms of adding new blood.

 So, let’s make again the exercise of finding out what’s happening here:

GRASS

  • It calls the atention the curve of activity in the project: growth by periods (2001-2004 and 2005-2007) with local maximums in 2004 and 2007. Our hypothesis was that it was due to the way the project works: the developers here make changes both in the trunk and in the branch of the product to release (be it 6.4 or 6.5) at the same time, with a lot of changesets moved between both the trunk and the branches (so doing heavy backporting). In a recently conversation with Markus Neteler, he has explained me better how they work and I guess the rhythm we see in the graphics is due to that.
  • In terms of number of developers, GRASS has showed a continuous growth until 2008; since then, the number of regular developers stabilizes.

gvSIG

  • gvSIG shows an incredible high period of activity during 2006-2008 (4500 changesets by year and most that 30 people involved!). To understand the Gauss bell of activity, is needed to know the background of the project: gvSIG development has been led by contract, which means that all activities (planning, development, testing, etc) were led by the client needs who pay for it. Only recently, these processes have been opened to a broader community (firms and volunteers collaborating in the project within the gvSIG association). So, it makes sense that the beginnings had seen less activity (high phases of planing) and afterwards they got to agregate so many people in such a short period of time.
  • But, in 2010 it suffered a sudden stop in development (only 233 changes to the codebase were made, while a pace of 4500 changes were made during previous years). This decreasing in activity is highly correlated to the number of developers involved. It’s hard to say why it happens: could it be due to the efforts were directed to gvSIG 2.0 development? could it be due to the reorganization in the project and the creation of gvSIG association? Well, few can be said at this respect with the data available, further research is required to determine that.

QGIS

  • Steady grow both in terms of contributions and contributors. 2004 and 2008 years determine two peaks of activity and people participating in the development. Our preliminar hypothesys was that it was due to the release of the first stable version and the release of 1.0, as well as become an oficial project of OSGEO. Gary Sherman has confirmed that in a recent post (history of QGIS commiters) and an interview (part1 and part2). Besides, he pointed out that in 2007 the project added python support for plugin development, which possibly was one of the reasons of the growth in 2008 and afterwards.
  • An interesting finding is that, every 4 years the project has doubled the amount of developers involved with a slower but steady growth in activity.

Well, hope these graphics have helped us to understand better how is the project activity and the manpower every project is able to aggregate around it. The next post in the series will focus on the developers involved and the culture surrounding them. Looking forward to your feedback!


Comments

7 responses to “Analysis of free software communities (III): activity”

  1. Hi, very interesting post! Only a question: where do you foind the data taht you’ve used in the charts?

    1. Hello Cesare, the data comes from the code repository of every project. We parsed it and generated the stats. If you are interested in playing with them, find them here.

  2. […] no solo software Share this:CondivisioneStampa con JoliprintLike this:LikeBe the first to like this post. […]

  3. Regarding gvSIG I guess you were looking at gvSIG main repo, I don’t know. Just for the records, I want to note that gvSIG 2.0 development has been exploded to several OSOR projects and because of maven modularity there are many different locations where activity happens. César Ordiñana has been maintaining the list of repos at gvSIG Desktop 2.0 entry at Ohloh http://www.ohloh.net/p/gvsig-desktop-2/enlistments
    I agree that gvSIG development has decreased in activity by “main contracts” but it’s increasing the contributions my small contracts that public administrations make to improve some specific parts of the products. I like a lot this way, as it demonstrates the maturity of understanding of public bodies decision makers regarding what free software is (pay for improvements and maintenance, not just for new ultra-cool features).
    There are more to discuss here but well, it’s enough for a blog comment 🙂
    Nice reports!!

    1. Yep, the report depicts the activity in gvSIG 1.X line.

  4. […] this post is part of a serie: I, II, III, IV and V (this […]

  5. […] post is part of a series: introduction (I), adoption (II), activity (III), work hours (IV), generations (V), and coda […]

Leave a Reply to GRASS, gvSIG, QGIS: come crescono? « Cesare Gerbino GIS Blog Cancel reply

Your email address will not be published. Required fields are marked *