Forums : Feedback Forum

Dear Open Hub Users,

We’re excited to announce that we will be moving the Open Hub Forum to https://community.synopsys.com/s/black-duck-open-hub. Beginning immediately, users can head over, register, get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.


On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at [email protected]

Project report update rate and development activity indicators

I was wondering if Ohloh takes into account the frequency of recent commits and the relative size of the codebase of a project to establish the optimal update rate. There are many projects with high development activity and a relatively small codebase. Updating these projects more frequently would result in more accurate statistics at a relatively small cost in terms of server workload. At the moment Ohloh seems to take into account only recency to schedule update rates.

Also, I was looking at the development activity metrics and was wondering whether you may want to change the current scale to determine whether a project is active or not. At the moment only projects that are in a revisions system for more than two years (a reference year plus the current year) display development activity statistics, right? Given the average commit rate of opensource projects (http://cia.navi.cx/), this may be an unnecessary large and not very meaningful scale to determine changes in development activity. A short-term indicator (for instance the first derivative of the number of commits per week or per month) could be a more interesting metrics in my opinion.

dartar about 17 years ago
 

Hi dartar,

That's a great suggestion. We've been thinking of doing something similar for a while. Before adding this feature, however, we're hoping to throw hardware at the problem. We made a recent hardware purchase that should show up in the next 2 weeks. I'm hopeful that this will enable us to make daily project updates.

Meanwhile, let me know what projects you're interested in and I'll write a custom script to get your project(s) updated more frequently.

Jason Allen about 17 years ago
 

The project I maintain is WikkaWiki, but I think I can live with weekly (-ish) updates if this is the rule, no need of a custom treatment :) I do think it'd be really nice to have daily updates for all tracked projects so I look forward to your hardware upgrade. Hopefully this will also allow more accurate statistics on recent activity, along the lines I suggested above. Keep up the good work.

dartar about 17 years ago
 

I decided to make a minor change that should help out: the wording of our current 'factoid'. It now reads:

Decreasing year-over-year development activity

The goal of this factoid is to give interested users an idea of the broad trends within a project. I don't see weekly or monthly activity factoids being very useful for users. However that might be useful for project developers/maintainers. I think we should consider adding a monthly activity historical chart.

Does this sound reasonable?

Jason Allen about 17 years ago
 

My point was simply that development activity stats as described by factoids are biased towards projects that are at least 2y old in their code repository. This is especially true when factoids are used for ranking (e.g. project A has the largest activity, project B has the slowest development etc.) which is inaccurate if not explicitely referred to projects whose age is >2y.

A monthly activity chart sounds like a nice addition though.

dartar about 17 years ago
 

I don't have a good suggestion yet. I just would like small project not to appear dead. Maybe adding another statement that says something about recent commit activity. E.g. avg commits in the last year (dayly, weekly, ...).

Stefan Sauer about 17 years ago
 

Hi,

I also want to voice concerns at this 'Decreasing year-over-year development activity' as it relates to WIKINDX:
http://www.ohloh.net/projects/4584

If you look at the codebase history it steeply increases year on year but, from January this year shot down to zero (no commits for the last few months until about a week ago because I was moving country/job). However, that's just three months which is not 'year on year'.

Describing WIKINDX as having 'Decreasing year-over-year development activity' is not only not true but reflects badly on the project.

If monthly factoids/charts are no use to users then neither should be used to assess year on year activity (or is it just my bad luck that your 'year' starts at the beginning of January and so my commits in the first two weeks of January don't add up compared to other years?).

Apart from that, I appreciate the work you've done.

Mark.

sirfragalot almost 17 years ago
 

Our 'year-over-year' measurements consider full 12 month periods leading up to today, not calendar years.

We haven't updated the WIKINDX in about a week, so we're missing the recent burst of activity over the last few days. Our report is based on the last recent changes in the project, around the end of February.

For WIKINDX, there were 433 commits between March 1, 2005 and March 1, 2006. Over the next year from March 2006 to March 2007, there were only 292 commits. Is this not true?

We show the decreasing year-over-year activity flag when the number of commits is less than 75% of the number in the prior year. Given your recent burst of activity, the number of commits is probably rising quickly, and this flag should go away soon.

Robin Luckey almost 17 years ago
 

Lies, damn lies and statistics. How then do you explain the codebase activity chart which continues to show a rise year on year. The rate of increase over 2005 may not be as much as the rate of increase over 2004 but that is of course to be expected with any software project. In the early stages of development there is a lot more of the base code to be developed than in subsequent years.

It's like writing a series of textbooks. First you have to write them, a period of frenetic activity to get your first 10 books out in public -- every single line is new. In subsequent years you may supplement this basis with one or two more books but you may equally spend time refining and updating your existing set leading to new editions. This may not lead to such an increase of number of new lines as in the first year but is, nonetheless, still an indication of ongoing and still frentic activity. By all means present the graph showing a decline in the rate of new code production but, because this says nothing about ongoing refinement of existing code, I believe it's not true to say that the project has 'decreasing year-over-year development activity' (you only chart activity for just over three years -- a further reason why you have no basis to make this claim).

In the periods you give there are more ways of measuring activity than CVS commits. March 05 to March 06 there were 29 releases (core, plug-ins and localisations) while from March 06 to date there have been 27. Less, yes, but bearing no relation to the almost 50% decrease you describe in CVS commits and so surely not enough of a decline to be damned with the phrase 'decreasing year-over-year development activity'.

As you can see, my main main complaint is that phrase. Not true depending upon which figures are used and not true because you don't have the year on year data to truly make the comparison (wait until you can measure a decline over five years or more before making such a sweeping pronouncement).

Mark

sirfragalot almost 17 years ago
 

Hi Mark,

The 'decreasing year-over-year development activity' flag is turning out to be more controversial than we expected. We're not interested in making developers feel damned or having bad luck. It's not Ohloh's goal to pass judgment, but rather to make the raw data simple enough to understand that non-technical users can easily make their own decisions.

The intention of this metric is to simply state that activity has slowed lately -- which you admit, having taken several months off this year. However, this is the feedback I'm hearing:

  1. It's not plainly obvious how we come up with this label. That's true; nowhere on the web site do we explain ourselves in this matter. Point taken.

  2. The time span should be longer. That's interesting, because we've also heard from people who want the time span shortened. I tend to agree with you, that a longer time span is better. However, perhaps the real problem is simply that we don't have a good, simple visualization of activity over time, which would make this label unnecessary.

  3. Our methodology for coming up with this metric might be flawed, because the number of commits isn't a sole metric for measuring activity. I agree with you. In this regard, I would emphasize that we're not trying to make a blanket evaluation of the project, we're just trying to state that the number of commits is going down, and we let the users make their own call. Our choice of language might lead a reader to think that we are implying more that this. We've changed the wording once already in response to feedback, and we're still open to suggestions on what to do about this metric.

I think there might be some confusion over the codebase history chart. This graph shows the total lines of code, not the amount of activity. We are planning to add an additional chart that displays the amount of activity over time -- and we'll get some better labels on these graphs.

We're trying to balance simplicity and utility while remaining transparent. As they say, our website is still under construction. Your feedback is welcome.

Thanks,
Robin

Robin Luckey almost 17 years ago
 

Hi Robin,

Thanks for the response. There are several reasons why the 'decreasing year-over-year development activity' metric is not a good label and one, I feel, that is misleading and damning (let's not forget, it is boldy presented with a warning symbol on the front page as part of the project summary) and these are supported by your statements above:

The intention of this metric is to simply state that activity has slowed lately. This is true but it would be more accurate to call this a 'month on month' decline and, in this case, it is a period of just three months. However, to tune it to the smaller period means that the system would be too closely correlated to blips in the CVS activity level as has been the case in my project.

It's not Ohloh's goal to pass judgment. Nevertheless, to those like myself who are strongly proprietorial and (I'll admit it) proud of our projects (which, in this case, is not a commercial business as, I understand you plan to make yours -- correct me if I'm wrong), the 'year on year' statement comes across as highly judgemental. It is a value judgement based on one and only one set of interpreted data where other data statisticalised in a different manner may show the opposite to be true. As an example of this, the recent burst of activity for wikindx (over just the past week following a hiatus of about three months) has resulted in having my highest ever sourceforge daily ranking yet -- not quite in the top 50 sourceforge projects but very close. Obviously, sourceforge analyses different data differently and this is a day-by-day account -- today's figure contrasts very strongly with the picture a week ago.

I'm not suggesting that the time span for taking snapshots of 'activity' should be longer. I am suggesting that to measure just three years and then claim a
'decreasing year-over-year development activity' risks accusations of not using a long enough statistical sample with which to judge yearly activity.

I agree, a graph of CVS commits might present a clearer and more precise picture than the broad brush 'year on year' statement but it should be clearly explained that this is not necessarily a way to judge the vitality of a project. Assessing the number of new lines committed is likewise not a good way the vitality because it does not take into account debugging activity and refinement which may not add lines (indeed, it may remove lines) because it simply edits existing lines. Neither is the number of commits/year necessarily a good measurement. Wikindx is the first open source project I've worked on and the first in which I've used CVS. I know from my own experience that as I mature in my familiarity with such a system I store up code for less frequent commits than I did in the first rush of enthusiasm. Additionally, the script system I latterly developed to upgrade the database of wikindx for existing users means that I'm less likely to commit frequently during a developmental stage as this may cause problems for those accessing the CVS repository for the latest developmental snapshot.

In a previous post you give a CVS commits figure from March 2005 - March 2006 compared with the same period a year later. Do you have figures for the same period 2004 - 2005?

I know that there are any number of ways of interpreting data and any number of ways of selecting which data to interpret which is why I gave the (in)famous quote at the beginning of my first post and thus those gathering and intepreting such data (yourself) will always be open to accusations of bias, misinterpretation, selection etc. etc. etc.. But please, change that label. The explanation behind it states that it is represents a 'substantial decline in development activity' but it also states that it may mean 'a maturing software base'. One is negative (in terms of public perception of a project) whereas the other is positive. Yet the label on the front page reflects only the first. One could equally argue that the label should state: 'Maturing codebase year on year'. Naturally, I'd be happier with that....

Mark

sirfragalot almost 17 years ago
 

What about differentiation between 'the projects codebase is expanding' (commits that add stuff) and 'the projects codebase is maturing/getting refactored' (its not growing a lot, but its changing).
I think the only fact that would be worth warning is a project with no commit for a year. If the commit ration is linearily decreasing over time, than a weak warning might be also okay. Too bad that bugtrackers have no standart API, because having a bugtracker with bugs filed, but no activity is also a good indicator for a dead project.

Stefan Sauer almost 17 years ago