Forums : Ohloh General Discussion

Dear Open Hub Users,

We’re excited to announce that we will be moving the Open Hub Forum to https://community.synopsys.com/s/black-duck-open-hub. Beginning immediately, users can head over, register, get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.

On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at [email protected]

PHP Eats Rails for Breakfast

This thread is for discussion of the article PHP Eats Rails for Breakfast

Robin Luckey over 17 years ago

The reason for the huge increase in PHP is that PHP projects like Content Management Systems go through huge social upheavals and fork into seperate projects on a regular basis.

For example, take this one

http://next.ohloh.net/projects/14

NOBODY writes 1.4 million lines of PHP on their own in one year.

Either your PHP lines of code scanner is whacked and can't handle projects like that (i.e. it is including HTML or something) or you are essentially massively amplifying the line count as large PHP projects fork.

Joomla! is a nother example. Millions of lines of code in one year? It's just not possible unless you are forking other projects outright.

Adam Kennedy over 17 years ago

As long as we don't tag every line of code with an author & license comment we will never have a chance to produce serious stats here. Every coder is grabing code from others and is using it his one code - sometimes just part of a short line, sometimes thousands of lines. And - as already mentioed - forks start with an existing code base others (or the same people!) created.

Open source code tends by design to mix and merge and split, and that is one of it's biggest advatages over other development models as it ensures much more flexibility and faster development. However, for bigger forked projects we could try to deduct the code taken from other still existing projects.

Greetings,
Chris

Chris Hildebrandt over 17 years ago

I work a lot on PHP and Perl projects -- and you have me pretty well profiled ;-) and I have nothing for or against either.

But I think your metrics are wrong.

Most Perl projects are published mostly packaged in library-centric ways in CPAN. Whole CMSs are published this way, and what you do is add a few mod_perl config options to your apache config. So CPAN is where the action is for Perl code. Unfortunately, as CPAN doesn't have a centralised SCM, it's very hard to track. I cannot see any evidence of ohloh tracking CPAN at all.

In the case of PHP, most of the work is happening in environments like SourceForge that are easier for ohloh to track. This is because there is almost no code in PEAR (perhaps because the PEAR repo is much newer, the tools greener and the approach rather elitist). So the level of libification is lower and people just copy code around a lot more. In that sense, the approach is less sophisticated.

Perhaps PHP still has more dynamic -- I don't know. But your stats don't include the main repo where Perl code is being developed/published, so there is no way to tell.

regards

martin

Martin Langhoff almost 17 years ago

Is Java so low in the stats that it doesn't even appear? Just wondering... What is the proportion of other languages in these charts? (C, C++, Java, whatever...)
Statistics are never easily interpreted and I think most of us know it, but it's always nice to have some at hand, so I consider the article a good thing overall. Is there a way to access global statistics like this on a dynamic page?

ywarnier over 16 years ago

ywarnier,

I think you might be interested in the languages comparison page.

Robin Luckey over 16 years ago

Comparing lines of code may not be very valuable, because some languages are just plain more verbose than others (in this case, I think it's fair to say that PHP tends to be more verbose than either Python or Ruby).

Forest Bond over 14 years ago

@forest: Agreed. For most of our own internal data wrangling, we look at the number of commits rather than the number of lines of code. The number of commits tends to better represent the amount of work invested in a project.

That's why we go to the trouble of processing the source control history, and we don't accept tarballs for processing.

Robin Luckey over 14 years ago