Forums : Feedback Forum

Statistics for Drupal are flawed

Hey,

the LOC statistics for Drupal are definitely flawed somehow. Ohloh reports 21,103 LOC for Drupal core but there are definitely more.

The reason might be that many PHP files in Drupal have a different extension - namely .module, .engine and .theme. Without counting these, the statistics for Drupal are worthless, unfortunately, as more than half of Drupal's code lives in .module files.

As of today's Drupal CVS HEAD:

find drupal -type f ( -name .php -o -name .inc ) -exec egrep -vh '^$' {} \; | wc -l

27220

(= number of non-blank lines in .php and .inc files)

find drupal -type f ( -name .module -o -name .theme -o -name *.engine ) -exec egrep -vh '^$' {} \; | wc -l

27351

(= number of non-blank lines in .module, .theme and .engine files)

So, Ohloh is basically ignoring half of Drupal's code.

One solution would be to make either the file types that are used to calculate the LOC or the filetype->languate mapping a project-specific setting.

C3ea1edf4553f125e48bb3e5d1737665?&s=32&rating=pg&d=http%3a%2f%2fwww

Frando

almost 7 years ago
 

I'm afraid making file extensions project specific would allow many people to "cheat". Since PHP files always contain "<?php" could it be a solution to search for the begin tag in non-.php files? Of course, there's also <? and <% but these are disabled by default and are no guarantee the file contains PHP (it could also be XML or ASP).

44db738aeafe124b45b2a5ee5ea25a7e?&s=32&rating=pg&d=http%3a%2f%2fwww

Dietrich Moerman

almost 7 years ago
 

Greetings all,

Our detector uses file extensions and their contents to try to determine the language contained. As Frando suspects, we do NOT currently recognize .module, .theme and .engine files as php.

Dietrich - we have some disambiguation logic to try and tell if a file should be treated as X or Y. So, the rule COULD be something like:

` if extension =~ /.module|.theme|.engine/ AND file.contents =~ /s always outliers that make life difficult. Frando, Dietrich - what do you think?

9dbaca493199c57710e53b56310f659d?&s=32&rating=pg&d=http%3a%2f%2fwww

Jason Allen

almost 7 years ago
 

I think this would be a nice solution. :)

44db738aeafe124b45b2a5ee5ea25a7e?&s=32&rating=pg&d=http%3a%2f%2fwww

Dietrich Moerman

almost 7 years ago
 

Yup, that should work. All PHP files must contain "<?php", so checking against that sounds like the best thing to do.

Here's a complete list of file endings that Drupal uses at the moment for PHP files:

.php .inc .module .theme .engine .schema .install .profile

This applies to both Drupal (core) and Drupal (contributions).

Maybe just checking all text files against "<?php" would be the easiest and most future-proof?

Thanks for your efforts in fixing this!

C3ea1edf4553f125e48bb3e5d1737665?&s=32&rating=pg&d=http%3a%2f%2fwww

Frando

almost 7 years ago
 

Wouldn't it be easier to use some mime magic on the non-binary files to figure out what they are? The unix 'file' utility does a good job in figuring out the file type:

file index.php includes/common.inc modules/system/system.module

Gives

index.php: PHP script text

includes/common.inc: PHP script text

modules/system/system.module: PHP script text

Eecf5698f44c149b400201b1cc825e19?&s=32&rating=pg&d=http%3a%2f%2fwww

elmuerte

almost 7 years ago
 

The file utility does exactly what the introduced fix in Ohloh does, it reads the file looking for a PHP open tag. I tried this out myself.

$ file tagadelic.module tagadelic.module: PHP script text

After removing the PHP open tag:

$ file tagadelic.module tagadelic.module: ASCII C++ program text, with very long lines

So, I think the original solution is the best (no need to use third-party and *NIX only binaries).

44db738aeafe124b45b2a5ee5ea25a7e?&s=32&rating=pg&d=http%3a%2f%2fwww

Dietrich Moerman

almost 7 years ago
 

Any news here?

C3ea1edf4553f125e48bb3e5d1737665?&s=32&rating=pg&d=http%3a%2f%2fwww

Frando

almost 7 years ago
 

bump

C3ea1edf4553f125e48bb3e5d1737665?&s=32&rating=pg&d=http%3a%2f%2fwww

Frando

almost 7 years ago
 

Post a Response