Forums : Ohloh General Discussion

Dear Open Hub Users,

We’re excited to announce that we will be moving the Open Hub Forum to https://community.synopsys.com/s/black-duck-open-hub. Beginning immediately, users can head over, register, get technical help and discuss issue pertinent to the Open Hub. Registered users can also subscribe to Open Hub announcements here.


On May 1, 2020, we will be freezing https://www.openhub.net/forums and users will not be able to create new discussions. If you have any questions and concerns, please email us at [email protected]

uber_data_crawler adding bogus projects

It appears as if the uber_data_crawler account is scouring SourceForge and adding dead and/or abandoned projects. Is this on purpose? Can we please delete these projects?

These junk projects may inflate Ohloh's numbers but they also seriously dilute its effectiveness. I'm getting tired of searching for something only to have 99% of the results be something I can't use.

Thanx!
Richard

BTW: I was searching for grade, as in teacher grade book software, when I found this. Look at these projects for some prime examples:

Richard Hurt about 15 years ago
 

Some projects dead on sourceforge.net and reported by uberdatacrawler, have moved into other homes and are correctly registered by humans. This is the case of denemo:

the bad one (uberdatacrawler) (3 months ago):
https://www.ohloh.net/p/28635

the good one (2 years ago):
https://www.ohloh.net/p/denemo

Benoît Rouits about 15 years ago
 

I definitely do not want to remove any valid, human registered projects. But I would like to know if it is OK for us to remove the other, invalid, uber datacrawler registered projects? If so, what is the proper way to remove them? And if we remove them, will they just get put back in a couple of months or so?

I'm sorry for any misunderstanding I might have caused.

Richard Hurt about 15 years ago
 

If you find a duplicate project, please do delete the duplicate. The instructions are in the FAQ. Once deleted, our data crawler will not re-add the project.

However, please don't delete projects just because they are abandoned or inactive. It is Ohloh's goal to record all open source activity, even if it does seem trivial. Knowing that the project is abandoned is better than knowing nothing about the project at all.

After that, it's our job to figure out which projects are relevant to your search.

Bad search results are not a consequence of too many projects -- they're a consequence of immature search tools. We've been building a new search engine over the last few weeks, so we are still working out some problems.

We're digging into the specific case you described right now, and should have some improvements very soon.

Thanks,
Robin

Robin Luckey about 15 years ago
 

I may have written a wrong way (i am not english). I meant that uberdatacrawler was wrong, and i also would like to be able to remove uberdatacrawler's wrong entry, tell 'him' to not reschedule registering again.

Benoît Rouits about 15 years ago
 

Robin, it seems to me you guys have quite a task on your hands. Those two projects that be1 suggested are duplicates but it would be kinda hard for a computer to figure out. They have different names and homepages. Unless you have a seriously good natural language parser, I think you're going to have a tough time finding these needles in the Ohloh haystack. :)

One search suggestion that I could make is to rank projects with no source code, users, maintainers, or RSS feeds at the bottom of any result. And/or put more information about the project activity on the search results listing. That way I can page past all the junk projects.

Later...
Richard

Richard Hurt about 15 years ago