Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

Scrapy

Compare

  Analyzed about 20 hours ago

Scrapy is a fast high-level scraping and web crawling framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

36K lines of code

50 current contributors

5 days since last commit

20 users on Open Hub

High Activity
5.0
 
I Use This

Coq proof assistant

Compare

  Analyzed about 12 hours ago

Coq is a formal proof management system: a proof done with Coq is mechanically checked by the machine. In particular, Coq allows: * to define functions or predicates, * to state mathematical theorems and software specifications, * to develop interactively formal proofs of these theorems, * to ... [More] check these proofs by a relatively small certification "kernel". [Less]

456K lines of code

74 current contributors

1 day since last commit

19 users on Open Hub

Very High Activity
4.85714
   
I Use This

Weboob

Compare

  Analyzed 2 days ago

Web Outside of Browsers. Weboob is a collection of applications able to interact with websites, without requiring the user to open them in a browser. It also provides well-defined APIs to talk to websites lacking one.

122K lines of code

50 current contributors

2 days since last commit

12 users on Open Hub

Very High Activity
5.0
 
I Use This

GNU libextractor

Compare

  Analyzed 3 months ago

GNU libextractor is a library used to extract meta-data from files. The goal is to provide developers of file-sharing networks or WWW-indexing bots with a universal library to obtain simple keywords to match against queries. Currently, GNU libextractor supports the following formats: HTML, PS ... [More] , OLE2 (DOC, XLS, PPT), OpenOffice (sxw), StarOffice (sdw), DVI, MAN, FLAC, MP3 (ID3v1 and ID3v2), NSF (NES Sound Format), SID, OGG, WAV, EXIV2, JPEG, GIF, PNG, TIFF, DEB, RPM, TAR(.GZ), ZIP, FLV, REAL, RIFF (AVI), MPEG, QT and ASF. Also, various additional MIME types are detected. [Less]

99.6K lines of code

2 current contributors

5 months since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This

7-Zip-JBinding

Compare

  Analyzed over 1 year ago

Native (JNI) cross-platform library to extract (password protected, multi-part) 7z Zip Rar Tar Split Lzma Iso HFS GZip Cpio BZip2 Z Arj Chm Lhz Cab Nsis Deb Rpm Wim Udf archives from Java. Archive creation and more formats coming soon.

300K lines of code

0 current contributors

over 5 years since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This

Hyperkit Java/Groovy Web Crawler F.

Compare

  Analyzed over 1 year ago

The Hyperkit Java/Groovy web crawler framework provides the basics for running scriptable web crawler tasks. The core of the framework is implemented in pure Java, crawler tasks are linked with arbitrary Groovy scripts and stored in a queue.

281 lines of code

0 current contributors

almost 10 years since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This

auto-unrar

Compare

  Analyzed 2 days ago

Smart extracting of many RAR archives for Linux written in Perl. * handle all three multipart archives naming conventions * duplicate directory structure tree ( see http://bit.ly/dafLxF ) * move/rename normal files (no rar archives) * check minimum free space on device * can delete archives ... [More] if extracted ok * save status to file * smart error handling (maintain undo actions list for recovery to initial state) * support for rsync integration (generate rsync exclude list, check mtime, ...) * and many more ... [Less]

1.97K lines of code

0 current contributors

over 6 years since last commit

1 users on Open Hub

Inactive
5.0
 
I Use This

Apache Tika for TYPO3

Compare

  Analyzed 1 day ago

Apache Tika for TYPO3 offers several services to extract meta data and content from files. The extension also comes with a service to detect the language of a text (requires Tika 0.8+). EXT:tika can use either a locally available Tika CLI app or a remote Apache Solr server. The provided ... [More] services can then be used by other extensions like EXT:dam or EXT:solr for example. [Less]

4.18K lines of code

2 current contributors

about 1 month since last commit

1 users on Open Hub

Low Activity
5.0
 
I Use This

Feature Extraction plugin API

Compare

  Analyzed over 1 year ago

Easy-to-use platform-independent plugin API for the extraction of low-level features from audio data in PCM format, as required in the context of music information retrieval software.

1.71K lines of code

0 current contributors

about 15 years since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This
Licenses: No declared licenses

QuickCode (formerly ScraperWiki)

Compare

  Analyzed over 1 year ago

QuickCode is the new name for the original ScraperWiki product. We renamed it, as it isn’t a wiki or just for scraping any more. It’s a Python and R data analysis environment, ideal for economists, statisticians and data managers who are new to coding.

188K lines of code

1 current contributors

about 2 years since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This