Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

file

Compare

  Analyzed 2 days ago

File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed. Starting with version 4, the file command is not much more than a wrapper around the "magic" library.

15.3K lines of code

1 current contributors

5 days since last commit

157 users on Open Hub

Very Low Activity
4.4375
   
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed about 7 hours ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

213K lines of code

60 current contributors

8 days since last commit

45 users on Open Hub

High Activity
5.0
 
I Use This

WEKA

Compare

  Analyzed over 1 year ago

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is ... [More] also well-suited for developing new machine learning schemes. [Less]

707K lines of code

3 current contributors

over 1 year since last commit

37 users on Open Hub

Activity Not Available
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 8 days ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

182K lines of code

14 current contributors

18 days since last commit

24 users on Open Hub

Moderate Activity
3.6
   
I Use This

dlib C++ Library

Compare

  Analyzed about 11 hours ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

373K lines of code

59 current contributors

7 days since last commit

10 users on Open Hub

High Activity
4.75
   
I Use This

PyMVPA

Compare

  Analyzed 1 day ago

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

139K lines of code

0 current contributors

over 1 year since last commit

8 users on Open Hub

Very Low Activity
5.0
 
I Use This

Java Data Mining Package (JDMP)

Compare

  Analyzed 1 day ago

The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It ... [More] includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave). [Less]

40.7K lines of code

0 current contributors

about 2 years since last commit

2 users on Open Hub

Inactive
0.0
 
I Use This

FAKE GAME

Compare

  Analyzed 6 months ago

The FAKE GAME tool uses natural evolution to evolve Data Mining models. It incorporates several preprocessing, optimization and visualization methods aimed to streamline the Knowledge Discovery process. Knowledge Extraction from data is being automated!

0 lines of code

0 current contributors

almost 8 years since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

MARF:Modular Audio Recognition Framework

Compare

  Analyzed 2 days ago

MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act ... [More] as a library in applications or be used as a source for learning and extension. [Less]

79.8K lines of code

0 current contributors

almost 2 years since last commit

2 users on Open Hub

Very Low Activity
5.0
 
I Use This

Cryptolysis

Compare

  Analyzed over 1 year ago

A Java framework for testing heuristics in breaking classifcal ciphers.

3.42K lines of code

0 current contributors

almost 5 years since last commit

1 users on Open Hub

Activity Not Available
5.0
 
I Use This