Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

file

Compare

  Analyzed about 1 month ago

File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed. Starting with version 4, the file command is not much more than a wrapper around the "magic" library.

15K lines of code

4 current contributors

about 1 month since last commit

155 users on Open Hub

Activity Not Available
4.4375
   
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed about 1 month ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

211K lines of code

47 current contributors

about 2 months since last commit

45 users on Open Hub

Activity Not Available
5.0
 
I Use This

WEKA

Compare

  Analyzed 10 months ago

707K lines of code

3 current contributors

12 months since last commit

37 users on Open Hub

Activity Not Available
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 20 days ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

136K lines of code

9 current contributors

about 2 months since last commit

25 users on Open Hub

Moderate Activity
3.6
   
I Use This

dlib C++ Library

Compare

  Analyzed 9 months ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

357K lines of code

23 current contributors

10 months since last commit

10 users on Open Hub

Activity Not Available
4.75
   
I Use This

PyMVPA

Compare

  Analyzed 6 months ago

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

139K lines of code

6 current contributors

about 1 year since last commit

8 users on Open Hub

Activity Not Available
5.0
 
I Use This

FAKE GAME

Compare

  Analyzed 17 days ago

The FAKE GAME tool uses natural evolution to evolve Data Mining models. It incorporates several preprocessing, optimization and visualization methods aimed to streamline the Knowledge Discovery process. Knowledge Extraction from data is being automated!

0 lines of code

0 current contributors

over 7 years since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

Java Data Mining Package (JDMP)

Compare

  Analyzed 20 days ago

The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It ... [More] includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave). [Less]

40.7K lines of code

0 current contributors

over 1 year since last commit

2 users on Open Hub

Very Low Activity
0.0
 
I Use This

MARF:Modular Audio Recognition Framework

Compare

  Analyzed 18 days ago

MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act ... [More] as a library in applications or be used as a source for learning and extension. [Less]

79.8K lines of code

0 current contributors

over 1 year since last commit

2 users on Open Hub

Very Low Activity
5.0
 
I Use This

rapaio

Compare

  Analyzed 20 days ago

statistics, data mining and machine learning toolbox written in Java

40.4K lines of code

2 current contributors

9 months since last commit

1 users on Open Hub

Very Low Activity
0.0
 
I Use This