Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

file

Compare

  Analyzed about 18 hours ago

File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed. Starting with version 4, the file command is not much more than a wrapper around the "magic" library.

19.3K lines of code

1 current contributors

10 days since last commit

159 users on Open Hub

Moderate Activity
4.4375
   
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed 1 day ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

234K lines of code

42 current contributors

18 days since last commit

45 users on Open Hub

Low Activity
5.0
 
I Use This

WEKA

Compare

  Analyzed about 10 hours ago

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is ... [More] also well-suited for developing new machine learning schemes. [Less]

780K lines of code

3 current contributors

over 1 year since last commit

38 users on Open Hub

Very Low Activity
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed about 6 hours ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

146K lines of code

0 current contributors

2 months since last commit

25 users on Open Hub

Low Activity
3.6
   
I Use This

dlib C++ Library

Compare

  Analyzed about 19 hours ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

450K lines of code

25 current contributors

11 days since last commit

11 users on Open Hub

Moderate Activity
4.75
   
I Use This

PyMVPA

Compare

  Analyzed about 5 hours ago

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

113K lines of code

0 current contributors

about 8 years since last commit

8 users on Open Hub

Inactive
5.0
 
I Use This

MARF:Modular Audio Recognition Framework

Compare

  Analyzed 3 months ago

MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act ... [More] as a library in applications or be used as a source for learning and extension. [Less]

12.5K lines of code

0 current contributors

over 8 years since last commit

2 users on Open Hub

Activity Not Available
5.0
 
I Use This

Java Data Mining Package (JDMP)

Compare

  Analyzed about 11 hours ago

The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It ... [More] includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave). [Less]

40.7K lines of code

0 current contributors

over 8 years since last commit

2 users on Open Hub

Inactive
0.0
 
I Use This

FAKE GAME

Compare

  Analyzed 4 days ago

The FAKE GAME tool uses natural evolution to evolve Data Mining models. It incorporates several preprocessing, optimization and visualization methods aimed to streamline the Knowledge Discovery process. Knowledge Extraction from data is being automated!

0 lines of code

0 current contributors

over 14 years since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

CRM114 for Ruby

Compare

  Analyzed about 9 hours ago

Ruby interface to the CRM114 Controllable Regex Mutilator, an advanced and fast text classifier that uses sparse binary polynomial matching with a Bayesian Chain Rule evaluator and a hidden Markov model to categorize data with up to a 99.87% accuracy.

197 lines of code

0 current contributors

over 14 years since last commit

1 users on Open Hub

Inactive
5.0
 
I Use This