Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

file

Compare

  Analyzed about 11 hours ago

File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed. Starting with version 4, the file command is not much more than a wrapper around the "magic" library.

15K lines of code

4 current contributors

2 days since last commit

156 users on Open Hub

Moderate Activity
4.4375
   
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed 6 days ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

211K lines of code

47 current contributors

24 days since last commit

45 users on Open Hub

High Activity
5.0
 
I Use This

WEKA

Compare

  Analyzed 9 months ago

Weka is a collection of machine learning algorithms for solving real-world data mining problems. It is written in Java and runs on almost any platform. The algorithms can either be applied directly to a dataset or called from your own Java code.

707K lines of code

3 current contributors

11 months since last commit

37 users on Open Hub

Activity Not Available
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 5 months ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

132K lines of code

8 current contributors

5 months since last commit

25 users on Open Hub

Activity Not Available
3.6
   
I Use This

dlib C++ Library

Compare

  Analyzed 8 months ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

357K lines of code

23 current contributors

9 months since last commit

10 users on Open Hub

Activity Not Available
4.75
   
I Use This

PyMVPA

Compare

  Analyzed 5 months ago

Python module to ease pattern classification analyses of large datasets. It provides high-level abstraction of typical processing steps (e.g. data preparation, classification, feature selection, generalization testing), a number of implementations of some popular algorithms (e.g. kNN, Ridge ... [More] Regressions, Sparse Multinomial Logistic Regression, GPR. RFE, I-RELIEF), and bindings to external ML libraries (libsvm, shogun, R). While it is not limited to neuroimaging data (e.g. FMRI) it is eminently suited for such datasets. [Less]

139K lines of code

6 current contributors

about 1 year since last commit

8 users on Open Hub

Activity Not Available
5.0
 
I Use This

FAKE GAME

Compare

  Analyzed 15 days ago

The FAKE GAME tool uses natural evolution to evolve Data Mining models. It incorporates several preprocessing, optimization and visualization methods aimed to streamline the Knowledge Discovery process. Knowledge Extraction from data is being automated!

0 lines of code

0 current contributors

over 7 years since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This
Mostly written in language not available
Licenses: No declared licenses

Java Data Mining Package (JDMP)

Compare

  Analyzed 5 months ago

The Java Data Mining Package (JDMP) is an open source Java library for data analysis and machine learning. It facilitates the access to data sources and machine learning algorithms (e.g. clustering, regression, classification, graphical models, optimization) and provides visualization modules. It ... [More] includes a matrix library for storing and processing any kind of data, with the ability to handle very large matrices even when they do not fit into memory. Import and export interfaces are provided for JDBC data bases, TXT, CSV, Excel, Matlab, Latex, MTX, HTML, WAV, BMP and other file formats. JDMP provides a number of algorithms and tools, but also interfaces to other machine learning and data mining packages (Weka, LibSVM, Mallet, Lucene, Octave). [Less]

40.7K lines of code

0 current contributors

over 1 year since last commit

2 users on Open Hub

Activity Not Available
0.0
 
I Use This

MARF:Modular Audio Recognition Framework

Compare

  Analyzed about 2 months ago

MARF is an open-source research platform and a collection of voice/sound/speech/text and natural language processing (NLP) algorithms written in Java and arranged into a modular and extensible framework facilitating addition of new algorithms. MARF can run distributedly over the network and may act ... [More] as a library in applications or be used as a source for learning and extension. [Less]

79.8K lines of code

0 current contributors

over 1 year since last commit

2 users on Open Hub

Activity Not Available
5.0
 
I Use This

rapaio

Compare

  Analyzed 6 months ago

statistics, data mining and machine learning toolbox written in Java

40.4K lines of code

4 current contributors

8 months since last commit

1 users on Open Hub

Activity Not Available
0.0
 
I Use This