Tags : Browse Projects

Select a tag to browse associated projects and drill deeper into the tag cloud.

scikit-learn

Compare

  Analyzed about 5 hours ago

scikit-learn is a Python module integrating various machine learning algorithms under a common interface. It offers a wide range of methods such as Support Vector Machines, linear models (L1, L2 penalized), logistic regression, gaussian mixture models and more. The large number of algorithms aleady ... [More] implemented allows for easy comparison of accuracy and performance of various algorithms. [Less]

139K lines of code

354 current contributors

about 19 hours since last commit

64 users on Open Hub

Very High Activity
5.0
 
I Use This

Apache Spark

Compare

Claimed by Apache Software Foundation Analyzed about 8 hours ago

Apache Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and fast to write. To run programs faster, Spark provides primitives for in-memory cluster computing: your job can load data into memory and query it repeatedly more rapidly than with ... [More] disk-based systems like Hadoop. To make programming faster, Spark offers high-level APIs in Scala, Java and Python, letting you manipulate distributed datasets like local collections. You can also use Spark interactively to query big data from the Scala or Python shells. Spark integrates closely with Hadoop to run inside Hadoop clusters and can access any existing Hadoop data source. [Less]

1.18M lines of code

437 current contributors

about 9 hours since last commit

48 users on Open Hub

Very High Activity
5.0
 
I Use This

Natural Language Toolkit (NLTK)

Compare

  Analyzed 3 days ago

NLTK — the Natural Language Toolkit — is a suite of open source Python modules, linguistic data and documentation for research and development in natural language processing, supporting dozens of NLP tasks, with distributions for Windows, Mac OSX and Linux.

212K lines of code

52 current contributors

16 days since last commit

45 users on Open Hub

High Activity
5.0
 
I Use This

WEKA

Compare

  Analyzed about 1 year ago

Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is ... [More] also well-suited for developing new machine learning schemes. [Less]

707K lines of code

3 current contributors

about 1 year since last commit

37 users on Open Hub

Activity Not Available
3.93333
   
I Use This
Licenses: No declared licenses

Apache Mahout

Compare

Claimed by Apache Software Foundation Analyzed 3 days ago

Apache Mahout's goal is to build scalable machine learning libraries. With scalable we mean: Scalable to reasonably large data sets. Our core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm. ... [More] However we do not restrict contributions to Hadoop based implementations: Contributions that run on a single node or on a non-Hadoop cluster are welcome as well. The core libraries are highly optimized to allow for good performance also for non-distributed algorithms [Less]

182K lines of code

14 current contributors

3 days since last commit

24 users on Open Hub

Moderate Activity
3.6
   
I Use This

YALE -- Open-Source Java Data Mining

Compare

  Analyzed almost 2 years ago

YALE (Yet Another Learning Environment) is the most comprehensive open-source software for intelligent data analysis, data mining, knowledge discovery, machine learning, predictive analytics, forecasting, and analytics in business intelligence (BI). YALE provides more than 400 data mining operators ... [More] , a graphical user interface (GUI), an online tutorial with hands-on data mining applications, a comprehensive PDF tutorial, many visualization schemes for data sets and data mining results, many different learning and meta-learning schemes ranging from decision tree and rule learners to neural networks, SVMs, ensemble methods, etc. YALE is implemented in Java and available under GPL (GNU General Public License) as well as under a developer license (OEM license) for closed-source developers. [Less]

3.53M lines of code

3 current contributors

over 2 years since last commit

17 users on Open Hub

Activity Not Available
4.25
   
I Use This
Licenses: No declared licenses

Apache OpenNLP

Compare

Claimed by Apache Software Foundation Analyzed 3 days ago

Apache OpenNLP is a Java machine learning toolkit for natural language processing (NLP).

129K lines of code

19 current contributors

about 2 months since last commit

12 users on Open Hub

Moderate Activity
5.0
 
I Use This

gCube

Compare

  Analyzed 5 months ago

gCube is a framework dedicated to scientists. It enables the declarative and interactive creation of transient Virtual Research Environments that aggregate and deploy on-demand content resources and application services by exploiting computational and storage resources offered by private and commercial cloud providers.

22.7M lines of code

3 current contributors

5 months since last commit

11 users on Open Hub

Activity Not Available
5.0
 
I Use This

dlib C++ Library

Compare

  Analyzed 12 months ago

This project is a modern C++ library with a focus on portability and program correctness. It strives to be easy to use right and hard to use wrong. Thus, it comes with extensive documentation and thorough debugging modes. The library provides a platform abstraction layer for common tasks such as ... [More] interfacing with network services, handling threads, or creating graphical user interfaces. Additionally, the library implements many useful algorithms such as data compression routines, linked lists, binary search trees, linear algebra and matrix utilities, machine learning algorithms, XML and text parsing, and many other general utilities. [Less]

357K lines of code

23 current contributors

about 1 year since last commit

10 users on Open Hub

Activity Not Available
4.75
   
I Use This

MOA - Massive Online Analysis

Compare

  Analyzed 12 months ago

A framework for learning from a continuous supply of examples, a data stream. Includes classification and clustering methods. Related to the WEKA project, also written in Java, while scaling to more demanding problems.

115K lines of code

14 current contributors

about 1 year since last commit

9 users on Open Hub

Activity Not Available
0.0
 
I Use This